Pyparsing

Latest version: v3.1.2

Safety actively analyzes 630217 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 17

3.0.3

-----------------------------
- Fixed regex typo in `one_of` fix for `as_keyword=True`.

- Fixed a whitespace-skipping bug, Issue 319, introduced as part of the revert
of the `LineStart` changes. Reported by Marc-Alexandre Côté,
thanks!

- Added header column labeling > 100 in `with_line_numbers` - some input lines
are longer than others.

3.0.2

-----------------------------
- Reverted change in behavior with `LineStart` and `StringStart`, which changed the
interpretation of when and how `LineStart` and `StringStart` should match when
a line starts with spaces. In 3.0.0, the `xxxStart` expressions were not
really treated like expressions in their own right, but as modifiers to the
following expression when used like `LineStart() + expr`, so that if there
were whitespace on the line before `expr` (which would match in versions prior
to 3.0.0), the match would fail.

3.0.0 implemented this by automatically promoting `LineStart() + expr` to
`AtLineStart(expr)`, which broke existing parsers that did not expect `expr` to
necessarily be right at the start of the line, but only be the first token
found on the line. This was reported as a regression in Issue 317.

In 3.0.2, pyparsing reverts to the previous behavior, but will retain the new
`AtLineStart` and `AtStringStart` expression classes, so that parsers can chose
whichever behavior applies in their specific instance. Specifically:

matches expr if it is the first token on the line
(allows for leading whitespace)
LineStart() + expr

matches only if expr is found in column 1
AtLineStart(expr)

- Performance enhancement to `one_of` to always generate an internal `Regex`,
even if `caseless` or `as_keyword` args are given as `True` (unless explicitly
disabled by passing `use_regex=False`).

- `IndentedBlock` class now works with `recursive` flag. By default, the
results parsed by an `IndentedBlock` are grouped. This can be disabled by constructing
the `IndentedBlock` with `grouped=False`.

3.0.1

-----------------------------
- Fixed bug where `Word(max=n)` did not match word groups less than length 'n'.
Thanks to Joachim Metz for catching this!

- Fixed bug where `ParseResults` accidentally created recursive contents.
Joachim Metz on this one also!

- Fixed bug where `warn_on_multiple_string_args_to_oneof` warning is raised
even when not enabled.

3.0.0

-----------------------------
- A consolidated list of all the changes in the 3.0.0 release can be found in
`docs/whats_new_in_3_0_0.rst`.
(https://github.com/pyparsing/pyparsing/blob/master/docs/whats_new_in_3_0_0.rst)

Version 3.0.0.final - October, 2021
-----------------------------------
- Added support for python `-W` warning option to call `enable_all_warnings`() at startup.
Also detects setting of `PYPARSINGENABLEALLWARNINGS` environment variable to any non-blank
value. (If using `-Wd` for testing, but wishing to disable pyparsing warnings, add
`-Wi:::pyparsing`.)

- Fixed named results returned by `url` to match fields as they would be parsed
using `urllib.parse.urlparse`.

- Early response to `with_line_numbers` was positive, with some requested enhancements:
. added a trailing "|" at the end of each line (to show presence of trailing spaces);
can be customized using `eol_mark` argument
. added expand_tabs argument, to control calling str.expandtabs (defaults to True
to match `parseString`)
. added mark_spaces argument to support display of a printing character in place of
spaces, or Unicode symbols for space and tab characters
. added mark_control argument to support highlighting of control characters using
'.' or Unicode symbols, such as "␍" and "␊".

- Modified helpers `common_html_entity` and `replace_html_entity()` to use the HTML
entity definitions from `html.entities.html5`.

- Updated the class diagram in the pyparsing docs directory, along with the supporting
.puml file (PlantUML markup) used to create the diagram.

- Added global method `autoname_elements()` to call `set_name()` on all locally
defined `ParserElements` that haven't been explicitly named using `set_name()`, using
their local variable name. Useful for setting names on multiple elements when
creating a railroad diagram.

a = pp.Literal("a")
b = pp.Literal("b").set_name("bbb")
pp.autoname_elements()

`a` will get named "a", while `b` will keep its name "bbb".

3.0.0rc2

--------------------------------
- Added `url` expression to `pyparsing_common`. (Sample code posted by Wolfgang Fahl,
very nice!)

This new expression has been added to the `urlExtractorNew.py` example, to show how
it extracts URL fields into separate results names.

- Added method to `pyparsing_test` to help debugging, `with_line_numbers`.
Returns a string with line and column numbers corresponding to values shown
when parsing with expr.set_debug():

data = """\
A
100"""
expr = pp.Word(pp.alphanums).set_name("word").set_debug()
print(ppt.with_line_numbers(data))
expr[...].parseString(data)

prints:

1
1234567890
1: A
2: 100
Match word at loc 3(1,4)
A
^
Matched word -> ['A']
Match word at loc 11(2,7)
100
^
Matched word -> ['100']

- Added new example `cuneiform_python.py` to demonstrate creating a new Unicode
range, and writing a Cuneiform->Python transformer (inspired by zhpy).

- Fixed issue 272, reported by PhasecoreX, when `LineStart`() expressions would match
input text that was not necessarily at the beginning of a line.

As part of this fix, two new classes have been added: AtLineStart and AtStringStart.
The following expressions are equivalent:

LineStart() + expr and AtLineStart(expr)
StringStart() + expr and AtStringStart(expr)

[`LineStart` and `StringStart` changes reverted in 3.0.2.]

- Fixed `ParseFatalExceptions` failing to override normal exceptions or expression
matches in `MatchFirst` expressions. Addresses issue 251, reported by zyp-rgb.

- Fixed bug in which `ParseResults` replaces a collection type value with an invalid
type annotation (as a result of changed behavior in Python 3.9). Addresses issue 276, reported by
Rob Shuler, thanks.

- Fixed bug in `ParseResults` when calling `__getattr__` for special double-underscored
methods. Now raises `AttributeError` for non-existent results when accessing a
name starting with '__'. Addresses issue 208, reported by Joachim Metz.

- Modified debug fail messages to include the expression name to make it easier to sync
up match vs success/fail debug messages.

3.0.0rc1

----------------------------------
- Railroad diagrams have been reformatted:
. creating diagrams is easier - call

expr.create_diagram("diagram_output.html")

create_diagram() takes 3 arguments:
. the filename to write the diagram HTML
. optional 'vertical' argument, to specify the minimum number of items in a path
to be shown vertically; default=3
. optional 'show_results_names' argument, to specify whether results name
annotations should be shown; default=False
. every expression that gets a name using `setName()` gets separated out as
a separate subdiagram
. results names can be shown as annotations to diagram items
. `Each`, `FollowedBy`, and `PrecededBy` elements get [ALL], [LOOKAHEAD], and [LOOKBEHIND]
annotations
. removed annotations for Suppress elements
. some diagram cleanup when a grammar contains Forward elements
. check out the examples make_diagram.py and railroad_diagram_demo.py

- Type annotations have been added to most public API methods and classes.

- Better exception messages to show full word where an exception occurred.

Word(alphas, alphanums)[...].parseString("ab1 123", parseAll=True)

Was:
pyparsing.ParseException: Expected end of text, found '1' (at char 4), (line:1, col:5)
Now:
pyparsing.exceptions.ParseException: Expected end of text, found '123' (at char 4), (line:1, col:5)

- Suppress can be used to suppress text skipped using "...".

source = "lead in START relevant text END trailing text"
start_marker = Keyword("START")
end_marker = Keyword("END")
find_body = Suppress(...) + start_marker + ... + end_marker
print(find_body.parseString(source).dump())

Prints:

['START', 'relevant text ', 'END']
- _skipped: ['relevant text ']

- New string constants `identchars` and `identbodychars` to help in defining identifier Word expressions

Two new module-level strings have been added to help when defining identifiers, `identchars` and `identbodychars`.

Instead of writing::

import pyparsing as pp
identifier = pp.Word(pp.alphas + "_", pp.alphanums + "_")

you will be able to write::

identifier = pp.Word(pp.identchars, pp.identbodychars)

Those constants have also been added to all the Unicode string classes::

import pyparsing as pp
ppu = pp.pyparsing_unicode

cjk_identifier = pp.Word(ppu.CJK.identchars, ppu.CJK.identbodychars)
greek_identifier = pp.Word(ppu.Greek.identchars, ppu.Greek.identbodychars)

- Added a caseless parameter to the `CloseMatch` class to allow for casing to be
ignored when checking for close matches. (Issue 281) (PR by Adrian Edwards, thanks!)

- Fixed bug in Located class when used with a results name. (Issue 294)

- Fixed bug in `QuotedString` class when the escaped quote string is not a
repeated character. (Issue 263)

- `parseFile()` and `create_diagram()` methods now will accept `pathlib.Path`
arguments.

Page 3 of 17

Releases

Has known vulnerabilities

Previous Next

Pyparsing

Page 3 of 17

3.0.3

3.0.2

3.0.1

3.0.0

3.0.0rc2

3.0.0rc1

Page 3 of 17

Links

Releases