Beautifulsoup4

Latest version: v4.12.3

Safety actively analyzes 628969 Python packages for vulnerabilities to keep your Python projects secure.

Page 6 of 12

4.1.0

* Added experimental support for fixing Windows-1252 characters
embedded in UTF-8 documents. (UnicodeDammit.detwingle())

* Fixed the handling of " with the built-in parser. [bug=993871]

* Comments, processing instructions, document type declarations, and
markup declarations are now treated as preformatted strings, the way
CData blocks are. [bug=1001025]

* Fixed a bug with the lxml treebuilder that prevented the user from
adding attributes to a tag that didn't originally have
attributes. [bug=1002378] Thanks to Oliver Beattie for the patch.

* Fixed some edge-case bugs having to do with inserting an element
into a tag it's already inside, and replacing one of a tag's
children with another. [bug=997529]

* Added the ability to search for attribute values specified in UTF-8. [bug=1003974]

This caused a major refactoring of the search code. All the tests
pass, but it's possible that some searches will behave differently.

4.0.5

* Added a new method, wrap(), which wraps an element in a tag.

* Renamed replace_with_children() to unwrap(), which is easier to
understand and also the jQuery name of the function.

* Made encoding substitution in <meta> tags completely transparent (no
more %SOUP-ENCODING%).

* Fixed a bug in decoding data that contained a byte-order mark, such
as data encoded in UTF-16LE. [bug=988980]

* Fixed a bug that made the HTMLParser treebuilder generate XML
definitions ending with two question marks instead of
one. [bug=984258]

* Upon document generation, CData objects are no longer run through
the formatter. [bug=988905]

* The test suite now passes when lxml is not installed, whether or not
html5lib is installed. [bug=987004]

* Print a warning on HTMLParseErrors to let people know they should
install a better parser library.

4.0.4

* Fixed a bug that sometimes created disconnected trees.

* Fixed a bug with the string setter that moved a string around the
tree instead of copying it. [bug=983050]

* Attribute values are now run through the provided output formatter.
Previously they were always run through the 'minimal' formatter. In
the future I may make it possible to specify different formatters
for attribute values and strings, but for now, consistent behavior
is better than inconsistent behavior. [bug=980237]

* Added the missing renderContents method from Beautiful Soup 3. Also
added an encode_contents() method to go along with decode_contents().

* Give a more useful error when the user tries to run the Python 2
version of BS under Python 3.

* UnicodeDammit can now convert Microsoft smart quotes to ASCII with
UnicodeDammit(markup, smart_quotes_to="ascii").

4.0.3

* Fixed a typo that caused some versions of Python 3 to convert the
Beautiful Soup codebase incorrectly.

* Got rid of the 4.0.2 workaround for HTML documents--it was
unnecessary and the workaround was triggering a (possibly different,
but related) bug in lxml. [bug=972466]

4.0.2

* Worked around a possible bug in lxml that prevents non-tiny XML
documents from being parsed. [bug=963880, bug=963936]

* Fixed a bug where specifying `text` while also searching for a tag
only worked if `text` wanted an exact string match. [bug=955942]

4.0.1

* This is the first official release of Beautiful Soup 4. There is no

Page 6 of 12

Releases

Has known vulnerabilities

Previous Next

Beautifulsoup4

Page 6 of 12

4.1.0

4.0.5

4.0.4

4.0.3

4.0.2

4.0.1

Page 6 of 12

Links

Releases