Beautifulsoup4

Latest version: v4.12.3

Safety actively analyzes 628969 Python packages for vulnerabilities to keep your Python projects secure.

Page 9 of 12

3.0.7

Fixed a UnicodeDecodeError when unpickling documents that contain
non-ASCII characters.

Fixed a TypeError that occurred in some circumstances when a tag
contained no text.

Jump through hoops to avoid the use of chardet, which can be extremely
slow in some circumstances. UTF-8 documents should never trigger the
use of chardet.

Whitespace is preserved inside <pre> and <textarea> tags that contain
nothing but whitespace.

Beautiful Soup can now parse a doctype that's scoped to an XML namespace.

3.0.7a

Added an import that makes BS work in Python 2.3.

3.0.6

Got rid of a very old debug line that prevented chardet from working.

Added a Tag.decompose() method that completely disconnects a tree or a
subset of a tree, breaking it up into bite-sized pieces that are
easy for the garbage collecter to collect.

Tag.extract() now returns the tag that was extracted.

Tag.findNext() now does something with the keyword arguments you pass
it instead of dropping them on the floor.

Fixed a Unicode conversion bug.

Fixed a bug that garbled some <meta> tags when rewriting them.

3.0.5

Soup objects can now be pickled, and copied with copy.deepcopy.

Tag.append now works properly on existing BS objects. (It wasn't
originally intended for outside use, but it can be now.) (Giles
Radford)

Passing in a nonexistent encoding will no longer crash the parser on
Python 2.4 (John Nagle).

Fixed an underlying bug in SGMLParser that thinks ASCII has 255
characters instead of 127 (John Nagle).

Entities are converted more consistently to Unicode characters.

Entity references in attribute values are now converted to Unicode
characters when appropriate. Numeric entities are always converted,
because SGMLParser always converts them outside of attribute values.

ALL_ENTITIES happens to just be the XHTML entities, so I renamed it to
XHTML_ENTITIES.

The regular expression for bare ampersands was too loose. In some
cases ampersands were not being escaped. (Sam Ruby?)

Non-breaking spaces and other special Unicode space characters are no
longer folded to ASCII spaces. (Robert Leftwich)

Information inside a TEXTAREA tag is now parsed literally, not as HTML
tags. TEXTAREA now works exactly the same way as SCRIPT. (Zephyr Fang)

3.0.4

Fixed a bug that crashed Unicode conversion in some cases.

Fixed a bug that prevented UnicodeDammit from being used as a
general-purpose data scrubber.

Fixed some unit test failures when running against Python 2.5.

When considering whether to convert smart quotes, UnicodeDammit now
looks at the original encoding in a case-insensitive way.

3.0.3

Beautiful Soup is now usable as a way to clean up invalid XML/HTML (be
sure to pass in an appropriate value for convertEntities, or XML/HTML
entities might stick around that aren't valid in HTML/XML). The result
may not validate, but it should be good enough to not choke a
real-world XML parser. Specifically, the output of a properly
constructed soup object should always be valid as part of an XML
document, but parts may be missing if they were missing in the
original. As always, if the input is valid XML, the output will also
be valid.

Page 9 of 12

Releases

Has known vulnerabilities

Previous Next

Beautifulsoup4

Page 9 of 12

3.0.7

3.0.7a

3.0.6

3.0.5

3.0.4

3.0.3

Page 9 of 12

Links

Releases