Html5lib

Latest version: v1.1

Safety actively analyzes 621498 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

1.1

~~~

Breaking changes:

* Drop support for Python 3.3. (358)
* Drop support for Python 3.4. (421)

Deprecations:

* Deprecate the ``html5lib`` sanitizer (``html5lib.serialize(sanitize=True)`` and
``html5lib.filters.sanitizer``). We recommend users migrate to `Bleach
<https://github.com/mozilla/bleach>`. Please let us know if Bleach doesn't suffice for your
use. (443)

Other changes:

* Try to import from ``collections.abc`` to remove DeprecationWarning and ensure
``html5lib`` keeps working in future Python versions. (403)
* Drop optional ``datrie`` dependency. (442)

1.0.1

~~~~~

Released on December 7, 2017

Breaking changes:

* Drop support for Python 2.6. (330) (Thank you, Hugo, Will Kahn-Greene!)
* Remove ``utils/spider.py`` (353) (Thank you, Jon Dufresne!)

Features:

* Improve documentation. (300, 307) (Thank you, Jon Dufresne, Tom Most,
Will Kahn-Greene!)
* Add iframe seamless boolean attribute. (Thank you, Ritwik Gupta!)
* Add itemscope as a boolean attribute. (194) (Thank you, Jonathan Vanasco!)
* Support Python 3.6. (333) (Thank you, Jon Dufresne!)
* Add CI support for Windows using AppVeyor. (Thank you, John Vandenberg!)
* Improve testing and CI and add code coverage (323, 334), (Thank you, Jon
Dufresne, John Vandenberg, Sam Sneddon, Will Kahn-Greene!)
* Semver-compliant version number.

Bug fixes:

* Add support for setuptools < 18.5 to support environment markers. (Thank you,
John Vandenberg!)
* Add explicit dependency for six >= 1.9. (Thank you, Eric Amorde!)
* Fix regexes to work with Python 3.7 regex adjustments. (318, 379) (Thank
you, Benedikt Morbach, Ville Skyttä, Mark Vasilkov!)
* Fix alphabeticalattributes filter namespace bug. (324) (Thank you, Will
Kahn-Greene!)
* Include license file in generated wheel package. (350) (Thank you, Jon
Dufresne!)
* Fix annotation-xml typo. (339) (Thank you, Will Kahn-Greene!)
* Allow uppercase hex chararcters in CSS colour check. (377) (Thank you,
Komal Dembla, Hugo!)

1.0

~~~

Released and unreleased on December 7, 2017. Badly packaged release.

1.0b3

Not secure
~~~~~

Released on July 24, 2013

* Removed ``RecursiveTreeWalker`` from ``treewalkers._base``. Any
implementation using it should be moved to
``NonRecursiveTreeWalker``, as everything bundled with html5lib has
for years.

* Fix 67 so that ``BufferedStream`` to correctly returns a bytes
object, thereby fixing any case where html5lib is passed a
non-seekable RawIOBase-like object.

1.0b2

Not secure
~~~~~

Released on June 27, 2013

* Removed reordering of attributes within the serializer. There is now
an ``alphabetical_attributes`` option which preserves the previous
behaviour through a new filter. This allows attribute order to be
preserved through html5lib if the tree builder preserves order.

* Removed ``dom2sax`` from DOM treebuilders. It has been replaced by
``treeadapters.sax.to_sax`` which is generic and supports any
treewalker; it also resolves all known bugs with ``dom2sax``.

* Fix treewalker assertions on hitting bytes strings on
Python 2. Previous to 1.0b1, treewalkers coped with mixed
bytes/unicode data on Python 2; this reintroduces this prior
behaviour on Python 2. Behaviour is unchanged on Python 3.

1.0b1

Not secure
~~~~~

Released on May 17, 2013

* Implementation updated to implement the `HTML specification
<http://www.whatwg.org/specs/web-apps/current-work/>`_ as of 5th May
2013 (`SVN <http://svn.whatwg.org/webapps/>`_ revision r7867).

* Python 3.2+ supported in a single codebase using the ``six`` library.

* Removed support for Python 2.5 and older.

* Removed the deprecated Beautiful Soup 3 treebuilder.
``beautifulsoup4`` can use ``html5lib`` as a parser instead. Note that
since it doesn't support namespaces, foreign content like SVG and
MathML is parsed incorrectly.

* Removed ``simpletree`` from the package. The default tree builder is
now ``etree`` (using the ``xml.etree.cElementTree`` implementation if
available, and ``xml.etree.ElementTree`` otherwise).

* Removed the ``XHTMLSerializer`` as it never actually guaranteed its
output was well-formed XML, and hence provided little of use.

* Removed default DOM treebuilder, so ``html5lib.treebuilders.dom`` is no
longer supported. ``html5lib.treebuilders.getTreeBuilder("dom")`` will
return the default DOM treebuilder, which uses ``xml.dom.minidom``.

* Optional heuristic character encoding detection now based on
``charade`` for Python 2.6 - 3.3 compatibility.

* Optional ``Genshi`` treewalker support fixed.

* Many bugfixes, including:

* 33: null in attribute value breaks XML AttValue;

* 4: nested, indirect descendant, <button> causes infinite loop;

* `Google Code 215
<http://code.google.com/p/html5lib/issues/detail?id=215>`_: Properly
detect seekable streams;

* `Google Code 206
<http://code.google.com/p/html5lib/issues/detail?id=206>`_: add
support for <video preload=...>, <audio preload=...>;

* `Google Code 205
<http://code.google.com/p/html5lib/issues/detail?id=205>`_: add
support for <video poster=...>;

* `Google Code 202
<http://code.google.com/p/html5lib/issues/detail?id=202>`_: Unicode
file breaks InputStream.

* Source code is now mostly PEP 8 compliant.

* Test harness has been improved and now depends on ``nose``.

* Documentation updated and moved to https://html5lib.readthedocs.io/.

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.