Skip to content

Commit abf6224

Browse files
committed
Remove docs for HTMLTokenizer and HTMLSanitizer
HTMLTokenizer is now a private API (I cannot find a public export). HTMLSanitizer no longer exists as a tokenizer, and has been replaced with a filter.
1 parent 964d0e1 commit abf6224

File tree

1 file changed

+0
-38
lines changed

1 file changed

+0
-38
lines changed

doc/movingparts.rst

Lines changed: 0 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -169,41 +169,3 @@ the following way:
169169
* If all else fails, the default encoding will be used. This is usually
170170
`Windows-1252 <http://en.wikipedia.org/wiki/Windows-1252>`_, which is
171171
a common fallback used by Web browsers.
172-
173-
174-
Tokenizers
175-
----------
176-
177-
The part of the parser responsible for translating a raw input stream
178-
into meaningful tokens is the tokenizer. Currently html5lib provides
179-
two.
180-
181-
To set up a tokenizer, simply pass it when instantiating
182-
a :class:`~html5lib.html5parser.HTMLParser`:
183-
184-
.. code-block:: python
185-
186-
import html5lib
187-
from html5lib import sanitizer
188-
189-
p = html5lib.HTMLParser(tokenizer=sanitizer.HTMLSanitizer)
190-
p.parse("<p>Surprise!<script>alert('Boo!');</script>")
191-
192-
HTMLTokenizer
193-
~~~~~~~~~~~~~
194-
195-
This is the default tokenizer, the heart of html5lib. The implementation
196-
can be found in `html5lib/tokenizer.py
197-
<https://github.com/html5lib/html5lib-python/blob/master/html5lib/tokenizer.py>`_.
198-
199-
HTMLSanitizer
200-
~~~~~~~~~~~~~
201-
202-
This is a tokenizer that removes unsafe markup and CSS styles from the
203-
input. Elements that are known to be safe are passed through and the
204-
rest is converted to visible text. The default configuration of the
205-
sanitizer follows the `WHATWG Sanitization Rules
206-
<http://wiki.whatwg.org/wiki/Sanitization_rules>`_.
207-
208-
The implementation can be found in `html5lib/sanitizer.py
209-
<https://github.com/html5lib/html5lib-python/blob/master/html5lib/sanitizer.py>`_.

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy