Skip to content

html5lib/html5lib-php

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

html5lib - php flavour

This is an implementation of the tokenization and tree-building parts
of the HTML5 specification in PHP.  Potential uses of this library
can be found in web-scrapers and HTML filters.

Warning: This is a pre-alpha release, and as such, certain parts of
this code are not up-to-snuff (e.g. error reporting and performance).
However, the code is very close to spec and passes 100% of tests
not related to parse errors.  Nevertheless, expect to have to update
your code on the next upgrade.


Usage notes:

    <?php
    require_once '/path/to/HTML5/Parser.php';
    $dom = HTML5_Parser::parse('<html><body>...');
    $nodelist = HTML5_Parser::parseFragment('<b>Boo</b><br>');
    $nodelist = HTML5_Parser::parseFragment('<td>Bar</td>', 'table');


Documentation:

HTML5_Parser::parse($text)
    $text  : HTML to parse
    return : DOMDocument of parsed document

HTML5_Parser::parseFragment($text, $context)
    $text    : HTML to parse
    $context : String name of context element
    return   : DOMDocument of parsed document


Developer notes:

* To setup unit tests, you need to add a small stub file test-settings.php
  that contains $simpletest_location = 'path/to/simpletest/'; This needs to
  be version 1.1 (or, until that is released, SVN trunk) of SimpleTest.

* We don't want to ultimately use PHP's DOM because it is not tolerant
  of certain types of errors that HTML 5 allows (for example, an element
  "foo@bar"). But the current implementation uses it, since it's easy.
  Eventually, this html5lib implementation will get a version of SimpleTree;
  and may possibly start using that by default.

    vim: et sw=4 sts=4

About

PHP port of html5lib, currently unmaintained.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy