libhtml is a C library for parsing HTML that aims to conform to the HTML5 specification and be useful for parsing real-world web pages.

Currently, the library is still in the planning stage, and consists of a simple test program that attempts to detect the encoding of HTML files.

To checkout the code from the Subversion repository:

svn co libhtml

For more information, see the SourceForge project page for libhtml.