(PHP 5, PHP 7, PHP 8)
DOMDocument::loadHTML — Load HTML from a string
The function parses the HTML contained in the string source
.
Unlike loading XML, HTML does not have to be well-formed to load.
This function parses the input using an HTML 4 parser. The parsing rules of HTML 5, which is what modern web browsers use, are different. Depending on the input this might result in a different DOM structure. Therefore this function cannot be safely used for sanitizing HTML.
As an example, some HTML elements will implicitly close a parent element when encountered. The rules for automatically closing parent elements differ between HTML 4 and HTML 5 and thus the resulting DOM structure that DOMDocument sees might be different from the DOM structure a web browser sees, possibly allowing an attacker to break the resulting HTML.
If an empty string is passed as the source
,
a warning will be generated. This warning is not generated by libxml
and cannot be handled using libxml's error handling functions.
While malformed HTML should load successfully, this function may generate E_WARNING
errors when it encounters bad markup. libxml's error handling functions may be used to handle these errors.
Version | Description |
---|---|
8.3.0 | This function now has a tentative bool return type. |
8.0.0 |
Calling this function statically will
now throw an Error.
Previously, an E_DEPRECATED was raised.
|
Example #1 Creating a Document
<?php
$doc = new DOMDocument();
$doc->loadHTML("<html><body>Test<br></body></html>");
echo $doc->saveHTML();
?>