tidy::parseString()
(PHP 5, PHP 7, PECL tidy >= 0.5.2)
分析存储在字符串中的文档
说明
面向对象风格public tidy::parseString(string $input[,mixed $config[,string $encoding]]): bool
过程化风格
tidy_parse_string(string $input[,mixed $config[,string $encoding]]): tidy
参数
- $input
被解析的数据
- $config
配置选项可以作为数组或字符串传递。如果传递字符串,则将其解释为配置文件的名称,否则,将其解释为选项本身.
- $encoding
编码参数设置输入/输出文档的编码。可能的编码值是:ascii,latin0,latin1,raw,utf8,iso2022,mac,win1252,ibm858,utf16,utf16le,utf16be,big5, andshiftjis.
返回值
范例
<?php ob_start(); ?> <html> <head> <title>test</title> </head> <body> <p>error<br>another line</i> </body> </html> <?php $buffer = ob_get_clean(); $config = array('indent' => TRUE, 'output-xhtml' => TRUE, 'wrap' => 200); $tidy = tidy_parse_string($buffer, $config, 'UTF8'); $tidy->cleanRepair(); echo $tidy; ?>
以上例程会输出:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title> test </title> </head> <body> <p> error<br /> another line </p> </body> </html>
参见
- tidy::parseFile() 分析文件或URI中的标记
- tidy::repairFile() 修复文件并将其作为字符串返回
- tidy::repairString() 使用可选提供的配置文件修复字符串
<?php /** * Simpler version without pretty print config options. */ function tidy_html5($html, array $config = [], $encoding = 'utf8') { $config += [ 'doctype' => '<!DOCTYPE html>', 'drop-empty-elements' => 0, 'new-blocklevel-tags' => 'article aside audio bdi canvas details dialog figcaption figure footer header hgroup main menu menuitem nav section source summary template track video', 'new-empty-tags' => 'command embed keygen source track wbr', 'new-inline-tags' => 'audio command datalist embed keygen mark menuitem meter output progress source time video wbr', 'tidy-mark' => 0, ]; $html = tidy_parse_string($html, $config, $encoding); // doctype not inserted tidy_clean_repair($html); // doctype inserted return $html; } $html = '</z><p><a href="#">Link</a></p><p><img src="logo.png"/>Seçond para</p><i class="fa"></i><p></p>'; echo tidy_html5($html); <!DOCTYPE html> <html> <head> <title></title> </head> <body> <p><a href="#">Link</a></p> <p><img src="logo.png">Seçond para</p> <i class="fa"></i> <p></p> </body> </html> echo tidy_html5($html, ['indent'=>2, 'indent-spaces'=>4]); <!DOCTYPE html> <html> <head> <title></title> </head> <body> <p><a href="#">Link</a></p> <p><img src="logo.png">Seçond para</p><i class="fa"></i> <p></p> </body> </html> echo tidy_html5($html, ['indent'=>1], 'ascii'); <!DOCTYPE html> <html> <head> <title></title> </head> <body> <p> <a href="#">Link</a> </p> <p> <img src="logo.png">Seçond para </p><i class="fa"></i> <p></p> </body> </html> echo tidy_html5($html, ['show-body-only'=>1]); <p><a href="#">Link</a></p> <p><img src="logo.png">Seçond para</p> <i class="fa"></i> <p></p>
<?php /** * UTF-8 HTML5-compatible Tidy * * @param string $html * @param array $config * @param string $encoding */ function tidy_html5($html, array $config = [], $encoding = 'utf8') { $config += [ 'clean' => TRUE, 'doctype' => 'omit', 'indent' => 2, // auto 'output-html' => TRUE, 'tidy-mark' => FALSE, 'wrap' => 0, // HTML5 tags 'new-blocklevel-tags' => 'article aside audio bdi canvas details dialog figcaption figure footer header hgroup main menu menuitem nav section source summary template track video', 'new-empty-tags' => 'command embed keygen source track wbr', 'new-inline-tags' => 'audio command datalist embed keygen mark menuitem meter output progress source time video wbr', ]; $html = tidy_parse_string($html, $config, $encoding); tidy_clean_repair($html); return '<!DOCTYPE html>' . PHP_EOL . $html; } $html = '</z><p><a href="#">Link</a></p><p>Second para</p>'; echo tidy_html5($html); Output: <!DOCTYPE html> <html> <head> <title></title> </head> <body> <p><a href="#">Link</a></p> <p>Second para</p> </body> </html>