tidy::repairString()
(PHP 5, PHP 7, PECL tidy >= 0.7.0)
使用可选提供的配置文件修复字符串
说明
面向对象风格public tidy::repairString(string $data[,mixed $config[,string $encoding]]): string
过程化风格
tidy_repair_string(string $data[,mixed $config[,string $encoding]]): string
参数
- $data
The data to be repaired.
- $config
The config$configcan be passed either as an array or as a string. If a string is passed, it is interpreted as the name of the configuration file, otherwise, it is interpreted as the options themselves.
Check » http://tidy.sourceforge.net/docs/quickref.html for an explanation about each option.
- $encoding
The$encodingparameter sets the encoding for input/output documents. The possible values for encoding are:ascii,latin0,latin1,raw,utf8,iso2022,mac,win1252,ibm858,utf16,utf16le,utf16be,big5, andshiftjis.
返回值
Returns the repaired string.
范例
tidy::repairString() example
<?php ob_start(); ?> <html> <head> <title>test</title> </head> <body> <p>error</i> </body> </html> <?php $buffer = ob_get_clean(); $tidy = new tidy(); $clean = $tidy->repairString($buffer); echo $clean; ?>
以上例程会输出:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN"> <html> <head> <title>test</title> </head> <body> <p>error</p> </body> </html>
参见
- tidy::parseFile() Parse markup in file or URI
- tidy::parseString() Parse a document stored in a string
- tidy::repairFile() Repair a file and return it as a string
You can also use this function to repair xml, for example if stray ampersands etc are breaking it: <?php $xml = tidy_repair_string($xml, array( 'output-xml' => true, 'input-xml' => true )); ?>
Using tidy is very simple to fix a broken ods/odt document I wrote the following code to be run from command line <?php $zip = new ZipArchive(); if ($zip->open($argv[1])) { $fp = $zip->getStream('content.xml'); //file inside archive if(!$fp) die("Error: can't get stream to document file"); $stat = $zip->statName('content.xml'); $buf = ""; //file buffer ob_start(); //to capture CRC error message while (!feof($fp)) { $buf .= fread($fp, 2048); } $s = ob_get_contents(); ob_end_clean(); fclose($fp); $zip->close(); $config = array( 'indent' => true, 'clean' => true, 'input-xml' => true, 'output-xml' => true, 'wrap' => false ); $tidy = new Tidy(); $xml = $tidy->repairstring($buf, $config); $array=split("\n",$xml); $file=tempnam("/tmp","xml"); $fp=fopen($file,"rw+"); foreach ($array as $key=>$value) { fwrite($fp,trim($value),strlen(trim($value))); if ($key==0) { fwrite($fp,"\n"); } } fclose($fp); if ($zip->open($argv[1]) === TRUE) { $zip->deleteName('content.xml'); $zip->addFile($file, 'content.xml'); $zip->close(); echo 'recovery complete'; } else { echo 'recovery failed'; } unlink($file); } ?> save it to a file called fixdoc and invoke as: php fixdoc yourbrokendoc for your safety, please work on a copy of your doc.
The docs referenced at http://tidy.sourceforge.net/docs/quickref.html above state that the configuration option 'sort-attributes' is an enumeration of 'none' and 'alpha', thereby specifying that strings of either form are the acceptable values. This may not be the case, however - on my system, the option was not honored until I set it to true. This may also be the case with other options, so experiment a bit. The output of tidy::getConfig() may be useful in this regard.