htmlspecialchars_decode()
(PHP 5 >= 5.1.0, PHP 7)
将特殊的 HTML 实体转换回普通字符
说明
htmlspecialchars_decode(string $string[,int $flags= ENT_COMPAT | ENT_HTML401] ): string
此函数的作用和htmlspecialchars()刚好相反。它将特殊的HTML实体转换回普通字符。
被转换的实体有:&,"(没有设置ENT_NOQUOTES
时),'(设置了ENT_QUOTES
时),<以及>。
参数
- $string
要解码的字符串
- $flags
用下列标记中的一个或多个作为一个位掩码,来指定如何处理引号和使用哪种文档类型。默认为ENT_COMPAT | ENT_HTML401。
有效的$flags常量 常量名 说明 ENT_COMPAT
转换双引号,不转换单引号。 ENT_QUOTES
单引号和双引号都转换。 ENT_NOQUOTES
单引号和双引号都不转换。 ENT_HTML401
作为HTML 4.01编码处理。 ENT_XML1
作为XML 1编码处理。 ENT_XHTML
作为XHTML编码处理。 ENT_HTML5
作为HTML 5编码处理。
返回值
返回解码后的字符串。
更新日志
版本 | 说明 |
---|---|
5.4.0 | 增加了ENT_HTML401 、ENT_XML1 、ENT_XHTML 和ENT_HTML5 等常量。 |
范例
一个htmlspecialchars_decode()的例子
<?php $str = "<p>this -> "</p>\n"; echo htmlspecialchars_decode($str); // 注意,这里的引号不会被转换 echo htmlspecialchars_decode($str, ENT_NOQUOTES); ?>
以上例程会输出:
<p>this -> "</p> <p>this -> "</p>
参见
htmlspecialchars()
将特殊字符转换为 HTML 实体html_entity_decode()
Convert HTML entities to their corresponding charactersget_html_translation_table()
返回使用 htmlspecialchars 和 htmlentities 后的转换表
This should be the best way to do it. (Reposted because the other one seems a bit slower and because those who used the code under called it htmlspecialchars_decode_php4) <?php if ( !function_exists('htmlspecialchars_decode') ) { function htmlspecialchars_decode($text) { return strtr($text, array_flip(get_html_translation_table(HTML_SPECIALCHARS))); } } ?>
The example for "htmlspecialchars_decode()" below sadly does not work for all PHP4 versions. Quote from the PHP manual: "get_html_translation_table() will return the translation table that is used internally for htmlspecialchars() and htmlentities()." But it does NOT! At least not for PHP version 4.4.2. This was already reported in a bug report (http://bugs.php.net/bug.php?id=25927), but it was marked as BOGUS. Proof: Code: -------------------- <?php var_dump(get_html_translation_table(HTML_SPECIALCHARS,ENT_QUOTES)); var_dump(htmlspecialchars('\'',ENT_QUOTES)); ?> -------------------- Output: -------------------- array '"' => '"' ''' => ''' '<' => '<' '>' => '>' '&' => '&' ''' -------------------- This comment now is not to report this bug again (though I really believe it is one), but to complete the example and warn people of this pitfall. To make sure your htmlspecialchars_decode fake for PHP4 works, you should do something like this: <?php function htmlspecialchars_decode($string,$style=ENT_COMPAT) { $translation = array_flip(get_html_translation_table(HTML_SPECIALCHARS,$style)); if($style === ENT_QUOTES){ $translation['''] = '\''; } return strtr($string,$translation); } ?> Br, Thomas
that works also with ä and " and so on. get_html_translation_table(HTML_ENTITIES) => offers more characters than HTML_SPECIALCHARS function htmlspecialchars_decode_PHP4($uSTR) { return strtr($uSTR, array_flip(get_html_translation_table(HTML_ENTITIES, ENT_QUOTES))); }
If you use `htmlspecialchars()` to change things like the ampersand (&) into it's HTML equivalent (&), you might run into a situation where you mistakenly pass the same string to the function twice, resulting in things appearing on your website like, as I call it, the ampersanded amp; "&". Clearly nobody want's "&" on his or her web page where there is supposed to be just an ampersand. Here's a quick and easy trick to make sure this doesn't happen: <?php $var = "This is a string that could be passed to htmlspecialchars multiple times."; if (htmlspecialchars_decode($var) == $var) { $var = htmlspecialchars($var); } echo $var; ?> Now, if your dealing with text that is a mixed bag (has HTML entities and non-HTML entities) you're on your own.
Keep in mind that you should never trust user input - particularly for "mixed-bag" input containing a combination of plain text and markup or scripting code. Why? Well, consider someone sending '&<script>alert('XSS');</script>' to your PHP script: <?php $var = "&<script>alert('XSS');</script>"; $var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var; echo $var; ?> Since '&' decodes into '&', (htmlspecialchars_decode($var) == $var) will be -false-, thus returning $var without that it's escaped. In consequence, the script-tags are untouched, and you've just opened yourself to XSS. There is, unfortunately, no reliable way to determine whether HTML is escaped or not that does not come with this caveat that I know of. Rather than try and catch the case 'I've already encoded this', you are better off avoiding double-escaping by simply escaping the HTML as close to the actual output as you can muster, e.g. in the view in an MVC development structure.
or of course: <?php $var = "Blue & yellow make green."; $var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var; echo $var; // outputs Blue & yellow make green. // you can do it a bunch of times, it still won't screw you! $var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var; $var = (htmlspecialchars_decode($var) == $var) ? htmlspecialchars($var) : $var; echo $var; // still outputs Blue & yellow make green. ?> Put it in a function. Add it to the method of some abstract data class.
[Update of previous note, having noticed I forgot to put in quote style] PHP4 Compatible function: <?php function htmlspecialchars_decode_php4 ($str, $quote_style = ENT_COMPAT) { return strtr($str, array_flip(get_html_translation_table(HTML_SPECIALCHARS, $quote_style))); } ?>
For PHP4 Compatibility: <?php function htmlspecialchars_decode_php4 ($str) { return strtr($str, array_flip(get_html_translation_table(HTML_SPECIALCHARS))); } ?>