• 首页
  • vue
  • TypeScript
  • JavaScript
  • scss
  • css3
  • html5
  • php
  • MySQL
  • redis
  • jQuery
  • xml_parser_create()

    (PHP 4, PHP 5, PHP 7)

    建立一个 XML 解析器

    描述

    xml_parser_create([string $encoding]): resource

    函数xml_parser_create()建立一个新的 XML 解析器并返回可被其它 XML 函数使用的资源句柄。

    可选参数$encoding在 PHP 4 中用来指定要被解析的 XML 输入的字符编码方式。PHP 5 开始,自动侦测输入的 XML 的编码,因此$encoding参数仅用来指定解析后输出数据的编码。在 PHP 4 总,默认输出的编码与输入数据的编码是相同的。如果传递了空字符串,解析器会尝试搜索头 3 或 4 个字节以确定文档的编码。在 PHP 5.0.0 和 5.0.1 总,默认输出的字符编码是 ISO-8859-1,而 PHP 5.0.2 及以上版本是 UTF-8。解析器支持的编码有ISO-8859-1,UTF-8US-ASCII

    请参阅函数xml_parser_create_ns()和xml_parser_free()。

    I created a function, which combines xml_paresr_create and all functions around.
    <?php
    function html_parse($file)
       {
       $array = str_split($file, 1);
       $count = false;
       $text = "";
       $end = false;
       foreach($array as $temp)
        {
        switch($temp)
         {
         case "<":
          between($text);
          $text = "";
          $count = true;
          $end = false;
          break;
         case ">":
          if($end == true) {end_tag($text);}
          else {start_tag($text);}
          $text = "";
          break;
         case "/":
          if($count == true) {$end = true;}
          else {$text = $text . "/";}
          break;
         default:
          $count = false;
          $text = $text . $temp;
         }
        }
       }
    ?>
    The input value is a string.
    It calls functions start_tag() , between() and end_tag() just like the original xml parser.
    But it has a few differences:
     - It does NOT check the code. Just resends values to that three functions, no matter, if they are right
     - It works with parameters. For example: from tag <sth b="42"> sends sth b="42"
     - It works wit diacritics. The original parser sometimes wrapped the text before the first diacritics appearance.
     - Works with all encoding. If the input is UTF-8, the output will be UTF-8 too
     - It works with strings. Not with file pointers.
     - No "Reserved XML name" error
     - No doctype needed
     - It does not work with commentaries, notes, programming instructions etc. Just the tags
    definition of the handling functions is:
    <?php
    function between($stuff) {}
    ?>
    No other attributes
    To maintain compatibility between PHP4 and PHP5 you should always pass a string argument to this function. PHP4 autodetects the format of the input if you leave it out whereas PHP5 will assume the format to be ISO-8859-1 (and choke on the byte order marker of UTF-8 files).
    Calling the function as <?php $res = xml_parser_create('') ?> will cause both versions of PHP to autodetect the format.
    The above "XML to array" code does not work properly if you have several tags on the same level and with the same name, example:
    <currenterrors>
    <error>
    <description>This is a real error...</description>
    </error>
    <error>
    <description>This is a second error...</description>
    </error>
    <error>
    <description>Lots of errors today...</description>
    </error>
    <error>
    <description>This is the last error...</description>
    </error>
    </currenterrors>
    It will then only display the first <error>-tag.
    In this case you will need to number the tags automatically or maybe have several arrays for each new element.
    Internals has proposed[1] changing this extension from resource-based to object-based. When this change is made, xml_parser_create will return an object, not a resource. Application developers are encouraged to replace any checks for explicit success, like:
    <?php
    $res = xml_parser_create(/*...*/);
    if (! is_resource($res)) {
      // ...
    }
    ?>
    With a check for explicit failure:
    <?php
    $res = xml_parser_create(/*...*/);
    if (false === $res) {
      // ...
    }
    [1]: https://marc.info/?l=php-internals&m=154998365013373&w=2
    Even though I passed "UTF-8" as encoding type PHP (Version 4.3.3) did *not* treat the input file as UTF-8. The input file was missing the BOM header bytes (which may indeed be omitted, according to RFC3629...but things are a bit unclear there. The RFC seems to make mere recommendations concering the BOM header). If you want to sure that PHP treats an UTF-8 encoded file correctly, make sure that it begins with the corresponding 3 byte BOM header (0xEF 0xBB 0xBF)
    I'd also recommend adding the option below 
    xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
    In PHP 5, when including in your xml file the definition '<?xml version="1.0" encoding="ISO-8859-1" ?>',  I'd also recommend adding the option below:
     
    xml_parser_set_option($xml_parser,XML_OPTION_TARGET_ENCODING, "ISO-8859-1").
    It works fine!
    If your enconding is 'UTF-8', just replace 'ISO-8859-1'.