• 首页
  • vue
  • TypeScript
  • JavaScript
  • scss
  • css3
  • html5
  • php
  • MySQL
  • redis
  • jQuery
  • mb_regex_encoding()

    (PHP 4 >= 4.2.0, PHP 5, PHP 7)

    Set/Get character encoding for multibyte regex

    说明

    mb_regex_encoding([string $encoding= mb_regex_encoding()]): mixed

    Set/Get character encoding for a multibyte regex.

    参数

    $encoding

    $encoding参数为字符编码。如果省略,则使用内部字符编码。

    返回值

    If$encodingis set, then 成功时返回TRUE,或者在失败时返回FALSE。 In this case, the internal character encoding is NOT changed. If$encodingis omitted, then the current character encoding name for a multibyte regex is returned.

    更新日志

    版本说明
    5.6.0 Default encoding is changed to UTF-8. It was EUC-JP Previously.

    参见

    Beware, mb_regex_encoding does not support the same set of encodings as listed in mb_list_encodings.php
    Example:
    <?php
    mb_internal_encoding('CP936');
    mb_regex_encoding('CP936'); # this line produces an error
     ?>
    
    mb_ereg functionality is provided via Oniguruma RegEx library and not via PCRE. mb_regex_encoding() does only support a subset of encoding names, compared to mb_list_encodings() and mb_encoding_aliases().
    Currently the following names are supported (case-insensitive):
    UCS-4
    UCS-4LE
    UTF-32
    UTF-32BE
    UTF-32LE
    UTF-16
    UTF-16BE
    UTF-16LE
    UTF-8
    utf8
    ASCII
    US-ASCII
    EUC-JP
    eucJP
    x-euc-jp
    SJIS
    eucJP-win
    SJIS-win
    CP932
    MS932
    Windows-31J
    ISO-8859-1
    ISO-8859-2
    ISO-8859-3
    ISO-8859-4
    ISO-8859-5
    ISO-8859-6
    ISO-8859-7
    ISO-8859-8
    ISO-8859-9
    ISO-8859-10
    ISO-8859-13
    ISO-8859-14
    ISO-8859-15
    ISO-8859-16
    EUC-CN
    EUC_CN
    eucCN
    gb2312
    EUC-TW
    EUC_TW
    eucTW
    BIG-5
    CN-BIG5
    BIG-FIVE
    BIGFIVE
    EUC-KR
    EUC_KR
    eucKR
    KOI8-R
    KOI8R
    The list is a mixture of base names and aliases and applies to PHP 5.4.45 (Oniguruma lib v4.7.1), PHP 5.6.31 (v5.9.5), PHP 7.0.22 (v5.9.6) and PHP 7.1.8 (v5.9.6). Be aware of the inconsistency: mb_regex_encoding() accepts for example the base name 'UTF-8' and its only alias 'utf8', but it does not accept aliases 'utf16', 'utf32' or 'latin1'.
    Additionally note, that the informal name/alias 'latin9' for ISO/IEC 8859-15:1999 (including the Euro sign on 0xA4) is also not known by mb_list_encodings(). It can only be adressed as 'ISO-8859-15' or 'ISO_8859-15' and for mb_regex_encoding() solely as 'ISO-8859-15'.
    mb_regex_encoding does not recognize CP1252 or Windows-1252 as valid encodings, although they are in the list generated by mb_list_encodings.
    ISO-8859-1 (AKA "Latin-1") is supported, but it's not the same as the Windows variety of Latin-1.
    To change algo the regex_encodign
    <?php
    echo "current mb_internal_encoding: ".mb_internal_encoding()."<br />";
    echo "changing mb_internal_encoding to UTF-8<br />";
    mb_internal_encoding("UTF-8"); 
    echo "new mb_internal_encoding: ".mb_internal_encoding()."<br />";
    echo "current mb_regex_encoding: ".mb_regex_encoding()."<br />";
    echo "changing mb_regex_encoding to UTF-8<br />";
    mb_regex_encoding('UTF-8');
    echo "new mb_regex_encoding: ".mb_regex_encoding()."<br />";
    ?>
    
    Return values vary in setting and getting:
    <?php
     echo mb_regex_encoding();
     // returns encoding name as a string
    ?>
    <?php
     echo mb_regex_encoding("UTF-8");
     // returns true (success) of false as a boolean
    ?>