• 首页
  • vue
  • TypeScript
  • JavaScript
  • scss
  • css3
  • html5
  • php
  • MySQL
  • redis
  • jQuery
  • Normalizer::normalize()

    normalizer_normalize

    (PHP 5 >= 5.3.0, PHP 7, PECL intl >= 1.0.0)

    Normalizes the input provided and returns the normalized string

    说明

    面向对象风格
    publicstaticNormalizer::normalize(string $input[,int $form= Normalizer::FORM_C]): string
    过程化风格
    normalizer_normalize(string $input[,int $form= Normalizer::FORM_C]): string

    Normalizes the input provided and returns the normalized string

    参数

    $input

    The input string to normalize

    $form

    One of the normalization forms.

    返回值

    The normalized string or FALSE if an error occurred.

    范例

    normalizer_normalize() example

    <?php
    $char_A_ring = "\xC3\x85"; // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
    $char_combining_ring_above = "\xCC\x8A";  // 'COMBINING RING ABOVE' (U+030A)
     
    $char_1 = normalizer_normalize( $char_A_ring, Normalizer::FORM_C );
    $char_2 = normalizer_normalize( 'A' . $char_combining_ring_above, Normalizer::FORM_C );
     
    echo urlencode($char_1);
    echo ' ';
    echo urlencode($char_2);
    ?>
    

    OO example

    <?php
    $char_A_ring = "\xC3\x85"; // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)
    $char_combining_ring_above = "\xCC\x8A";  // 'COMBINING RING ABOVE' (U+030A)
     
    $char_1 = Normalizer::normalize( $char_A_ring, Normalizer::FORM_C );
    $char_2 = Normalizer::normalize( 'A' . $char_combining_ring_above, Normalizer::FORM_C );
     
    echo urlencode($char_1);
    echo ' ';
    echo urlencode($char_2);
    ?>
    

    以上例程会输出:

    %C3%85 %C3%85
    

    参见

    • normalizer_is_normalized() Checks if the provided string is already in the specified normalization form
    Especially when matching texts against each-other or against keywords, it is helpful to normalize the texts before.
    The following function removes all diacritics (marks like accents) from a given UTF8-encoded texts and returns ASCii-text.
    Be sure to have the PHP-Normalizer-extension (intl and icu) installed.
    Tipp: You may also want to map the text to lower case before execute matching procedures ...
    <?php
    function normalizeUtf8String( $s)
    {
      // Normalizer-class missing!
      if (! class_exists("Normalizer", $autoload = false))
        return $original_string;
      
      
      // maps German (umlauts) and other European characters onto two characters before just removing diacritics
      $s  = preg_replace( '@\x{00c4}@u'  , "AE",  $s );  // umlaut Ä => AE
      $s  = preg_replace( '@\x{00d6}@u'  , "OE",  $s );  // umlaut Ö => OE
      $s  = preg_replace( '@\x{00dc}@u'  , "UE",  $s );  // umlaut Ü => UE
      $s  = preg_replace( '@\x{00e4}@u'  , "ae",  $s );  // umlaut ä => ae
      $s  = preg_replace( '@\x{00f6}@u'  , "oe",  $s );  // umlaut ö => oe
      $s  = preg_replace( '@\x{00fc}@u'  , "ue",  $s );  // umlaut ü => ue
      $s  = preg_replace( '@\x{00f1}@u'  , "ny",  $s );  // ñ => ny
      $s  = preg_replace( '@\x{00ff}@u'  , "yu",  $s );  // ÿ => yu
      
      
      // maps special characters (characters with diacritics) on their base-character followed by the diacritical mark
        // exmaple: Ú => U´, á => a`
      $s  = Normalizer::normalize( $s, Normalizer::FORM_D );
      
      
      $s  = preg_replace( '@\pM@u'    , "",  $s );  // removes diacritics
      
      
      $s  = preg_replace( '@\x{00df}@u'  , "ss",  $s );  // maps German ß onto ss
      $s  = preg_replace( '@\x{00c6}@u'  , "AE",  $s );  // Æ => AE
      $s  = preg_replace( '@\x{00e6}@u'  , "ae",  $s );  // æ => ae
      $s  = preg_replace( '@\x{0132}@u'  , "IJ",  $s );  // ? => IJ
      $s  = preg_replace( '@\x{0133}@u'  , "ij",  $s );  // ? => ij
      $s  = preg_replace( '@\x{0152}@u'  , "OE",  $s );  // Π=> OE
      $s  = preg_replace( '@\x{0153}@u'  , "oe",  $s );  // œ => oe
      
      $s  = preg_replace( '@\x{00d0}@u'  , "D",  $s );  // Ð => D
      $s  = preg_replace( '@\x{0110}@u'  , "D",  $s );  // Ð => D
      $s  = preg_replace( '@\x{00f0}@u'  , "d",  $s );  // ð => d
      $s  = preg_replace( '@\x{0111}@u'  , "d",  $s );  // d => d
      $s  = preg_replace( '@\x{0126}@u'  , "H",  $s );  // H => H
      $s  = preg_replace( '@\x{0127}@u'  , "h",  $s );  // h => h
      $s  = preg_replace( '@\x{0131}@u'  , "i",  $s );  // i => i
      $s  = preg_replace( '@\x{0138}@u'  , "k",  $s );  // ? => k
      $s  = preg_replace( '@\x{013f}@u'  , "L",  $s );  // ? => L
      $s  = preg_replace( '@\x{0141}@u'  , "L",  $s );  // L => L
      $s  = preg_replace( '@\x{0140}@u'  , "l",  $s );  // ? => l
      $s  = preg_replace( '@\x{0142}@u'  , "l",  $s );  // l => l
      $s  = preg_replace( '@\x{014a}@u'  , "N",  $s );  // ? => N
      $s  = preg_replace( '@\x{0149}@u'  , "n",  $s );  // ? => n
      $s  = preg_replace( '@\x{014b}@u'  , "n",  $s );  // ? => n
      $s  = preg_replace( '@\x{00d8}@u'  , "O",  $s );  // Ø => O
      $s  = preg_replace( '@\x{00f8}@u'  , "o",  $s );  // ø => o
      $s  = preg_replace( '@\x{017f}@u'  , "s",  $s );  // ? => s
      $s  = preg_replace( '@\x{00de}@u'  , "T",  $s );  // Þ => T
      $s  = preg_replace( '@\x{0166}@u'  , "T",  $s );  // T => T
      $s  = preg_replace( '@\x{00fe}@u'  , "t",  $s );  // þ => t
      $s  = preg_replace( '@\x{0167}@u'  , "t",  $s );  // t => t
      
      // remove all non-ASCii characters
      $s  = preg_replace( '@[^\0-\x80]@u'  , "",  $s ); 
      
      
      // possible errors in UTF8-regular-expressions
      if (empty($s))
        return $original_string;
      else
        return $s; 
    }
    ?>
    The above function is mainly based on the following article:
    http://ahinea.com/en/tech/accented-translate.html
    You can use the 'original' abbreviations if you feel more comfortable:
    <?php
    Normalizer::NFD;
    Normalizer::NFKD;
    Normalizer::NFC;
    Normalizer::NFKC;
    ?>
    
    "If you get error messages while starting apache of xampp package with activated extension=intl.dll," do NOT copy any files around.
    Use Apache's "LoadFile …" functionality to load any missing DLL's not found within a %PATH%. Even php##ts.dll itself.
    If you get error messages while starting apache of xampp package with activated extension=intl.dll, copy the files
      * icudt##.dll
      * icuin##.dll
      * icuio##.dll
      * icule##.dll
      * iculx##.dll
      * icutu##.dll
      * icuuc##.dll
    ## = version number
    from "/program files/xampp/php"
    into your "/program files/xampp/apache/bin" or whereever your xampp resides :-)