• 首页
  • vue
  • TypeScript
  • JavaScript
  • scss
  • css3
  • html5
  • php
  • MySQL
  • redis
  • jQuery
  • mb_split()

    (PHP 4 >= 4.2.0, PHP 5, PHP 7)

    使用正则表达式分割多字节字符串

    说明

    mb_split(string $pattern,string $string[,int $limit= -1]): array

    使用正则表达式$pattern分割多字节$string并返回结果array。

    参数

    $pattern

    正则表达式。

    $string

    待分割的string。

    $limit
    如果指定了可选参数$limit,将最多分割为$limit个元素。

    返回值

    array的结果。

    注释

    Note:

    Thecharacter encoding specified by mb_regex_encoding()will be used as the character encoding for this function by default.

    参见

    a (simpler) way to extract all characters from a UTF-8 string to array with a single call to a built-in function:
    <?php
     $str = 'Ма-
    руся';
     print_r(preg_split('//u', $str, null, PREG_SPLIT_NO_EMPTY));
    ?>
    Output:
    Array
    (
      [0] => М
      [1] => а
      [2] => -
      [3] => 
      [4] => р
      [5] => у
      [6] => с
      [7] => я
    )
    The $pattern argument doesn't use /pattern/ delimiters, unlike other regex functions such as preg_match.
    <?php
      # Works. No slashes around the /pattern/
      print_r( mb_split("\s", "hello world") );
      Array (
       [0] => hello
       [1] => world
      )
      # Doesn't work:
      print_r( mb_split("/\s/", "hello world") );
      Array (
       [0] => hello world
      )
    ?>
    
    I figure most people will want a simple way to break-up a multibyte string into its individual characters. Here's a function I'm using to do that. Change UTF-8 to your chosen encoding method.
    <?php
    function mbStringToArray ($string) {
      $strlen = mb_strlen($string);
      while ($strlen) {
        $array[] = mb_substr($string,0,1,"UTF-8");
        $string = mb_substr($string,1,$strlen,"UTF-8");
        $strlen = mb_strlen($string);
      }
      return $array;
    }
    ?>
    
    To split an string like this: "日、に、本、ほん、語、ご" using the "、" delimiter i used:
       $v = mb_split('、',"日、に、本、ほん、語、ご");
    but didn't work.
    The solution was to set this before:
        mb_regex_encoding('UTF-8');
       mb_internal_encoding("UTF-8"); 
       $v = mb_split('、',"日、に、本、ほん、語、ご");
    and now it's working:
    Array
    (
      [0] => 日
      [1] => に
      [2] => 本
      [3] => ほん
      [4] => 語
      [5] => ご
    )
    In addition to Sezer Yalcin's tip.
    This function splits a multibyte string into an array of characters. Comparable to str_split().
    <?php
    function mb_str_split( $string ) {
      # Split at all position not after the start: ^
      # and not before the end: $
      return preg_split('/(?<!^)(?!$)/u', $string );
    }
    $string  = '火车票';
    $charlist = mb_str_split( $string );
    print_r( $charlist );
    ?>
    # Prints:
    Array
    (
      [0] => 火
      [1] => 车
      [2] => 票
    )
    I agree that some people might want a mb_explode('', $string);
    this is my solution for it:
    <?php
    $string = 'Hallöle';
    $array = array_map(function ($i) use ($string) { 
      return mb_substr($string, $i, 1); 
    }, range(0, mb_strlen($string) -1));
    expect($array)->toEqual(['H', 'a', 'l', 'l', 'ö', 'l', 'e']);
    ?>
    
    an other way to str_split multibyte string:
    <?php
    $s='әӘөүҗңһ';
    //$temp_s=iconv('UTF-8','UTF-16',$s);
    $temp_s=mb_convert_encoding($s,'UTF-16','UTF-8');
    $temp_a=str_split($temp_s,4);
    $temp_a_len=count($temp_a);
    for($i=0;$i<$temp_a_len;$i++){
      //$temp_a[$i]=iconv('UTF-16','UTF-8',$temp_a[$i]);
      $temp_a[$i]=mb_convert_encoding($temp_a[$i],'UTF-8','UTF-16');
    }
    echo('<pre>');
    print_r($temp_a);
    echo('</pre>');
    //also possible to directly use UTF-16:
    define('SLS',mb_convert_encoding('/','UTF-16'));
    $temp_s=mb_convert_encoding($s,'UTF-16','UTF-8');
    $temp_a=str_split($temp_s,4);
    $temp_s=implode(SLS,$temp_a);
    $temp_s=mb_convert_encoding($temp_s,'UTF-8','UTF-16');
    echo($temp_s);
    ?>
    
    We are talking about Multi Byte ( e.g. UTF-8) strings here, so preg_split will fail for the following string: 
    'Weiße Rosen sind nicht grün!'
    And because I didn't find a regex to simulate a str_split I optimized the first solution from adjwilli a bit:
    <?php
    $string = 'Weiße Rosen sind nicht grün!'
    $stop  = mb_strlen( $string);
    $result = array();
    for( $idx = 0; $idx < $stop; $idx++)
    {
      $result[] = mb_substr( $string, $idx, 1);
    }
    ?>
    Here is an example with adjwilli's function:
    <?php
    mb_internal_encoding( 'UTF-8');
    mb_regex_encoding( 'UTF-8'); 
    function mbStringToArray
    ( $string
    )
    {
     $stop  = mb_strlen( $string);
     $result = array();
     for( $idx = 0; $idx < $stop; $idx++)
     {
       $result[] = mb_substr( $string, $idx, 1);
     }
     return $result;
    }
    echo '<pre>', PHP_EOL, 
    print_r( mbStringToArray( 'Weiße Rosen sind nicht grün!', true)), PHP_EOL,
    '</pre>'; 
    ?>
    Let me know [by personal email], if someone found a regex to simulate a str_split with mb_split.

    上篇:mb_send_mail()

    下篇:mb_strcut()