• 首页
  • vue
  • TypeScript
  • JavaScript
  • scss
  • css3
  • html5
  • php
  • MySQL
  • redis
  • jQuery
  • convert_cyr_string()

    (PHP 4, PHP 5, PHP 7)

    将字符由一种 Cyrillic 字符转换成另一种

    说明

    convert_cyr_string(string $str, string $from, string $to) : string

    此函数将给定的字符串从一种 Cyrillic 字符转换成另一种,返回转换之后的字符串。

    参数

    $str

    要转换的字符。

    $from

    单个字符,代表源 Cyrillic 字符集。

    $to

    单个字符,代表了目标 Cyrillic 字符集。

    支持的类型有:

    • k - koi8-r
    • w - windows-1251
    • i - iso8859-5
    • a - x-cp866
    • d - x-cp866
    • m - x-mac-cyrillic

    返回值

    返回转换后的字符串。

    注释

    Note:此函数可安全用于二进制对象。

    Only this code works OK for me, for translating win-1251 to utf-8 for macedonian letters!
    // Modificated by tapin13
    // Corrected by Timuretis
    // Corrected by Sote for macedonian cyrillic
    // Convert win-1251 to utf-8
    function unicode_mk_cyr($str) {
       $encode = "";
       for ($ii=0;$ii<strlen($str);$ii++) {
         $xchr=substr($str,$ii,1);
         echo "<p>".ord($xchr)."</p>\n";
         if (ord($xchr)>191) {
           $xchr=ord($xchr)+848;
           $xchr="&#" . $xchr . ";";
         }
         if(ord($xchr) == 129) {
            $xchr = "&#1027;"; 
         } 
         if(ord($xchr) == 163) {
            $xchr = "&#1032;"; 
         }    
         if(ord($xchr) == 138) {
            $xchr = "&#1033;"; 
         }
         if(ord($xchr) == 140) {
            $xchr = "&#1034;"; 
         }
         if(ord($xchr) == 143) {
            $xchr = "&#1039;"; 
         }
         if(ord($xchr) == 141) {
            $xchr = "&#1036;"; 
         }  
         if(ord($xchr) == 189) {
            $xchr = "&#1029;"; 
         }                 
          
         if(ord($xchr) == 188) {
            $xchr = "&#1112;"; 
         }
         if(ord($xchr) == 131) {
            $xchr = "&#1107;"; 
         }
         if(ord($xchr) == 190) {
            $xchr = "&#1109;"; 
         }
         if(ord($xchr) == 154) {
            $xchr = "&#1113;"; 
         }
         if(ord($xchr) == 156) {
            $xchr = "&#1114;"; 
         }
         if(ord($xchr) == 159) {
            $xchr = "&#1119;"; 
         }
         if(ord($xchr) == 157) {
            $xchr = "&#1116;"; 
         }                           
         $encode=$encode . $xchr;
      }
       return $encode;
    }
    He is improved function to decode win1251->UTF8
    <?php
        function win2utf($s){ 
          $c209 = chr(209); $c208 = chr(208); $c129 = chr(129);
          for($i=0; $i<strlen($s); $i++)  { 
            $c=ord($s[$i]);
            if ($c>=192 and $c<=239) $t.=$c208.chr($c-48);
            elseif ($c>239) $t.=$c209.chr($c-112);
            elseif ($c==184) $t.=$c209.$c209;
            elseif ($c==168)  $t.=$c208.$c129;
            else $t.=$s[$i];
          } 
          return $t;
        }
    ?>
    To: mihailsbo at lycos dot ru
    Transliteration could be done easier:
    <?
    function transliterate($cyrstr)
      {
        $ru = array('A', 'a',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?',
              '?', '?');
        $en = array('A', 'a',
              'B', 'b',
              'V', 'v',
              'G', 'g',
              'D', 'd',
              'E', 'e',
              'E', 'e',
              'Zh', 'zh',
              'Z', 'z',
              'I', 'i',
              'J', 'j',
              'K', 'k',
              'L', 'l',
              'M', 'm',
              'N', 'n',
              'O', 'o',
              'P', 'p',
              'R', 'r',
              'S', 's',
              'T', 't',
              'U', 'u',
              'F', 'f',
              'H', 'h',
              'C', 'c',
              'Ch', 'ch',
              'Sh', 'sh',
              'Sch', 'sch',
              '\'', '\'',
              'Y', 'y', 
              '\'', '\'',
              'E', 'e',
              'Ju', 'ju',
              'Ja', 'ja');
        return str_replace($ru, $en, $cyrstr);
      }
    ?>
    cathody at mail dot ru(27-Jul-2005 06:41)
    You function doesn't work on my PC..
    It's work:
    function Encode2($str,$type)
    {
      $conv=array();
      for($x=192;$x<=239;$x++)
        $conv[u][chr($x)]=chr(208).chr($x-48);
      for($x=240;$x<=255;$x++)
        $conv[u][chr($x)]=chr(209).chr($x-112);
      $conv[u][chr(168)]=chr(208).chr(129);
      $conv[u][chr(184)]=chr(209).chr(209);
      $conv[w]=array_reverse($conv[u]);
      if($type=='w'  ||  $type=='u')
        return strtr($str,$conv[$type]);
      else
        return $str;
    }
    Check this code -- exelent to convert win-1251 to UTF-8 
    just one fix!!!
        if ($c==184) { $t.=chr(209).chr(145); continue; }; 
    Anything more it is not necessary.
    It is grateful to threed [at] koralsoft.com
    28-Jul-2003 03:37 
    i tried all functions here to convert from cp1251 to unicode, but they don't work. i think that this work : 
    <?php 
    function win3utf($s)  { 
      for($i=0, $m=strlen($s); $i<$m; $i++)  { 
        $c=ord($s[$i]); 
        if ($c<=127) {$t.=chr($c); continue; } 
        if ($c>=192 && $c<=207)  {$t.=chr(208).chr($c-48); continue; } 
        if ($c>=208 && $c<=239) {$t.=chr(208).chr($c-48); continue; } 
        if ($c>=240 && $c<=255) {$t.=chr(209).chr($c-112); continue; } 
        if ($c==184) { $t.=chr(209).chr(209); continue; }; 
      if ($c==168) { $t.=chr(208).chr(129); continue; }; 
      } 
      return $t; 
    } 
    ?>
    i tried all functions here to convert from cp1251 to unicode, but they don't work. i think that this work :
    <?php
    function win3utf($s)  {
      for($i=0, $m=strlen($s); $i<$m; $i++)  {
        $c=ord($s[$i]);
        if ($c<=127) {$t.=chr($c); continue; }
        if ($c>=192 && $c<=207)  {$t.=chr(208).chr($c-48); continue; }
        if ($c>=208 && $c<=239) {$t.=chr(208).chr($c-48); continue; }
        if ($c>=240 && $c<=255) {$t.=chr(209).chr($c-112); continue; }
        if ($c==184) { $t.=chr(209).chr(209); continue; };
      if ($c==168) { $t.=chr(208).chr(129); continue; };
      }
      return $t;
    }
    ?>
    A better function to convert cp1251 string to utf8.
    Works with russian and ukrainian text.
    function unicod($str) {
      $conv=array();
      for($x=128;$x<=143;$x++) $conv[$x+112]=chr(209).chr($x);
      for($x=144;$x<=191;$x++) $conv[$x+48]=chr(208).chr($x);
      $conv[184]=chr(209).chr(145); #ё
      $conv[168]=chr(208).chr(129); #Ё
      $conv[179]=chr(209).chr(150); #і
      $conv[178]=chr(208).chr(134); #І
      $conv[191]=chr(209).chr(151); #ї
      $conv[175]=chr(208).chr(135); #ї
      $conv[186]=chr(209).chr(148); #є
      $conv[170]=chr(208).chr(132); #Є
      $conv[180]=chr(210).chr(145); #ґ
      $conv[165]=chr(210).chr(144); #Ґ
      $conv[184]=chr(209).chr(145); #Ґ
      $ar=str_split($str);
      foreach($ar as $b) if(isset($conv[ord($b)])) $nstr.=$conv[ord($b)]; else $nstr.=$b;
      return $nstr;
    }
    //I've also built the same way for hebrew to utf converting
    function heb2utf($s) {
      for($i=0, $m=strlen($s); $i<$m; $i++)  { 
        $c=ord($s[$i]); 
        if ($c<=127) {$t.=chr($c); continue; } 
        if ($c>=224 )  {$t.=chr(215).chr($c-80); continue; } 
       
      } 
      return $t; 
    }
    //Simple unicoder and decoder for hebrew and russian:
    function unicode_hebrew($str) {
      for ($ii=0;$ii<strlen($str);$ii++) {
        $xchr=substr($str,$ii,1);
        if (ord($xchr)>223) {
          $xchr=ord($xchr)+1264;
          $xchr="&#" . $xchr . ";";
        }
        $encode=$encode . $xchr;
      }
      return $encode;
    }
    function unicode_russian($str) {
      for ($ii=0;$ii<strlen($str);$ii++) {
        $xchr=substr($str,$ii,1);
        if (ord($xchr)>191) {
          $xchr=ord($xchr)+848;
          $xchr="&#" . $xchr . ";";
        }
        $encode=$encode . $xchr;
      }
      return $encode;
    }
    function decode_unicoded_hebrew($str) {
      $decode="";
      $ar=split("&#",$str);
      foreach ($ar as $value ) {
        $in1=strpos($value,";"); //end of code
        if ($in1>0) {// unicode
          $code=substr($value,0,$in1);
            
          if ($code>=1456 and $code<=1514) { //hebrew
              $code=$code-1264;
            $xchr=chr($code);
            } else { //other unicode
            $xchr="&#" . $code . ";";
           } 
          $xchr=$xchr . substr($value,$in1+1);  
        } else //not unicode
           $xchr = $value;
        
        $decode=$decode . $xchr;
      }
      return $decode;
    }
    function decode_unicoded_russian($str) {
      $decode="";
      $ar=split("&#",$str);
      foreach ($ar as $value ) {
        $in1=strpos($value,";"); //end of code
        if ($in1>0) {// unicode
          $code=substr($value,0,$in1);
            
          if ($code>=1040 and $code<=1103) {
              $code=$code-848;
            $xchr=chr($code);
            } else {
            $xchr="&#" . $code . ";";
           } 
          $xchr=$xchr . substr($value,$in1+1);  
        } else
           $xchr = $value;
        
        $decode=$decode . $xchr;
      }
      return $decode;
    }
    Not all of cyrilic characters are supported by this function. Cyrilic chars from Macedonian Alphabet like Sh, kj, dz' ,nj are not supported.
    :) what about NUMBER!!!???
      function Utf8Win($str,$type="w")
      {
        static $conv='';
        if (!is_array($conv))
        {
          $conv = array();
          for($x=128;$x<=143;$x++)
          {
            $conv['u'][]=chr(209).chr($x);
            $conv['w'][]=chr($x+112);
          }
          for($x=144;$x<=191;$x++)
          {
            $conv['u'][]=chr(208).chr($x);
            $conv['w'][]=chr($x+48);
          }
          $conv['u'][]=chr(208).chr(129); // Ё
          $conv['w'][]=chr(168);
          $conv['u'][]=chr(209).chr(145); // ё
          $conv['w'][]=chr(184);
          $conv['u'][]=chr(208).chr(135); // Ї
          $conv['w'][]=chr(175);
          $conv['u'][]=chr(209).chr(151); // ї
          $conv['w'][]=chr(191);
          $conv['u'][]=chr(208).chr(134); // І
          $conv['w'][]=chr(178);
          $conv['u'][]=chr(209).chr(150); // і
          $conv['w'][]=chr(179);
          $conv['u'][]=chr(210).chr(144); // Ґ
          $conv['w'][]=chr(165);
          $conv['u'][]=chr(210).chr(145); // ґ
          $conv['w'][]=chr(180);
          $conv['u'][]=chr(208).chr(132); // Є
          $conv['w'][]=chr(170);
          $conv['u'][]=chr(209).chr(148); // є
          $conv['w'][]=chr(186);
          $conv['u'][]=chr(226).chr(132).chr(150); // №
          $conv['w'][]=chr(185);
        }
        if ($type == 'w') { return str_replace($conv['u'],$conv['w'],$str); }
        elseif ($type == 'u') { return str_replace($conv['w'], $conv['u'],$str); }
        else { return $str; }
      }
    Sorry for my previous post. NOT array_reverce, array_flip is actual function. Correct function:
    function Encode($str,$type=u)
    {
      $conv=array();
      for($x=192;$x<=239;$x++)
        $conv[u][chr($x)]=chr(208).chr($x-48);
      for($x=240;$x<=255;$x++)
        $conv[u][chr($x)]=chr(209).chr($x-112);
      $conv[u][chr(168)]=chr(208).chr(129);
      $conv[u][chr(184)]=chr(209).chr(209);
      $conv[w]=array_flip($conv[u]);
      if($type=='w'  ||  $type=='u')
        return strtr($str,$conv[$type]);
      else
        return $str;
    }
    Sorry for my English ;)
    threed's function works great, but the replacement for the letter small io (&#1105;) needs to be changed from 
    <?php
    if ($c==184) { $t.=chr(209).chr(209); continue; };
    ?>
    to
    <?php
    if ($c==184) { $t.=chr(209).chr(145); continue; };
    ?>
    so, the final working result should look like this:
    <?php
    function win3utf($s) {
      for($i=0, $m=strlen($s); $i<$m; $i++)  {
        $c=ord($s[$i]);
        if ($c<=127) {$t.=chr($c); continue; }
        if ($c>=192 && $c<=207) {$t.=chr(208).chr($c-48); continue; }
        if ($c>=208 && $c<=239) {$t.=chr(208).chr($c-48); continue; }
        if ($c>=240 && $c<=255) {$t.=chr(209).chr($c-112); continue; }
        if ($c==184) { $t.=chr(209).chr(209); continue; };
        if ($c==168) { $t.=chr(208).chr(129); continue; };
      }
      return $t;
    }
    ?>
    I have made mistake remove this test line:
         echo "<p>".ord($xchr)."</p>\n";
    code should be like this:
    // Modificated by tapin13
    // Corrected by Timuretis
    // Corrected by Sote for macedonian cyrillic
    // Convert win-1251 to utf-8
    function unicode_mk_cyr($str) {
       $encode = "";
       for ($ii=0;$ii<strlen($str);$ii++) {
         $xchr=substr($str,$ii,1);
         if (ord($xchr)>191) {
           $xchr=ord($xchr)+848;
           $xchr="&#" . $xchr . ";";
         }
         if(ord($xchr) == 129) {
            $xchr = "&#1027;";
         }
         if(ord($xchr) == 163) {
            $xchr = "&#1032;";
         }   
         if(ord($xchr) == 138) {
            $xchr = "&#1033;";
         }
         if(ord($xchr) == 140) {
            $xchr = "&#1034;";
         }
         if(ord($xchr) == 143) {
            $xchr = "&#1039;";
         }
         if(ord($xchr) == 141) {
            $xchr = "&#1036;";
         }  
         if(ord($xchr) == 189) {
            $xchr = "&#1029;";
         }                
         
         if(ord($xchr) == 188) {
            $xchr = "&#1112;";
         }
         if(ord($xchr) == 131) {
            $xchr = "&#1107;";
         }
         if(ord($xchr) == 190) {
            $xchr = "&#1109;";
         }
         if(ord($xchr) == 154) {
            $xchr = "&#1113;";
         }
         if(ord($xchr) == 156) {
            $xchr = "&#1114;";
         }
         if(ord($xchr) == 159) {
            $xchr = "&#1119;";
         }
         if(ord($xchr) == 157) {
            $xchr = "&#1116;";
         }                          
         $encode=$encode . $xchr;
      }
       return $encode;
    }
    previous bit of code (grmaxim's win_to_utf8 function) didn't work for me, so I wrote my own func to convert from win1251 to utf8:
    <?php
    function win2utf($s) {
      for($i=0,$m=strlen($s);$i<$m;$i++) {
        $c=ord($s[$i]);
        if ($c>127) // convert only special chars
          if   ($c==184)  $t.=chr(209).chr(209); // small io
          elseif ($c==168)  $t.=chr(208).chr(129); // capital io
          else       $t.=($c>239?chr(209): chr(208)).chr($c-48);
        else $t.=$s[$i];
      }
      return $t;
    }
    ?>
    Hope this helps
    Unfortunately input data must be a string only. But it is may be changed! ;) 
    To convert multi-dimensional array I use this recursive function:
    <?php
    function convert_cyr_array($array,$from,$to){
      foreach($array as $key=>$value){
        if(is_array($value)) {
          $result[$key] = convert_cyr_array($value,$from,$to);
          continue;
        }
        $result[$key] = convert_cyr_string($value,$from,$to);
      }
      return $result;
    }
    ?>
    An example:
    <?php
    $array[0] = "сВМПЛП";
    $array[1] = "зТХЫБ";
    $array[2] = array("пЗХТЕГ","рПНЙДПТ");
    $array[3] = array(
               array("бРЕМШУЙО","нБОДБТЙО"),
               array("бВТЙЛПУ","рЕТУЙЛ")
            );
    $result = convert_cyr_array($array,"k","w");
    /* Returns:
    Array
    (
     [0] => Яблоко
     [1] => Груша
     [2] => Array
      (
       [0] => Огурец
       [1] => Помидор
      )
     [3] => Array
      (
       [0] => Array
        (
         [0] => Апельсин
         [1] => Мандарин
        )
       [1] => Array
        (
         [0] => Абрикос
         [1] => Персик
        )
      )
    )*/
    ?>
    Praising other people for their efforts to write a convenient UTF8 to Win-1251 functions may I mention that, since str_replace allows arrays as parameters, the function may be rewritten in a slightly efficient way (moreover, the array generated may be stored for performance improvement):
    <?php 
     function Encode ( $str, $type )
    {
     // $type: 
    // 'w' - encodes from UTF to win 
     // 'u' - encodes from win to UTF 
      static $conv='';
      if (!is_array ( $conv ))
      {  
        $conv=array ();
        for ( $x=128; $x <=143; $x++ )
        {
         $conv['utf'][]=chr(209).chr($x);
         $conv['win'][]=chr($x+112);
        }
        for ( $x=144; $x <=191; $x++ )
        {
            $conv['utf'][]=chr(208).chr($x);
            $conv['win'][]=chr($x+48);
        }
     
        $conv['utf'][]=chr(208).chr(129);
        $conv['win'][]=chr(168);
        $conv['utf'][]=chr(209).chr(145);
        $conv['win'][]=chr(184);
       }
       if ( $type=='w' )
         return str_replace ( $conv['utf'], $conv['win'], $str );
       elseif ( $type=='u' )
         return str_replace ( $conv['win'], $conv['utf'], $str );
       else
        return $str;
     }
    ?>

    上篇:chunk_split()

    下篇:convert_uudecode()