Site hosted by Angelfire.com: Build your free website today!

UTF-8

[Back]  [Home]  [CJK]
 

This page is only an illustration of how you can convert UTF-16BE into UTF-7.  Read RFC 2279 to get first-hand information.

  1. Take the hex code of the character to find out how many bytes you need:

    0000-007F 1 byte
    0080-07FF 2 bytes
    0800-FFFF 3 bytes

  2. Take the binary form of the character and fill in the empty bits:

    1 byte 0xxxxxxx
    2 bytes 110xxxxx 10xxxxxx
    3 bytes 1110xxxx 10xxxxxx 10xxxxxx

Example

Cha 'tea' is encoded in UTF-16BE as 0x8336, so you need 3 bytes.  The binary form of 0x8336 is
 
10000011 00110110
 
Fill in the bits and you will get
 
11101000 10001100 10110110
 
Thus you have converted 0x8336 to 0xE88CB6. Set your browser to UTF-8 to see these bytes
 

 
as one Chinese character: cha.  In Western mode, you will see three characters: e grave, OE and paragraph.

Source of Information

Francois Yergeau. 1998. UTF-8, a transformation format of ISO 10646. RFC 2279.

Gyula Zsigri

[Back]  [Home]  [CJK]

August 19, 2000

eo.yahoo.com/toto?s=76001067&l=NE&b=0&t=1024291294';yvnR='us';yfnEA(0); 1