UTF-8
[Back]
[Home]
[CJK]
This page is only an illustration of how you can convert
UTF-16BE into
UTF-7. Read RFC 2279 to get first-hand information.
- Take the hex code of the character to find out how many bytes
you need:
| 0000-007F
| 1 byte
|
| 0080-07FF
| 2 bytes
|
| 0800-FFFF
| 3 bytes
|
- Take the binary form of the character
and fill in the empty bits:
| 1 byte
| 0xxxxxxx
|
| 2 bytes
| 110xxxxx 10xxxxxx
|
| 3 bytes
|
1110xxxx 10xxxxxx 10xxxxxx
|
Example
'tea' is encoded in UTF-16BE as
0x8336,
so you need 3 bytes. The binary form
of 0x8336
is
- 10000011 00110110
- Fill in the bits and you will get
-
11101000 10001100 10110110
- Thus you have converted
0x8336
to
0xE88CB6.
Set
your browser to UTF-8 to see these bytes
- 茶
- as one Chinese character:
.
In Western mode, you will see three characters:
.
Source of Information
Francois Yergeau. 1998. UTF-8, a transformation format of
ISO 10646.
RFC 2279.