U+18000 "𘀀" Tangut Ideograph-# Unicode Character
Unicode Version 17.0
U+18000 "𘀀" Tangut Ideograph-# is the very first encoded glyph in the Tangut script block, representing a logogram from the extinct Tangut language used by the Tangut Empire during the Western Xia dynasty in medieval China. This specific ideograph, like all Tangut characters, is highly complex in structure, typically comprising dozens of strokes that combine semantic and phonetic components. Its inclusion in Unicode, beginning with version 9.0 in 2016, was a monumental achievement that required extensive scholarly collaboration to decipher and standardize thousands of characters from fragments of historical manuscripts and dictionaries recovered from sites like Khara-Khoto. As a result, this character now serves as a digital gateway for researchers and linguists to study, preserve, and revitalize access to a once lost writing system that encodes the unique history and culture of a vanished civilization.
General Properties
Encodings
| HTML Decimal Encoding |
𘀀 |
| HTML Hex Encoding |
𘀀 |
| UTF-8 Encoding |
0xF0 0x98 0x80 0x80 |
| UTF-16 Encoding |
0xD820 0xDC00 |
| UTF-32 Encoding |
0x00018000 |
| C/C++/Java Escape |
\ud820\udc00 |
Unicode Properties