Tables of Contents for The Unicode Standard, Version 4.0
Unicode Consortium Members and Directors
ix
Current Associate Members
ix
Current Liaison Members
ix
Current Specialist Members
x
Current Individual Members
x
Current Members of the Board of Directors
x
Former Members of the Board of Directors
x
About the Unicode Standard
xxxi
Concepts, Architecture, Conformance, and Guidelines
xxxi
Character Block Descriptions
xxxii
Charts and Han Radical-Stroke Index
xxxiii
The Unicode Character Database and Technical Reports
xxxiii
Notational Conventions
xxxiv
Unicode Anonymous FTP Site
xxxvii
Unicode E-mail Discussion List
xxxvii
How to Contact the Unicode Consortium
xxxviii
Interpreting Characters
5
1
The Unicode Standard and ISO/IEC 10646
5
1
The Unicode Technical Committee
6
1
Submitting New Characters
7
4
Text Elements, Characters, and Text Processes
12
1
Text Processes and Encoding
13
1
Unicode Design Principles
14
9
Characters, Not Glyphs
15
2
Compatibility Characters
23
1
Compatibility Characters
23
1
Compatibility Decomposable Characters
23
1
Mapping Compatibility Characters
23
1
Code Points and Characters
24
2
Comparison of the Advantages of UTF-32, UTF-16, and UTF-8
31
1
Allocation Areas and Character Blocks
36
1
Assignment of Code Points
42
1
Sequence of Base Characters and Diacritics
44
1
Multiple Combining Characters
44
2
Ligated Multiple Base Characters
46
1
Spacing Clones of European Diacritical Marks
46
1
``Characters'' and Grapheme Clusters
46
1
Special Characters and Noncharacters
47
1
Special Noncharacter Code Points
48
1
Layout and Format Control Characters
48
1
The Replacement Character
48
1
Conforming to the Unicode Standard
48
2
Versions of the Unicode Standard
55
3
Errata, Corrigenda, and Future Updates
56
1
References to the Unicode Standard
57
1
References to Unicode Character Properties
57
1
References to Unicode Algorithms
57
1
Conformance Requirements
58
5
Unassigned Code Points
58
1
Character Encoding Forms
60
1
Character Encoding Schemes
61
1
Default Casing Operations
62
1
Unicode Standard Annexes
62
1
Character Identity and Semantics
63
1
Characters and Encoding
64
2
Normative and Informative Properties
66
2
Simple and Derived Properties
68
1
Default Property Values
69
1
Compatibility Decomposition
71
1
Canonical Decomposition
72
1
Unicode Encoding Forms
73
5
Encoding Form Conversion
78
1
Unicode Encoding Schemes
78
4
Canonical Ordering Behavior
82
3
Application of Combining Marks
83
1
Canonical Ordering and Collation
85
1
Conjoining Jamo Behavior
85
4
Hangul Syllable Boundaries
86
1
Standard Korean Syllables
86
1
Hangul Syllable Composition
87
1
Hangul Syllable Decomposition
88
1
Default Case Operations
89
6
Case Conversion of Strings
90
1
Case Detection for Strings
90
1
Unicode Character Database
96
1
Combining Classes---Normative
97
1
Directionality---Normative
98
1
General Category---Normative
98
2
Numeric Value---Normative
100
1
Ideographic Numeric Values
100
1
Bidi Mirrored---Normative
101
1
Letters, Alphabetic, and Ideographic
102
1
Characters with Unusual Properties
103
4
Implementation Guidelines
107
40
Transcoding to Other Standards
107
2
Unknown and Missing Characters
110
1
Reserved and Private-Use Character Codes
110
1
Interpretable but Unrenderable Characters
110
1
Default Property Values
110
1
Default Ignorable Code Points
111
1
Interacting with Downlevel Systems
111
1
Handling Surrogate Pairs in UTF-16
111
3
Language Information in Plain Text
119
2
Requirements for Language Tagging
119
1
Language Tags and Han Unification
120
1
Editing and Selection
121
1
Consistent Text Elements
121
1
Strategies for Handling Nonspacing Marks
122
3
Rendering Nonspacing Marks
125
5
Canonical Equivalence
127
1
Locating Text Element Boundaries
130
1
Property-Based Identifier Syntax
130
1
Alternative Recommendation
132
1
Sorting and Searching
132
3
Culturally Expected Sorting and Searching
133
1
Language-Insensitive Sorting
133
1
UTF-8 in UTF-16 Order
135
1
UTF-16 in UTF-8 Order
136
1
Complications for Case Mapping
137
1
Default Ignorable Code Points
142
5
Writing Systems and Punctuation
147
18
Punctuation: U+0020--U+00BF
152
2
General Punctuation: U+2000--U+206F
154
6
CJK Symbols and Punctuation: U+3000--U+303F
160
1
CJK Compatibility Forms: U+FE30--U+FE4F
161
1
Small Form Variants: U+FE50--U+FE6F
162
3
European Alphabetic Scripts
165
26
Letters of Basic Latin: U+0041--U+007A
166
1
Letters of the Latin-1 Supplement: U+00C0--U+00FF
167
1
Latin Extended-A: U+0100--U+017F
167
2
Latin Extended-B: U+0180--U+024F
169
1
IPA Extensions: U+0250--U+02AF
170
1
Phonetic Extensions: U+1D00--U+1D6A
171
1
Latin Extended Additional: U+1E00--U+1EFF
172
1
Latin Ligatures: FB00--FB06
173
1
Greek: U+0370--U+03FF
174
3
Greek Extended: U+1F00--U+1FFF
177
2
Cyrillic: U+0400--U+04FF
179
1
Cyrillic Supplement: U+0500--U+052F
179
1
Armenian: U+0530--U+058F
180
2
Georgian: U+10A0--U+10FF
182
2
Spacing Modifier Letters: U+02B0--U+02FF
184
2
Combining Diacritical Marks: U+0300--U+036F
186
2
Combining Marks for Symbols: U+20D0--U+20FF
188
1
Combining Half Marks: U+FE20--U+FE2F
188
3
Middle Eastern Scripts
191
26
Hebrew: U+0590--U+05FF
192
2
Alphabetic Presentation Forms: U+FB1D--U+FB4F
194
1
Arabic: U+0600--U+06FF
195
4
Arabic Presentation Forms-A: U+FB50--U+FDFF
204
1
Arabic Presentation Forms-B: U+FE70--U+FEFF
205
1
Syriac: U+0700--U+074F
206
4
Syriac Cursive Joining
210
2
Thaana: U+0780--U+07BF
213
4
Devanagari: U+0900--U+097F
219
13
Bengali: U+0980--U+09FF
232
2
Gurmukhi: U+0A00--U+0A7F
234
2
Gujarati: U+0A80--U+0AFF
236
1
Oriya: U+0B00--U+0B7F
237
2
Tamil: U+0B80--U+0BFF
239
5
Telugu: U+0C00--U+0C7F
244
1
Kannada: U+0C80--U+0CFF
245
3
Malayalam: U+0D00--U+0D7F
248
2
Sinhala: U+0D80--U+0DFF
250
1
Tibetan: U+0F00--U+0FFF
251
9
Limbu: U+1900--U+194F
260
5
Southeast Asian Scripts
265
26
Myanmar: U+1000--U+109F
271
3
Khmer: U+1780--U+17FF
274
9
Khmer Symbols: U+19E0--U+19FF
283
1
Tai Le: U+1950--U+197F
284
2
Tagalog: U+1700--U+171F
286
1
Hanunoo: U+1720--U+173F
286
1
Buhid: U+1740--U+175F
286
1
Tagbanwa: U+1760--U+177F
286
5
CJK Unified Ideographs
293
11
CJK Unified Ideographs Ext. B: U+20000--U+2A6D6
304
1
CJK Compatibility Ideographs: U+F900--U+FAFF
305
1
CJK Compatibility Supplement: U+2F800--U+2FA1D
305
1
Kanbun: U+3190--U+319F
305
1
CJK and KangXi Radicals: U+2E80--U+2FD5
306
1
Ideographic Description: U+2FF0--U+2FFB
307
3
Bopomofo: U+3100--U+312F
310
2
Hiragana and Katakana
312
2
Hiragana: U+3040--U+309F
312
1
Katakana: U+30A0--U+30FF
312
1
Katakana Phonetic Extensions: U+31F0--U+31FF
313
1
Halfwidth and Fullwidth Forms: U+FF00--U+FFEF
313
1
Hangul Jamo: U+1100--U+11FF
314
1
Hangul Compatibility Jamo: U+3130--U+318F
314
1
Hangul Syllables: U+AC00--U+D7A3
315
2
Additional Modern Scripts
321
16
Ethiopic: U+1200--U+137F
322
3
Mongolian: U+1800--U+18AF
325
4
Osmanya: U+10480--U+104AF
329
1
Cherokee: U+13A0--U+13FF
330
1
Canadian Aboriginal Syllabics
331
1
Canadian Aboriginal Syllabics: U+1400--U+167F
331
1
Deseret: U+10400--U+1044F
332
2
Shavian: U+10450--U+1047F
334
3
Ogham: U+1680--U+169F
338
1
Old Italic: U+10300--U+1032F
339
2
Runic: U+16A0--U+16F0
341
2
Gothic: U+10330--U+1034F
343
1
Ugaritic: U+10380--U+1039F
344
1
Linear B Syllabary: U+10000--U+1007F
345
1
Linear B Ideograms: U+10080--U+108FF
345
1
Aegean Numbers: U+10100--U+1013F
345
1
Cypriot Syllabary: U+10800--U+1083F
346
3
Currency Symbols: U+20A0--U+20CF
351
2
Letterlike Symbols: U+2100--U+214F
353
1
Math Alphanumeric Symbols: U+1D400--U+1D7FF
354
1
Mathematical Alphabets
354
2
Fonts Used for Mathematical Alphabets
356
2
Number Forms: U+2150--U+218F
358
1
Superscripts and Subscripts: U+2070--U+209F
359
1
Mathematical Operators: U+2200--U+22FF
360
2
Supplements to Mathematical Symbols and Arrows
362
1
Supplemental Math Operators: U+2A00--U+2AFF
362
1
Miscellaneous Math Symbols-A: U+27C0--U+27EF
362
1
Miscellaneous Math Symbols-B: U+2980--U+29FF
362
1
Arrows: U+2190--U+21FF
363
1
Standardized Variants of Mathematical Symbols
363
2
Control Pictures: U+2400--U+243F
365
1
Miscellaneous Technical: U+2300--U+23FF
365
2
Optical Character Recognition: U+2440--U+245F
367
1
Box Drawing: U+2500--U+257F
368
1
Block Elements: U+2580--U+259F
368
1
Geometric Shapes: U+25A0--U+25FF
368
2
Miscellaneous Symbols and Dingbats
370
3
Miscellaneous Symbols: U+2600--U+26FF
370
1
Dingbats: U+2700--U+27BF
371
1
Yijing Hexagram Symbols: U+4DC0--U+4DFF
372
1
Tai Xuan Jing Symbols: U+1D300--U+1D356
372
1
Enclosed Alphanumerics: U+2460--U+24FF
373
1
Enclosed CJK Letters and Months: U+3200--U+32FF
373
1
CJK Compatibility: U+3300--U+33FF
373
1
Braille Patterns: U+2800--U+28FF
374
2
Byzantine Musical Symbols
376
1
Byzantine Musical Symbols: U+1D000--U+1D0FF
376
1
Western Musical Symbols
377
6
Musical Symbols: U+1D100--U+1D1FF
377
6
Special Areas and Format Characters
383
30
Deprecated Format Characters
394
2
Deprecated Format Characters: U+206A--U+206F
394
2
Surrogates Area: U+D800--U+DFFF
396
1
Private-Use Characters
398
2
Private Use Area: U+E000--U+F8FF
398
1
Supplementary Private Use Areas
399
1
Noncharacters: U+FFFE, U+FFFF, and Others
400
1
Specials: U+FEFF, U+FFF0--U+FFFD
401
4
Tag Characters: U+E0000--U+E007F
405
8
Images in the Code Charts and Character Lists
414
1
Information About Languages
415
1
CJK Unified Ideographs
417
1
Han Radical-Stroke Index
1189
152
Han Unification History
1341
2
Abstracts of Unicode Technical Reports
1343
4
Unicode Standard Annexes
1343
1
UAX #9: The Bidirectional Algorithm
1343
1
UAX #11: East Asian Width
1343
1
UAX #14: Line Breaking Properties
1343
1
UAX #15: Unicode Normalization Forms
1344
1
UAX #24: Script Names
1344
1
UAX #29: Text Boundaries
1344
1
Unicode Technical Standards
1344
1
UTS #6: A Standard Compression Scheme for Unicode
1344
1
UTS #10: Unicode Collation Algorithm
1344
1
Unicode Technical Reports
1344
1
UTR #17: Character Encoding Model
1344
1
UTR #18: Unicode Regular Expression Guidelines
1344
1
UTR #20: Unicode in XML and Other Markup Languages
1345
1
UTR #22: Character Mapping Markup Language (CharMapML)
1345
1
UTR #26: Compatibility Encoding Scheme for UTF-16: 8-Bit (CESU-8)
1345
1
Other Unicode References
1345
2
Unicode Technical Notes
1345
1
FAQ (Frequently Asked Questions)
1346
1
Where Is My Character?
1346
1
Relationship to ISO/IEC 10646
1347
8
Encoding Forms in ISO/IEC 10646
1350
1
UCS Transformation Formats
1351
1
Synchronization of the Standards
1352
1
Identification of Features for the Unicode Standard
1352
1
Character Functional Specifications
1353
2
Changes from Unicode Version 3.0
1355
8
Versions of the Unicode Standard
1355
2
Changes from Unicode Version 3.0 to Version 3.1
1357
1
New Characters Added
1357
1
Unicode Character Database Changes
1357
1
Changes Affecting Conformance
1357
1
Unicode Standard Annexes
1358
1
Changes from Unicode Version 3.1 to Version 3.2
1358
1
New Characters Added
1358
1
Unicode Character Database Changes
1358
1
Changes Affecting Conformance
1358
1
Unicode Standard Annexes
1359
1
Changes from Unicode Version 3.2 to Version 4.0
1359
4
New Characters Added
1359
1
Unicode Character Database Changes
1359
1
Changes Affecting Conformance
1360
1
Unicode Standard Annexes
1360
1
Source Standards and Specifications
1385
6
Source Dictionaries for Han Unification
1391
1
Other Sources for the Unicode Standard
1391
6
Selected Resources: Technical
1397
3
Selected Resources: Scripts and Languages
1400
7
Unicode Names Index
1407
42