About Character Sets

Important:
This is retired content. This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

A version of this page is also available for

Windows Embedded CE 6.0 R3

4/8/2010

A character set is a group of characters from a given language. For example, the ASCII character set is the standard United States-English character set. MLang provides a number of APIs to help you use multiple character sets, including APIs that perform conversions to Unicode and font linking.

The following terms and definitions pertain to character sets, and will help you better understand MLang methods:

Encoding

A mapping of a character to a sequence of bits. All encodings except Unicode are called multibyte encodings.
Charset

The application of an encoding for each character in a character set. In other words, it is a character set in which every character has been assigned an encoding-unique numeric value.
Code page

A unique physical implementation of a charset. In the MLang API, a code page is usually identified by a DWORD. Each bit in the DWORDrepresents a specific code page. When a bit is set to 1, its corresponding code page is considered a member in the set; if the bit is set to 0, its code page is not considered a member. Thus, the DWORD0x1e0000 would represent the code pages corresponding to the bits 0x100000, 0x80000, 0x40000, and 0x20000.
Font linking

The process of creating customized fonts that can display text in characters from a variety of different languages. This functionality is especially useful when dealing with Unicode strings, which can contain characters from many character sets at once.

See Also

Other Resources