Next: Ctl-Char Syntax, Previous: Basic Char Syntax, Up: Character Type
In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.
?\N{
NAME}
represents the Unicode character named
NAME. Thus, ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’ is
equivalent to ?à
and denotes the Unicode character U+00E0. To
simplify entering multi-line strings, you can replace spaces in the
names by non-empty sequences of whitespace (e.g., newlines).
?\N{U+
X}
represents a character with Unicode code point
X, where X is a hexadecimal number. Also,
?\u
xxxx and ?\U
xxxxxxxx represent code
points xxxx and xxxxxxxx, respectively, where each x
is a single hexadecimal digit. For example, ?\N{U+E0}
,
?\u00e0
and ?\U000000E0
are all equivalent to ?à
and to ‘?\N{LATIN SMALL LETTER A WITH GRAVE}’. The Unicode
Standard defines code points only up to ‘U+10ffff’, so if
you specify a code point higher than that, Emacs signals an error.
?\xe0
is the character à (a with grave accent).
You can use any number of hex digits, so you can represent any
character code in this way.
?\002
for the character C-b. Only characters up to octal code 777 can
be specified this way.
These escape sequences may also be used in strings. See Non-ASCII in Strings.