+<sect1 id="charset-handling">
+<title>Character Set Handling</title>
+
+<para>
+A <quote>character set</quote> is basically a mapping between bytes and
+glyphs and implies a certain character encoding scheme. For example, for
+the ISO 8859 family of character sets, an encoding of 8bit per character
+is used. For the Unicode character set, different character encodings
+may be used, UTF-8 being the most popular. In UTF-8, a character is
+represented using a variable number of bytes ranging from 1 to 4.
+</para>
+
+<para>
+Since Mutt is a command-line tool run from a shell, and delegates
+certain tasks to external tools (such as an editor for composing/editing
+messages), all of these tools need to agree on a character set and
+encoding. There exists no way to reliably deduce the character set a
+plain text file has. Interoperability is gained by the use of
+well-defined environment variables. The full set can be printed by
+issuing <literal>locale</literal> on the command line.
+</para>
+
+<para>
+Upon startup, Mutt determines the character set on its own using
+routines that inspect locale-specific environment variables. Therefore,
+it is generally not necessary to set the <literal>$charset</literal>
+variable in Mutt. It may even be counter-productive as Mutt uses system
+and library functions that derive the character set themselves and on
+which Mutt has no influence. It's safest to let Mutt work out the locale
+setup itself.
+</para>
+
+<para>
+If you happen to work with several character sets on a regular basis,
+it's highly advisable to use Unicode and an UTF-8 locale. Unicode can
+represent nearly all characters in a message at the same time. When not
+using a Unicode locale, it may happen that you receive messages with
+characters not representable in your locale. When displaying such a
+message, or replying to or forwarding it, information may get lost
+possibly rendering the message unusable (not only for you but also for
+the recipient, this breakage is not reversible as lost information
+cannot be guessed).
+</para>
+
+<para>
+A Unicode locale makes all conversions superfluous which eliminates the
+risk of conversion errors. It also eliminates potentially wrong
+expectations about the character set between Mutt and external programs.
+</para>
+
+<para>
+The terminal emulator used also must be properly configured for the
+current locale. Terminal emulators usually do <emphasis>not</emphasis>
+derive the locale from environment variables, they need to be configured
+separately. If the terminal is incorrectly configured, Mutt may display
+random and unexpected characters (question marks, octal codes, or just
+random glyphs), format strings may not work as expected, you may not be
+abled to enter non-ascii characters, and possible more. Data is always
+represented using bytes and so a correct setup is very important as to
+the machine, all character sets <quote>look</quote> the same.
+</para>
+
+<para>
+Warning: A mismatch between what system and library functions think the
+locale is and what Mutt was told what the locale is may make it behave
+badly with non-ascii input: it will fail at seemingly random places.
+This warning is to be taken seriously since not only local mail handling
+may suffer: sent messages may carry wrong character set information the
+<emphasis>receiver</emphasis> has too deal with. The need to set
+<literal>$charset</literal> directly in most cases points at terminal
+and environment variable setup problems, not Mutt problems.
+</para>
+
+<para>
+A list of officially assigned and known character sets can be found at
+<ulink url="http://www.iana.org/assignments/character-sets">IANA</ulink>,
+a list of locally supported locales can be obtained by running
+<literal>locale -a</literal>.
+</para>
+
+</sect1>
+