devn00b
/
EQ2EMu


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130
							//
//  Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
//
//  Distributed under the Boost Software License, Version 1.0. (See
//  accompanying file LICENSE_1_0.txt or copy at
//  http://www.boost.org/LICENSE_1_0.txt)
//

// vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
/*!
\page std_locales Introduction to C++ Standard Library localization support

\section std_locales_basics Getting familiar with standard C++ Locales

The C++ standard library offers a simple and powerful way to provide locale-specific information. It is done via the \c 
std::locale class, the container that holds all the required information about a specific culture, such as number formatting
patterns, date and time formatting, currency, case conversion etc.

All this information is provided by facets, special classes derived from the \c std::locale::facet base class. Such facets are
packed into the \c std::locale class and allow you to provide arbitrary information about the locale. The \c std::locale class
keeps reference counters on installed facets and can be efficiently copied.

Each facet that was installed into the \c std::locale object can be fetched using the \c std::use_facet function. For example,
the \c std::ctype<Char> facet provides rules for case conversion, so you can convert a character to upper-case like this:

\code
std::ctype<char> const &ctype_facet = std::use_facet<std::ctype<char> >(some_locale);
char upper_a = ctype_facet.toupper('a');
\endcode

A locale object can be imbued into an \c iostream so it would format information according to the locale:

\code
cout.imbue(std::locale("en_US.UTF-8"));
cout << 1345.45 << endl;
cout.imbue(std::locale("ru_RU.UTF-8"));
cout << 1345.45 << endl;
\endcode

Would display:

\verbatim
    1,345.45 1.345,45
\endverbatim

You can also create your own facets and install them into existing locale objects. For example:

\code
    class measure : public std::locale::facet {
    public:
        typedef enum { inches, ... } measure_type;
        measure(measure_type m,size_t refs=0) 
        double from_metric(double value) const;
        std::string name() const;
        ...
    };
\endcode
And now you can simply provide this information to a locale:

\code
    std::locale::global(std::locale(std::locale("en_US.UTF-8"),new measure(measure::inches)));
    /// Create default locale built from en_US locale and add paper size facet.
\endcode


Now you can print a distance according to the correct locale:

\code
    void print_distance(std::ostream &out,double value)
    {
        measure const &m = std::use_facet<measure>(out.getloc());
        // Fetch locale information from stream
        out << m.from_metric(value) << " " << m.name();
    }
\endcode

This technique was adopted by the Boost.Locale library in order to provide powerful and correct localization. Instead of using
the very limited C++ standard library facets, it uses ICU under the hood to create its own much more powerful ones.

\section std_locales_common Common Critical Problems with the Standard Library

There are numerous issues in the standard library that prevent the use of its full power, and there are several
additional issues:

-   Setting the global locale has bad side effects.
    \n
    Consider following code:
    \n
    \code
        int main()
        {
            std::locale::global(std::locale("")); 
            // Set system's default locale as global
            std::ofstream csv("test.csv");
            csv << 1.1 << ","  << 1.3 << std::endl;
        }
    \endcode
    \n
    What would be the content of \c test.csv ? It may be "1.1,1.3" or it may be "1,1,1,3" 
    rather than what you had expected.
    \n
    More than that it affects even \c printf and libraries like \c boost::lexical_cast giving
    incorrect or unexpected formatting. In fact many third-party libraries are broken in such a
    situation.
    \n
    Unlike the standard localization library, Boost.Locale never changes the basic number formatting,
    even when it uses \c std based localization backends, so by default, numbers are always
    formatted using C-style locale. Localized number formatting requires specific flags.
    \n
-   Number formatting is broken on some locales.
    \n
    Some locales use the non-breakable space u00A0 character for thousands separator, thus
    in \c ru_RU.UTF-8 locale number 1024 should be displayed as "1 024" where the space
    is a Unicode character with codepoint u00A0. Unfortunately many libraries don't handle
    this correctly, for example GCC and SunStudio display a "\xC2" character instead of
    the first character in the UTF-8 sequence "\xC2\xA0" that represents this code point, and 
    actually generate invalid UTF-8.
    \n
-   Locale names are not standardized. For example, under MSVC you need to provide the name
    \c en-US or \c English_USA.1252 , when on POSIX platforms it would be \c en_US.UTF-8 
    or \c en_US.ISO-8859-1 
    \n
    More than that, MSVC does not support UTF-8 locales at all.
    \n
-   Many standard libraries provide only the C and POSIX locales, thus GCC supports localization
    only under Linux. On all other platforms, attempting to create locales other than "C" or
    "POSIX" would fail.

*/