std_locales.html 9.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  2. <html xmlns="http://www.w3.org/1999/xhtml">
  3. <head>
  4. <meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
  5. <meta http-equiv="X-UA-Compatible" content="IE=9"/>
  6. <meta name="generator" content="Doxygen 1.8.6"/>
  7. <title>Boost.Locale: Introduction to C++ Standard Library localization support</title>
  8. <link href="tabs.css" rel="stylesheet" type="text/css"/>
  9. <script type="text/javascript" src="jquery.js"></script>
  10. <script type="text/javascript" src="dynsections.js"></script>
  11. <link href="navtree.css" rel="stylesheet" type="text/css"/>
  12. <script type="text/javascript" src="resize.js"></script>
  13. <script type="text/javascript" src="navtree.js"></script>
  14. <script type="text/javascript">
  15. $(document).ready(initResizable);
  16. $(window).load(resizeHeight);
  17. </script>
  18. <link href="doxygen.css" rel="stylesheet" type="text/css" />
  19. </head>
  20. <body>
  21. <div id="top"><!-- do not remove this div, it is closed by doxygen! -->
  22. <div id="titlearea">
  23. <table cellspacing="0" cellpadding="0">
  24. <tbody>
  25. <tr style="height: 56px;">
  26. <td id="projectlogo"><img alt="Logo" src="boost-small.png"/></td>
  27. <td style="padding-left: 0.5em;">
  28. <div id="projectname">Boost.Locale
  29. </div>
  30. </td>
  31. </tr>
  32. </tbody>
  33. </table>
  34. </div>
  35. <!-- end header part -->
  36. <!-- Generated by Doxygen 1.8.6 -->
  37. <div id="navrow1" class="tabs">
  38. <ul class="tablist">
  39. <li><a href="index.html"><span>Main&#160;Page</span></a></li>
  40. <li class="current"><a href="pages.html"><span>Related&#160;Pages</span></a></li>
  41. <li><a href="modules.html"><span>Modules</span></a></li>
  42. <li><a href="namespaces.html"><span>Namespaces</span></a></li>
  43. <li><a href="annotated.html"><span>Classes</span></a></li>
  44. <li><a href="files.html"><span>Files</span></a></li>
  45. <li><a href="examples.html"><span>Examples</span></a></li>
  46. </ul>
  47. </div>
  48. </div><!-- top -->
  49. <div id="side-nav" class="ui-resizable side-nav-resizable">
  50. <div id="nav-tree">
  51. <div id="nav-tree-contents">
  52. <div id="nav-sync" class="sync"></div>
  53. </div>
  54. </div>
  55. <div id="splitbar" style="-moz-user-select:none;"
  56. class="ui-resizable-handle">
  57. </div>
  58. </div>
  59. <script type="text/javascript">
  60. $(document).ready(function(){initNavTree('std_locales.html','');});
  61. </script>
  62. <div id="doc-content">
  63. <div class="header">
  64. <div class="headertitle">
  65. <div class="title">Introduction to C++ Standard Library localization support </div> </div>
  66. </div><!--header-->
  67. <div class="contents">
  68. <div class="textblock"><h1><a class="anchor" id="std_locales_basics"></a>
  69. Getting familiar with standard C++ Locales</h1>
  70. <p>The C++ standard library offers a simple and powerful way to provide locale-specific information. It is done via the <code>std::locale</code> class, the container that holds all the required information about a specific culture, such as number formatting patterns, date and time formatting, currency, case conversion etc.</p>
  71. <p>All this information is provided by facets, special classes derived from the <code>std::locale::facet</code> base class. Such facets are packed into the <code>std::locale</code> class and allow you to provide arbitrary information about the locale. The <code>std::locale</code> class keeps reference counters on installed facets and can be efficiently copied.</p>
  72. <p>Each facet that was installed into the <code>std::locale</code> object can be fetched using the <code>std::use_facet</code> function. For example, the <code>std::ctype&lt;Char&gt;</code> facet provides rules for case conversion, so you can convert a character to upper-case like this:</p>
  73. <div class="fragment"><div class="line">std::ctype&lt;char&gt; <span class="keyword">const</span> &amp;ctype_facet = std::use_facet&lt;std::ctype&lt;char&gt; &gt;(some_locale);</div>
  74. <div class="line"><span class="keywordtype">char</span> upper_a = ctype_facet.toupper(<span class="charliteral">&#39;a&#39;</span>);</div>
  75. </div><!-- fragment --><p>A locale object can be imbued into an <code>iostream</code> so it would format information according to the locale:</p>
  76. <div class="fragment"><div class="line">cout.imbue(std::locale(<span class="stringliteral">&quot;en_US.UTF-8&quot;</span>));</div>
  77. <div class="line">cout &lt;&lt; 1345.45 &lt;&lt; endl;</div>
  78. <div class="line">cout.imbue(std::locale(<span class="stringliteral">&quot;ru_RU.UTF-8&quot;</span>));</div>
  79. <div class="line">cout &lt;&lt; 1345.45 &lt;&lt; endl;</div>
  80. </div><!-- fragment --><p>Would display:</p>
  81. <pre class="fragment"> 1,345.45 1.345,45
  82. </pre><p>You can also create your own facets and install them into existing locale objects. For example:</p>
  83. <div class="fragment"><div class="line"><span class="keyword">class </span>measure : <span class="keyword">public</span> std::locale::facet {</div>
  84. <div class="line"><span class="keyword">public</span>:</div>
  85. <div class="line"> <span class="keyword">typedef</span> <span class="keyword">enum</span> { inches, ... } measure_type;</div>
  86. <div class="line"> measure(measure_type m,<span class="keywordtype">size_t</span> refs=0) </div>
  87. <div class="line"> double from_metric(<span class="keywordtype">double</span> value) const;</div>
  88. <div class="line"> std::<span class="keywordtype">string</span> name() const;</div>
  89. <div class="line"> ...</div>
  90. <div class="line">};</div>
  91. </div><!-- fragment --><p> And now you can simply provide this information to a locale:</p>
  92. <div class="fragment"><div class="line">std::locale::global(std::locale(std::locale(<span class="stringliteral">&quot;en_US.UTF-8&quot;</span>),<span class="keyword">new</span> measure(measure::inches)));</div>
  93. <div class="line"><span class="comment">/// Create default locale built from en_US locale and add paper size facet.</span></div>
  94. </div><!-- fragment --><p>Now you can print a distance according to the correct locale:</p>
  95. <div class="fragment"><div class="line"><span class="keywordtype">void</span> print_distance(std::ostream &amp;out,<span class="keywordtype">double</span> value)</div>
  96. <div class="line">{</div>
  97. <div class="line"> measure <span class="keyword">const</span> &amp;m = std::use_facet&lt;measure&gt;(out.getloc());</div>
  98. <div class="line"> <span class="comment">// Fetch locale information from stream</span></div>
  99. <div class="line"> out &lt;&lt; m.from_metric(value) &lt;&lt; <span class="stringliteral">&quot; &quot;</span> &lt;&lt; m.name();</div>
  100. <div class="line">}</div>
  101. </div><!-- fragment --><p>This technique was adopted by the Boost.Locale library in order to provide powerful and correct localization. Instead of using the very limited C++ standard library facets, it uses ICU under the hood to create its own much more powerful ones.</p>
  102. <h1><a class="anchor" id="std_locales_common"></a>
  103. Common Critical Problems with the Standard Library</h1>
  104. <p>There are numerous issues in the standard library that prevent the use of its full power, and there are several additional issues:</p>
  105. <ul>
  106. <li>Setting the global locale has bad side effects. <br/>
  107. Consider following code: <br/>
  108. <div class="fragment"><div class="line"><span class="keywordtype">int</span> main()</div>
  109. <div class="line">{</div>
  110. <div class="line"> std::locale::global(std::locale(<span class="stringliteral">&quot;&quot;</span>)); </div>
  111. <div class="line"> <span class="comment">// Set system&#39;s default locale as global</span></div>
  112. <div class="line"> std::ofstream csv(<span class="stringliteral">&quot;test.csv&quot;</span>);</div>
  113. <div class="line"> csv &lt;&lt; 1.1 &lt;&lt; <span class="stringliteral">&quot;,&quot;</span> &lt;&lt; 1.3 &lt;&lt; std::endl;</div>
  114. <div class="line">}</div>
  115. </div><!-- fragment --> <br/>
  116. What would be the content of <code>test.csv</code> ? It may be "1.1,1.3" or it may be "1,1,1,3" rather than what you had expected. <br/>
  117. More than that it affects even <code>printf</code> and libraries like <code>boost::lexical_cast</code> giving incorrect or unexpected formatting. In fact many third-party libraries are broken in such a situation. <br/>
  118. Unlike the standard localization library, Boost.Locale never changes the basic number formatting, even when it uses <code>std</code> based localization backends, so by default, numbers are always formatted using C-style locale. Localized number formatting requires specific flags. <br/>
  119. </li>
  120. <li>Number formatting is broken on some locales. <br/>
  121. Some locales use the non-breakable space u00A0 character for thousands separator, thus in <code>ru_RU.UTF-8</code> locale number 1024 should be displayed as "1 024" where the space is a Unicode character with codepoint u00A0. Unfortunately many libraries don't handle this correctly, for example GCC and SunStudio display a "\xC2" character instead of the first character in the UTF-8 sequence "\xC2\xA0" that represents this code point, and actually generate invalid UTF-8. <br/>
  122. </li>
  123. <li>Locale names are not standardized. For example, under MSVC you need to provide the name <code>en-US</code> or <code>English_USA.1252</code> , when on POSIX platforms it would be <code>en_US.UTF-8</code> or <code>en_US.ISO-8859-1</code> <br/>
  124. More than that, MSVC does not support UTF-8 locales at all. <br/>
  125. </li>
  126. <li>Many standard libraries provide only the C and POSIX locales, thus GCC supports localization only under Linux. On all other platforms, attempting to create locales other than "C" or "POSIX" would fail. </li>
  127. </ul>
  128. </div></div><!-- contents -->
  129. </div><!-- doc-content -->
  130. <li class="footer">
  131. &copy; Copyright 2009-2012 Artyom Beilis, Distributed under the <a href="http://www.boost.org/LICENSE_1_0.txt">Boost Software License</a>, Version 1.0.
  132. </li>
  133. </ul>
  134. </div>
  135. </body>
  136. </html>