diff --git a/doc/source/_static/option_unicode01.png b/doc/source/_static/option_unicode01.png new file mode 100644 index 0000000000000..d7168de126c5b Binary files /dev/null and b/doc/source/_static/option_unicode01.png differ diff --git a/doc/source/_static/option_unicode02.png b/doc/source/_static/option_unicode02.png new file mode 100644 index 0000000000000..89e81e4f5f0ed Binary files /dev/null and b/doc/source/_static/option_unicode02.png differ diff --git a/doc/source/_static/option_unicode03.png b/doc/source/_static/option_unicode03.png new file mode 100644 index 0000000000000..0b4ee876e17fe Binary files /dev/null and b/doc/source/_static/option_unicode03.png differ diff --git a/doc/source/_static/option_unicode04.png b/doc/source/_static/option_unicode04.png new file mode 100644 index 0000000000000..1b839a44422b3 Binary files /dev/null and b/doc/source/_static/option_unicode04.png differ diff --git a/doc/source/options.rst b/doc/source/options.rst index 46ff2b6e5c343..bb78b29d7f205 100644 --- a/doc/source/options.rst +++ b/doc/source/options.rst @@ -454,10 +454,14 @@ Unicode Formatting Some East Asian countries use Unicode characters its width is corresponding to 2 alphabets. If DataFrame or Series contains these characters, default output cannot be aligned properly. +.. note:: Screen captures are attached for each outputs to show the actual results. + .. ipython:: python df = pd.DataFrame({u'国籍': ['UK', u'日本'], u'名前': ['Alice', u'しのぶ']}) - df + df; + +.. image:: _static/option_unicode01.png Enable ``display.unicode.east_asian_width`` allows pandas to check each character's "East Asian Width" property. These characters can be aligned properly by checking this property, but it takes longer time than standard ``len`` function. @@ -465,31 +469,32 @@ These characters can be aligned properly by checking this property, but it takes .. ipython:: python pd.set_option('display.unicode.east_asian_width', True) - df + df; + +.. image:: _static/option_unicode02.png In addition, Unicode contains characters which width is "Ambiguous". These character's width should be either 1 or 2 depending on terminal setting or encoding. Because this cannot be distinguished from Python, ``display.unicode.ambiguous_as_wide`` option is added to handle this. By default, "Ambiguous" character's width, "¡" (inverted exclamation) in below example, is regarded as 1. -.. note:: - - This should be aligned properly in terminal which uses monospaced font. - .. ipython:: python df = pd.DataFrame({'a': ['xxx', u'¡¡'], 'b': ['yyy', u'¡¡']}) - df + df; + +.. image:: _static/option_unicode03.png Enabling ``display.unicode.ambiguous_as_wide`` lets pandas to regard these character's width as 2. Note that this option will be effective only when ``display.unicode.east_asian_width`` is enabled. Confirm starting position has been changed, but not aligned properly because the setting is mismatched with this environment. .. ipython:: python pd.set_option('display.unicode.ambiguous_as_wide', True) - df + df; + +.. image:: _static/option_unicode04.png .. ipython:: python :suppress: pd.set_option('display.unicode.east_asian_width', False) pd.set_option('display.unicode.ambiguous_as_wide', False) - diff --git a/doc/source/whatsnew/v0.17.0.txt b/doc/source/whatsnew/v0.17.0.txt index 1e240d0786082..ab9cc17a3f990 100644 --- a/doc/source/whatsnew/v0.17.0.txt +++ b/doc/source/whatsnew/v0.17.0.txt @@ -353,10 +353,16 @@ Some East Asian countries use Unicode characters its width is corresponding to 2 .. ipython:: python df = pd.DataFrame({u'国籍': ['UK', u'日本'], u'名前': ['Alice', u'しのぶ']}) - df + df; + +.. image:: _static/option_unicode01.png + +.. ipython:: python pd.set_option('display.unicode.east_asian_width', True) - df + df; + +.. image:: _static/option_unicode02.png For further details, see :ref:`here `