Skip to content

bpo-31793: Documentation: Specialize smart-quotes for japanese. #4006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 7, 2017

Conversation

JulienPalard
Copy link
Member

@JulienPalard JulienPalard commented Oct 15, 2017

Specializing and by doing so showing the way on how to specialize smart quoting of documentation in various languages.

I ran a test build using:

rm -f docutils.conf build/
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -D locale_dirs=/tmp/mdk/locales/ -D language=fr -D gettext_compact=0 -Ea -A daily=1 -A switchers=1 . build/html
mv build build_fr
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -D locale_dirs=/tmp/mdk/locales/ -D language=ja -D gettext_compact=0 -Ea -A daily=1 -A switchers=1 . build/html
mv build build_ja
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -Ea -A daily=1 -A switchers=1 . build/html
mv build build_en

cp /tmp/docutils.conf docutils.conf
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -D locale_dirs=/tmp/mdk/locales/ -D language=fr -D gettext_compact=0 -Ea -A daily=1 -A switchers=1 . build/html
mv build build_fr_docutils
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -D locale_dirs=/tmp/mdk/locales/ -D language=ja -D gettext_compact=0 -Ea -A daily=1 -A switchers=1 . build/html
mv build build_ja_docutils
sphinx-build -j8 -b html -d build/doctrees -D latex_elements.papersize= -Ea -A daily=1 -A switchers=1 . build/html
mv build build_en_docutils
$ cat docutils.conf 
[restructuredtext parser]
smartquotes-locales: ja: ""''

fr and en yielded not a single difference, but ja differed as expected, like:

< インタープリタ名は 「シェバン」 行としてアーカイブの起点に書込まれます。
---
> インタープリタ名は &quot;シェバン&quot; 行としてアーカイブの起点に書込まれます。
206,207c206,207
< <em>main</em> 引数は 「pkg.module:callable」 の形式を取り、
< アーカイブは 「pkg.module」 をインポートして実行され、

(6431 lines of diff on the generated HTML).

https://bugs.python.org/issue31793

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

It seems like @methane likes it as well:
python/docsbuild-scripts#32 (comment)

@methane
Copy link
Member

methane commented Nov 7, 2017

LGTM. But how about completely disable smart-quote?

When building English document for French, converting "foo" to «foo» will be good idea.
But we "translating" document. When «» is better than "", translators can use «» directly.

@vstinner
Copy link
Member

vstinner commented Nov 7, 2017

French: 6.1.3. Fichiers Python « compilés »
https://docs.python.org/fr/dev/tutorial/modules.html#compiled-python-files

Original (english): https://docs.python.org/dev/tutorial/modules.html#compiled-python-files
https://docs.python.org/dev/tutorial/modules.html#compiled-python-files

Do « and » come from the translation, or from smart quotes?

If it comes from the translation, where are the smart quotes used?

@JulienPalard
Copy link
Member Author

« and » come from the smart quotes, here's the po file:

#: ../Doc/tutorial/modules.rst:193
msgid "\"Compiled\" Python files"
msgstr "Fichiers Python \"compilés\""

@methane
Copy link
Member

methane commented Nov 7, 2017

Then, without smart-quote, translators can choose quote symbols.
With smart-quote, translators can't.

@vstinner
Copy link
Member

vstinner commented Nov 7, 2017

« and » come from the smart quotes, here's the po file: (...)

Ok. So I undertand that all quotes would have to be modified in the french PO file if we disable smart quotes for all languages. I dislike requesting the french translators to modify so many translations :-/

What is the impact of this change for the japanese PO translation? Does it break anything?

@vstinner
Copy link
Member

vstinner commented Nov 7, 2017

@methane: "With smart-quote, translators can't."

Hum. I don't understand something. The bpo title is "Allow to specialize smart quotes in documentation translations". What is the purpose of this PR in this case?

@vstinner
Copy link
Member

vstinner commented Nov 7, 2017

@methane: "With smart-quote, translators can't."

Sorry, I don't understand well the consequences of smart quotes.

Let's say that this PR is merge.

  • Original text: msgid ""Compiled" Python files"
  • French translation: msgstr "Fichiers Python "compilés""
  • (Hypothetical) Japanese translation: msgstr "Fichiers Python "compilés""
  • French rendered as: Fichiers Python « compilés »
  • Japanese rendered as: Fichiers Python "compilés"
  • (Hypothetical) Japanese translation 2: msgstr "Fichiers Python [compilés]"
  • (Hypothetical) Japanese translation 2 rendered as: Fichiers Python [compilés]

Smart quotes only changed the rendered text if the translation keeps "..." quotes, no? But with this PR, "..." is left unchanged. Japanese translators are free to replace "..." quotes with whatever else in the PO file, no?

@JulienPalard
Copy link
Member Author

A question is "Did smart quote been introduced just to fix translation of lazy translators, or is it a really interesting feature even for good translators". If it's "just to fix lazyness" I would be OK to drop it, let's do things properly in the first place.

I searched an answer to this question and found http://docutils.sourceforge.net/docs/user/smartquotes.html

Looks like SmartQuotes is also used to transform --- into so it look like they're used by the english documentation too, not only translations, some can be seen around: https://docs.python.org/3/distutils/apiref.html#module-distutils.filelist

So completly removing smartquotes will also ask for english documents to be reviewed (actually a single sed is probably enough).

The smartquotes documentat also say it may be better to do it right in the first place than relying on smartquotes:

Even if you do care about accurate typography, you still might want to think twice before "auto-educating" the quote characters in your documents. As there is always a chance that the algorithm gets it wrong, you may instead prefer to use the compose key or some other means to insert the correct Unicode characters into the source.

If we're going to remove smartquotes, I'm OK to review french translation myself to fix quotes, let's not make this a blocking issue.

@vstinner
Copy link
Member

vstinner commented Nov 7, 2017

--- hyphen is very common in the Python documentation:

haypo@selma$ grep -R ' --- ' Doc|wc -l
469

@methane
Copy link
Member

methane commented Nov 7, 2017

We use 「おはよう」 for Japanese sentence, and both of 「Good morning!」and "Good morning!" are OK.
But in technical document, we don't use 「options」. We use "options" instead.

So this pull request is OK for Japanese document.
I just wonder if French translators use « when they want «, and use " when they want it.
But I don't know much about French. If you're OK for smart-quote, let's merge it.

@JulienPalard
Copy link
Member Author

I searched a a few minutes in the french documentation and didn't found a paragraph where " …. " would be better than the imposed-by-smartquotes « …», and never encontered one in the past, but we only translated 1/4 of the whole, so we may miss some smartquote induced problems in the future, I can't know.

@merwok
Copy link
Member

merwok commented Nov 7, 2017

It seems to me that the automatic replacement of --- with and straight quotes to correct French quotes is a work-around for poor Unicode support in input methods and displays. Maybe it was or is useful for docs extracted from docstrings that are ASCII-only, like the rule for the stdlib. But for translations in 2017, I would not rely on these automatic replacements but use the real characters directly in po files: accents, chevron quotes, ™ symbols, long dashes and all. It really isn’t fun to read or edit source full of \uXXX or &xxx; escapes.

(Same rule for non-breaking spaces if the Python French translation uses the France variant; in Québec or Switzerland there are no spaces.)

@merwok
Copy link
Member

merwok commented Nov 7, 2017

After reading most recent replies: if having straight quotes in source and using the automated replacement works for Julien, then +1! And if desired by the translation team, the source could be moved to correct quotes over time, removing the need for the transform.

@methane methane merged commit 5a66c8a into python:master Nov 7, 2017
@miss-islington
Copy link
Contributor

Thanks @JulienPalard for the PR, and @methane for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @JulienPalard for the PR, and @methane for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-4324 is a backport of this pull request to the 2.7 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 7, 2017
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 7, 2017
@bedevere-bot
Copy link

GH-4325 is a backport of this pull request to the 3.6 branch.

methane pushed a commit that referenced this pull request Nov 7, 2017
methane pushed a commit that referenced this pull request Nov 7, 2017
embray pushed a commit to embray/cpython that referenced this pull request Nov 9, 2017
adrianliaw added a commit to adrianliaw/cpython that referenced this pull request Oct 11, 2018
This is a replacement for python#9337.

* Using conf.py to disable smartquotes through Sphinx instead of using docutils.conf
* Disable smartquotes for Japanese, French and Traditional Chinese translations

See:
* Suggesting to use conf.py: https://mail.python.org/pipermail/doc-sig/2018-September/004084.html
* Original smartquotes issue in ja translations: https://bugs.python.org/issue31793
* Disabling for ja: python#4006
* Smartquotes issue in fr: python/python-docs-fr#303
* Smartquotes issue in zh_TW: https://mail.python.org/pipermail/doc-sig/2018-August/004079.html
@JulienPalard JulienPalard deleted the issue-31793-smart-quotes branch June 16, 2019 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants