Remove u string prefix from docs#1174
Conversation
akx
left a comment
There was a problem hiding this comment.
Thank you for the PR! Generally I'm on board with this, but there are some inconsistencies and weirdness here.
| 'apples, oranges, and pears' | ||
| >>> format_list(['apples', 'oranges', 'pears'], locale='zh') | ||
| u'apples\u3001oranges\u548cpears' | ||
| 'apples、oranges和pears' |
There was a problem hiding this comment.
Why do some of the changes have escaped sequences replaced with characters and others don't? 🤔
I think I'd rather not change any escape sequences to the unescaped versions, just so it's easy to see e.g. the difference between - – — 😄
There was a problem hiding this comment.
Like I said,
I also updated some strings to include the character instead of \u and \x literals since that's what the Python 3 REPL does
I don't think I missed any, some sequences are escaped because they're unprintable characters that are escaped by the Python REPL, like \u2009. I think for console examples the example should print exactly what the REPL outputs, we could argue about what to put in the code that causes the REPL to print that string though.
Like we can have
>>> 'apples\u3001oranges\u548cpears'
'apples、oranges和pears'but we definitely shouldn't have
>>> 'apples\u3001oranges\u548cpears'
'apples\u3001oranges\u548cpears'because that's not what actually happens in the REPL.
There was a problem hiding this comment.
I also separated the 3 things I did into 3 separate commits for easier review.
| >>> fmt = Format('en_US', tzinfo=get_timezone('US/Eastern')) | ||
| >>> fmt.datetime(datetime(2007, 4, 1, 15, 30)) | ||
| u'Apr 1, 2007, 11:30:00\u202fAM' | ||
| 'Apr 1, 2007, 11:30:00\\u202fAM' |
There was a problem hiding this comment.
Why does a single slash turn into two here? (And in other places in the diff.)
There was a problem hiding this comment.
I explained in the PR
I also noticed some literals in docstrings were not escaped correctly (you need to escape the backslash unless it's a r""" docstring, otherwise the docstring will contain the actual character instead of the literal)
It's because this is a string within a string (a docstring) that starts here
Line 91 in 3000762
Someone pasted REPL output directly into the docstring without escaping the backslashes, the other way to do this right is to use a r""" docstring which some of the code does.
You can see that this is wrong if you open a Python REPL and run this code
from babel.support import Format
fmt = Format('en_US', tzinfo=get_timezone('US/Eastern'))
help(fmt.datetime)you'll see this:
but if you actually type the code into the repl the character is escaped
|
It would be good to add doctest to babel to check that the REPL output in the examples actually matches what you get if you were to run the code, but I'm not going to work on that. |
We do have that... Lines 22 to 24 in 98b9562 |
|
I thought it would catch the incorrectly escaped strings but it turns out doctest thinks that ['E d.\\u2009–\\u2009', 'E d.M.']and ['E d.\u2009–\u2009', 'E d.M.']are the same, which is surprising because they obviously are not but they wont fix it python/cpython#129257 (comment) |
|
@verhovsky Very sorry for the delay! Can you rebase this (to fix the |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1174 +/- ##
=======================================
Coverage 91.98% 91.98%
=======================================
Files 27 27
Lines 4693 4693
=======================================
Hits 4317 4317
Misses 376 376
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I did a search and replace for
u"and(?<!')\bu'and checked each case. I also updated some strings to include the character instead of\uand\xliterals since that's what the Python 3 REPL does and I also noticed some literals in docstrings were not escaped correctly (you need to escape the backslash unless it's ar"""docstring, otherwise the docstring will contain the actual character instead of the literal)