Skip to content

codecs module doesn't recognize new C++ 23 universal-character-name \u{xxx}. #130475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrolle45 opened this issue Feb 22, 2025 · 6 comments
Closed
Labels
type-feature A feature request or enhancement

Comments

@mrolle45
Copy link

mrolle45 commented Feb 22, 2025

Bug report

Bug description:

The C++ 23 Standard has a new syntax for universal character names, which codecs.decode does not recognize. I ran this on Python 3.13, and the same occurs with earlier Python versions.

Python 3.13.1 (tags/v3.13.1:0671451, Dec  3 2024, 19:06:28) [MSC v.1942 64 bit (AMD64)] on win32
>>> import codecs
>>> codecs.decode('\u41',encoding='unicode-escape')
'A'
>>> codecs.decode('\u{41}',encoding='unicode-escape')
  File "<python-input-3>", line 1
    codecs.decode('\u{41}',encoding='unicode-escape')
                  ^^^^^^^^^^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

The result should be 'A'.

For reference, this is quoted from the C++ 23 Standard, Appendix A.3:

universal-character-name:
    ...
    \u{ simple-hexadecimal-digit-sequence }

simple-hexadecimal-digit-sequence:
    hexadecimal-digit
    simple-hexadecimal-digit-sequence hexadecimal-digit

Please update codecs in Python 3.13, and all earlier Python versions that are still publishing bug fixes.

CPython versions tested on:

3.13

Operating systems tested on:

Windows

@mrolle45 mrolle45 added the type-bug An unexpected behavior, bug, or error label Feb 22, 2025
@gvanrossum
Copy link
Member

Where is it written that Python should follow the C++ standard?

@gvanrossum gvanrossum added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Feb 22, 2025
@picnixz
Copy link
Member

picnixz commented Feb 22, 2025

I've already replied on #129392 and decided to close this feature request (#129392 (comment)). In addition, I'm not sure that { are even part of the input and not just here to indicate a mandatory value. See https://en.cppreference.com/w/cpp/language/character_literal and https://stackoverflow.com/questions/77055189/c23-char-now-supports-unicode. EDIT: the { indeed seem expected (https://en.cppreference.com/w/cpp/language/escape).

I will again reject that feature because no evidence has been shown that it's an important feature.

@picnixz picnixz closed this as completed Feb 22, 2025
@gvanrossum
Copy link
Member

Since the same user @mrolle45 submitted the same issue twice, and so far hasn’t engaged in either ticket, can we conclude this is abuse and ban them? The way the issue was written (both times) also made me think they know better.

@picnixz
Copy link
Member

picnixz commented Feb 22, 2025

Let's warn them first. One could say that the C++23 feature (which I'm not aware of though) is an "evidence" of needs for support. However, I'm not convinced by just supporting an appendix of C++23. Unless better evidence is presented on either of the issues that supporting \u{ordinal-value} is necessary, I would advise @mrolle45 to avoid re-opening similar issues, lest we would consider it as abuse.

@mrolle45
Copy link
Author

mrolle45 commented Feb 24, 2025


Sorry, I forgot that I had already raised the issue before and didn't notice it while scanning for older issues. Please forgive me. I had no intention of being abusive.

Just to make things clearer, I have posted a new Idea on Discourse and suggest that Python should keep up-to-date with C and c++ standards.

Somebody, please check the Discourse and enter this as a new Issue if you see that it has traction. I don't have the attention span to check this myself.

@gvanrossum
Copy link
Member

If you care about the issue, please track it yourself. You can't just drop an idea and disappear -- that's inefficient use of the Python project's scarce resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants