Skip to content

RGA.diff: Ivestigate what bad can happen when splitting multi-codepoint "characters" into codepoints #146

Open
@cblp

Description

@cblp

What bad can happen if we split é into e + ´?

See also 2-codepoint country flags.

For this, Unicode has a concept of “grapheme cluster”. There’s also “extended grapheme cluster” (EGC), which is basically an updated version of the concept.

http://unicode.org/glossary/#grapheme_cluster

http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions