Skip to content

Arabic: Support Harakat #1314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Jason3S opened this issue Jul 29, 2022 · 5 comments · Fixed by #1317 or #1313
Closed

Arabic: Support Harakat #1314

Jason3S opened this issue Jul 29, 2022 · 5 comments · Fixed by #1317 or #1313

Comments

@Jason3S
Copy link
Collaborator

Jason3S commented Jul 29, 2022

As mentioned the link to improve the dictionary in this link

we have a problem in Harakat in Arabic. They are considered complete letters. How we can ignore them?

Screen Shot 2022-07-29 at 11 43 43

Originally posted by @WatheqAlshowaiter in #941 (comment)

@Jason3S
Copy link
Collaborator Author

Jason3S commented Jul 29, 2022

@WatheqAlshowaiter,

Great example. Would you please add (in text) some example word pairs?

It is possible words with Harakat are not in the dictionary, you might need to follow up with @linuxscout to see if they are there. I'll regenerate the dictionary to a greater depth see if those words are added. But he might have a better idea on how Harakat is handled in other spell checkers.

@linuxscout
Copy link

ٍHi @Jason3S,
To resolve this issue, we must remove diacritics before spellchecking,
we add remove_diacritics(word).

The word to compare against the dictionary is a non diacritized word (word without diacritics).
This case is handled by Hunspell like this, we added a feature name "IGNORE" to ignore a list of diacritics in the word. In the code we remove diacritics before spelling, but we keep it on display.

@linuxscout
Copy link

@Jason3S No need to rebuild the dictionary

@Jason3S
Copy link
Collaborator Author

Jason3S commented Jul 29, 2022

@linuxscout,

Thank you.

Found it:
image

I'll see if I can add support for IGNORE to cspell.

Jason3S added a commit to streetsidesoftware/cspell that referenced this issue Jul 29, 2022
Added the ability to specify characters to be ignored (removed) from a word before checking the word in the dictionary.

Related to Aribic Harakat vowel accents.
streetsidesoftware/cspell-dicts#1314
Jason3S added a commit to streetsidesoftware/cspell that referenced this issue Jul 29, 2022
* feat: Support Ignoring characters before checking

   Added the ability to specify characters to be ignored (removed) from a word before checking the word in the dictionary.

   Related to Aribic Harakat vowel accents.
   streetsidesoftware/cspell-dicts#1314
Jason3S added a commit that referenced this issue Jul 29, 2022
Jason3S added a commit that referenced this issue Jul 29, 2022
@Jason3S
Copy link
Collaborator Author

Jason3S commented Jul 29, 2022

I have added support to CSpell to ignore accents if specified in the dictionary definition.

I'll update the VS Code extension tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants