Skip to content

Use regex to match badwords #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 6, 2020
Merged

Use regex to match badwords #24

merged 8 commits into from
Oct 6, 2020

Conversation

nathanaelhoun
Copy link
Contributor

@nathanaelhoun nathanaelhoun commented Sep 25, 2020

Summary

Use regex to match badwords.

The regex is build on configuration change and then use for filter each post.

  • The plugin can detect multi-word badwords (Breaking changes : the word list is now comma-separated)
  • The plugin now detect badwords even with punctuation
  • Default bad word list is updated to add more multi-word bad words and use regex
  • README.md updated

image

Ticket Link

Closes #20 and closes #22

@codecov-commenter
Copy link

codecov-commenter commented Sep 25, 2020

Codecov Report

Merging #24 into master will increase coverage by 9.05%.
The diff coverage is 69.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #24      +/-   ##
==========================================
+ Coverage   39.02%   48.07%   +9.05%     
==========================================
  Files           3        3              
  Lines          41       52      +11     
==========================================
+ Hits           16       25       +9     
- Misses         22       24       +2     
  Partials        3        3              
Impacted Files Coverage Δ
server/configuration.go 32.25% <58.33%> (+18.62%) ⬆️
server/plugin.go 75.00% <81.81%> (+2.77%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3aeb251...bef1495. Read the comment docs.

@hanzei hanzei added 2: Dev Review Requires review by a core committer 3: QA Review Requires review by a QA tester labels Sep 25, 2020
@hanzei hanzei self-requested a review September 25, 2020 16:01
@hanzei hanzei mentioned this pull request Sep 25, 2020
3 tasks
Copy link
Contributor

@hanzei hanzei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT 👍

@hanzei hanzei requested a review from levb October 2, 2020 11:48
Copy link

@levb levb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a nit.

@levb levb removed the 2: Dev Review Requires review by a core committer label Oct 5, 2020
@levb
Copy link

levb commented Oct 5, 2020

@nathanaelhoun thank you for this meaningful improvement. Do you mind fixing my nit comment, if you agree with it? Otherwise, LGTM. @DHaussermann will you have the cycles to test this, or should we consider the new unit tests sufficient?

@DHaussermann DHaussermann self-requested a review October 5, 2020 17:19
@DHaussermann
Copy link

Yes, I can take a quick look once changes are completed.

@nathanaelhoun
Copy link
Contributor Author

@levb Thanks! Agree with you, so I applied it.

@DHaussermann You can test this :-)

Copy link

@DHaussermann DHaussermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and passed

  • Regex match solves both issues as expected
  • Can now filter words that contain a space like "bad word" with no impact on the individual word
  • Tested basic ways of circumventing filter:
    • add ! and other punctuation
    • add dash or other special char at the end
    • encase the word in a comment block
  • Users will no longer side-step the filter naturally by punctuating words
  • No issues found.

LGTM!

Thanks @nathanaelhoun Much appreciated. You've added a huge amount of value by making this change!

@DHaussermann DHaussermann added 4: Reviews Complete All reviewers have approved the pull request and removed 3: QA Review Requires review by a QA tester labels Oct 6, 2020
@levb levb merged commit c4f4195 into mattermost-community:master Oct 6, 2020
@hanzei hanzei added this to the v1.0.0 milestone Oct 7, 2020
@nathanaelhoun nathanaelhoun deleted the gh-20-multiword-censor branch October 8, 2020 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4: Reviews Complete All reviewers have approved the pull request
Projects
None yet
5 participants