Open
Description
Testing with R pluralize on vwr english.words from CELEX (~66K) returns ~600 inconsistencies:
library(pluralize)
library(vwr)
dat<-data.frame(word=english.words,sing=singularize(english.words))
dat<-dat[which(dat$word!=dat$sing & !dat$sing %in% english.words),]
Examples:
abdomen: abdoman
always: alway
amaryllis: amarylli
appendicitis: appendiciti
asbestos: asbesto
axis: axi
(...) and so on.
One possible workaround is to filter singulars that are not in a selected dictionary or lexicon.
Metadata
Metadata
Assignees
Labels
No labels