Description
Expected Behavior
Hi,I want to know how does mitie deal with the segmentation of OOV.
In fact, two of my train example like this:
1.The daily life of the [League Of Legends](name) on November 10 (chinese: [英雄联盟](name)11.10的日活)
2. The daily life of the [Tomb Raider3](name) on November 10 (chinese: [古墓丽影3](name)11.10的日活)
My training sample is in Chinese which contains many entities related to the game name. Some game names contain numbers, some have no numbers,like "古墓丽影3" and ”英雄联盟“.In the example above , I want mitie to identify the entities as "古墓丽影3" and the ”英雄联盟“. 11.10 is a simple representation of the date,which should not be include.
Current Behavior
I label the entity correctly.However, the first sample is often identified as ”英雄联盟11" rather than ”英雄联盟". How can I deal with this problem? I try to add several data,but It's work. Should I add more data ?
- Version: 0.7.0
- Where did you get MITIE: pip install
- Platform: windows64 and linux64