You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG
+44Lines changed: 44 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,47 @@
1
+
========
2
+
1.6.2
3
+
========
4
+
---------------
5
+
Overview
6
+
---------------
7
+
In this release, we focused on reviewing out streaming performance, buy measuring our amount of sentences processed by second, through a LightPipeline.
8
+
We increased Norvig Spell Checker by more than 300% by disabling DoubleVariants and improving algorithm orders. It is now reported capable of 42K sentences per second.
9
+
Symmetric Delete Spell checker is more performance, although it has been reported to process 2K sentences per second.
10
+
NerCRF has been reported to process 300 hundred sentences per second, while NerDL can do twice fast (about 700 sentences per second).
11
+
Vivekn Sentiment Analysis was improved and is now capable to processing 100K sentences per sentence (before it was below 500).
12
+
Finally, SentenceDetector performance was improved by a 40% from ~30K rows processed per second to ~40K. But, we have now enabled Abbreviation processing by default which reduces final speed to 22K rows per second with a negative net but better accuracy.
13
+
Again, thanks for the community for helping with feedback. We welcome everyone asking questions or giving feedback in our Slack channel or reporting issues on Github.
14
+
15
+
---------------
16
+
Enhancements
17
+
---------------
18
+
* OCR now features kernel segmentation. Significantly improves image based PDF processing
19
+
* Vivekn Sentiment Analysis prediction performance improved by better data structures
20
+
* Both Norvig and Symmetric Delete spell checkers now have improved performance
21
+
* SentenceDetector improved accuracy by better handling abbreviations. UseAbbreviations now also by default turned ON
22
+
* SentenceDetector improved performance significantly by improved preloading of rules
23
+
24
+
---------------
25
+
Bug fixes
26
+
---------------
27
+
* Fixed NerDL not training correctly (broken since 1.6.0). Pretrained models not affected
28
+
* Fixed NerConverter not properly considering multiple sentences per row (after using SentenceDetector), causing an unhandled exception to occur in some scenarios.
29
+
* Tensorflow sessions now all support allow_soft_placement, supporting GPU based graphs to work with and without GPU
30
+
* Norvig Spell Checker fixed a missing step from the algorithm to check for additional variants. May improve accuracy
31
+
* Norvig Spell Checker disabled DoubleVariants by default. Was not improving accuracy significantly and was hitting performance very hard
32
+
33
+
---------------
34
+
Developer API
35
+
---------------
36
+
* New FeatureSet allows HashSet params
37
+
38
+
---------------
39
+
Models
40
+
---------------
41
+
* Vivekn Sentiment Pipeline doesn't have Spell Checker anymore
Since we are dealing with small amounts of data, we put in practice LightPipelines.
104
104
</p>
105
105
<p>
106
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/example/src/TrainViveknSentiment.scala" target="_blank"> Take me to notebook!</a>
106
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/example/src/TrainViveknSentiment.scala" target="_blank"> Take me to notebook!</a>
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/vivekn-sentiment/sentiment.ipynb" target="_blank"> Take me to notebook!</a>
138
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/vivekn-sentiment/sentiment.ipynb" target="_blank"> Take me to notebook!</a>
Each of these sentences will be used for giving a score to text
158
158
</p>
159
159
</p>
160
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/dictionary-sentiment/sentiment.ipynb" target="_blank"> Take me to notebook!</a>
160
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/dictionary-sentiment/sentiment.ipynb" target="_blank"> Take me to notebook!</a>
approach to use the same pipeline for tagging external resources.
178
178
</p>
179
179
<p>
180
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/crf-ner/ner.ipynb" target="_blank"> Take me to notebook!</a>
180
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/crf-ner/ner.ipynb" target="_blank"> Take me to notebook!</a>
and it will leverage batch-based distributed calls to native TensorFlow libraries during prediction.
197
197
</p>
198
198
<p>
199
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/dl-ner/ner.ipynb" target="_blank"> Take me to notebook!</a>
199
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/dl-ner/ner.ipynb" target="_blank"> Take me to notebook!</a>
200
200
</p>
201
201
</div>
202
202
<div>
@@ -211,7 +211,7 @@ <h4 id="text-notebook" class="section-block"> Simple Text Matching</h4>
211
211
This annotator is an AnnotatorModel and does not require training.
212
212
</p>
213
213
<p>
214
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/text-matcher/extractor.ipynb" target="_blank"> Take me to notebook!</a>
214
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/text-matcher/extractor.ipynb" target="_blank"> Take me to notebook!</a>
215
215
</p>
216
216
</div>
217
217
<div>
@@ -226,7 +226,7 @@ <h4 id="assertion-notebook" class="section-block"> Assertion Status with LogReg<
226
226
dataset will return the appropriate result.
227
227
</p>
228
228
<p>
229
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/logreg-assertion/assertion.ipynb" target="_blank"> Take me to notebook!</a>
229
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/logreg-assertion/assertion.ipynb" target="_blank"> Take me to notebook!</a>
230
230
</p>
231
231
</div>
232
232
<div>
@@ -241,7 +241,7 @@ <h4 id="dlassertion-notebook" class="section-block"> Deep Learning Assertion Sta
241
241
graphs may be redesigned if needed.
242
242
</p>
243
243
<p>
244
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/dl-assertion/assertion.ipynb" target="_blank"> Take me to notebook!</a>
244
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/dl-assertion/assertion.ipynb" target="_blank"> Take me to notebook!</a>
Such components may then be injected seamlessly into further pipelines, and so on.
261
261
</p>
262
262
<p>
263
-
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.1/python/example/model-downloader/ModelDownloaderExample.ipynb" target="_blank"> Take me to notebook!</a>
263
+
<aclass="btn btn-warning btn-cta" style="float: center;margin-top: 10px;" href="https://github.com/JohnSnowLabs/spark-nlp/blob/1.6.2/python/example/model-downloader/ModelDownloaderExample.ipynb" target="_blank"> Take me to notebook!</a>
0 commit comments