Sparknlp

Latest version: v1.0.0

Safety actively analyzes 621876 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 12

2.6.5

========
----------------
Bugfixes
----------------
* Fix a bug in batching sentences in BertSentenceEmbeddings
* Fix AttributeError when trying to load a saved EmbeddingsFinisher in Python

----------------
Enhancements
----------------
* Improve handeling exceptions in DocumentAssmbler when user uses a corrupted DataFrame

========

2.6.4

========
----------------
Bugfixes
----------------
* Fix loading from a local folder with no access to the cache folder
* Fix NullPointerException in DocumentAssembler when there are null in the rows
* Fix dynamic padding in BertSentenceEmbeddings

========

2.6.3

========
---------------
New Features
---------------
* Add enableMemoryOptimizer to allow training NerDLApproach on a dataset larger than the memory
* Add option to explode sentences in SentenceDetectorDL

----------------
Enhancements
----------------
* Improve POS (AveragedPerceptron) performance
* Improve Norvig Spell Checker performance

----------------
Bugfixes
----------------
* Fix SentenceDetectorDL unsupported model error in pretrained function
* Fix a race condition in Lru that can cause NullPointerException during a LightPipeline operations with embeddings
* Fix max sequence length calculation in BertEmbeddings and BertSentenceEmbeddings
* Fix threshold in YakeModel on Python side

========

2.6.2

========
---------------
New Features
---------------
* Introducing a new SentenceDetectorDL

----------------
Enhancements
----------------
* Improved BioBERT models quality for BertEmbeddings (it achieves higher accuracy in sequence classification)
* Improved Sentence BioBERT models quality for BertSentenceEmbeddings (it achieves higher accuracy in text classification)
* Add unit test to MultiClassifierDL annotator
* Better error handling in SentimentDLApproach
* Improve loadSavedModel in BertEmbeddings and BertSentenceEmbeddings

----------------
Bugfixes
----------------
* Fix BERT LaBSE model for BertSentenceEmbeddings
* Fix loadSavedModel for BertSentenceEmbeddings in Python

---------------
Deprecations
---------------
* DeepSentenceDetector is deprecated in favor of SentenceDetectorDL

========

2.6.1

========
----------------
Bugfixes
----------------
* Fix a bug in ClassifierDL that resulted in low accuracy during the training

========

2.6.0

========
------------------------------
Major features and improvements
------------------------------

* **NEW:** A new MultiClassifierDL annotator for multi-label text classification
* **NEW:** A new BertSentenceEmbeddings annotator with 41 available pre-trained models for sentence embeddings used in SentimentDL, ClassifierDL, and MultiClassifierDL annotators
* **NEW:** A new YakeModel annotator for an unsupervised, corpus-independent, domain, and language-independent and single-document keyword extraction algorithm
* Integrate 24 new Small BERT models where the smallest model is 24x times smaller and 28x times faster compare to BERT base models
* Add 3 new ELECTRA small, base, and large models
* Add 4 new Finnish BERT models for BertEmbeddings and BertSentenceEmbeddings
* Improve BertEmbeddings memory consumption by 30%
* Improve BertEmbeddings performance by more than 70% with a new built-in dynamic shape inputs
* Remove the poolingLayer parameter in BertEmbeddings in favor of sequence_output that is provided by TF Hub models for new BERT models
* Add validation loss, validation accuracy, validation F1, and validation True Positive Rate during the training in MultiClassifierDL
* Add parameter to enable/disable list detection in SentenceDetector
* Unify the loggings in ClassifierDL and SentimentDL during training

----------------
Bugfixes
----------------
* Fix Tokenization bug with Bigrams in the exception list
* Fix the versioning error in second SBT projects causing models not being found via pretrained function
* Fix logging to file in NerDLApproach, ClassifierDL, SentimentDL, and MultiClassifierDL on HDFS
* Fix ignored modified tokens in BertEmbeddings, now it will consider modified tokens instead of originals

========

Page 1 of 12

Releases

Has known vulnerabilities

Sparknlp

Page 1 of 12

2.6.5

2.6.4

2.6.3

2.6.2

2.6.1

2.6.0

Page 1 of 12

Links

Releases