Sparknlp

Latest version: v1.0.0

Safety actively analyzes 630169 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 12

2.5.5

========
---------------
New Features
---------------
- Add getClasses() function to NerDLModel
- Add getClasses() function to ClassifierDLModel
- Add getClasses() function to SentimentDLModel

---------------------
Enhancements
---------------------
- Improve max sequence length calculation in BertEmbeddings and XlnetEmbeddings

----------------
Bugfixes
----------------
- Fix a bug in RegexTokenizer in Python
- Fix StopWordsCleaner exception in Python when pretrained() is used
- Fix max sequence length issue in AlbertEmbeddings and SentencePiece generation
- Fix HDFS support for setGaphFolder param in NerDLApproach

========

2.5.4

========
---------------
New Features
---------------
* Add support for Apache Spark 2.3.x including new Maven artifacts and full support of all pre-trained models/pipelines
* Add 43 new pre-trained models in 43 languages to StopWordsCleaner annotator
* Introduce a new RegexTokenizer to split text by regex pattern

---------------------
Enhancements
---------------------
* Retrained 6 new BioBERT and ClinicalBERT models
* Add a new param to `start()` function to start the session for Apache Spark 2.3.x

----------------
Bugfixes
----------------
* Add missing library for SentencePiece used by AlbertEmbeddings and XlnetEmbeddings on Windows
* Fix ModuleNotFoundError in LanguageDetectorDL pipelines in Python


========

2.5.3

========
---------------
New Features
---------------
* TextMatcher now can construct the chunks from tokens instead of the original documents via buildFromTokens param
* CoNLLGenerator now is accessible in Python


----------------
Bugfixes
----------------
* Fix a bug in ContextSpellChecker resulting in IllegalArgumentException

---------------------
Enhancements
---------------------
* Improve RocksDB connection to support different storage capabilities
* Improve parameters naming convention in ContextSpellChecker

---------------------
Enhancements
---------------------
* Add NerConverter to documentation
* Fix multi-language tabs in documentation


========

2.5.2

========
---------------
New Features
---------------
* Introducing a new LanguageDetectorDL state-of-the-art annotator to detect and identify languages in documents and sentences
* Add a new param entityValue to TextMatcher to add custom value inside metadata. Useful in post-processing when there are multiple TextMatcher annotators with multiple dictionaries https://github.com/JohnSnowLabs/spark-nlp/issues/920

----------------
Bugfixes
----------------
* Add missing TensorFlow graphs to train ContextSpellChecker annotator https://github.com/JohnSnowLabs/spark-nlp/issues/912
* Fix misspelled param in classThreshold param in ContextSpellChecker annotator https://github.com/JohnSnowLabs/spark-nlp/issues/911
* Fix a bug where setGraphFolder in NerDLApproach annotator couldn't find a graph on Databricks (DBFS) https://github.com/JohnSnowLabs/spark-nlp/issues/739
* Fix a bug in NerDLApproach when includeConfidence was set to true https://github.com/JohnSnowLabs/spark-nlp/issues/917
* Fix a bug in BertEmbeddings https://github.com/JohnSnowLabs/spark-nlp/issues/906 https://github.com/JohnSnowLabs/spark-nlp/issues/918

---------------------
Enhancements
---------------------
* Improve TF backend in ContextSpellChecker annotator


========

2.5.1

========
---------------
New Features
---------------
* Add Python support for PubTator reader to convert automatic annotations of the biomedical datasets into DataFrame
* Add 6 new pre-trained BERT models from BioBERT and ClinicalBERT

---------------------
Enhancements
---------------------
* Add unit tests for XlnetEmbeddings
* Add unit tests for AlbertEmbeddings
* Add unit tests for ContextSpellChecker


========

2.5.0

========
---------------
New Features
---------------
* A new AlbertEmbeddings annotator with 4 available pre-trained models
* A new XlnetEmbeddings annotator with 2 available pre-trained models
* A new ContextSpellChecker annotator, the state-of-the-art annotator for spell checking
* A new SentimentDL annotator for multi-class sentiment analysis. This annotator comes with 2 available pre-trained models trained on IMDB and Twitter datasets
* Add new PubTator reader to convert automatic annotations of the biomedical datasets into DataFrame
* Introducing a new outputLogsPath param for NerDLApproach, ClassifierDLApproach and SentimentDLApproach annotators
* Refactored CoNLLGenerator to actually use NER labels from the DataFrame
* Unified params in NerDLModel in both Scala and Python
* Extend and complete Scaladoc APIs for all the annotators

----------------
Bugfixes
----------------
* Fix position of tokens in Normalizer
* Fix Lemmatizer exception on a bad input
* Fix annotator logs failing on object storage file systems like DBFS

----------------
Documentation
----------------
* Update documentation for release of Spark NLP 2.5.x
* Update the entire [spark-nlp-workshop](https://github.com/JohnSnowLabs/spark-nlp-models) notebooks for Spark NLP 2.5.x
* Update the entire [spark-nlp-models](https://github.com/JohnSnowLabs/spark-nlp-workshop) repository with new pre-trained models and pipelines

========

Page 2 of 12

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.