Gensim

Latest version: v4.3.2

Safety actively analyzes 629639 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 15

2.2.0

:star2: New features:
* Add sklearn wrapper for RpModel (chinmayapancholi13, [1395](https://github.com/RaRe-Technologies/gensim/pull/1395))
* Add sklearn wrappers for LdaModel and LsiModel (chinmayapancholi13, [1398](https://github.com/RaRe-Technologies/gensim/pull/1398))
* Add sklearn wrapper for LdaSeq (chinmayapancholi13, [1405](https://github.com/RaRe-Technologies/gensim/pull/1405))
* Add keras wrapper for Word2Vec model (chinmayapancholi13, [1248](https://github.com/RaRe-Technologies/gensim/pull/1248))
* Add LdaModel.diff method (menshikh-iv, [1334](https://github.com/RaRe-Technologies/gensim/pull/1334))
* Allow use of truncated Dictionary for coherence measures. Fix 1342 (macks22, [1349](https://github.com/RaRe-Technologies/gensim/pull/1349))


:+1: Improvements:
* Fix save_as_text/load_as_text for Dictionary (vlejd, [1402](https://github.com/RaRe-Technologies/gensim/pull/1402))
* Add sampling support for corpus. Fix 308 (vlejd, [1408](https://github.com/RaRe-Technologies/gensim/pull/1408))
* Add napoleon extension to sphinx (rasto2211, [1411](https://github.com/RaRe-Technologies/gensim/pull/1411))
* Add KeyedVectors support to AnnoyIndexer (quole, [1318](https://github.com/RaRe-Technologies/gensim/pull/1318))
* Add BaseSklearnWrapper (chinmayapancholi13, [1383](https://github.com/RaRe-Technologies/gensim/pull/1383))
* Replace num_words to topn in model for unification. Fix 1198 (prakhar2b, [1200](https://github.com/RaRe-Technologies/gensim/pull/1200))
* Rename out_path to out_name & add logging for WordRank model. Fix 1310 (parulsethi, [1332](https://github.com/RaRe-Technologies/gensim/pull/1332))
* Remove multiple iterations of corpus in p_boolean_document (danielchamberlain, [1325](https://github.com/RaRe-Technologies/gensim/pull/1325))
* Fix codestyle in TfIdf (piskvorky, [1313](https://github.com/RaRe-Technologies/gensim/pull/1313))
* Fix warnings from Sphinx. Partial fix 1192 (souravsingh, [1330](https://github.com/RaRe-Technologies/gensim/pull/1330))
* Add test_env to setup.py (menshikh-iv, [1336](https://github.com/RaRe-Technologies/gensim/pull/1336))


:red_circle: Bug fixes:
* Add cleanup in annoy test (prakhar2b, [1420](https://github.com/RaRe-Technologies/gensim/pull/1420))
* Add cleanup in lda backprop test (prakhar2b, [1417](https://github.com/RaRe-Technologies/gensim/pull/1417))
* Fix out-of-vocab in FastText (jayantj, [1409](https://github.com/RaRe-Technologies/gensim/pull/1409))
* Add cleanup in WordRank test (parulsethi, [1410](https://github.com/RaRe-Technologies/gensim/pull/1410))
* Fix rest requirements in Travis. Partial fix 1393 (ibrahimsharaf, menshikh-iv, [1400](https://github.com/RaRe-Technologies/gensim/pull/1400))
* Fix morfessor exception. Partial fix 1324 (souravsingh, [1406](https://github.com/RaRe-Technologies/gensim/pull/1406))
* Fix test for FastText (prakhar2b, [1371](https://github.com/RaRe-Technologies/gensim/pull/1371))
* Fix WikiCorpus (alekol, [1333](https://github.com/RaRe-Technologies/gensim/pull/1333))
* Fix backward incompatibility for LdaModel (chinmayapancholi13, [1327](https://github.com/RaRe-Technologies/gensim/pull/1327))
* Fix support for old and new FastText model format. Fix 1301 (prakhar2b, [1319](https://github.com/RaRe-Technologies/gensim/pull/1319))
* Fix wrapper tests. Fix 1323 (shubhamjain74, [1359](https://github.com/RaRe-Technologies/gensim/pull/1359))
* Update export_phrases method. Fix 794 (toumorokoshi, [1362](https://github.com/RaRe-Technologies/gensim/pull/1362))
* Fix sklearn exception in test (souravsingh, [1350](https://github.com/RaRe-Technologies/gensim/pull/1350))


:books: Tutorial and doc improvements:
* Fix incorrect link in tutorials (aneesh-joshi, [1426](https://github.com/RaRe-Technologies/gensim/pull/1426))
* Add notebook with sklearn wrapper examples (chinmayapancholi13, [1428](https://github.com/RaRe-Technologies/gensim/pull/1428))
* Replace absolute pathes to relative in notebooks (vochicong, [1414](https://github.com/RaRe-Technologies/gensim/pull/1414))
* Fix code-style in keras notebook (chinmayapancholi13, [1394](https://github.com/RaRe-Technologies/gensim/pull/1394))
* Replace absolute pathes to relative in notebooks (vochicong, [1407](https://github.com/RaRe-Technologies/gensim/pull/1407))
* Fix typo in quickstart guide (vochicong, [1404](https://github.com/RaRe-Technologies/gensim/pull/1404))
* Update docstring for WordRank. Fix 1384 (parulsethi, [1378](https://github.com/RaRe-Technologies/gensim/pull/1378))
* Update docstring for SkLdaModel (chinmayapancholi13, [1382](https://github.com/RaRe-Technologies/gensim/pull/1382))
* Update logic for updatetype in LdaModel (chinmayapancholi13, [1389](https://github.com/RaRe-Technologies/gensim/pull/1389))
* Update docstring for Doc2Vec (jstol, [1379](https://github.com/RaRe-Technologies/gensim/pull/1379))
* Fix docstring for KL-distance (viciousstar, [1373](https://github.com/RaRe-Technologies/gensim/pull/1373))
* Update Corpora_and_Vector_Spaces tutorial (charliejharrison, [1308](https://github.com/RaRe-Technologies/gensim/pull/1308))
* Add visualization for difference between LdaModel (menshikh-iv, [1374](https://github.com/RaRe-Technologies/gensim/pull/1374))
* Fix punctuation & typo in changelog (piskvorky, menshikh-iv, [1366](https://github.com/RaRe-Technologies/gensim/pull/1366))
* Fix PEP8 & typo in several PRs (menshikh-iv, [1369](https://github.com/RaRe-Technologies/gensim/pull/1369))
* Update docstrings connected with backward compability in for LdaModel (chinmayapancholi13, [1365](https://github.com/RaRe-Technologies/gensim/pull/1365))
* Update Corpora_and_Vector_Spaces tutorial (schuyler1d, [1360](https://github.com/RaRe-Technologies/gensim/pull/1360))
* Fix typo in Doc2Vec doctsring (fujiyuu75, [1356](https://github.com/RaRe-Technologies/gensim/pull/1356))
* Update Annoy tutorial (pmbaumgartner, [1355](https://github.com/RaRe-Technologies/gensim/pull/1355))
* Update temp folder in tutorials (yl2526, [1352](https://github.com/RaRe-Technologies/gensim/pull/1352))
* Remove spaces after print in Topics_and_Transformation tutorial (gsimore, [1354](https://github.com/RaRe-Technologies/gensim/pull/1354))
* Update Dictionary docstring (oonska, [1347](https://github.com/RaRe-Technologies/gensim/pull/1347))
* Add section headings in word2vec notebook (MikeTheReader, [1348](https://github.com/RaRe-Technologies/gensim/pull/1348))
* Fix broken urls in starter tutorials (ka7eh, [1346](https://github.com/RaRe-Technologies/gensim/pull/1346))
* Update quick start notebook (yardsale8, [1345](https://github.com/RaRe-Technologies/gensim/pull/1345))
* Fix typo in quick start notebook (MikeTheReader, [1344](https://github.com/RaRe-Technologies/gensim/pull/1344))
* Fix docstring in keyedvectors (chinmayapancholi13, [1337](https://github.com/RaRe-Technologies/gensim/pull/1337))

2.1.0

:star2: New features:
* Add modified save_word2vec_format for Doc2Vec, to save document vectors. (parulsethi, [1256](https://github.com/RaRe-Technologies/gensim/pull/1256))


:+1: Improvements:
* Add automatic code style check limited only to the code modified in PR (tmylk, [1287](https://github.com/RaRe-Technologies/gensim/pull/1287))
* Replace `logger.warn` by `logger.warning` (chinmayapancholi13, [1295](https://github.com/RaRe-Technologies/gensim/pull/1295))
* Docs word2vec docstring improvement, deprecation labels (shubhvachher, [1274](https://github.com/RaRe-Technologies/gensim/pull/1274))
* Stop passing 'sentences' as parameter to Doc2Vec. Fix 511 (gogokaradjov, [1306](https://github.com/RaRe-Technologies/gensim/pull/1306))


:red_circle: Bug fixes:
* Allow indexing with np.int64 in doc2vec. Fix 1231 (bogdanteleaga, [1254](https://github.com/RaRe-Technologies/gensim/pull/1254))
* Update Doc2Vec docstring. Fix 1302 (datapythonista, [1307](https://github.com/RaRe-Technologies/gensim/pull/1307))
* Ignore rst and ipynb file in Travis flake8 validations (datapythonista, [1309](https://github.com/RaRe-Technologies/gensim/pull/1309))


:books: Tutorial and doc improvements:
* Update Tensorboard Doc2Vec notebook (parulsethi, [1286](https://github.com/RaRe-Technologies/gensim/pull/1286))
* Update Doc2Vec IMDB Notebook, replace codesc to smart_open (robotcator, [1278](https://github.com/RaRe-Technologies/gensim/pull/1278))
* Add explanation of `size` to Word2Vec Notebook (jbcoe, [1305](https://github.com/RaRe-Technologies/gensim/pull/1305))
* Add extra param to WordRank notebook. Fix 1276 (parulsethi, [1300](https://github.com/RaRe-Technologies/gensim/pull/1300))
* Update warning message in WordRank (parulsethi, [1299](https://github.com/RaRe-Technologies/gensim/pull/1299))

2.0.0

Breaking changes:

Any direct calls to method train() of Word2Vec/Doc2Vec now require an explicit epochs parameter and explicit estimate of corpus size. The most usual way to call `train` is `vec_model.train(sentences, total_examples=self.corpus_count, epochs=self.iter)`
See the [method documentation](https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/word2vec.py#L766) for more information.


* Explicit epochs and corpus size in word2vec train(). (gojomo, robotcator, [1139](https://github.com/RaRe-Technologies/gensim/pull/1139), [#1237](https://github.com/RaRe-Technologies/gensim/pull/1237))

New features:
* Add output word prediction in word2vec. Only for negative sampling scheme. See [ipynb]( https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/word2vec.ipynb) (chinmayapancholi13, [#1209](https://github.com/RaRe-Technologies/gensim/pull/1209))
* scikit_learn wrapper for LSI Model in Gensim (chinmayapancholi13, [1244](https://github.com/RaRe-Technologies/gensim/pull/1244))
* Add the 'keep_tokens' parameter to 'filter_extremes'. (toliwa, [1210](https://github.com/RaRe-Technologies/gensim/pull/1210))
* Load FastText models with specified encoding (jayantj, [1210](https://github.com/RaRe-Technologies/gensim/pull/1189))


Improvements:
* Fix loading large FastText models on Mac. (jaksmid, [1196](https://github.com/RaRe-Technologies/gensim/pull/1214))
* Sklearn LDA wrapper now works in sklearn pipeline (kris-singh, [1213](https://github.com/RaRe-Technologies/gensim/pull/1213))
* glove2word2vec conversion script refactoring (parulsethi, [1247](https://github.com/RaRe-Technologies/gensim/pull/1247))
* Word2vec error message when update called before train . Fix 1162 (hemavakade, [1205](https://github.com/RaRe-Technologies/gensim/pull/1205))
* Allow training if model is not modified by "_minimize_model". Add deprecation warning. (chinmayapancholi13, [1207](https://github.com/RaRe-Technologies/gensim/pull/1207))
* Update the warning text when building vocab on a trained w2v model (prakhar2b, [1190](https://github.com/RaRe-Technologies/gensim/pull/1190))

Bug fixes:

* Fix word2vec reset_from bug in v1.0.1 Fix 1230. (Kreiswolke, [1234](https://github.com/RaRe-Technologies/gensim/pull/1234))

* Distributed LDA: checking the length of docs instead of the boolean value, plus int index conversion (saparina, [1191](https://github.com/RaRe-Technologies/gensim/pull/1191))

* syn0_lockf initialised with zero in intersect_word2vec_format() (KiddoZhu, [1267](https://github.com/RaRe-Technologies/gensim/pull/1267))

* Fix wordrank max_iter_dump calculation. Fix 1216 (ajkl, [1217](https://github.com/RaRe-Technologies/gensim/pull/1217))

* Make SgNegative test use skip-gram (shubhvachher, [1252](https://github.com/RaRe-Technologies/gensim/pull/1252))

* pep8/pycodestyle fixes for hanging indents in Summarization module (SamriddhiJain, [1202](https://github.com/RaRe-Technologies/gensim/pull/1202))

* WordRank and Mallet wrappers single vs double quote issue in windows. (prakhar2b, [1208](https://github.com/RaRe-Technologies/gensim/pull/1208))


* Fix 824 : no corpus in init, but trim_rule in init (prakhar2b, [1186](https://github.com/RaRe-Technologies/gensim/pull/1186))

* Hardcode version number. Fix 1138. (tmylk, [1138](https://github.com/RaRe-Technologies/gensim/pull/1138))

Tutorial and doc improvements:

* Color dictionary according to topic notebook update (bhargavvader, [1164](https://github.com/RaRe-Technologies/gensim/pull/1164))

* Fix hdp show_topic/s docstring (parulsethi, [1264](https://github.com/RaRe-Technologies/gensim/pull/1264))

* Add docstrings for word2vec.py forwarding functions (shubhvachher, [1251](https://github.com/RaRe-Technologies/gensim/pull/1251))

* updated description for worker_loop function used in score function (chinmayapancholi13, [1206](https://github.com/RaRe-Technologies/gensim/pull/1206))

1.0.1

* Rebuild cumulative table on load. Fix 1180. (tmylk, [1181](https://github.com/RaRe-Technologies/gensim/pull/893))
* most_similar_cosmul bug fix (dkim010, [1177](https://github.com/RaRe-Technologies/gensim/pull/1177))
* Fix loading old word2vec models pre-1.0.0 (jayantj, [1179](https://github.com/RaRe-Technologies/gensim/pull/1179))
* Load utf-8 words in fasttext (jayantj, [1176](https://github.com/RaRe-Technologies/gensim/pull/1176))

1.0.0

New features:
* Add Author-topic modeling (olavurmortensen, [893](https://github.com/RaRe-Technologies/gensim/pull/893))
* Add FastText word embedding wrapper (Jayantj, [847](https://github.com/RaRe-Technologies/gensim/pull/847))
* Add WordRank word embedding wrapper (parulsethi, [1066](https://github.com/RaRe-Technologies/gensim/pull/1066), [#1125](https://github.com/RaRe-Technologies/gensim/pull/1125))
* Add VarEmbed word embedding wrapper (anmol01gulati, [1067](https://github.com/RaRe-Technologies/gensim/pull/1067)))
* Add sklearn wrapper for LDAModel (AadityaJ, [932](https://github.com/RaRe-Technologies/gensim/pull/932))

Deprecated features:

* Move `load_word2vec_format` and `save_word2vec_format` out of Word2Vec class to KeyedVectors (tmylk, [1107](https://github.com/RaRe-Technologies/gensim/pull/1107))
* Move properties `syn0norm`, `syn0`, `vocab`, `index2word` from Word2Vec class to KeyedVectors (tmylk,[1147](https://github.com/RaRe-Technologies/gensim/pull/1147))
* Remove support for Python 2.6, 3.3 and 3.4 (tmylk,[1145](https://github.com/RaRe-Technologies/gensim/pull/1145))


Improvements:

* Python 3.6 support (tmylk [1077](https://github.com/RaRe-Technologies/gensim/pull/1077))
* Phrases and Phraser allow a generator corpus (ELind77 [1099](https://github.com/RaRe-Technologies/gensim/pull/1099))
* Ignore DocvecsArray.doctag_syn0norm in save. Fix 789 (accraze, [1053](https://github.com/RaRe-Technologies/gensim/pull/1053))
* Fix bug in LsiModel that occurs when id2word is a Python 3 dictionary. (cvangysel, [1103](https://github.com/RaRe-Technologies/gensim/pull/1103)
* Fix broken link to paper in readme (bhargavvader, [1101](https://github.com/RaRe-Technologies/gensim/pull/1101))
* Lazy formatting in evaluate_word_pairs (akutuzov, [1084](https://github.com/RaRe-Technologies/gensim/pull/1084))
* Deacc option to keywords pre-processing (bhargavvader, [1076](https://github.com/RaRe-Technologies/gensim/pull/1076))
* Generate Deprecated exception when using Word2Vec.load_word2vec_format (tmylk, [1165](https://github.com/RaRe-Technologies/gensim/pull/1165))
* Fix hdpmodel constructor docstring for print_topics (1152) (toliwa, [1152](https://github.com/RaRe-Technologies/gensim/pull/1152))
* Default to per_word_topics=False in LDA get_item for performance (menshikh-iv, [1154](https://github.com/RaRe-Technologies/gensim/pull/1154))
* Fix bound computation in Author Topic models. (olavurmortensen, [1156](https://github.com/RaRe-Technologies/gensim/pull/1156))
* Write UTF-8 byte strings in tensorboard conversion (tmylk, [1144](https://github.com/RaRe-Technologies/gensim/pull/1144))
* Make top_topics and sparse2full compatible with numpy 1.12 strictly int idexing (tmylk, [1146](https://github.com/RaRe-Technologies/gensim/pull/1146))

Tutorial and doc improvements:

* Clarifying comment in is_corpus func in utils.py (greninja, [1109](https://github.com/RaRe-Technologies/gensim/pull/1109))
* Tutorial Topics_and_Transformations fix markdown and add references (lgmoneda, [1120](https://github.com/RaRe-Technologies/gensim/pull/1120))
* Fix doc2vec-lee.ipynb results to match previous behavior (bahbbc, [1119](https://github.com/RaRe-Technologies/gensim/pull/1119))
* Remove Pattern lib dependency in News Classification tutorial (luizcavalcanti, [1118](https://github.com/RaRe-Technologies/gensim/pull/1118))
* Corpora_and_Vector_Spaces tutorial text clarification (lgmoneda, [1116](https://github.com/RaRe-Technologies/gensim/pull/1116))
* Update Transformation and Topics link from quick start notebook (mariana393, [1115](https://github.com/RaRe-Technologies/gensim/pull/1115))
* Quick Start Text clarification and typo correction (luizcavalcanti, [1114](https://github.com/RaRe-Technologies/gensim/pull/1114))
* Fix typos in Author-topic tutorial (Fil, [1102](https://github.com/RaRe-Technologies/gensim/pull/1102))
* Address benchmark inconsistencies in Annoy tutorial (droudy, [1113](https://github.com/RaRe-Technologies/gensim/pull/1113))
* Add note about Annoy speed depending on numpy BLAS setup in annoytutorial.ipynb (greninja, [1137](https://github.com/RaRe-Technologies/gensim/pull/1137))
* Fix dependencies description on doc2vec-IMDB notebook (luizcavalcanti, [1132](https://github.com/RaRe-Technologies/gensim/pull/1132))
* Add documentation for WikiCorpus metadata. (kirit93, [1163](https://github.com/RaRe-Technologies/gensim/pull/1163))

1.0.0rc2

* Add note about Annoy speed depending on numpy BLAS setup in annoytutorial.ipynb (greninja, [1137](https://github.com/RaRe-Technologies/gensim/pull/1137))
* Remove direct access to properties moved to KeyedVectors (tmylk, [1147](https://github.com/RaRe-Technologies/gensim/pull/1147))
* Remove support for Python 2.6, 3.3 and 3.4 (tmylk, [1145](https://github.com/RaRe-Technologies/gensim/pull/1145))
* Write UTF-8 byte strings in tensorboard conversion (tmylk, [1144](https://github.com/RaRe-Technologies/gensim/pull/1144))
* Make top_topics and sparse2full compatible with numpy 1.12 strictly int idexing (tmylk, [1146](https://github.com/RaRe-Technologies/gensim/pull/1146))

Page 6 of 15

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.