* new LDA implementation after Hoffman et al.: Online Learning for Latent Dirichlet Allocation * distributed LDA * updated LDA docs (wiki experiments, distributed tutorial) * matrixmarket header now uses capital 'M's: MatrixMarket. (André Lynum reported than Matlab has trouble processing the lowercase version) * moved code to github * started gensim Google group
0.7.6
* added workaround for a bug in numpy: pickling a fortran-order array (e.g. LSA model) and then loading it back and using it results in segfault (thx to Brian Merrel) * bundled a new version of ez_setup.py: old failed with Python2.6 when setuptools were missing (thx to Alan Salmoni).
0.7.5
* further optimization to LSA; this is the version used in my NIPS workshop paper * got rid of SVDLIBC dependency (one-pass LSA now uses stochastic algo for base-base decompositions)
0.7.4
* sped up Latent Dirichlet ~10x (through scipy.weave, optional) * finally, distributed LDA! scales almost linearly, but no tutorial yet. see the tutorial on distributed LSI, everything's completely analogous. * several minor fixes and improvements; one nasty bug fixed (lsi[corpus] didn't work; thx to Danilo Spinelli)
0.7.3
* added stochastic SVD decomposition (faster than the current one-pass LSI algo, but needs two passes over the input corpus) * published gensim on mloss.org
0.7.2
* added workaround for a numpy bug where SVD sometimes fails to converge for no good reason * changed content of gensims's PyPi title page * completed HTML tutorial on distributed LSA