gensim

gensim
gensim tagline
Fork me on GitHub
gensim code example
Gensim is a FREE Python library
  • Scalable statistical semantics
  • Analyze plain-text documents for semantic structure
  • Retrieve semantically similar documents

Who is using Gensim?
Doing something interesting with gensim? Ask to be featured here.

  • “Here at Tailwind, we use Gensim to help our customers post interesting and relevant content to Pinterest. No fuss, no muss. Just fast, scalable language processing.” Waylon Flinn, Tailwind
  • “We are using gensim every day. Over 15 thousand times per day to be precise. Gensim’s LDA module lies at the very core of the analysis we perform on each uploaded publication to figure out what it’s all about. It simply works.” Andrius Butkus, Issuu
  • “Gensim hits the sweetest spot of being a simple yet powerful way to access some incredibly complex NLP goodness.” Alan J. Salmoni, Roistr.com
  • “I used gensim at Ghent university. I found it easy to build prototypes with various models, extend it with additional features and gain empirical insights quickly. It's a reliable library that can be used beyond prototyping too.” Dieter Plaetinck, IBCN group
  • “We used gensim in several text mining projects at Sports Authority. The data were from free-form text fields in customer surveys, as well as social media sources. Having gensim significantly sped our time to development, and it is still my go-to package for topic modeling with large retail data sets.” Josh Hemann, Sports Authority
  • “Semantic analysis is a hot topic in online marketing, but there are few products on the market that are truly powerful. Gensim is undoubtedly one of the best frameworks that efficiently implement algorithms for statistical analysis. Few products, even commercial, have this level of quality.” Bruno Champion, DynAdmic
  • “Based on our experience with gensim on DML-CZ, we naturally opted to use it on a much bigger scale for similarity of fulltexts of scientific papers in the European Digital Mathematics Library. In evaluation with other approaches, gensim became a clear winner, especially because of speed, scalability and ease of use.”Petr Sojka, EuDML
  • “We have been using gensim in several DTU courses related to digital media engineering and find it immensely useful as the tutorial material provides students an excellent introduction to quickly understand the underlying principles in topic modeling based on both LSA and LDA.”Michael Kai Petersen, Technical University of Denmark
get started