LSA, VSM, & SVD – Introduction to Text Analytics with R Part 7

Part 7 of this video series includes specific coverage of LSA, VSM, & SVD:

– The trade-offs of expanding the text analytics feature space with n-grams.
– How bag-of-words representations map to the vector space model (VSM).
– Usage of the dot product between document vectors as a proxy for correlation.
– Latent semantic analysis (LSA) as a means to address the curse of dimensionality in text analytics.
– How LSA is implemented using singular value decomposition (SVD).
– Mapping new data into the lower dimensional SVD space.

Kaggle Dataset:
Kaggle Spam Data Set

The data and R code here

Full Series:
Introduction to Text Analytics with R

More Data Science Material:
[Video] Introduction to Natural Language Processing
[Blog]  Liberating the Data Artist: Rethinking Creativity in Data Science


About The Author
- Data Science Dojo is a paradigm shift in data science learning. We enable all professionals (and students) to extract actionable insights from data.


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>