N-grams – Introduction to Text Analytics with R Part 6
Data Science Tutorials
Rating: 8.7 / 10
N-grams – Introduction to Text Analytics with R Part 6
November 26, 2013 6:00 am
N-grams includes specific coverage of:
• Validate the effectiveness of TF-IDF in improving model accuracy.
• Introduce the concept of N-grams as an extension to the bag-of-words model to allow for word ordering.
• Discuss the trade-offs involved of N-grams and how Text Analytics suffers from the “Curse of Dimensionality”.
• Illustrate how quickly Text Analytics can strain the limits of your computer hardware.
Kaggle Dataset:
Kaggle Spam Data Set
The data and R code here
Full Series:
Introduction to Text Analytics with R
More Data Science Material:
[Video] Introduction to N-Grams
[Blog] Natural Language Processing with R Programming Books
(546)