Your First Test – Introduction to Text Analytics with R Part 11
November 26, 2013 11:00 am
Your First Test includes specific coverage of:
– Pre-processing new, unseen textual data to allow for predictions from our trained model.
– The importance of caching the IDF values calculated from the training data set to TF-IDF new, unseen, pre-processed data.
– Performing SVD projections of new, unseen, pre-processed textual data into the latent semantic space.
– Creating predictions and evaluating model effectiveness in the context of accuracy, sensitivity, and specificity.
Kaggle Dataset:
Kaggle Spam Data Set
The data and R code here
Full Series:
Introduction to Text Analytics with R
More Data Science Material:
[Video] Steps in Experimentation
[Blog] Text Mining: Breathing Structure to the Unstructured
(707)
Tags: R Programming, Text Analytics