caret Package – Machine Learning with R

The R programming language is experiencing rapid increases in popularity and wide adoption across industries. This popularity is due, in part, to R’s huge collection of open source machine learning algorithms. If you are a data scientist working with R, the caret package (short for [C]lassification [A]nd [RE]gression [T]raining) is a must-have tool in your toolbelt. The package provides capabilities that are ubiquitous in all stages of the data science project lifecycle. Most important of all, it provides a common interface for training, tuning, and evaluating more than 200 machine learning algorithms. Not surprisingly, caret is a sure fire way to accelerate your velocity as a data scientist!

In this presentation Dave Langer will provide an introduction to this package. The focus of the presentation will be using caret to implement some of the most common tasks of the data science project lifecycle and to illustrate incorporating it into your daily work.

Viewers will learn how to:

• Create stratified random samples of data useful for training machine learning models.
• Train machine learning models using common interface.
• Leverage the powerful features for cross-validation and hyperparameter tuning.
• Scale caret via use of multi-core, parallel training.
• Increase their knowledge of the many features.

R code and accompanying dataset can be found here

Package website:
http://topepo.github.io/caret/index.html

More Data Science Material:
[Video Series] Introduction to dplyr
[Video] Data Visualization with R and ggplot2
[Blog]  The 4 Pillars of Data Democratization

(1284)

Avatar
About The Author
- Data Science Dojo is a paradigm shift in data science learning. We enable all professionals (and students) to extract actionable insights from data.

Avatar

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>