R programming is rapidly becoming a valuable skill for data professionals of all stripes and a must-have skill for aspiring data scientists. Adding R programming to your data analyst skillset allows you to leverage powerful data visualizations, statistical analyses, and …
Microsoft’s Power BI is a powerful technology for quickly creating rich r visualizations. Power BI has many practical uses for the modern data professional including executive dashboards, operational dashboards, and visualizations for data exploration/analysis.
Event logs are everywhere and represent a prime source of Big Data. Event log sources run the gamut from e-commerce web servers to devices participating in globally distributed Internet of Things (IoT) architectures. Even Enterprise Resource Planning (ERP) systems produce …
In the final video in our Data Mining Fundamentals series, we conclude our discussion of different visualization techniques for data exploration with scatter plots and contour plots. We will define each plot, and share examples of when you can use …
Histograms and box plots are the most popular visualization techniques. In this tutorial, we discuss the unique benefits of both, and provide examples of when you can use each for your data exploration and visualization.
Center and Spread measurement is the next topic in our discussion on data exploration and visualization. We discuss measuring of center such as the median and mean, and look at measures of spread such as range and variance.
Data exploration is visualization and calculation to better understand characteristics of data. We will tell you the key motivations of data exploration as well as the techniques used in data exploration.
Summary statistics are numbers that summarize properties of data, and the frequency of an attribute value is a percentage measuring how often the value occurs in the data set. We will also describe percentiles, and provide examples of each.
Correlation and visually evaluating is the next step in our discussion on similarity and dissimilarity. Correlation measures the linear relationship between objects, and to visually evaluate correlation, you will need to build a scatter plot.
Euclidean distance and cosine similarity are the next aspect of similarity and dissimilarity we will discuss. We will show you how to calculate the euclidean distance and construct a distance matrix.
This series is part of our pre-bootcamp course work …
Are you passionate about crafting the best data science tutorials on the Internet?
Data Science Dojo is actively building a community of experts just like you. If you would like to join us, please submit your data science video tutorial today!