In part 3 of this video series, learn how to evaluate time series model predictions using mean absolute error and Python’s statistics and matplotlib packages. We look at plotting the differences between actual versus predicted values, and calculate the mean absolute error to help evaluate our ARIMA time series model. We also look at potential issues when modeling time series, and how to take this further and learn more in-depth. This series is considered for** intermediate** and **advanced** users. We have a data science bootcamp for complete beginners!

Hi, welcome back to this Data Science Dojo video tutorial series on time series.

In part two we left it at modeling our data and predicting five

timestamps ahead into the future. In part three we’ll evaluate our predictions and

see how far off the month they were to the actual values in our holdout data, or

in the last five timestamps of our full sample data set.

So now we’re going to plot actual versus predicted.

We’re going to get two versions of our time series

so we’re going to have all our values with the last five being actual values

and then we’re going to overlay the plot with all the values again but with the

last five being predicted values, and we should see some difference between those

actual and predicted values and the last five timestamps.

So first we’re going to read in our entire sample which includes our last five values as our actual values.

And once again we’re going to use pandas read csv function.

Gonna read in our full sample, or entire dataset.

And once again, we’ll use our date time column as our index column.

Which is the first column.

And we will parse these dates.

We’ll use this squeeze option to return a series.

Now, I want to print the row values, or the index

values, of the last of our last five values or a holdout set, as we’re going to

input these into another series. So the way to get this, we’ll just call index for those values.

And we’re going to get the last five here for our actual.

I’m gonna get the index values for these starting at 19, going to 23, 24.

And we’re going to print these out, so we can have a look as well.

Okay, let’s have a look at these.

Alright, so these values here is basically the time stamps for our

holdout set. It’s going to input these into another series with our prediction

values. I’m going to tie our predicted values to each of their time stamps.

So another way you can read in a time series is using this series function

here, we’re wanting to read from a CSV before, but you can do it this way as well.

Give it our predicted values, and we’re gonna create row index.

I’m gonna paste in these values here, just so you can see the last five timestamps.

But you can just feed it that, you know, index for values variable.

I’ll clean these up a bit.

Okay, great. And let’s print this just to make sure that it is in a correct format.

Alright, let’s have a look.

Okay, great. So we have our predicted values tied to

their time stamps now in a series, and what we’re going to do is append that on

to our training set, so we have, as I said, one version of our series with the

predicted values, and one version with the actual. So let’s go ahead and do that.

You can comment these out, as we no longer need to print them.

And I’m just going to print the tail end of this, just to make sure it appended

onto the end of the drawing set.

Okay, let’s have a look here.

Okay, great. So it looks like it successfully appended onto the training

set here. So now we have a full series with predicted values and a full series with actual.

Okay, now let’s plot the actual versus predicted.

I’m going to create a plot here.

We’ll start with our predictive values, and I’m just gonna plot them in the color orange.

And I’m gonna give it a label so I can add a legend later.

And I’m gonna do the same for actual, obviously.

And I’ll just color this a different color, so maybe blue.

And I’ll also give it a label.

And I’m also going to create a legend for this, so we can differentiate these lines.

I’ll just place it in the upper left, it’s pretty reasonable location.

Okay, let’s have a look at our actual versus predicted.

See if it was way off the mark or not.

Okay, so having a look at this, the predicted kind of

follows the same kind of general downward pattern as the actual.

It’s quite off the mark here, but we can’t tell exactly how far off the mark.

So we need to calculate the mean absolute error as a way of seeing how big are

these differences between actual and predicted, so let’s go ahead and do that.

So I’ll comment these out.

And we’re going to calculate the mean absolute error to evaluate the model and

see if there’s a big difference between actual values and the predicted values.

And average over these. So first of all we’ll get our actual values and our holdout set.

And we’ll just get the index starting at 19, ending 23.

So our last five values of our holdout set.

We’ll do the same for predicted.

Okay, great. Now we’re going to basically go through and compare each value

so we’re going to take the first

actual value, and minus the first predicted value and then we’ll take the

second actual value and minus the second predicted value and so on and so forth.

And so we’re going to have all these

values over the differences between the two, we’re going to store them in an array

called prediction errors. And then at the end of that we’re just going to average

over their absolute values to get an idea of, you know, the mean absolute error

or the overall error rate here.

So, for example, you can take the first actual value, minus the first predictive value.

And we’re going to pin that onto our predictions error array.

And we want to have a quick look at these differences. See if they’re quite big or not.

Alright, let’s have a look at these.

Between tabbing and having four spaces, the war begins.

We use four spaces in this instance. Just make sure that’s consistent because Python is

kind of a language that kind of has these issues all the time.

Okay, so here are our differences between actual and predicted.

So they don’t seem too bad. In some cases they might be quite far off the mark,

considering that we have values that go six places after the decimal point.

Zero point two, zero point two five might be quite big of a difference.

But the way to really judge this is to average over them their absolute values

Okay, we’ll store it in the variable called mean absolute error and we’re

going to obviously get the mean first and use the statistics package for this.

Look at the mean of the absolute values. And that’s pretty much it.

And the absolute values of our prediction errors.

And we obviously want to print this, so let’s have a look at it.

Okay, so our mean absolute error is about 0.02, so it’s here.

So that basically means that it’s off the mark for about 0.02, so it’ll

We have to be underestimating or overestimating, but considering, as I said,

like, there’s six values past this decimal point, maybe this is quite a big difference.

Maybe it’s not too big of a deal. It’s something that we need to consider here.

You’d have to think of this and decide whether you would accept this model as it is.

There are a few problems to be aware of in this model. For one, the

data might be not entirely stationary, so even though it looked fairly

stationary to our judgement when we were plotting it before, a test would help

better determine this. So what we could do is use the augmented dickey-fuller

test to check if those two rounds of differences that we did resulted in

a stationary data or not.

So let’s have a look here and see why we’re getting a relatively big mean absolute error.

And we’re going to print the p-value for this test, so if the p-value is greater

than 0.05, which is our significance level, we’ll accept the null hypothesis

as the data is non-stationary. And if it’s less than or equal to 0.05 we’re

going to reject that null hypothesis and say that the data is stationary.

So if we want it to be stationary, we want to see it less than or equal to 0.05.

Let’s see if this is the case.

Okay, let’s print this and have a look.

Okay, so we probably wouldn’t accept the model as it is because it’s confirmed that we have

stationary issues with our data, it’s not completely stationary yet.

So this could be a reason why it’s a bit off the mark. So then we need to look at better

transforming this data. One way you could do this is you could look at say

stabilizing the variance by applying maybe the cube root which can take into

account negative and positive values. And then you can difference the data.

You might also want to compare models with different AR and MA terms, so remember

when we printed the summary of our model and there were some terms that weren’t

really significant enough to be included in the model, maybe you look at running a

model just with one MA term and see if that makes a difference to the results.

Also, another thing to consider is, this is a very small sample size of

only 24 timestamps in our entire dataset, 19 in our train set.

There might not be enough data to spare for a holdout set. So then to get more out of

your data for training, you could look at rolling over time series or time stamps

at a time for different holdout sets. And this allows you to train on more time

stamps, so it doesn’t stop the model from capturing the last chunk of time stamps

stored in a single holdout set.

Another thing is that the data only looks at 24 hours in one day.

I mean, would we stop to capture more of a trend in hourly sentiment if we collected data over several days?

How would you go about collecting more data?

So that’s something else to think about.

So what I would like you to do now, is take on this challenge and further improve on this model.

So you’ve been given a head start, now I want you to take this example and improve on it.

Sometimes we get into the habit of just following along and

copying what somebody else is doing, but I want you to think critically about

this, and think about some of the issues that we talked about and how you can take this further.

To study time series further, you also need to understand

things like model diagnostics, using the AIC to search for best model parameters.

You need to be able to handle any daytime data issues. You might want to

try other modeling techniques.

So, time series is something that we plan to introduce in

Data Science Dojo’s post bootcamp material, but you can learn more during a

short sort of intense bootcamp. We cover some key machine learning algorithms and

techniques, and we take you through the critical thinking process behind many

data science tasks. You can check out the curriculum below this video.

But keep fine-tuning and keep practicing.

if you found this video tutorial useful, give us a like.

Otherwise, you can check out our other videos at tutorials.datasciencedojo.com

**Watch Part 1:**

Read and Transform your Data: Time Series in Python

**Watch Part 2:**

ARIMA modeling and forecasting: Time Series in Python

**Code, R & Python Script Repository**

**Packages Used:**

pandas

matplotlib

StatsModels

statistics

**More Data Science Material:**

[Video] Getting started with Python and R for Data Science

[Video] Web scraping in Python and Beautiful Soup

[Blog] Breaking the Curse of Dimensionality with Python

(2814)

Very good post. I absolutely appreciate this site.

Keep writing!

Great videos & material – easy to follow along.

Thanks!