# ARIMA modeling and forecasting: Time Series in Python Part 2

In part 2 of this video series, learn how to build an ARIMA time series model using Python’s statsmodels package and predict or forecast N timestamps ahead into the future. Now that we have differenced our data to make it more stationary, we need to determine the Autoregressive (AR) and Moving Average (MA) terms in our model. To determine this, we look at the Autocorrelation Function plot and Partial Autocorrelation Function plot. This series is considered for intermediate and advanced users. If you are looking to further your knowledge in data science, why not check out our data science bootcamp.

#### Transcript

Hi, welcome back to this Data Science Dojo video tutorial series on time series.
In part one, we left it at differencing our data to make it more stationary.
As this is a requirement of many time series models. In part two we’ll
take our difference data and start modeling on it and forecast into the future.
So, what we need to do now is look at the autocorrelation function and
partial autocorrelation plots, or ACF PACF for short.
So these plots help determine the number of order aggressive terms and moving
average terms in a autoregressive moving average model. Or to spot the seasonality
or periodic trends. So I’ll explain what I mean by autogregressive and moving
average. So autoregressive basically is able to forecast the next timestamps
value by regressing over the previous values, and a moving average is able to
forecast the next timestamps value by averaging the previous values. So, autoregressive
integrated moving average model, which is the one we’re going to
use, is useful for non stationary data as it allows us to difference the data plus
has an additional seasonal differencing parameter for seasonal non stationary data.
So first let’s produce these plots and then I’ll explain how to interpret them.
So, we’re going to produce our first plots going to be ACF plot.
And a different style.
and we’re going to produce a PACAF plot as well.
Okay, let’s have a look at these.
Okay, so the ACF and the PCAF plot includes a 95% confidence interval band.
So anything outside this kind of shaded band here
is a statistically significant correlation. So if we see a significant spike at lag X
in the ACF that helps us determine the number of moving average terms and if we
see a significant spike at lag X in the PACF, that helps us determine the number
of autoregressive terms. So here in the ACF plot we see a spike at about one here.
So that will turn, help us determine the number of moving average terms and
if we look at the PACF, we can see two major spikes here, so one at about
five, and one I think at about thirteen. So that will help us determine the
number of AR terms. For now we’re just going to go ahead with a model that only
includes about five AR terms and see how that goes.
So, now that we have looked at
our ACF and PACF plots, we can now build our ARIMA model. That takes into account
that the amount of terms that we need to use. And just keep in mind this models also
going to infer the frequency, so we need to make sure there’s no gaps between our date
times before we start modeling.
Okay, so let’s call this ARMA 1 model.
And I’m going to apply our ARIMA model.
And we’re gonna give it our data.
And the order of terms is gonna be our ARMA terms and differencing.
So, first we’ll put in number of AR terms here.
Two rounds of differences, or two sets of differences. And one MA term here.
And I’m going to put an option here, or specified transparameters as false.
This kind of ensures, if you set it as true, ensures that things are kept stationary but
you’ll see why I have to set this as false later on in the video tutorial
series, when we talk about issues with our model.
And we’re going to print the summary of our model, so we can get a few details
modeled here, so let’s do that.
I’ll explain how to interpret the summary as well
Okay, let’s go ahead and run this.
So we’ve had a look at our autocorrelation and partial
autocorrelation and now we’ve built our model.
Alright, so this shows us a summary of our model here.
we want to probably look at the P values for our coefficient
of our terms here, so our AR terms and our MA terms here.
So looking at this is is useful because if the P value for say an AR or an MA
coefficient is greater than 0.05, which is our significance level.
A kind of cut off mark to determine whether it’s significant or not.
Then we can say it’s probably not significant enough of a term to keep in the model.
So how you look at this, we might want to remodel and include only this
AR or MA term here, as the other ones might not be necessary.
But for the purpose of demonstration, let’s go ahead, and then we’ll
discuss issues with our model later on.
The next step is, we want to predict the next 5 hours on the next 5 timestamps ahead,
which is our test holdout set.
So I’ll comment these out so they’re not too much of a distraction.
And we’ll give it our model.
and we use the predict function here.
And I’m going to give it the time stamps from the last time stamp was basically 6:00 p.m.
on the 6th of February 9, 2019. So I’m gonna take the time stamps into the future
from the last time stamp, which is from 7:00 p.m. to 11:00 p.m. on the five
time stamps ahead, so let’s do this.
I’m also going to make this type levels, and you’ll see why later on, why we need to specify that.
Okay, let’s run this.
I’m also going to print these predictions, obviously.
Okay, let’s run this.
Alright, so here are our forecasts, or our predictions, for the next five hours ahead.
We can kind of see it going in this sort of downward trajectory here, so it
predicts that sentiment is likely to go, turn in a kind of bad direction.
But what we need to keep in mind is, with time series we need to back
transform our D difference predicted values with our D differenced or
original actual values. This is automatically done when predicting so
when we specified type levels here.
We kind of wanted to predict on the
original scale, not on the D differenced kind of scale.
Nevertheless, we’re going to demonstrate how to de-transform, say, two rounds of differences
using cumulative sum, when you’ve been given original data. So the first step in that
is we want to basically get the second round of differences back to the first
round of differences, and then take that D different starter and get it back to
the original. So kind of like it’s two-step process.
So let’s go ahead and demonstrate this.
So, as I said, we want to get our second round of differences
back to the first round. So I’ll just call this undif one.
Take our second round of differences.
And we’re going to fill in any missing values just so they don’t cause us any problems.
And the next step, we want to get that
difference data, or undifference data back to the original. So this undiff 2.
Once again, fill in any missing values.
Okay, now we can compare these. So, the difference or,
There’re going to be small differences between our original data and our undifferenced data.
But we’re going to round it up to six places after the decimal point.
I mean, our values only come in six places after the decimal point anyway,
So they’re not very big differences to care about, but they’re essentially the same.
When we do round it up six places past the decimal point, so let’s have a look at this.
And we’ll round this.
And we’ll just look at our original data first.
To about six places after the decimal point.
I want to see if it’s equal to the same as our undifferenced data.
Also, do this six points after the decimal point.
And just for our own sanity check, we can just look at the first few values for the original values
and compare it with the D difference values to see if they’re on par.
Let’s have a look at this.
Okay, cool. So, it’s come back as true as if there are no differences
or real differences between them. So our undifference data and our
original values are on par. And you can have your own kind of sanity check here
to make sure, just say the first few examples are definitely the same.
Now that we have modeled the data and made our predictions, we’ll compare our
predictions against the actual values in part three.
Thanks for watching. If you found this video tutorial useful, give us a like. Otherwise, you can check out our
other videos at tutorials.datasciencedojo.com

Watch Part 1 Here:

Watch Part 3 Here:
Mean Absolute Error for Forecast Evaluation: Time Series in Python

Code, R & Python Script Repository

Packages Used:
pandas
matplotlib
StatsModels
statistics

More Data Science Material:
[Video] Getting started with Python and R for Data Science
[Video] Web scraping in Python and Beautiful Soup
[Blog] Supercharge your Python Plots with Zero Extra Code

(6492) - Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.

• Anonymous

Thank you so much! 🙂 It is a very nice explanation.

• Rebecca

Thanks! Glad you found it useful

• Anonymous
• 