ARIMA modeling and forecasting: Time Series in Python Part 2

In part 2 of this video series, learn how to build an ARIMA time series model using Python’s statsmodels package and predict or forecast N timestamps ahead into the future. Now that we have differenced our data to make it more stationary, we need to determine the Autoregressive (AR) and Moving Average (MA) terms in our model. To determine this, we look at the Autocorrelation Function plot and Partial Autocorrelation Function plot. This series is considered for intermediate and advanced users.

Watch Part 1 Here:
Read and Transform your data: Time Series in Python

Watch Part 3 Here:
Mean Absolute Error for Forecast Evaluation: Time Series in Python

Code, R & Python Script Repository

Packages Used:
pandas
matplotlib
StatsModels
statistics

More Data Science Material:
[Video] Getting started with Python and R for Data Science
[Video] Web scraping in Python and Beautiful Soup
[Blog] Supercharge your Python Plots with Zero Extra Code

(813)

Rebecca Merrett
About The Author
- Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney, and is undertaking her post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in writing for tech publications.

6 Comments

  • Pingback: Time Series in Python Part 1: Read and Transform Your Data

  • Pingback: Mean Absolute Error for Forecast Evaluation: Time Series in Python

  • Avatar
    Anonymous
    Reply

    Thank you Rebecca. Great videos indeed. One observation: for some reason I was expecting you to use hourly_sentiment_series_diff2 instead of hourly_sentiment_series in ARMA1model (line 38, video 2). Or the previous differencing was only to come up with correct orders for the model? Sergey

  • Rebecca
    Rebecca
    Reply

    Thanks! Within ARIMA() you specify the order of number of AR, number of differences, and number of MA. So the order in this example is order=(5,2,1), with 2 sets of differences on the data. We use hourly_sentiment_series_diff2 for plotting to see if this helped make the data more stationary. We first see if 1 set of differences is enough or not before applying differences again. If you need to do this 2 times then you set this to 2 in the order(). If 1 set of diffs does a fairly good job, then set it to 1 in order(). This will difference the data the same way we differenced it for the purpose of plotting and checking if how it looks.

Avatar

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>