ARIMA modeling and forecasting: Time Series in Python Part 2

In part 2 of this video series, learn how to build an ARIMA time series model using Python’s statsmodels package and predict or forecast N timestamps ahead into the future. Now that we have differenced our data to make it more stationary, we need to determine the Autoregressive (AR) and Moving Average (MA) terms in our model. To determine this, we look at the Autocorrelation Function plot and Partial Autocorrelation Function plot. This series is considered for intermediate and advanced users. If you are looking to further your knowledge in data science, why not check out our data science bootcamp.

Watch Part 1 Here:
Read and Transform your data: Time Series in Python

Watch Part 3 Here:
Mean Absolute Error for Forecast Evaluation: Time Series in Python

Code, R & Python Script Repository

Packages Used:

More Data Science Material:
[Video] Getting started with Python and R for Data Science
[Video] Web scraping in Python and Beautiful Soup
[Blog] Supercharge your Python Plots with Zero Extra Code


Rebecca Merrett
About The Author
- Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.


  • Avatar

    Thank you Rebecca. Great videos indeed. One observation: for some reason I was expecting you to use hourly_sentiment_series_diff2 instead of hourly_sentiment_series in ARMA1model (line 38, video 2). Or the previous differencing was only to come up with correct orders for the model? Sergey

  • Rebecca

    Thanks! Within ARIMA() you specify the order of number of AR, number of differences, and number of MA. So the order in this example is order=(5,2,1), with 2 sets of differences on the data. We use hourly_sentiment_series_diff2 for plotting to see if this helped make the data more stationary. We first see if 1 set of differences is enough or not before applying differences again. If you need to do this 2 times then you set this to 2 in the order(). If 1 set of diffs does a fairly good job, then set it to 1 in order(). This will difference the data the same way we differenced it for the purpose of plotting and checking if how it looks.


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>