The post-COVID supply chain world has become significantly more unpredictable and complex. Predicting customer demand is as important as ever, and a precise demand forecasting model is worth its weight in gold. Improving forecast accuracy aids in inventory optimization, warehouse resource/labor planning, supply & logistics planning, financial planning, and enhancing customer service levels.
It has become imperative for large organizations to invest in advanced forecasting. While statistical forecasting has been around for decades, this area of time series forecasting is evolving at a breakneck speed. Due to this speed of innovation, traditional supply chain planning platforms are sometimes not able to catch up with the latest technology and AI innovations. As a result, many enterprises today are investing in building advanced forecasting models to augment the base forecast coming out of the supply chain platforms.
As someone who is leading the inbound supply chain digital transformation of a Fortune 100 company, I have been fortunate to be in the middle of some of these cool innovations in advanced forecasting, where we are experimenting with the latest time series forecasting machine learning models. I have been regularly working with some of the public datasets in Kaggle competitions as a test bed for the advanced forecasting work, and here are some of my learnings from those experiments.
Never underestimate Data Preparation & EDA: While it is natural for us to want to jump straight into experimentation with various models, we must never underestimate the data preparation, EDA, and feature engineering portions.
- If we are transforming the time series data into a supervised learning dataset, we should create lag features.
- Transform date-related features from the timestamp values.
- Depending on the business requirements, if the observations are not in the correct time frequency, we need to perform upsampling or downsampling depending on the granularity needed.
- We need to make sure we decompose the time series into its components – Trend, Seasonality, and Error/Noise. It is important to visualize and understand whether the trend and seasonality are additive or multiplicative.
- Always create ACF & PACF plots.
- Test for stationarity of the time series using the Augmented Dickey-Fuller (ADF) test, and if it is non-stationary, we need to convert it to a stationary time series.
Know your domain – it is important: While this statement holds good for any data science-related project, it is particularly relevant for a time series forecasting model.
- Often in a real-world project, it is not a “one size fits all” model that solves your forecasting problems. We need to carefully examine the input data and understand the different unique patterns/types for proper segmentation analysis.
For example, in the food service industry, forecasting for perishable products can follow a different approach compared to forecasting for frozen/non-perishable products with a significantly longer shelf life. A fast-moving product category will behave differently than a slow mover.
- Often, it might be beneficial to have a clustering model to properly segment the demand patterns by product category, customer segments, supplier segments, etc., before applying any time series model to them.
Understand your External Demand Signals: External demand signals often play a very important role in the sophistication of forecasting models. In today’s world, when unstructured data has increased exponentially, it is crucial to automatically ingest relevant external data like weather, events, social media feeds, marketing fliers, etc., as inputs to the forecasting model. This capability helps in two ways:
- Enhances the forecasting model in terms of better forecasts for extraneous variables, thereby improving forecast accuracy.
- Automates the process of ingesting these external demand signals, instead of demand planners having to manually gather, analyze, and input the data.
Know your Models – their strengths and weaknesses: While it is always tempting to look for the “latest and greatest” models out there to solve a forecasting problem (and in a topic that is evolving as fast as advanced forecasting, that is quite a temptation for us to resist), it is important to know their relative strengths and weaknesses. We should always be mindful of the operational cost, complexity, and ROI of a particular model before choosing one.
- Holt Winters Model: A lot of time series forecasting problems can be solved with the traditional Holt Winters model (Triple Exponential Smoothing). This model works well with time series that have a clear trend. Another advantage is its simplicity. However, it does not take the latest inter-trends relations, which can impact forecast accuracy. Also, I have seen the Holt Winters model producing lower MAPE for multiplicative trend/seasonality scenarios.
- Auto Regressive Model (ARIMA/SARIMA/SARIMAX): Auto Regressive models are known for their simplicity and ability to handle time series data with linear patterns. However, they cannot model non-linear patterns and also cannot handle multivariate time series problems.
- Long Short Term Memory (LSTM): As a deep learning model, LSTM can handle time series data with long-term dependencies. It can also handle non-stationary data. However, LSTM is computationally expensive as it requires a large amount of training data. LSTM is also complex, and it can be challenging to interpret the results of an LSTM model.
- Tree Based Models (LightGBM, Random Forest, XGBoost): These models are powerful, but they have a limitation regarding time series forecasting. Due to the inherent design of tree-based methods, they are great at identifying patterns but struggle to project that pattern into the future. Hence, it is important to make a time series stationary and de-trend it before using it for tree-based models.
- Prophet: Facebook Prophet is a robust and user-friendly library for Time Series Forecasting. The main advantage of using Prophet is that it handles seasonality, holidays, and missing data really well. Also, its simplicity allows for quick experimentation. However, being an additive model, it sometimes struggles to predict non-additive trends. So, we can definitely consider Prophet if we are trying to build a quick model to minimize Time to Market. However, we might want to evaluate alternatives if we want more stable and accurate models.
- DeepAR: DeepAR from Amazon is an LSTM RNN with further sophistication to improve forecast accuracy on complex data. One of the main advantages of DeepAR is that it is effective at learning seasonal dependencies with minimal tuning. It also makes probabilistic forecasts (Monte Carlo samples). However, one disadvantage of DeepAR is that it requires a large amount of data for training and does not perform particularly well on small datasets. Also, this being a deep learning neural network-based model, the results are not easily interpretable.
Conclusion – It is an Ensemble Cast
At the end of the day, there is no one best model for time series forecasting, especially for a large organization’s input sales data. It is important for us to examine both internal and external data from a domain perspective, perform extensive EDA and feature engineering to understand the nuances, and perform enough clustering/segmentation of data. It is also important that we perform enough experiments with different models to come up with solid comparisons on the results in terms of forecast accuracy. As we would often see, more often than not, it will be an ensemble of methods that will work the best for large, complex time series datasets.