Galit Shmueli

Galit Shmueli

Chair Professor @ NTHU, Taiwan. Pioneered business analytics courses at U of Maryland, Indian School of Business, NTHU & Statistics.com. Read about her research & textbooks at galitshmueli.com

Location Taiwan

Activity

  • As a final bonus, this week the "Practical Time Series Forecasting" e-textbooks are at a token $1.99 (or less) on Amazon and Google Play Books

    https://amzn.to/2xe6PsF
    https://amzn.to/3eek6Sv

  • @FergusHoward if you have a specific question please post it here and one of the educators will respond. Please note that the goal of the discussion boards is for anyone to respond (other participants, other educators)

  • Hello Chibamba,
    The focus of this course is forecasting, and even that is such a large topic to cover! We therefore cannot teach R programming in this course. The R code and examples shown in our course (and in the textbook) are intended for those with at least basic R knowledge. For others, we provide a free license for the Excel add-on Analytic Solver...

  • David: we double-checked the links to the Kaggle pages mentioned here and they all seem to work fine. Please try again, or try another browser

  • Please note that in later weeks we will be using an Excel add-on called Analytics Solver (XLMiner), which will only work if you have Microsoft Excel. You can use the online 365 Office instead of installing on your computer. Or, you can use R instead.

    I will also note that for the purpose of visualization, you are better off using Tableau or other...

  • Hello Vic: We cannot model the noise — that is the uncertainty part of forecasting, but we do try to quantify it, which gives us an idea of how wrong our forecasts will be

  • Thanks for catching this. We found the new URL and it is now updated

  • Sonam - you are right! The interview was taken in my office, where prayer flags from Bhutan are hung for good luck

  • Thank you for this comment. There are so many interesting related topics that we could have mentioned! We had to choose the most important topics in order to keep the course focused. Most people are familiar with cross-sectional data (every Stat 101 course covers cross-sectional data analysis), so we use this familiarity to contrast it with time series data....

  • Hello Robert. This course was designed to closely be based on the textbook Practical Time Series Forecasting (R or XLMiner editions) -- see www.forecastingbook.com. You can use the XLMiner edition and work your way with a different software of your choice.

  • Thank you for your honest feedback, Ben. This is a learning journey for all of us and an opportunity to improve.

  • Kleyn - thus far, it looks like the machine learning methods you mentioned are not up to speed for time series forecasting. See this interesting blog post: https://www.r-bloggers.com/timeseries-forecasting-using-extreme-gradient-boosting/

  • Yes- good catch, Ben. It's indeed a typo. This means you really understand it!

  • Thank you Rich. We will take a look.

  • Guus - remember that there's another component remaining after removing trend and seasonality (besides error): level. The MA is trying to capture the level by averaging out the noise.

  • Kleyn - see the next video/article on capturing seasonality. That's where we introduce the use of dummy variables.

  • Rich - you reached an important conclusion. The two options for dealing with a complex series are to create a more complicated model or to simplify the data. Splitting the series is simplifying the data. Jackie suggested differencing - that is also simplifying the data. And I agree these are good ideas.

  • Ben - please see this page on damped trend in exponential smoothing: https://www.otexts.org/fpp/7/4. In short, it means the trend approaches a constant after some time. "The effect of this is that short-run forecasts are trended while long-run forecasts are constant."

  • Please see the Google Sheet that Nick Danks posted for an earlier step: Columns E and F contain formulas for the two steps of un-differencing. https://www.futurelearn.com/courses/business-analytics-forecasting/1/steps/115988

  • Amanda - yes, no trend means the series is not consistently increasing or decreasing (but it might increase or decrease due to seasonality or noise)

  • We are using the terms "error" and "noise" interchangeably, to indicate the non-systematic component of the time series.

  • The software will do that for you. If you're interested in the technical details, here's one good description (they call the ACF plot a correlogram): http://www.ltrr.arizona.edu/~dmeko/notes_3.pdf

  • You can do both: compute autocorrelation for the raw series and for residuals. Note that if the series has trend/seasonality, that will show up in the ACF plot and will reflect the autocorrelation due to trend/seasonality.

  • Thank you all for this feedback. It's highly useful, as the idea of the test is not to test on semantics, but on understanding. Rich, can you please specify which paragraph you mean? Great to hear that you are enjoying the course!

  • Luis and Isabel: The wording was just a bit different from what Luis posted, which makes all the difference. First, for the normal errors, it said "If the forecast error distribution is not normally distributed (a bell curve) around zero, then we CANNOT compute prediction intervals". This is incorrect because we can always compute empirical prediction...

  • Ben - you are correct that differencing is not a smoothing method. We included it in this module because it is an important and simple operation that often precedes smoothing (or other methods that assume no trend and/or seasonality).

  • The I in ARIMA is the differencing operation that is required before fitting an ARMA model. You cannot tune the ARMA parameters to capture seasonality or trend, because by design it assumes both are absent.

  • Juan and Abheyjot: you raise a critical issue when using programming! When you're deep into the code be careful of tunnel vision, where you lose sight of the data and bigger problem. One good approach is to create visualizations of the data your code is creating. Looking at your creation through charts can help!
    Abheyjot, can you share how you noticed the...

  • Please use visualization to determine whether each series shows seasonality. We saw how to do that in Week 2.

  • Kleyn, in time series partitioning we must maintain the temporal sequence of the rows, so your randomization of row order is not a good idea. If you want to split manually without creating a ts object, create two series: one until a certain date and another from that date forward. The choice of the split should be based on the business goal, data length,...

  • Naive forecasting can work well also when the measurements are very frequent (eg, bike demand every 1 min) so that adjacent measurements are very similar.

  • Kleyn, unlike in cross sectional data, in time series we only partition into two parts, and we partition temporally (validation is most recent part of series), so beware of how you are performing cross validation.

  • Isabel - can you show us your dashboard? It would make your points very clear. As they say "a picture is worth a thousand words".

  • Qlikview is indeed another great interactive visualization tool.

  • Yanwei - your visualization is good! You combined several charts into a dashboard. You might want to make sure the date is identified by Tableau as a date, so that when you use day-of-week it will show the names of the days rather than numbers.

  • Good visualization Jackie! Notice that week that looks different from all the others. What might that be?

  • It loads properly on our side. Can you try a different browser? a different computer?

  • Luis - the Cycle component that you mention is probably "business cycle". While it is likely to be present in time series, we typically do not see a whole cycle because business cycles are very long. So, we effectively do not model them. Our 4 component breakdown is because the methods we will see can capture level, trend, and seasonality and we will quantify...

  • Charts are one way. Domain knowledge is another way (additive means growth in terms of fixed values, whereas multiplicative means percentage growth). We can try using methods for fitting both types, and then look at the forecast errors series to see if we captured the trend and/or seasonality properly. We'll see more in later weeks.

  • For those with admin restrictions on your computer, you can use the cloud version of Analytic Solver (which has all the XLMiner functionality). This works through a browser but requires Internet connection. Once you get the license (using the instructions on this page) go to http://analyticsolver.com/

  • The audio quality of this video is indeed not as clear as all the others because we shot it impromptu when we discovered the Youbike event. However, we think it's valuable! As Mahsa suggested, please use the subtitles. Apologies for the inconvenience.

  • Let me add: of course in both cases (predictive and descriptive) you ideally want to explore your data. We'll see lots of that in Week 2. However, determining whether the eventual goal is to forecast future values (prospective) or only quantify patterns in the series (descriptive) will affect many things, such as choice of method and how to evaluate...

  • Reino, that's a good point about notation. Yes, it is also customary to use ( ) instead of subscripts, such as Y(t), F(t+k), etc. And indeed, very important to remember that to get the forecast error you subtract the forecast from the actual!

  • Thank you for catching this Jackie! We updated the article to talk about Taipei MRT. (The Amtrak example is described in the textbook Practical Time Series Forecasting, and hence the confusion)

  • Thanks for catching that glitch Michael. Please check again - the file should be accessible.

  • Galit Shmueli replied to [Learner left FutureLearn]

    Dear Martin - the discount code is for purchase from Createspace.com, not from Amazon. Please use this link for the Excel book: https://www.createspace.com/6192529 with discount code A5QCYT4W