Optimal Clustering of Products for Regression-Type and Classification-TypePredictive Modeling for Assortment Planning
In collaboration with a national retailer, this study focused on assessing the impact of sales prediction accuracy when clustering sparse demand products in various ways, while trying to identify scenarios when framing the problem as a regression-problem or classification-problem would lead to the best demand decision-support. This problem is motivated by the fact that modeling very sparse demand products is hard. Some retailers frame the prediction problem as a classification problem, where they obtain the propensity that a product will sell or not sell within a specified planning horizon, or they might model it in a regression setting that is plagued by many zeros in the response. In our study, we clustered products using k-means, SOMs, and HDBSCAN algorithms using lifecycles, failure rates, product usability, and market-type features. We found there was a consistent story behind the clusters generated, which was primarily distinguished by particular demand patterns. Next, we aggregated the clustering results into a single input feature, which led to improved prediction accuracy of the predictive models we examined. When forecasting sales, we investigated a variety of different regression- and classification-type models and report a short list of those models that performed the best in each case. Lastly, we identify certain scenarios we observed when modeling the problem a classification problem versus a regression problem, so that our partner could be more strategic in how they use these forecasts for their assortment decision.
This study describes an optimization solution to minimize costs at the inventory system by the retailer. In the past, all demands were forecasted yearly and information regarding item distribution was not used. The retailer used weekly and monthly demand forecasts by just dividing yearly forecast with specific numbers. Therefore, the retailer purchased items in bulk to prepare for unexpected demand from vendors, which generated huge holding costs. If we approach the distribution of each item, then a dynamic economic order quantity model would be possible. We solved this problem by using diverse distributions for each item. Then, we built up formulas to calculate costs and service levels. Then we optimized our model to minimize the cost, while meeting several business requirements, such as minimum service level, for each item type. Lastly, we show the impact that the quality of the demand forecast has on the business
In collaboration with a national consulting company, this study’s objectives are twofold: (1) which machine learning approaches perform the best at predicting demand for grocery items? and (2) what is the performance one could expect to achieve using an open-source workflow versus using proprietary in-house machine learning software? The motivation behind this research is that consulting companies regularly help their retail clients try to understand demand as accurately as possible, but also in a scalable and efficient manner. Efficient and accurate demand forecast enables retailers to anticipate demand and plan better. In addition to delivering accurate results, data science teams must also continue to develop and improve their workflow so that experiments can be performed with greater easy and speed. We found that using open-source technologies such as scikit-learn, postgreSQL, and R, a decent performing workflow could be developed to train and score forecasts for thousands of products and stores accurately at various aggregated levels (e.g. day/week/month) level using deep-learning algorithms. However, the performance of our solution is yet to be compared to the data science team’s commercial platform that we collaborated with and will be added soon. We have been able to learn how they have been able to achieve performance gains (in model accuracy and runtime), which made this collaboration a great learning experience
This study provides an analysis to retrospectively investigate how various promotional activities (e.g. discount rates and bundling) affect a firm’s KPIs such as sales, traffic, and margins. The motivation for this study is that in the retail industry, a small change in price has significant business implications. The Fortune 500 retailer we collaborated with thrives on low price margins and had historically ran many promotions, however, until this study, they had limited ability to estimate the impact of these promotions on the business. The solution given employs a traditional log-log model of demand versus price to obtain a baseline measure of price sensitivity, followed by an efficient dynamic time-series intermittent forecast to estimate the promotional lift. We believe our approach is both a novel and practical solution to retrospectively understand promotional effects of test-andlearn type experiments that all retailers could implement to help improve their revenue management.
The objective of this study is to design and develop a better revenue management system that focuses on leveraging an understanding of price elasticity and promotional effects to predict demand for grocery items. This study is important because the use of sales promotions in grocery retailing has intensified over the last decade where competition between retailers has increased. Category managers constantly face the challenge of maximizing sales and profits for each category. Price elasticities of demand play a major role in the selection of products for promotions, and are a major lever retailers will use to push not only the products on sale, but other products as well. We model price sensitivity and develop highly accurate predictive demand models based on the product, discount, and other promotional attributes, using machine learning approaches, and compare performance of those models against time-series forecasts.