Deadline: June 29, 2020 00:00 CEST| Register here
[Oops, the opportunity has already expired. Sign up to AlphaGamma weekly newsletters to stay ahead of the game]
Eligibility: Only students with a valid enrollment certificate can take part


The DATA MINING CUP (DMC) has inspired students around the world to pursue intelligent data analysis since the year 2000. In the 20th DATA MINING CUP in 2019 about 150 teams from 114 universities in 28 countries took part in the competition. The best teams will be invited to Berlin for the awards ceremony at the retail intelligence summit.

Forecasting demand for optimized inventory planning

It is no secret that the ability to optimize stocks provides many benefits for retail companies. Different advantages may accrue, depending on the type of company, its strategy and its situation.

It allows store-based retailers to downsize their storage space and increase the sales area to provide a more open and inviting shopping experience, for example. Online retailers, on the other hand, may be able to upscale their business without relocating their entire operations to a larger facility.

Overall, optimized inventory planning helps to reduce the number of slow-moving goods, because retailers only stock products that people actually buy. This, in turn, means that it is not necessary to send customers away because products are temporarily unavailable; this increases both revenues and customer satisfaction.

Moreover, fewer slow-moving goods mean less reorganization, accounting and clearance and this also reduces the work time required and the outlay for logistical services.

For these reasons, forecasting demand is the focus of this year’s DATA MINING CUP.


An established retailer wants to optimize its inventory planning to not only significantly reduce storage space but also its costs and need for logistical operations. It plans to restock its inventory every other week and only keep in stock the items that it has actually sold during that period.

The goal of the participating teams is to create a machine learning model to predict the demand for every product over the two-week period. It is important to point out that some products will be promoted for limited periods of time.

Products that are promoted during the simulation period will be earmarked. However, the transaction data needs to indicate whether a product is being promoted during the training period.

Finally, the model does not need to be able to respond to price changes during the simulation period. To simplify matters, prices will not be changed during the period.

In order to create this model, the teams obtain information about the exact time of every transaction during a period of six months and about other features that describe the products.


Historical data must be used to create a machine learning model to reliably forecast the demand for each item in the “items.csv” file for a period of 14 days. Use the period starting on 30 June 2018 00:00:00, the day after the last date from the transaction files.

The historical demand for an item (e.g. daily) can be derived from the “orders.csv” file by aggregating the orders for each item (daily). The “orders.csv” file is not already aggregated (e.g. on a daily basis); as a result, the participant can choose the scope of its time steps more freely.

In addition to time-dependent features, participants are allowed to use any attribute provided by the “items.csv”, “orders.csv” and “infos.csv” files. The solution file must match the specifications described in the Data section if they are relevant.


  • 1st place: EUR 2,000.00
  • 2nd place: EUR 1,000.00
  • 3rd place: EUR 500.00

Interested in applying for the DATA MINING CUP 2020? Register your participation by following the registration link and taking the suggested steps.

For more opportunities, check our opportunities section and subscribe to our weekly newsletters.


Please enter your comment!
Please enter your name here