Long Tran-Thanh (University of Warwick): COPs, Bandits, and AI for Good | HUN-REN Alfréd Rényi Institute of Mathematics

Online, Zoom webinar

Description

EGERVÁRY seminar

Abstract: In the recent years, there has been an increasing interest in applying techniques from artificial intelligence (AI) to tackle societal and environmental challenges, ranging from climate change and natural disasters, to food safety and disease spread. These efforts are typically known under the name AI for Good. While many research work in this area have been focusing on designing machine learning algorithms to learn new insights/predict future events from previously collected data, there is another domain where AI has been found to be useful, namely: resource allocation and decision making. In particular, a key step in addressing societal/environmental challenges is to efficiently allocate a set of sparse resources to mitigate the problem(s). For example, in the case of wildfire, a decision maker has to adaptively and sequentially allocate a limited number of firefighting units to stop the spread of the fire as soon as possible. Another example comes from the problem of housing management for people in need, where a limited number of housing units have to be allocated to applicants in an online manner over time.

While sequential resource allocation can be often casted as (online) combinatorial optimisation problems (COPs), they can differ from the standard COPs when the decision maker has to perform under uncertainty (e.g., the value of the action is not known in advance, or future events are unknown at the decision-making stage). In the presence of such uncertainty, a popular tool from the decision-making literature, called multi-armed bandits, comes in handy. In this talk, I will demonstrate how to efficiently combine COPs with bandit models to tackle some AI for Good problems. In particular, I first discuss knapsack bandits, a model that combines knapsack problems with sequential decision making under uncertainty to efficiently allocate limited resources such as in wildfire mitigation. In the second part of the presentation, I will talk about the blocking bandit model, which integrates interval scheduling into the sequential decision-making under uncertainty framework, and can be used for housing assignment for people in need.

Please contact Tamás Király (tkiraly[at]cs.elte.hu) for Zoom access.