Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
Session | ||
WC 02: Learning for Optimization 2
| ||
Presentations | ||
Learning the Follower's Objective Function in Sequential Bilevel Games University Trier, Germany We consider bilevel optimization problems in which the leader has no or only partial knowledge about the objective function of the follower. The studied setting is a sequential one in which the bilevel game is played repeatedly. This allows the leader to learn the objective function of the follower over time. We focus on two methods: a multiplicative weight update (MWU) method and one based on the lower-level's KKT conditions that are used in the fashion of inverse optimization. The MWU method requires less assumptions but the convergence guarantee is also only on the objective function values, whereas the inverse KKT method requires stronger assumptions but actually allows to learn the objective function itself. The applicability of the proposed methods is shown using two case studies. First, we study a repeatedly played continuous knapsack interdiction problem and, second, a sequential bilevel pricing game in which the leader needs to learn the utility function of the follower. Addressing Real-World Side Constraints in Combinatorial Optimization with Deep Reinforcement Learning Bielefeld University, Germany Deep Reinforcement Learning (DRL) methodologies have garnered increasing attention in addressing combinatorial optimization challenges, particularly in domains such as routing and scheduling. While recent approaches have demonstrated notable efficacy, especially in classic problems like the Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP), they often operate within simplified problem settings, lacking real-world side constraints. Consequently, DRL methods encounter difficulties in generating feasible solutions for more complex scenarios. In this study, we address these limitations by introducing additional real-world side constraints and exploring diverse mechanisms to accommodate them while steering the search towards feasible solutions. Our experimentation extends to a variety of combinatorial optimization problems, including the Capacitated Vehicle Routing Problem with Time Windows (CVRPTW) and Skill-VRP, showcasing our approach's effectiveness in handling practical constraints. Optimizing Ambulance Dispatching and Redeployment: A Structured Learning Approach 1School of Management, Technical University of Munich; 2Munich Data Science Institute, Technical University of Munich Minimizing response times to serve patients in a timely manner is crucial for Emergency Medical Service (EMS) systems. Achieving this goal necessitates optimizing operational decision-making to efficiently manage available ambulances. Recent efforts focus on developing online ambulance dispatching policies and determining optimal waiting positions for idle ambulances that enable a fast response to future requests. Against this background, we study a centrally controlled EMS system in which the dispatcher must i) dispatch an ambulance upon receiving an emergency call and ii) redeploy it to a waiting location upon the completion of its service. In this context, we aim to learn an online ambulance dispatching and redeployment policy that minimizes the mean response time of ambulances within the system. As a basis, we first present a mixed integer linear program to derive optimal dispatching and redeployment decisions for offline settings. Second, we introduce a machine learning (ML) pipeline enriched by a combinatorial optimization (CO) layer that leverages these optimal solutions by learning an online ambulance dispatching and redeployment policy in a supervised fashion. Within the pipeline, we train an ML predictor to parameterize the problem instances solved subsequently in the CO layer, generating feasible solutions to our original problem instances. We learn a policy by minimizing the non-optimality between the optimal offline solutions and the online solutions derived from the CO layer. To evaluate the performance of the learned policies against current industry practices, we conduct a numerical case study on the example of San Francisco’s 911 call data. |
Contact and Legal Notice · Contact Address: Privacy Statement · Conference: OR 2024 |
Conference Software: ConfTool Pro 2.6.153+TC © 2001–2025 by Dr. H. Weinreich, Hamburg, Germany |