Publication Details
An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Andriushchenko Roman, Ing. (DITS)
Češka Milan, doc. RNDr., Ph.D. (DITS)
JUNGES, S.
KATOEN, J.
Markov decision processes, model-based reasoning, search, decision making under
uncertainty
Dealing with aleatoric uncertainty is key in many domains involving sequential
decision making, e.g., planning in AI, network protocols, and symbolic program
synthesis. This paper presents a general-purpose model-based framework to obtain
policies operating in uncertain environments in a fully automated manner. The new
concept of coloured Markov Decision Processes (MDPs) enables a succinct
representation of a wide range of synthesis problems. A coloured MDP describes
a collection of possible policy configurations with their structural
dependencies. The framework covers the synthesis of (a) programmatic policies
from probabilistic program sketches and (b) finite-state controllers representing
policies for partially observable MDPs (POMDPs), including decentralised POMDPs
as well as constrained POMDPs. We show that all these synthesis problems can be
cast as exploring memoryless policies in the corresponding coloured MDP. This
exploration uses a symbiosis of two orthogonal techniques: abstraction
refinement-using a novel refinement method-and counter-example generalisation.
Our approach outperforms dedicated synthesis techniques on some problems and
significantly improves an earlier version of this framework.
@article{BUT196710,
author="MACÁK, F. and ANDRIUSHCHENKO, R. and ČEŠKA, M. and JUNGES, S. and KATOEN, J.",
title="An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty",
journal="JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH",
year="2025",
volume="2025",
number="82",
pages="433--469",
doi="10.1613/jair.1.16593",
issn="1076-9757",
url="https://www.jair.org/index.php/jair/article/view/16593"
}