Multi-Agent Intention Progression with Reward Machines

Michael  Dann; Yuan  Yao; Natasha  Alechina; Brian Logan; John Thangarajah

Multi-Agent Intention Progression with Reward Machines

Michael Dann, Yuan Yao, Natasha Alechina, Brian Logan, John Thangarajah

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Abstract

Recent work in multi-agent intention scheduling has shown that enabling agents to predict the actions of other agents when choosing their own actions may be beneficial. However existing approaches to ‘intention-aware’ scheduling assume
that the programs of other agents are known, or are “similar” to that of the agent making the prediction. While this assumption is reasonable in some circumstances, it is less plausible when the agents are not co-designed. In this paper, we present a new approach to multi-agent intention scheduling in which agents predict the actions of other agents based on a high-level specification of the tasks performed by an agent in the form of a reward machine (RM) rather than on its (assumed) program. We show how a reward machine can be used to generate tree and rollout policies for an MCTS-based scheduler. We evaluate our approach in a range of multi-agent
environments, and show that RM-based scheduling out-performs previous intention-aware scheduling approaches in settings where agents are not codesigned.

Original language	English
Title of host publication	Proceedins of the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence
Publication status	Published - 30 Jul 2022
Event	31st International Joint Conference on Artificial Intelligence: and the 25th European Conference on Artificial Intelligence - Messe Wien, Vienna, Austria Duration: 23 Jul 2022 → 29 Jul 2022 Conference number: 31 https://ijcai-22.org/

Conference

Conference	31st International Joint Conference on Artificial Intelligence
Abbreviated title	IJCAI-ECAI 2022
Country/Territory	Austria
City	Vienna
Period	23/07/22 → 29/07/22
Internet address	https://ijcai-22.org/

Cite this

@inproceedings{c36a451dd4d94e5d965ec8fe51bc2ecf,

title = "Multi-Agent Intention Progression with Reward Machines",

abstract = "Recent work in multi-agent intention scheduling has shown that enabling agents to predict the actions of other agents when choosing their own actions may be beneficial. However existing approaches to {\textquoteleft}intention-aware{\textquoteright} scheduling assumethat the programs of other agents are known, or are “similar” to that of the agent making the prediction. While this assumption is reasonable in some circumstances, it is less plausible when the agents are not co-designed. In this paper, we present a new approach to multi-agent intention scheduling in which agents predict the actions of other agents based on a high-level specification of the tasks performed by an agent in the form of a reward machine (RM) rather than on its (assumed) program. We show how a reward machine can be used to generate tree and rollout policies for an MCTS-based scheduler. We evaluate our approach in a range of multi-agentenvironments, and show that RM-based scheduling out-performs previous intention-aware scheduling approaches in settings where agents are not codesigned.",

author = "Michael Dann and Yuan Yao and Natasha Alechina and Brian Logan and John Thangarajah",

year = "2022",

month = jul,

day = "30",

language = "English",

booktitle = "Proceedins of the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence",

note = "31st International Joint Conference on Artificial Intelligence : and the 25th European Conference on Artificial Intelligence, IJCAI-ECAI 2022 ; Conference date: 23-07-2022 Through 29-07-2022",

url = "https://ijcai-22.org/",

}

TY - GEN

T1 - Multi-Agent Intention Progression with Reward Machines

AU - Dann, Michael

AU - Yao, Yuan

AU - Alechina, Natasha

AU - Logan, Brian

AU - Thangarajah, John

N1 - Conference code: 31

PY - 2022/7/30

Y1 - 2022/7/30

N2 - Recent work in multi-agent intention scheduling has shown that enabling agents to predict the actions of other agents when choosing their own actions may be beneficial. However existing approaches to ‘intention-aware’ scheduling assumethat the programs of other agents are known, or are “similar” to that of the agent making the prediction. While this assumption is reasonable in some circumstances, it is less plausible when the agents are not co-designed. In this paper, we present a new approach to multi-agent intention scheduling in which agents predict the actions of other agents based on a high-level specification of the tasks performed by an agent in the form of a reward machine (RM) rather than on its (assumed) program. We show how a reward machine can be used to generate tree and rollout policies for an MCTS-based scheduler. We evaluate our approach in a range of multi-agentenvironments, and show that RM-based scheduling out-performs previous intention-aware scheduling approaches in settings where agents are not codesigned.

AB - Recent work in multi-agent intention scheduling has shown that enabling agents to predict the actions of other agents when choosing their own actions may be beneficial. However existing approaches to ‘intention-aware’ scheduling assumethat the programs of other agents are known, or are “similar” to that of the agent making the prediction. While this assumption is reasonable in some circumstances, it is less plausible when the agents are not co-designed. In this paper, we present a new approach to multi-agent intention scheduling in which agents predict the actions of other agents based on a high-level specification of the tasks performed by an agent in the form of a reward machine (RM) rather than on its (assumed) program. We show how a reward machine can be used to generate tree and rollout policies for an MCTS-based scheduler. We evaluate our approach in a range of multi-agentenvironments, and show that RM-based scheduling out-performs previous intention-aware scheduling approaches in settings where agents are not codesigned.

M3 - Published conference contribution

BT - Proceedins of the 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence

T2 - 31st International Joint Conference on Artificial Intelligence

Y2 - 23 July 2022 through 29 July 2022

ER -

Multi-Agent Intention Progression with Reward Machines

Abstract

Conference

Fingerprint

Cite this