In making practical decisions, agents are expected to comply with ideals of behaviour, or norms. In reality, it may not be possible for an individual, or a team of agents, to be fully compliant—actual behaviour often differs from the ideal. The question we address in this paper is how we can design agents that act in such a way that they select collective strategies to avoid more critical failures (norm violations), and mitigate the effects of violations that do occur. We model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and propose a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPs. We develop mechanisms for solving this type of multi-agent planning problem and show, through empirical analysis, that joint policies generated are equally as good as those produced through existing methods but with significant reductions in execution time.
- Multi-agent planning