We present an approach that uses combinatorial optimization to decide which spatial relations between objects are relevant to accurately describe an indoor scene, made up of objects. We extract scene models from object configurations that are acquired during demonstration of actions, characteristic for a certain scene. We model scenes as graphs with Implicit Shape Models (ISMs), a Generalized Hough Transform variant. ISMs are limited to represent scenes as star-shaped topologies of object relations, leading to false positives in recognizing scenes. To describe other relation topologies, we introduced a representation of trees of ISMs in prior work together with a method to learn such ISM trees from demonstrations. Limited to creating topologies, corresponding to spanning trees, that method omits certain relations so that false positives still occur. In this paper, we introduce a method to convert any relation topology, corresponding to a connected graph, into an ISM tree using a heuristic depth-first-search. It allows using complete graphs as scene models. Despite causing no false positives, complete graphs are intractable for scene recognition. To achieve efficiency, we contribute a method that searches for an optimal relation topology by traversing the space of connected scene graphs, for a given set of objects, using an optimization similar to hill climbing. Optimality is defined as minimizing computational costs during scene recognition, while producing a minimum of false positives. Experiments with up to 15 objects show that both are achievable by the presented method. Costs, growing exponentially with the number of objects, are transferred from online recognition to offline optimization.