We present an approach for recognizing indoor scenes in object constellations that require object search by a mobile robot, as they cannot be captured from a single viewpoint. In our approach that we call Active Scene Recognition (ASR), robots predict object poses from learnt spatial relations that they combine with their estimates about present scenes. Our models for estimating scenes and predicting poses are Implicit Shape Model (ISM) trees from prior work . ISMs model scenes as sets of objects with spatial relations in-between and are learnt from observations. In prior work , we presented a realization of ASR, limited to choosing orientations for a fixed robot head with an approach to search objects that uses positions and ignores types. In this paper, we introduce an integrated system that extends ASR to selecting positions and orientations of camera views for a mobile robot with a pivoting head. We contribute an approach for Next-Best-View estimation in object search on predicted object poses. It is defined on 6 DoF viewing frustums and optimizes the searched view, together with the objects to be searched in it, based on 6 DoF pose predictions. To prevent combinatorial explosion when searching camera pose space, we introduce a hierarchical approach to sample robot positions with increasing resolution.