TY - JOUR
T1 - Active scene recognition for programming by demonstration using next-best-view estimates from hierarchical Implicit Shape Models
AU - Meissner, Pascal
AU - Reckling, Reno
AU - Wittenbeck, Valerij
AU - Schmidt-Rohr, Sven R.
AU - Dillmann, Rudiger
PY - 2014/9/22
Y1 - 2014/9/22
N2 - We present an approach That combines passive scene understanding with object search in order To recognize scenes in indoor environments That cannot be perceived from a single point of view. Passive scene recognition is performed using Implicit Shape Models based on spatial relations between objects. ISMs, a variant of The Generalized Hough Transform, are extended To describe scenes as sets of objects with relations lying between Them. Relations are expressed as six-degree-of-freedom (DoF) relative object poses. They are extracted from sensor recordings of human demonstrations of actions usually Taking place in The corresponding scene. In a scene ISMs solely represent relations of n objects Towards a common reference. Violations of other relations are not detectable. To overcome This limitation, we extend our scene model, using hierarchical agglomerative clustering, To a binary Tree consisting of ISMs. Active scene recognition aims To simultaneously detect present scenes and look for objects These scenes consist of. For a pivoting stereo camera rig, we achieve This by performing recognition with ISMs in an object search loop using next-best-view (NBV) estimates. A criterion, on which we greedily choose views The rig shall adopt next, is The confidence To detect objects in Them. In each step during The search, confidences on potential positions of objects, not found yet, are calculated based on The best available scene hypothesis. This is done by reversing The principle of ISMs and using spatial relations To predict potential object positions starting from The objects already detected.
AB - We present an approach That combines passive scene understanding with object search in order To recognize scenes in indoor environments That cannot be perceived from a single point of view. Passive scene recognition is performed using Implicit Shape Models based on spatial relations between objects. ISMs, a variant of The Generalized Hough Transform, are extended To describe scenes as sets of objects with relations lying between Them. Relations are expressed as six-degree-of-freedom (DoF) relative object poses. They are extracted from sensor recordings of human demonstrations of actions usually Taking place in The corresponding scene. In a scene ISMs solely represent relations of n objects Towards a common reference. Violations of other relations are not detectable. To overcome This limitation, we extend our scene model, using hierarchical agglomerative clustering, To a binary Tree consisting of ISMs. Active scene recognition aims To simultaneously detect present scenes and look for objects These scenes consist of. For a pivoting stereo camera rig, we achieve This by performing recognition with ISMs in an object search loop using next-best-view (NBV) estimates. A criterion, on which we greedily choose views The rig shall adopt next, is The confidence To detect objects in Them. In each step during The search, confidences on potential positions of objects, not found yet, are calculated based on The best available scene hypothesis. This is done by reversing The principle of ISMs and using spatial relations To predict potential object positions starting from The objects already detected.
UR - http://www.scopus.com/inward/record.url?scp=84929179706&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2014.6907680
DO - 10.1109/ICRA.2014.6907680
M3 - Conference article
AN - SCOPUS:84929179706
SP - 5585
EP - 5591
JO - Proceedings - IEEE International Conference on Robotics and Automation
JF - Proceedings - IEEE International Conference on Robotics and Automation
SN - 1050-4729
M1 - 6907680
T2 - 2014 IEEE International Conference on Robotics and Automation, ICRA 2014
Y2 - 31 May 2014 through 7 June 2014
ER -