VSS 2014 Winners > Moqian Tian
Viewpoint invariant object recognition: Spatiotemporal information during unsupervised learning enhances generalization
Moqian Tian1 and Kalanit Grill-Spector1, 2
1 Department of Psychology, Stanford University, Stanford, CA
2 Stanford Neuroscience Institute, Stanford University, Stanford, CA
People can learn to recognize object across views in an unsupervised way just from the natural viewing statistics. However, it is unknown what information is used during unsupervised training to obtain view invariant recognition. We distinguished temporal information, motion information and spatiotemporal information and tested their role in unsupervised learning to acquire view invariant object recognition. Participants were tested in a difficult task of discriminating similar 3D novel objects across in depth rotation that they performed only slightly above chance before training. After unsupervised training, participant showed significant improvement in discriminating objects across in depth rotation, with no advantage provided by spatiotemporal information or motion compared to temporal proximity alone. However, when training with a small number of object views covering the same view space, rendering views that are 30° apart compared to 7.5° in the previous case, unsupervised learning with spatiotemporal information produced better performance than learning with temporal proximity alone. These results suggest that while it is possible to learn view invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances learning by producing broader view tuning compared to learning via temporal association alone. Our finding help to unify mixed results about spatiotemporal information’s benefit in learning viewpoint invariant recognition, and in a higher conceptual level discussed different situations when part based and view based analysis are utilized. These results have important implications for theories of object recognition as well as for computational algorithms that learn from examples.