The goal of this project is to model the process of ‘full interpretation’ of object images, namely the ability to identify and localize all semantic features and parts that are recognized by human observers.
The goal of this project is to model the process of ‘full interpretation’ of object images, namely the ability to identify and localize all semantic features and parts that are recognized by human observers. Our approach is based on interpreting multiple reduced but still interpretable local regions that comprise the complete object. In such reduced regions, interpretation is simpler, since the number of semantic components is small and the variability of possible configurations is low. To identify useful components and relations used in the local interpretation process, we consider the interpretation of 'minimal configurations’, which are reduced local regions that are minimal in the sense that further (small) reduction will make them unrecognizable and uninterpretable. We study implications of ‘full interpretation’ for difficult visual tasks, such as recognizing actions and social interactions.