Tutorials

REGISTRATION OPENING ON SEPTEMBER 7 AND 8 WILL BE AT 8:15

Robots are meant to operate in the real world. However, even the best system we can engineer today is bound to fail whenever the setting is not heavily constrained. This is because the real world is generally too nuanced and unpredictable to be summarized within a limited set of specifications. There will be inevitably novel situations and the system will always have gaps or ambiguities in its own knowledge. This calls for robots able to learn continuously over time. In this tutorial I will focus mainly on the life long learning of perceptual and semantic object knowledge, i.e. (a) knowledge about the visual appearance of objects, necessary to the robot to recognize and localize objects in its own environment, and (b) knowledge about properties that directly affect how the object should be manipulated, where it should be found and where it should be placed. The computer and robot visual learning communities have investigated, over the last decade, a set of research issues closely related to the problems outlined above, such as domain adaptation, attributes, active learning, dataset bias and transfer learning. This research has benefitted from cross-fertilization among the two fields, but it has largely been conducted separately. The time is ripe for a critical review of the field able to promote further integration of experts from the two domains.

This tutorial will introduce the audience to these research threads from the point of view of computer and robot vision, identifying current trends and open issues in both communities, successful examples of cross fertilizations and identifying open challenges in the fields. Links to resources available online and ongoing research projects will be provided. For each of the topics discussed (see syllabus with references below), a strong emphasis will be placed on the differences between computer and robot vision coming from the reality of the robot of having a moving body (embodiment), and its need to perform actions in its surrounding world (situatedness).

This tutorial will deal with the process of recovering the 3D position of points (structure) framed from different locations and angular attitude (motion). Historically, this issue have been dealt with in Photogrammetry since the first photographic camera was invented (~ 1840). More recently (~ 1980) Computer Vision re-discovered some of the old findings and also contributed his own results. I will try to give a unified view of the classical results from both disciplines and an outlook on recent trends.

Brain functional and structural connectivity can be assessed with advanced techniques of magnetic resonance imaging (MRI), namely functional MRI (fMRI) and diffusion tensor imaging (DTI). In particular, resting-state fMRI represents a powerful means for the observation of functional interactions during rest condition. Using this technique, the existence of network patterns characterized by coherent spontaneous activity in the human resting brain has been demonstrated and is nowadays considered a fundamental property of brain functional organization. The analysis of these networks requires the development and application of several mathematical and statistical methods. New approaches are continuously emerging in the attempt of providing a more complete and thorough description of functional network architecture. For example, recent findings highlighted the non-stationarity of functional networks, which is missed by conventional functional connectivity analysis that relies upon correlation computed over the full duration of the scan (i.e., several minutes). This observation encouraged the development of new methods for the exploration of brain network dynamics, particularly relevant in the application to brain diseases involving dynamic neuronal processes, like epilepsy. Furthermore, understanding the link between functional connectivity and the underlying structural architecture remains still one major challenge of the field and would be important in order to obtain a complete picture the human brain organization. This tutorial will be focused on novel emerging methods to explore human brain connectivity and their potential value in the application to the clinical field.

Deep learning has become a major breakthrough in artificial intelligence and achieved amazing success on solving grand challenges in many fields including computer vision, speech recognition, and natural language processing. Its success benefits from big training data and super parallel computational power emerging in recent years, as well as advanced model design and training strategies. The most important breakthrough of deep learning in computer vision happened in 2012. Hinton’s group won the ImageNet object recognition challenge with the deep convolutional neural network and beat conventional computer vision technologies with a large margin.

In this tutorial, I will introduce deep learning and its applications in computer vision. It starts with a historical overview of deep learning and introduction on several classical deep models. Through concrete examples on image classification, face recognition, object detection, image segmentation, and video understanding, I will explain why deep learning works in computer vision and how design effective deep models and learning strategies. Some open questions related to deep learning will also be discussed in the end.

Tutorials

REGISTRATION OPENING ON SEPTEMBER 7 AND 8 WILL BE AT 8:15

Proceedings

Partners

Twitter