Vorträge

Talk

- 22 July 2019 • 10:30 AM - 22 April 2019 • 11:30 AM

- Richard T Bryan

- 2P4

- Chengde Wan

- PS-Aquarium

Deep learning has significantly advanced state-of-the-art for 3D hand pose estimation, of which accuracy can be improved with increased amounts of labelled data. However, acquiring 3D hand pose labels can be extremely difficult. In this talk, I will present our recent two works on leveraging self-supervised learning techniques for hand pose estimation from depth map. In both works, we incorporate differentiable renderer to the network and formulate training loss as model fitting error to update network parameters. In first part of the talk, I will present our earlier work which approximates hand surface with a set of spheres. We then model the pose prior as a variational lower bound with variational auto-encoder(VAE). In second part, I will present our latest work on regressing the vertex coordinates of a hand mesh model with 2D fully convolutional network(FCN) in a single forward pass. In the first stage, the network estimates a dense correspondence field for every pixel on the image grid to the mesh grid. In the second stage, we design a differentiable operator to map features learned from the previous stage and regress a 3D coordinate map on the mesh grid. Finally, we sample from the mesh grid to recover the mesh vertices, and fit it an articulated template mesh in closed form. Without any human annotation, both works can perform competitively with strongly supervised methods. The later work will also be later extended to be compatible with MANO model.

Organizers: Dimitrios Tzionas

Archiv Vorträge

- Karl Rohe

- MPI IS Lecture Hall (N0.002)

This paper uses the relationship between graph conductance and spectral clustering to study (i) the failures of spectral clustering and (ii) the benefits of regularization. The explanation is simple. Sparse and stochastic graphs create a lot of small trees that are connected to the core of the graph by only one edge. Graph conductance is sensitive to these noisy "dangling sets." Spectral clustering inherits this sensitivity. The second part of the paper starts from a previously proposed form of regularized spectral clustering and shows that it is related to the graph conductance on a "regularized graph." We call the conductance on the regularized graph CoreCut. Based upon previous arguments that relate graph conductance to spectral clustering (e.g. Cheeger inequality), minimizing CoreCut relaxes to regularized spectral clustering. Simple inspection of CoreCut reveals why it is less sensitive to small cuts in the graph. Together, these results show that unbalanced partitions from spectral clustering can be understood as overfitting to noise in the periphery of a sparse and stochastic graph. Regularization fixes this overfitting. In addition to this statistical benefit, these results also demonstrate how regularization can improve the computational speed of spectral clustering. We provide simulations and data examples to illustrate these results.

Organizers: Damien Garreau

Talk

- 15 June 2018 • 11:00 12:00

- Adrián Javaloy

- S2 seminar room

The problem of text normalization is simple to understand: transform a given arbitrary text into its spoken form. In the context of text-to-speech systems – that we will focus on – this can be exemplified by turning the text “$200” into “two hundred dollars”. Lately, the interest of solving this problem with deep learning techniques has raised since it is a highly context-dependent problem that is still being solved by ad-hoc solutions. So much so that Google even started a contest in the web Kaggle to solve this problem. In this talk we will see how this problem has been approached as part of a Master thesis. Namely, the problem is tackled as if it were an automatic translation problem from English to normalized English, and so the architecture proposed is a neural machine translation architecture with the addition of traditional attention mechanisms. This network is typically composed of an encoder and a decoder, where both of them are multi-layer LSTM networks. As part of this work, and with the aim of proving the feasibility of convolutional neural networks in natural-language processing problems, we propose and compare different architectures for the encoder based on convolutional networks. In particular, we propose a new architecture called Causal Feature Extractor which proves to be a great encoder as well as an attention-friendly architecture.

Organizers: Philipp Hennig

IS Colloquium

- 11 June 2018 • 11:15 12:15

- Cédric Archambeau

- MPI IS Lecture Hall (N0.002)

Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, Gaussian process-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs. After a brief intro to BO and an overview of several use cases at Amazon, I will discuss a multi-task adaptive Bayesian linear regression model, whose computational complexity is attractive (linear) in the number of function evaluations and able to leverage information of related black-box functions through a shared deep neural net. Experimental results show that the neural net learns a representation suitable for warm-starting related BO runs and that they can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss). The proposed method was found to be at least one order of magnitude faster than competing neural network-based methods recently published in the literature. This is joint work with Valerio Perrone, Rodolphe Jenatton, and Matthias Seeger.

Organizers: Isabel Valera

Talk

- 11 June 2018 • 15:00 17:00

- Prof. Martin Spindler

- MPI IS Lecture Hall (N0.002)

In this talk first an introduction to the double machine learning framework is given. This allows inference on parameters in high-dimensional settings. Then, two applications are given, namely transformation models and Gaussian graphical models in high-dimensional settings. Both kind of models are widely used by practitioners. As high-dimensional data sets become more and more available, it is important to allow situations where the number of parameters is large compared to the sample size.

Talk

- 11 June 2018 • 15:00 16:00

- Prof. Martin Spindler

- MPI IS Lecture Hall (N0.002)

In this talk first an introduction to the double machine learning framework is given. This allows inference on parameters in high-dimensional settings. Then, two applications are given, namely transformation models and Gaussian graphical models in high-dimensional settings. Both kind of models are widely used by practitioners. As high-dimensional data sets become more and more available, it is important to allow situations where the number of parameters is large compared to the sample size.

Organizers: Philipp Geiger

- Dr. Greg Byrnes

- Room 3P02 - Stuttgart

Gliding evolved at least nine times in mammals. Despite the abundance and diversity of gliding mammals, little is known about their convergent morphology and mechanisms of aerodynamic control. Many gliding animals are capable of impressive and agile aerial behaviors and their flight performance depends on the aerodynamic forces resulting from airflow interacting with a flexible, membranous wing (patagium). Although the mechanisms that gliders use to control dynamic flight are poorly understood, the shape of the gliding membrane (e.g., angle of attack, camber) is likely a primary factor governing the control of the interaction between aerodynamic forces and the animal’s body. Data from field studies of gliding behavior, lab experiments examining membrane shape changes during glides and morphological and materials testing data of gliding membranes will be presented that can aid our understanding of the mechanisms gliding mammals use to control their membranous wings and potentially provide insights into the design of man-made flexible wings.

Organizers: Metin Sitti Ardian Jusufi

IS Colloquium

- 08 June 2018 • 11:00 12:00

- Prof. Javier Cudeiro

- MPI-IS lecture hall (N0.002)

Visual perception involves a complex interaction between feedforward and feedback processes. A mechanistic understanding of these processing, and its limitations, is a necessary first step towards elucidating key aspects of perceptual functions and dysfunctions. In this talk, I will review our ongoing effort towards the understanding of how feedback visual processing operates at the level of the thalamus, a dynamic relay station halfway between the retina and the cortex. I will present experimental evidence from several recent electrophysiology studies performed on subjects engaged in visual detection tasks. The results show that modulatory driving provided by top-down processes (the feedback from primary visual cortex) critically influences the ongoing thalamic activity and shapes the message to be delivered to the cortex. When neuromodulatory techniques (Transcranial Magnetic Stimulation or static magnetic fields) are used to transiently disrupt cortical activity two very interesting effects show up: (1) alterations in stimulus detection and (2) the spatial properties of thalamic receptive fields are dramatically modified. Finally, I will show how sensory information can be a powerful tool to interact with the motor system and re-organize altered patterns of movement in neurological disorders such as Parkinson's disease.

Organizers: Daniel Cudeiro

- Dr. Hadi Eghlidi

- MPI-IS Stuttgart, Room 5H7

Investigations and control of biological and synthetic nanoscopic species in liquids at the ultimate resolution of single entity, are important in diverse fields such as biology, medicine, physics, chemistry and emerging field of nanorobotics. Progress made to date on trapping and/or manipulating nanoscopic objects includes methods that use permanently imposed force fields of various kinds, such as optical, electrical and magnetic forces, to counteract their inherent Brownian motion.

Organizers: Peer Fischer Ardian Jusufi

Talk

- 05 June 2018 • 11:00 12:00

- Wenzhen Yuan

- MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

Why cannot the current robots act intelligently in the real-world environment? A major challenge lies in the lack of adequate tactile sensing technologies. Robots need tactile sensing to understand the physical environment, and detect the contact states during manipulation. Progress requires advances in the sensing hardware, but also advances in the software that can exploit the tactile signals. We developed a high-resolution tactile sensor, GelSight, which measures the geometry and traction field of the contact surface. For interpreting the high-resolution tactile signal, we utilize both traditional statistical models and deep neural networks. I will describe my research on both exploration and manipulation. For exploration, I use active touch to estimate the physical properties of the objects. The work has included learning the hardness of artificial objects, as well as estimating the general properties of natural objects via autonomous tactile exploration. For manipulation, I study the robot’s ability to detect slip or incipient slip with tactile sensing during grasping. The research helps robots to better understand and flexibly interact with the physical world.

Organizers: Katherine J. Kuchenbecker