Health care is probably the last remaining unsafe critical system. A large proportion of reported medical errors occur in the hospital operating room (OR), a highly complex sociotechnical environment. As technology is being introduced into the OR faster than surgeons can learn to use them, surgical errors result from the unfamiliar instrumentation, increased motoric, perceptual and cognitive demands on the surgeons, as well as the lack of adequate training. Effective technology design for minimally invasive surgery requires an understanding of the system constraints of remote surgery, and the complex interaction between humans and technology in the OR. This talk will describe research activities in the Ergonomics in Remote Environments Laboratory at Wright State University, which address some of these human factors issues.
Organizers: Katherine J. Kuchenbecker
Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself. Existing approaches for deep visual recognition are generally opaque and do not output any justification text; contemporary vision-language models can describe image content but fail to take into account class-discriminative image aspects which justify visual predictions. In this talk, I will present my past and current work on Zero-Shot Learning, Vision and Language for Generative Modeling and Explainable Artificial Intelligence in that (1) how we can generalize the image classification models to the cases when no visual training data is available, (2) how to generate images and image features using detailed visual descriptions, and (3) how our models focus on discriminating properties of the visible object, jointly predict a class label,explain why the predicted label is appropriate for the image whereas another label is not.
Organizers: Andreas Geiger
Complex shapes can can be summarized using a coarsely defined structure which is consistent and robust across variety of observations. However, existing synthesis techniques do not consider structural decomposition during synthesis, causing generation of implausible or structurally unrealistic shapes. We explore how structure-aware reasoning can benefit existing generative techniques for complex 2D and 3D shapes. We evaluate our methodology on a 3D dataset of chairs and a 2D dataset of typefaces.
Organizers: Sergi Pujades
Touch requires mechanical contact and is governed by the physics of friction. Frictional movements may convert the continuous 3D profile of textural objects into discrete and probabilistic movement events of the viscoelastic integument (skin/hair) called stick-slip movements (slips). This complex transformation may further be determined by the microanatomy and the active movements of the sensing organ. Thus, the integument may realize a computation, transforming the tactile world in a context dependent way - long before it even activates neurons. The possibility that the tactile world is perceived through these ‘fractured goggles’ of friction has been largely ignored by classical perceptual and neuro-scientific work. I will present biomechanical, neuro-scientific, and behavioral work supporting the slip hypothesis.
Organizers: Katherine J. Kuchenbecker
Optimal control problems are often too complex to solve analytically. Computational methods usually replace the continuous infinite dimensional problem by a finite dimensional discrete approximation. The talk will survey classical discretization techniques based on a Runge-Kutta approximation to the differential equations (an h-method) and then introduce recent approximations based on collocation at the roots of orthogonal polynomials (a p-method). The best approximations are often achieved using an hp-framework that combines the best features of both approaches. Numerical results using the GPOPS-II (General Pseudospectral Optimal Control Software package) will be presented.
Organizers: Jia-Jie Zhu
The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this talk I will revisit the classical analysis of the Gaussian mechanism and show it has several important limitations. For example, our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime and that it cannot be extended to the low privacy regime. We address these limitations by developing a new Gaussian mechanism whose variance is optimally calibrated by solving an equation involving the Gaussian cumulative density function. Our analysis side-steps the use of tail bounds approximations and relies on a novel characterisation of differential privacy that might be of independent interest. We numerically show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive denoising estimators by leveraging that the variance of the perturbation is known. Experiments with synthetic and real data show that this denoising step yields dramatic accuracy improvements in the high-dimensional regime. Based on joint work with Y.-X. Wang to appear at ICML 2018. Pre-print: https://arxiv.org/abs/1805.06530
I will describe recent research in my lab on haptics and robotics. It has been a longstanding challenge to realize engineering systems that can match the amazing perceptual and motor feats of biological systems for touch, including the human hand. Some of the difficulties of meeting this objective can be traced to our limited understanding of the mechanics, and to the high dimensionality of the signals, and to the multiple length and time scales - physical regimes - involved. An additional source of richness and complication arises from the sensitive dependence of what we feel on what we do, i.e. on the tight coupling between touch-elicited mechanical signals, object contacts, and actions. I will describe research in my lab that has aimed at addressing these challenges, and will explain how the results are guiding the development of new technologies for haptics, wearable computing, and robotics.
Organizers: Katherine J. Kuchenbecker
This paper uses the relationship between graph conductance and spectral clustering to study (i) the failures of spectral clustering and (ii) the benefits of regularization. The explanation is simple. Sparse and stochastic graphs create a lot of small trees that are connected to the core of the graph by only one edge. Graph conductance is sensitive to these noisy "dangling sets." Spectral clustering inherits this sensitivity. The second part of the paper starts from a previously proposed form of regularized spectral clustering and shows that it is related to the graph conductance on a "regularized graph." We call the conductance on the regularized graph CoreCut. Based upon previous arguments that relate graph conductance to spectral clustering (e.g. Cheeger inequality), minimizing CoreCut relaxes to regularized spectral clustering. Simple inspection of CoreCut reveals why it is less sensitive to small cuts in the graph. Together, these results show that unbalanced partitions from spectral clustering can be understood as overfitting to noise in the periphery of a sparse and stochastic graph. Regularization fixes this overfitting. In addition to this statistical benefit, these results also demonstrate how regularization can improve the computational speed of spectral clustering. We provide simulations and data examples to illustrate these results.
Organizers: Damien Garreau
The problem of text normalization is simple to understand: transform a given arbitrary text into its spoken form. In the context of text-to-speech systems – that we will focus on – this can be exemplified by turning the text “$200” into “two hundred dollars”. Lately, the interest of solving this problem with deep learning techniques has raised since it is a highly context-dependent problem that is still being solved by ad-hoc solutions. So much so that Google even started a contest in the web Kaggle to solve this problem. In this talk we will see how this problem has been approached as part of a Master thesis. Namely, the problem is tackled as if it were an automatic translation problem from English to normalized English, and so the architecture proposed is a neural machine translation architecture with the addition of traditional attention mechanisms. This network is typically composed of an encoder and a decoder, where both of them are multi-layer LSTM networks. As part of this work, and with the aim of proving the feasibility of convolutional neural networks in natural-language processing problems, we propose and compare different architectures for the encoder based on convolutional networks. In particular, we propose a new architecture called Causal Feature Extractor which proves to be a great encoder as well as an attention-friendly architecture.
Organizers: Philipp Hennig