I am a PhD student in the Autonomous Vision Group of Andreas Geiger at the University of Tübingen and the MPI for Intelligent Systems in Tübingen as well as at the Robert Bosch GmbH in Leonberg, where I am working under the supervision of Alexandru Condurache in the Driver Assistance Systems Department.
The topic of my PhD is "Efficient Invariant Deep Models for Computer Vision". Convolutional neural networks (CNNs) have revolutionized the field of computer vision and currently offer state-of-the-art performance for many computer vision tasks, including image classification, object detection and semantic segmentation. However, their success comes at the cost of large annotated datasets, which take a considerable effort to create. Therefore, I am investigating how the data efficiency of deep neural networks can be improved by learning invariance/equivariance or directly encoding it into deep neural network architectures.
2014 - 2016 MSc Computer Science, KTH Royal Institute of Technology
2010 - 2013 BSc Media Computer Science, Stuttgart Media University
2016 - 2016 Master Thesis Intern, Bosch
2015 - 2015 Data Mining Intern, Bosch North America
Palo Alto, CA, USA
2013 - 2014 HMI Research Assistant, Bosch North America
European Conference on Computer Vision (ECCV), September 2018 (conference)
Omnidirectional cameras offer great benefits over classical cameras wherever a wide field of view is essential, such as in virtual reality applications or in autonomous robots. Unfortunately, standard convolutional neural networks are not well suited for this scenario as the natural projection surface is a sphere which cannot be unwrapped to a plane without introducing significant distortions, particularly in the polar regions. In this work, we present SphereNet, a novel deep learning framework which encodes invariance against such distortions explicitly into convolutional neural networks. Towards this goal, SphereNet adapts the sampling locations of the convolutional filters, effectively reversing distortions, and wraps the filters around the sphere. By building on regular convolutions, SphereNet enables the transfer of existing perspective convolutional neural network models to the omnidirectional case. We demonstrate the effectiveness of our method on the tasks of image classification and object detection, exploiting two newly created semi-synthetic and real-world omnidirectional datasets.
In International Conference on Computer Vision Theory and Applications, International Conference on Computer Vision Theory and Applications, 2018 (inproceedings)
Deep convolutional neural networks are the current state-of-the-art solution to many computer vision tasks. However, their ability to handle large global and local image transformations is limited. Consequently, extensive data augmentation is often utilized to incorporate prior knowledge about desired invariances to geometric transformations such as rotations or scale changes. In this work, we combine data augmentation with an unsupervised loss which enforces similarity between the predictions of augmented copies of an input sample. Our loss acts as an effective regularizer which facilitates the learning of transformation invariant representations. We investigate the effectiveness of the proposed similarity loss on rotated MNIST and the German Traffic Sign Recognition Benchmark (GTSRB) in the context of different classification models including ladder networks. Our experiments demonstrate improvements with respect to the standard data augmentation approach for supervised and semi-supervised learning tasks, in particular in the presence of little annotated data. In addition, we analyze the performance of the proposed approach with respect to its hyperparameters, including the strength of the regularization as well as the layer where representation similarity is enforced.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems