Header logo is

Institute Talks

Designing Mobile Robots for Physical Interaction with Sandy Terrains

IS Colloquium
  • 18 March 2024 • 16:00—17:00
  • Dr. Hannah Stuart
  • Hybrid - Webex plus in-person attendance in Copper (2R04)

One day, robots will widely support exploration and development of unstructured natural environments. Much of the work I will present in this lecture is supported by NASA and is focused on robot design research relevant to accessing the surfaces of the Moon or Mars. Tensile elements appear repeatedly across the wide array of missions envisioned to support human or robotic exploration and habitation of the Moon. With a single secured tether either rovers or astronauts, or both, could belay down into steep lunar craters for the exploration of permanently shadowed regions; the tether prevents catastrophic slipping or falling and mitigates risk while we search for water resources. Like a tent that relies on tensioned ropes to sustain its structure, tensegrity-based antennae, dishes, and habitations can be made large and strong using very lightweight materials. We ask: Where does the tension in these lightweight systems go? Ultimately, these concepts will require anchors that attach cables autonomously, securely, and reliably to the surrounding regolith to react tensile forces. Thus, the development of new autonomous burrowing and anchoring technologies and modeling techniques to guide design and adoption is critical across multiple space relevant programs. Yet burrowing and anchoring problems are hard for multiple reasons, which remain fundamental areas of discovery. Our goal is to understand how the mechanics of granular and rocky interaction influences the design and control small-scale robotic systems for such forceful manipulations. I will present new mobility and anchoring strategies to enable small robots to resist, or "tug," massive loads in loose terrain, like regolith. The idea is that multiple tethered agents can work together to perform large-scale manipulations, even where traction and gravity is low. Resulting generalizable methods for rapidly modeling granular interactions also inform new mobility gaits to move over, through, or under loose sand more efficiently.

Organizers: Katherine J. Kuchenbecker


Bioadhesive Technologies with Mechanical Principles

Talk
  • 18 March 2024 • 11:00—12:15
  • Dr. Jianyu Li
  • Silver (2P04)

Bioadhesive technologies are important in a wide range of applications, spanning from wound management to wearable technologies. Forming and controlling tough adhesion on biological tissues has been a long-lasting challenge, necessitating transdisciplinary approaches. In my talk, I will share our recent progress in the design, mechanics, and applications of tough bioadhesives. I will first discuss the limitations of clinically used surgical glues and blood clots in terms of adhesion properties. I will then present the mechanical principles for making tough bioadhesives that exhibit superior adhesion performance on diverse tissues. Furthermore, I will discuss our transdisciplinary approaches and underlying mechanisms for controlling tough bioadhesion through ultrasound and interfacial entanglements. Lastly, I will showcase the applications of tough bioadhesives in wound management, tissue repair, and hemorrhage control. This talk will highlight the synergy of materials and mechanics in the development of new biomaterials poised to address biomedical challenges.

Organizers: Christoph Keplinger Adrian Koh

Geometric Regularizations for 3D Shape Generation

Talk
  • 13 March 2024 • 15:00—16:00
  • Qixing Huang
  • N3.022

Generative models, which map a latent parameter space to instances in an ambient space, enjoy various applications in 3D Vision and related domains. A standard scheme of these models is probabilistic, which aligns the induced ambient distribution of a generative model from a prior distribution of the latent space with the empirical ambient distribution of training instances. While this paradigm has proven to be quite successful on images, its current applications in 3D generation encounter fundamental challenges in the limited training data and generalization behavior. The key difference between image generation and shape generation is that 3D shapes possess various priors in geometry, topology, and physical properties. Existing probabilistic 3D generative approaches do not preserve these desired properties, resulting in synthesized shapes with various types of distortions. In this talk, I will discuss recent work that seeks to establish a novel geometric framework for learning shape generators. The key idea is to model various geometric, physical, and topological priors of 3D shapes as suitable regularization losses by developing computational tools in differential geometry and computational topology. We will discuss the applications in deformable shape generation, latent space design, joint shape matching, and 3D man-made shape generation.

Organizers: Yuliang Xiu


  • Marie Großmann
  • Hybrid - Webex plus in-person attendance in 5N18

The sensory perception of the world, including seeing and hearing, tasting and smelling,touching and feeling, are necessary social skills to become a social counterpart. In this context,the construction of a perceptible technology is an intersection where technical artifacts have the capability to interact and sense their environment. Sensors as technical artifacts not only measure various (physical) states, with their presented results influencing perceptions and actions, but they also undergo technical and computational processing. Sensors generate differences by capturing and measuring variations in their surroundings. In this talk, I will share insights from my qualitative social research in the lab, using a sociological engagement with technology, materiality, and science research as a starting point to sharpen a sociological perspective on constructing technical perceptions. The focus will lie on how knowledge on perceptions is implemented in technology and materiality when constructing sensors.

Organizers: Katherine J. Kuchenbecker


Mining Visual Knowledge from Large Pre-trained Models

Talk
  • 18 January 2024 • 15:00—16:00
  • Luming Tang
  • N3.022

Computer vision made huge progress in the past decade with the dominant supervised learning paradigm, that is training large-scale neural networks on each task with ever larger datasets. However, in many cases, scalable data or annotation collection is intractable. In contrast, humans can easily adapt to new vision tasks with very little data or labels. In order to bridge this gap, we found that there actually exists rich visual knowledge in large pre-trained models, i.e., models trained on scalable internet images with either self-supervised or generative objectives. And we proposed different techniques to extract these implicit knowledge and use them to accomplish specific downstream tasks where data is constrained including recognition, dense prediction and generation. Specifically, I’ll mainly present the following three works. Firstly, I will introduce an efficient and effective way to adapt pre-trained vision transformers to a variety of low-shot downstream tasks, while tuning only less than 1 percent of the model parameters. Secondly, I will show that accurate visual correspondences emerge from a strong generative model (i.e., diffusion models) without any supervision. Following that, I will demonstrate that an adapted diffusion model is able to complete a photo with true scene contents using only a few casual captured reference images.

Organizers: Yuliang Xiu Yandong Wen


  • Dr. Janneke Schwaner
  • Hybrid - Webex plus in-person attendance in 5N18

Animals seem to effortlessly navigate complex terrain. This is in stark contrast with even the most advanced robot, illustrating that navigating complex terrain is by no means trivial. Humans’ neuromusculoskeletal system is equipped with two key mechanisms that allow us to recover from unexpected perturbations: muscle intrinsic properties and sensory-driven feedback control. We used unique in vivo and in situ approaches to explore how guinea fowl (Numida meleagris) integrate these two mechanisms to maintain robust locomotion. For example, our work showed a modular task-level control of leg length and leg angular trajectory during navigating speed perturbations while walking, with different neuromechanical control and perturbation sensitivity in each actuation mode. We also discovered gait-specific control mechanisms in walking and running over obstacles. Additionally, by combining in vivo and in situ experimental approaches, we found that guinea fowl LG muscles do not operate at optimal muscle lengths during force production during walking and running, providing a safety factor to potential unexpected perturbations. Lastly, we will also highlight some work showing how kangaroo rats circumvent mechanical limitations of skeletal muscles to jump as well as how these animals overcome angular momentum limitations during aerial reorientation during predator escape leaps. Elucidating frameworks of functions, adaptability, and individual variation across neuromuscular systems will provide a stepping-stone for understanding fundamental muscle mechanics, sensory feedback, and neuromuscular health. Additionally, this research has the potential to reveal the functional significance of individual morphological, physiological, and neuromuscular variation in relation to locomotion. This knowledge can subsequently inform individualized rehabilitation approaches and treatment of neuromuscular conditions, such as stroke-related motor impairments, as they require an integrated understanding of dynamic interactions between musculoskeletal mechanics and sensorimotor control. This work also provides foundational knowledge for the development of dynamic assistive devices and robots that can navigate through complex terrains.

Organizers: Katherine J. Kuchenbecker Andrew Schulz


  • Partha Ghosh
  • N3.022 Aquarium and Zoom

We present a novel unconditional video generative model designed to address long-term spatial and temporal dependencies. To capture these dependencies, our approach incorporates a hybrid explicit-implicit tri-plane representation inspired by 3D-aware generative frameworks developed for three-dimensional object representation and employs a singular latent code to model an entire video sequence. Individual video frames are then synthesized from an intermediate tri-plane representation, which itself is derived from the primary latent code. This novel strategy reduces computational complexity by a factor of 2 as measured in FLOPs. Consequently, our approach facilitates the efficient and temporally coherent generation of videos. Moreover, our joint frame modeling approach, in contrast to autoregressive methods, mitigates the generation of visual artifacts. We further enhance the model's capabilities by integrating an optical flow-based module within our Generative Adversarial Network (GAN) based generator architecture, thereby compensating for the constraints imposed by a smaller generator size. As a result, our model is capable of synthesizing high-fidelity video clips at a resolution of 256×256 pixels, with durations extending to more than 5 seconds at a frame rate of 30 fps. The efficacy and versatility of our approach are empirically validated through qualitative and quantitative assessments across three different datasets comprising both synthetic and real video clips.

Organizers: Yandong Wen


  • Weiyang Liu
  • N3.022 Aquarium and Zoom

Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a few key desiderata that enable better parameter-efficiency. Inspired by how the Cooley-Tukey fast Fourier transform algorithm enables efficient information transmission, we propose an efficient orthogonal parameterization using butterfly structures. We apply this parameterization to OFT, creating a novel parameter-efficient finetuning method, called Orthogonal Butterfly (BOFT). By subsuming OFT as a special case, BOFT introduces a generalized orthogonal finetuning framework. Finally, we conduct an extensive empirical study of adapting large vision transformers, large language models, and text-to-image diffusion models to various downstream tasks in computer vision and natural language. The results validate the effectiveness of BOFT as a generic finetuning method.

Organizers: Yandong Wen


Project neuroArm: Image-guided Medical Robotics Program

Talk
  • 17 October 2023 • 14:00—15:00
  • Dr. Diego Ospina
  • Hybrid - Webex plus in-person attendance in 5N18

Project neuroArm was established in 2002, with the idea of building the world’s first robot for brain surgery and stereotaxy. With the launch (2007) and integration of the neuroArm robot in the neurosurgical operating room (May 2008), the project continues to spawn newer technological innovations, advance tele-robotics through sensors and AI, and intelligent surgical systems towards improving safety of surgery. This talk will provide a high-level overview of two such technologies the team at Project neuroArm is currently developing and deploying: i) neuroArm+HD, a medical-grade sensory immersive workstation designed to enhance learning, performance, and safety in robot-assisted microsurgery and tele-operations; and ii) SmartForceps, a sensorized surgical bipolar forceps for real-time recording, displaying, monitoring, and uploading of tool-tissue interaction forces during surgery.

Organizers: Katherine J. Kuchenbecker Rachael Lorsa


Ghost on the Shell: An Expressive Representation of General 3D Shapes

Talk
  • 12 October 2023 • 10:00 am—11:00 am
  • Zhen Liu
  • Hybrid

The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they enable 1) fast physics-based rendering with realistic material and lighting, 2) physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parametrize open surfaces by defining a manifold signed distance field on watertight templates. With this parametrization, we further develop a grid-based and differentiable representation that parametrizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-SHELL achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.

Organizers: Yandong Wen


Towards Seamless Handovers with Legged Manipulators

Talk
  • 10 October 2023 • 14:00—15:00
  • Andreea Tulbure
  • Hybrid - Webex plus in-person attendance in 5N18

Deploying perception and control modules for handovers is challenging because they require a high degree of robustness and generalizability to work reliably for a diversity of objects and situations, but also adaptivity to adjust to individual preferences. On legged robots, deployment is particularly challenging because of the limited computational resources and the additional sensing noise resulting from locomotion. In this talk, I will discuss how we tackle some of these challenges, by first introducing our perception framework and discussing the insights of the first human-robot handover user study with legged manipulators. Furthermore, I will show how we combine imitation and reinforcement learning to achieve some degree of adaptivity during handovers. Finally, I will present our work in which the robot takes into account the post-handover task of the collaboration partner when handing over an object. This is beneficial for situations where the human range of motion is constrained during the handover or time is crucial.

Organizers: Katherine J. Kuchenbecker


Gesture-Based Nonverbal Interaction for Exercise Robots

PhD Thesis Defense
  • 09 October 2023 • 13:30—14:30
  • Mayumi Mohan
  • Webex plus in-person attendance in N3.022 in the Tübingen site of MPI-IS

When teaching or coaching, humans augment their words with carefully timed hand gestures, head and body movements, and facial expressions to provide feedback to their students. Robots, however, rarely utilize these nuanced cues. A minimally supervised social robot equipped with these abilities could support people in exercising, physical therapy, and learning new activities. This thesis examines how the intuitive power of human gestures can be harnessed to enhance human-robot interaction. To address this question, this research explores gesture-based interactions to expand the capabilities of a socially assistive robotic exercise coach, investigating the perspectives of both novice users and exercise-therapy experts. This thesis begins by concentrating on the user's engagement with the robot, analyzing the feasibility of minimally supervised gesture-based interactions. This exploration seeks to establish a framework in which robots can interact with users in a more intuitive and responsive manner. The investigation then shifts its focus toward the professionals who are integral to the success of these innovative technologies: the exercise-therapy experts. Roboticists face the challenge of translating the knowledge of these experts into robotic interactions. We address this challenge by developing a teleoperation algorithm that can enable exercise therapists to create customized gesture-based interactions for a robot. Thus, this thesis lays the groundwork for dynamic gesture-based interactions in minimally supervised environments, with implications for not only exercise-coach robots but also broader applications in human-robot interaction.

Organizers: Mayumi Mohan Katherine J. Kuchenbecker