Header logo is


2024


{PuzzleAvatar}: Assembling 3D Avatars from Personal Albums
PuzzleAvatar: Assembling 3D Avatars from Personal Albums

Xiu, Y., Liu, Z., Tzionas, D., Black, M. J.

ACM Transactions on Graphics, 43(6), ACM, December 2024 (article) To be published

Abstract
Generating personalized 3D avatars is crucial for AR/VR. However, recent text-to-3D methods that generate avatars for celebrities or fictional characters, struggle with everyday people. Methods for faithful reconstruction typically require full-body images in controlled settings. What if a user could just upload their personal "OOTD" (Outfit Of The Day) photo collection and get a faithful avatar in return? The challenge is that such casual photo collections contain diverse poses, challenging viewpoints, cropped views, and occlusion (albeit with a consistent outfit, accessories and hairstyle). We address this novel "Album2Human" task by developing PuzzleAvatar, a novel model that generates a faithful 3D avatar (in a canonical pose) from a personal OOTD album, while bypassing the challenging estimation of body and camera pose. To this end, we fine-tune a foundational vision-language model (VLM) on such photos, encoding the appearance, identity, garments, hairstyles, and accessories of a person into (separate) learned tokens and instilling these cues into the VLM. In effect, we exploit the learned tokens as "puzzle pieces" from which we assemble a faithful, personalized 3D avatar. Importantly, we can customize avatars by simply inter-changing tokens. As a benchmark for this new task, we collect a new dataset, called PuzzleIOI, with 41 subjects in a total of nearly 1K OOTD configurations, in challenging partial photos with paired ground-truth 3D bodies. Evaluation shows that PuzzleAvatar not only has high reconstruction accuracy, outperforming TeCH and MVDreamBooth, but also a unique scalability to album photos, and strong robustness. Our code and data are publicly available for research purpose.

ps

Page Code Video DOI [BibTex]

2024


Page Code Video DOI [BibTex]


{StableNormal}: Reducing Diffusion Variance for Stable and Sharp Normal
StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Ye, C., Qiu, L., Gu, X., Zuo, Q., Wu, Y., Dong, Z., Bo, L., Xiu, Y., Han, X.

ACM Transactions on Graphics, 43(6), ACM, December 2024 (article) To be published

Abstract
This work addresses the challenge of high-quality surface normal estimation from monocular colored inputs (i.e., images and videos), a field which has recently been revolutionized by repurposing diffusion priors. However, previous attempts still struggle with stochastic inference, conflicting with the deterministic nature of the Image2Normal task, and costly ensembling step, which slows down the estimation process. Our method, StableNormal, mitigates the stochasticity of the diffusion process by reducing inference variance, thus producing "Stable-and-Sharp" normal estimates without any additional ensembling process. StableNormal works robustly under challenging imaging conditions, such as extreme lighting, blurring, and low quality. It is also robust against transparent and reflective surfaces, as well as cluttered scenes with numerous objects. Specifically, StableNormal employs a coarse-to-fine strategy, which starts with a one-step normal estimator (YOSO) to derive an initial normal guess, that is relatively coarse but reliable, then followed by a semantic-guided refinement process (SG-DRN) that refines the normals to recover geometric details. The effectiveness of StableNormal is demonstrated through competitive performance in standard datasets such as DIODE-indoor, iBims, ScannetV2 and NYUv2, and also in various downstream tasks, such as surface reconstruction and normal enhancement. These results evidence that StableNormal retains both the "stability" and "sharpness" for accurate normal estimation. StableNormal represents a baby attempt to repurpose diffusion priors for deterministic estimation. To democratize this, code and models have been publicly available.

ps

Page Huggingface Demo Code Video DOI [BibTex]

Page Huggingface Demo Code Video DOI [BibTex]


Reinforcement learning in cold atom experiments
Reinforcement learning in cold atom experiments

Reinschmidt, M., Fortágh, J., Günther, A., Volchkov, V.

nature communications, 15:8532, October 2024 (article)

Abstract
Cold atom traps are at the heart of many quantum applications in science and technology. The preparation and control of atomic clouds involves complex optimization processes, that could be supported and accelerated by machine learning. In this work, we introduce reinforcement learning to cold atom experiments and demonstrate a flexible and adaptive approach to control a magneto-optical trap. Instead of following a set of predetermined rules to accomplish a specific task, the objectives are defined by a reward function. This approach not only optimizes the cooling of atoms just as an experi- mentalist would do, but also enables new operational modes such as the preparation of pre-defined numbers of atoms in a cloud. The machine control is trained to be robust against external perturbations and able to react to situations not seen during the training. Finally, we show that the time con- suming training can be performed in-silico using a generic simulation and demonstrate successful transfer to the real world experiment.

OS Lab

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Hexagonal electrohydraulic modules for rapidly reconfigurable high-speed robots

Yoder, Z., Rumley, E., Schmidt, I., Rothemund, P., Keplinger, C.

Science Robotics, 9, September 2024 (article)

Abstract
Robots made from reconfigurable modular units feature versatility, cost efficiency, and improved sustainability compared with fixed designs. Reconfigurable modules driven by soft actuators provide adaptable actuation, safe interaction, and wide design freedom, but existing soft modules would benefit from high-speed and high-strain actuation, as well as driving methods well-suited to untethered operation. Here, we introduce a class of electrically actuated robotic modules that provide high-speed (a peak contractile strain rate of 4618% per second, 15.8-hertz bandwidth, and a peak specific power of 122 watts per kilogram), high-strain (49% contraction) actuation and that use magnets for reversible mechanical and electrical connections between neighboring modules, thereby serving as building blocks for rapidly reconfigurable and highly agile robotic systems. The actuation performance of each hexagonal electrohydraulic (HEXEL) module is enabled by a synergistic combination of soft and rigid components; a hexagonal exoskeleton of rigid plates amplifies the motion produced by soft electrohydraulic actuators and provides a mechanical structure and connection platform for reconfigurable robots composed of many modules. We characterize the actuation performance of individual HEXEL modules, present a model that captures their quasi-static force-stroke behavior, and demonstrate both a high-jumping and a fast pipe-crawling robot. Using embedded magnetic connections, we arranged multiple modules into reconfigurable robots with diverse functionality, including a high-stroke muscle, a multimodal active array, a table-top active platform, and a fast-rolling robot. We further leveraged the magnetic connections for hosting untethered, snap-on driving electronics, together highlighting the promise of HEXEL modules for creating rapidly reconfigurable high-speed robots.

rm

link (url) DOI [BibTex]


no image
Fiber-Optic Shape Sensing Using Neural Networks Operating on Multispecklegrams

Cao, C. G. L., Javot, B., Bhattarai, S., Bierig, K., Oreshnikov, I., Volchkov, V. V.

IEEE Sensors Journal, 24(17):27532-27540, September 2024 (article)

Abstract
Application of machine learning techniques on fiber speckle images to infer fiber deformation allows the use of an unmodified multimode fiber to act as a shape sensor. This approach eliminates the need for complex fiber design or construction (e.g., Bragg gratings and time-of-flight). Prior work in shape determination using neural networks trained on a finite number of possible fiber shapes (formulated as a classification task), or trained on a few continuous degrees of freedom, has been limited to reconstruction of fiber shapes only one bend at a time. Furthermore, generalization to shapes that were not used in training is challenging. Our innovative approach improves generalization capabilities, using computer vision-assisted parameterization of the actual fiber shape to provide a ground truth, and multiple specklegrams per fiber shape obtained by controlling the input field. Results from experimenting with several neural network architectures, shape parameterization, number of inputs, and specklegram resolution show that fiber shapes with multiple bends can be accurately predicted. Our approach is able to generalize to new shapes that were not in the training set. This approach of end-to-end training on parameterized ground truth opens new avenues for fiber-optic sensor applications. We publish the datasets used for training and validation, as well as an out-of-distribution (OOD) test set, and encourage interested readers to access these datasets for their own model development.

hi ei OS Lab zwe-sw

DOI [BibTex]


Localization and recognition of human action in {3D} using transformers
Localization and recognition of human action in 3D using transformers

Sun, J., Huang, L., Hongsong Wang, C. Z. J. Q., Islam, M. T., Xie, E., Zhou, B., Xing, L., Chandrasekaran, A., Black, M. J.

Nature Communications Engineering , 13(125), September 2024 (article)

Abstract
Understanding a person’s behavior from their 3D motion sequence is a fundamental problem in computer vision with many applications. An important component of this problem is 3D action localization, which involves recognizing what actions a person is performing, and when the actions occur in the sequence. To promote the progress of the 3D action localization community, we introduce a new, challenging, and more complex benchmark dataset, BABEL-TAL (BT), for 3D action localization. Important baselines and evaluating metrics, as well as human evaluations, are carefully established on this benchmark. We also propose a strong baseline model, i.e., Localizing Actions with Transformers (LocATe), that jointly localizes and recognizes actions in a 3D sequence. The proposed LocATe shows superior performance on BABEL-TAL as well as on the large-scale PKU-MMD dataset, achieving state-of-the-art performance by using only 10% of the labeled training data. Our research could advance the development of more accurate and efficient systems for human behavior analysis, with potential applications in areas such as human-computer interaction and healthcare.

ps

paper DOI [BibTex]

paper DOI [BibTex]


no image
Cutaneous Electrohydraulic (CUTE) Wearable Devices for Pleasant Broad-Bandwidth Haptic Cues

Sanchez-Tamayo, N., Yoder, Z., Rothemund, P., Ballardini, G., Keplinger, C., Kuchenbecker, K. J.

Advanced Science, (2402461):1-14, September 2024 (article)

Abstract
By focusing on vibrations, current wearable haptic devices underutilize the skin's perceptual capabilities. Devices that provide richer haptic stimuli, including contact feedback and/or variable pressure, are typically heavy and bulky due to the underlying actuator technology and the low sensitivity of hairy skin, which covers most of the body. This paper presents a system architecture for compact wearable devices that deliver salient and pleasant broad-bandwidth haptic cues: Cutaneous Electrohydraulic (CUTE) devices combine a custom materials design for soft haptic electrohydraulic actuators that feature high stroke, high force, and electrical safety with a comfortable mounting strategy that places the actuator in a non-contact resting position. A prototypical wrist-wearable CUTE device produces rich tactile sensations by making and breaking contact with the skin (2.44 mm actuation stroke), applying high controllable forces (exceeding 2.3 N), and delivering vibrations at a wide range of amplitudes and frequencies (0-200 Hz). A perceptual study with fourteen participants achieved 97.9% recognition accuracy across six diverse cues and verified their pleasant and expressive feel. This system architecture for wearable devices gives unprecedented control over the haptic cues delivered to the skin, providing an elegant and discreet way to activate the user's sense of touch.

hi rm

DOI [BibTex]


Electrohydraulic Musculoskeletal Robotic Leg for Agile, Adaptive, yet Energy-Efficient Locomotion
Electrohydraulic Musculoskeletal Robotic Leg for Agile, Adaptive, yet Energy-Efficient Locomotion

Buchner, T. J. K., Fukushima, T., Kazemipour, A., Gravert, S., Prairie, M., Romanescu, P., Arm, P., Zhang, Y., Wang, X., Zhang, S. L., Walter, J., Keplinger, C., Katzschmann, R. K.

Nature Communications, 15(1), September 2024 (article)

Abstract
Robotic locomotion in unstructured terrain demands an agile, adaptive, and energy-efficient architecture. To traverse such terrains, legged robots use rigid electromagnetic motors and sensorized drivetrains to adapt to the environment actively. These systems struggle to compete with animals that excel through their agile and effortless motion in natural environments. We propose a bio-inspired musculoskeletal leg architecture driven by antagonistic pairs of electrohydraulic artificial muscles. Our leg is mounted on a boom arm and can adaptively hop on varying terrain in an energy-efficient yet agile manner. It can also detect obstacles through capacitive self-sensing. The leg performs powerful and agile gait motions beyond 5 Hz and high jumps up to 40 % of the leg height. Our leg’s tunable stiffness and inherent adaptability allow it to hop over grass, sand, gravel, pebbles, and large rocks using only open-loop force control. The electrohydraulic leg features a low cost of transport (0.73), and while squatting, it consumes only a fraction of the energy (1.2 %) compared to its conventional electromagnetic counterpart. Its agile, adaptive, and energy-efficient properties would open a roadmap toward a new class of musculoskeletal robots for versatile locomotion and operation in unstructured natural environments.

rm

Press release Video (overview) Video (technical description) Article in pdf link (url) DOI [BibTex]

Press release Video (overview) Video (technical description) Article in pdf link (url) DOI [BibTex]


Building Instructions You Can Feel: Edge-Changing Haptic Devices for Digitally Guided Construction
Building Instructions You Can Feel: Edge-Changing Haptic Devices for Digitally Guided Construction

Tashiro, N., Faulkner, R., Melnyk, S., Rodriguez, T. R., Javot, B., Tahouni, Y., Cheng, T., Wood, D., Menges, A., Kuchenbecker, K. J.

ACM Transactions on Computer-Human Interaction, September 2024 (article) Accepted

Abstract
Recent efforts to connect builders to digital designs during construction have primarily focused on visual augmented reality, which requires accurate registration and specific lighting, and which could prevent a user from noticing safety hazards. Haptic interfaces, on the other hand, can convey physical design parameters through tangible local cues that don't distract from the surroundings. We propose two edge-changing haptic devices that use small inertial measurement units (IMUs) and linear actuators to guide users to perform construction tasks in real time: Drangle gives feedback for angling a drill relative to gravity, and Brangle assists with orienting bricks in the plane. We conducted a study with 18 participants to evaluate user performance and gather qualitative feedback. All users understood the edge-changing cues from both devices with minimal training. Drilling holes with Drangle was somewhat less accurate but much faster and easier than with a mechanical guide; 89% of participants preferred Drangle over the mechanical guide. Users generally understood Brangle's feedback but found its hand-size-specific grip, palmar contact, and attractive tactile cues less intuitive than Drangle's generalized form factor, fingertip contact, and repulsive cues. After summarizing design considerations, we propose application scenarios and speculate how such devices could improve construction workflows.

hi

[BibTex]

[BibTex]


EarthRanger: An Open-Source Platform for Ecosystem Monitoring, Research, and Management
EarthRanger: An Open-Source Platform for Ecosystem Monitoring, Research, and Management

Wall, J., Lefcourt, J., Jones, C., Doehring, C., O’Neill, D., Schneider, D., Steward, J., Krautwurst, J., Wong, T., Jones, B., Goodfellow, K., Schmitt, T., Gobush, K., Douglas-Hamilton, I., Pope, F., Schmidt, E., Palmer, J., Stokes, E., Reid, A., Elbroch, M. L., Kulits, P., Villeneuve, C., Matsanza, V., Clinning, G., Oort, J. V., Denninger-Snyder, K., Daati, A. P., Gold, W., Cunliffe, S., Craig, B., Cork, B., Burden, G., Goss, M., Hahn, N., Carroll, S., Gitonga, E., Rao, R., Stabach, J., Broin, F. D., Omondi, P., Wittemyer, G.

Methods in Ecology and Evolution, 13, British Ecological Society, September 2024 (article)

ps

DOI [BibTex]

DOI [BibTex]


no image
A Probabilistic Model behind Self-Supervised Learning

Bizeul, A., Schölkopf, B., Allen, C.

Transactions on Machine Learning Research, September 2024 (article) To be published

ei

PDF [BibTex]

PDF [BibTex]


no image
The Fairness-Quality Trade-off in Clustering

Hakim, R., Stoica, A., Papadimitriou, C. H., Yannakakis, M.

arXiv preprint arXiv:2408.10002, September 2024 (article)

sf

link (url) [BibTex]

link (url) [BibTex]


no image
Augmenting Robot-Assisted Pattern Cutting With Periodic Perturbations – Can We Make Dry Lab Training More Realistic?

Sharon, Y., Nevo, T., Naftalovich, D., Bahar, L., Refaely, Y., Nisky, I.

IEEE Transactions on Biomedical Engineering, August 2024 (article)

Abstract
Objective: Teleoperated robot-assisted minimally-invasive surgery (RAMIS) offers many advantages over open surgery, but RAMIS training still requires optimization. Existing motor learning theories could improve RAMIS training. However, there is a gap between current knowledge based on simple movements and training approaches required for the more complicated work of RAMIS surgeons. Here, we studied how surgeons cope with time-dependent perturbations. Methods: We used the da Vinci Research Kit and investigated the effect of time-dependent force and motion perturbations on learning a circular pattern-cutting surgical task. Fifty-four participants were assigned to two experiments, with two groups for each: a control group trained without perturbations and an experimental group trained with 1Hz perturbations. In the first experiment, force perturbations alternatingly pushed participants' hands inwards and outwards in the radial direction. In the second experiment, the perturbation constituted a periodic up-and-down motion of the task platform. Results: Participants trained with perturbations learned how to overcome them and improve their performances during training without impairing them after the perturbations were removed. Moreover, training with motion perturbations provided participants with an advantage when encountering the same or other perturbations after training, compared to training without perturbations. Conclusion: Periodic perturbations can enhance RAMIS training without impeding the learning of the perturbed task. Significance: Our results demonstrate that using challenging training tasks that include perturbations can better prepare surgical trainees for the dynamic environment they will face with patients in the operating room.

hi

DOI [BibTex]

DOI [BibTex]


Re-Thinking Inverse Graphics with Large Language Models
Re-Thinking Inverse Graphics with Large Language Models

Kulits, P., Feng, H., Liu, W., Abrevaya, V., Black, M. J.

Transactions on Machine Learning Research, August 2024 (article)

Abstract
Inverse graphics -- the task of inverting an image into physical variables that, when rendered, enable reproduction of the observed scene -- is a fundamental challenge in computer vision and graphics. Successfully disentangling an image into its constituent elements, such as the shape, color, and material properties of the objects of the 3D scene that produced it, requires a comprehensive understanding of the environment. This complexity limits the ability of existing carefully engineered approaches to generalize across domains. Inspired by the zero-shot ability of large language models (LLMs) to generalize to novel contexts, we investigate the possibility of leveraging the broad world knowledge encoded in such models to solve inverse-graphics problems. To this end, we propose the Inverse-Graphics Large Language Model (IG-LLM), an inverse-graphics framework centered around an LLM, that autoregressively decodes a visual embedding into a structured, compositional 3D-scene representation. We incorporate a frozen pre-trained visual encoder and a continuous numeric head to enable end-to-end training. Through our investigation, we demonstrate the potential of LLMs to facilitate inverse graphics through next-token prediction, without the application of image-space supervision. Our analysis enables new possibilities for precise spatial reasoning about images that exploit the visual knowledge of LLMs. We release our code and data at https://ig-llm.is.tue.mpg.de/ to ensure the reproducibility of our investigation and to facilitate future research.

ps

link (url) [BibTex]

link (url) [BibTex]


no image
Leveraging Task Structures for Improved Identifiability in Neural Network Representations

Chen*, W., Horwood*, J., Heo, J., Hernández-Lobato, J. M.

Transactions on Machine Learning Research, August 2024, *equal contribution (article)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test

Khojasteh, B., Solowjow, F., Trimpe, S., Kuchenbecker, K. J.

IEEE Transactions on Automation Science and Engineering, 21(3):4432-4447, July 2024 (article)

Abstract
Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns. Note to Practitioners—We demonstrate how to apply the kernel two-sample test to a surface-recognition task, discuss opportunities for improvement, and explain how to use this framework for other classification problems with similar properties. Automating surface recognition could benefit both surface inspection and robot manipulation. Our algorithm quantifies class similarity and therefore outputs an ordered list of similar surfaces. This technique is well suited for quality assurance and documentation of newly received materials or newly manufactured parts. More generally, our automated classification pipeline can handle heterogeneous data sources including images and high-frequency time-series measurements of vibrations, forces and other physical signals. As our approach circumvents the time-consuming process of feature engineering, both experts and non-experts can use it to achieve high-accuracy classification. It is particularly appealing for new problems without existing models and heuristics. In addition to strong theoretical properties, the algorithm is straightforward to use in practice since it requires only kernel evaluations. Its transparent architecture can provide fast insights into the given use case under different sensing combinations without costly optimization. Practitioners can also use our procedure to obtain the minimum data-acquisition time for independent time-series data from new sensor recordings.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Deep Backtracking Counterfactuals for Causally Compliant Explanations

Kladny, K., Kügelgen, J. V., Schölkopf, B., Muehlebach, M.

Transactions on Machine Learning Research, July 2024 (article)

ei lds

arXiv link (url) [BibTex]

arXiv link (url) [BibTex]


Fingertip Dynamic Response Simulated Across Excitation Points and Frequencies
Fingertip Dynamic Response Simulated Across Excitation Points and Frequencies

Serhat, G., Kuchenbecker, K. J.

Biomechanics and Modeling in Mechanobiology, 23, pages: 1369-1376, May 2024 (article)

Abstract
Predicting how the fingertip will mechanically respond to different stimuli can help explain human haptic perception and enable improvements to actuation approaches such as ultrasonic mid-air haptics. This study addresses this goal using high-fidelity 3D finite element analyses. We compute the deformation profiles and amplitudes caused by harmonic forces applied in the normal direction at four locations: the center of the finger pad, the side of the finger, the tip of the finger, and the oblique midpoint of these three sites. The excitation frequency is swept from 2.5 to 260 Hz. The simulated frequency response functions (FRFs) obtained for displacement demonstrate that the relative magnitudes of the deformations elicited by stimulating at each of these four locations greatly depends on whether only the excitation point or the entire finger is considered. The point force that induces the smallest local deformation can even cause the largest overall deformation at certain frequency intervals. Above 225 Hz, oblique excitation produces larger mean displacement amplitudes than the other three forces due to excitation of multiple modes involving diagonal deformation. These simulation results give novel insights into the combined influence of excitation location and frequency on the fingertip dynamic response, potentially facilitating the design of future vibration feedback devices.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


Closing the Loop in Minimally Supervised Human-Robot Interaction: Formative and Summative Feedback
Closing the Loop in Minimally Supervised Human-Robot Interaction: Formative and Summative Feedback

Mohan, M., Nunez, C. M., Kuchenbecker, K. J.

Scientific Reports, 14(10564):1-18, May 2024 (article)

Abstract
Human instructors fluidly communicate with hand gestures, head and body movements, and facial expressions, but robots rarely leverage these complementary cues. A minimally supervised social robot with such skills could help people exercise and learn new activities. Thus, we investigated how nonverbal feedback from a humanoid robot affects human behavior. Inspired by the education literature, we evaluated formative feedback (real-time corrections) and summative feedback (post-task scores) for three distinct tasks: positioning in the room, mimicking the robot's arm pose, and contacting the robot's hands. Twenty-eight adults completed seventy-five 30-second-long trials with no explicit instructions or experimenter help. Motion-capture data analysis shows that both formative and summative feedback from the robot significantly aided user performance. Additionally, formative feedback improved task understanding. These results show the power of nonverbal cues based on human movement and the utility of viewing feedback through formative and summative lenses.

hi

DOI Project Page [BibTex]


no image
Grundfragen der künstlichen Intelligenz

Schölkopf, B.

astronomie - Das Magazin, 42, May 2024 (article)

ei

link (url) [BibTex]

link (url) [BibTex]


Exploring Weight Bias and Negative Self-Evaluation in Patients with Mood Disorders: Insights from the {BodyTalk} Project,
Exploring Weight Bias and Negative Self-Evaluation in Patients with Mood Disorders: Insights from the BodyTalk Project,

Meneguzzo, P., Behrens, S. C., Pavan, C., Toffanin, T., Quiros-Ramirez, M. A., Black, M. J., Giel, K., Tenconi, E., Favaro, A.

Frontiers in Psychiatry, 15, Sec. Psychopathology, May 2024 (article)

Abstract
Background: Negative body image and adverse body self-evaluation represent key psychological constructs within the realm of weight bias (WB), potentially intertwined with the negative self-evaluation characteristic of depressive symptomatology. Although WB encapsulates an implicit form of self-critical assessment, its exploration among people with mood disorders (MD) has been under-investigated. Our primary goal is to comprehensively assess both explicit and implicit WB, seeking to reveal specific dimensions that could interconnect with the symptoms of MDs. Methods: A cohort comprising 25 MD patients and 35 demographically matched healthy peers (with 83% female representation) participated in a series of tasks designed to evaluate the congruence between various computer-generated body representations and a spectrum of descriptive adjectives. Our analysis delved into multiple facets of body image evaluation, scrutinizing the associations between different body sizes and emotionally charged adjectives (e.g., active, apple-shaped, attractive). Results: No discernible differences emerged concerning body dissatisfaction or the correspondence of different body sizes with varying adjectives. Interestingly, MD patients exhibited a markedly higher tendency to overestimate their body weight (p = 0.011). Explicit WB did not show significant variance between the two groups, but MD participants demonstrated a notable implicit WB within a specific weight rating task for BMI between 18.5 and 25 kg/m2 (p = 0.012). Conclusions: Despite the striking similarities in the assessment of participants’ body weight, our investigation revealed an implicit WB among individuals grappling with MD. This bias potentially assumes a role in fostering self-directed negative evaluations, shedding light on a previously unexplored facet of the interplay between WB and mood disorders.

ps

paper paper link (url) DOI [BibTex]

paper paper link (url) DOI [BibTex]


The Poses for Equine Research Dataset {(PFERD)}
The Poses for Equine Research Dataset (PFERD)

Li, C., Mellbin, Y., Krogager, J., Polikovsky, S., Holmberg, M., Ghorbani, N., Black, M. J., Kjellström, H., Zuffi, S., Hernlund, E.

Nature Scientific Data, 11, May 2024 (article)

Abstract
Studies of quadruped animal motion help us to identify diseases, understand behavior and unravel the mechanics behind gaits in animals. The horse is likely the best-studied animal in this aspect, but data capture is challenging and time-consuming. Computer vision techniques improve animal motion extraction, but the development relies on reference datasets, which are scarce, not open-access and often provide data from only a few anatomical landmarks. Addressing this data gap, we introduce PFERD, a video and 3D marker motion dataset from horses using a full-body set-up of densely placed over 100 skin-attached markers and synchronized videos from ten camera angles. Five horses of diverse conformations provide data for various motions from basic poses (eg. walking, trotting) to advanced motions (eg. rearing, kicking). We further express the 3D motions with current techniques and a 3D parameterized model, the hSMAL model, establishing a baseline for 3D horse markerless motion capture. PFERD enables advanced biomechanical studies and provides a resource of ground truth data for the methodological development of markerless motion capture.

ps

paper [BibTex]

paper [BibTex]


Airo{T}ouch: Enhancing Telerobotic Assembly through Naturalistic Haptic Feedback of Tool Vibrations
AiroTouch: Enhancing Telerobotic Assembly through Naturalistic Haptic Feedback of Tool Vibrations

Gong, Y., Mat Husin, H., Erol, E., Ortenzi, V., Kuchenbecker, K. J.

Frontiers in Robotics and AI, 11(1355205):1-15, May 2024 (article)

Abstract
Teleoperation allows workers to safely control powerful construction machines; however, its primary reliance on visual feedback limits the operator's efficiency in situations with stiff contact or poor visibility, hindering its use for assembly of pre-fabricated building components. Reliable, economical, and easy-to-implement haptic feedback could fill this perception gap and facilitate the broader use of robots in construction and other application areas. Thus, we adapted widely available commercial audio equipment to create AiroTouch, a naturalistic haptic feedback system that measures the vibration experienced by each robot tool and enables the operator to feel a scaled version of this vibration in real time. Accurate haptic transmission was achieved by optimizing the positions of the system's off-the-shelf accelerometers and voice-coil actuators. A study was conducted to evaluate how adding this naturalistic type of vibrotactile feedback affects the operator during telerobotic assembly. Thirty participants used a bimanual dexterous teleoperation system (Intuitive da Vinci Si) to build a small rigid structure under three randomly ordered haptic feedback conditions: no vibrations, one-axis vibrations, and summed three-axis vibrations. The results show that users took advantage of both tested versions of the naturalistic haptic feedback after gaining some experience with the task, causing significantly lower vibrations and forces in the second trial. Subjective responses indicate that haptic feedback increased the realism of the interaction and reduced the perceived task duration, task difficulty, and fatigue. As hypothesized, higher haptic feedback gains were chosen by users with larger hands and for the smaller sensed vibrations in the one-axis condition. These results elucidate important details for effective implementation of naturalistic vibrotactile feedback and demonstrate that our accessible audio-based approach could enhance user performance and experience during telerobotic assembly in construction and other application domains.

hi

DOI Project Page [BibTex]


no image
VIPurPCA: Visualizing and Propagating Uncertainty in Principal Component Analysis

Zabel, S., Hennig, P., Nieselt, K.

IEEE Transactions on Visualization and Computer Graphics, 30(4):2011-2022, April 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Integration of Generative AI in the Digital Markets Act: Contestability and Fairness from a Cross-Disciplinary Perspective

Yasar, A. G., Chong, A., Dong, E., Gilbert, T., Hladikova, S., Mougan, C., Shen, X., Singh, S., Stoica, A., Thais, S.

LSE Legal Studies Working Paper, March 2024 (article)

Abstract
The EU’s Digital Markets Act (DMA) aims to address the lack of contestability and unfair practices in digital markets. But the current framework of the DMA does not adequately cover the rapid advance of generative AI. As the EU adopts AI-specific rules and considers possible amendments to the DMA, this paper suggests that generative AI should be added to the DMA’s list of core platform services. This amendment is the first necessary step to address the emergence of entrenched and durable positions in the generative AI industry.

sf

link (url) [BibTex]

link (url) [BibTex]


no image
Modeling Fatigue in Manual and Robot-Assisted Work for Operator 5.0

Allemang–Trivalle, A., Donjat, J., Bechu, G., Coppin, G., Chollet, M., Klaproth, O. W., Mitschke, A., Schirrmann, A., Cao, C. G. L.

IISE Transactions on Occupational Ergonomics and Human Factors, 12(1-2):135-147, March 2024 (article)

hi

DOI [BibTex]

DOI [BibTex]


no image
Learning Graph Embeddings for Open World Compositional Zero-Shot Learning

Mancini, M., Naeem, M. F., Xian, Y., Akata, Z.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(3):1545-1560, IEEE, New York, NY, March 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
A mathematical principle for the gamification of behavior change

Lieder, F., Chen, P., Prentice, M., Amo, V., Tošić, M.

JMIR Serious Games , 12, JMIR Publications, March 2024 (article)

Abstract
Many people want to build good habits to become healthier, live longer, or become happier but struggle to change their behavior. Gamification can make behavior change easier by awarding points for the desired behavior and deducting points for its omission.

re

link (url) DOI [BibTex]

link (url) DOI [BibTex]


{IMU}-Based Kinematics Estimation Accuracy Affects Gait Retraining Using Vibrotactile Cues
IMU-Based Kinematics Estimation Accuracy Affects Gait Retraining Using Vibrotactile Cues

Rokhmanova, N., Pearl, O., Kuchenbecker, K. J., Halilaj, E.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32, pages: 1005-1012, February 2024 (article)

Abstract
Wearable sensing using inertial measurement units (IMUs) is enabling portable and customized gait retraining for knee osteoarthritis. However, the vibrotactile feedback that users receive directly depends on the accuracy of IMU-based kinematics. This study investigated how kinematic errors impact an individual's ability to learn a therapeutic gait using vibrotactile cues. Sensor accuracy was computed by comparing the IMU-based foot progression angle to marker-based motion capture, which was used as ground truth. Thirty subjects were randomized into three groups to learn a toe-in gait: one group received vibrotactile feedback during gait retraining in the laboratory, another received feedback outdoors, and the control group received only verbal instruction and proceeded directly to the evaluation condition. All subjects were evaluated on their ability to maintain the learned gait in a new outdoor environment. We found that subjects with high tracking errors exhibited more incorrect responses to vibrotactile cues and slower learning rates than subjects with low tracking errors. Subjects with low tracking errors outperformed the control group in the evaluation condition, whereas those with higher error did not. Errors were correlated with foot size and angle magnitude, which may indicate a non-random algorithmic bias. The accuracy of IMU-based kinematics has a cascading effect on feedback; ignoring this effect could lead researchers or clinicians to erroneously classify a patient as a non-responder if they did not improve after retraining. To use patient and clinician time effectively, future implementation of portable gait retraining will require assessment across a diverse range of patients.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Network propagation for GWAS analysis: a practical guide to leveraging molecular networks for disease gene discovery

Visonà, G., Bouzigon, E., Demenais, F., Schweikert, G.

Briefings in Bioinformatics, 25(2), February 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Trained recurrent neural networks develop phase-locked limit cycles in a working memory task

Pals, M., Macke, J. H., Barak, O.

PLOS Computational Biology, 20(2), February 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Pre-treatment 18F-FDG-PET/CT parameters as biomarkers for progression free survival, best overall response and overall survival in metastatic melanoma patients undergoing first-line immunotherapy

Peisen, F., Gerken, A., Dahm, I., Nikolaou, K., Eigentler, T., Amaral, T., Moltz, J. H., Othman, A. E., Gatidis, S.

PLOS ONE, 19(1), January 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


no image
Towards fully covariant machine learning

Villar, S., Hogg, D. W., Yao, W., Kevrekidis, G. A., Schölkopf, B.

Transactions on Machine Learning Research, January 2024 (article)

ei

link (url) [BibTex]

link (url) [BibTex]


no image
How Should Robots Exercise with People? Robot-Mediated Exergames Win with Music, Social Analogues, and Gameplay Clarity

Fitter, N. T., Mohan, M., Preston, R. C., Johnson, M. J., Kuchenbecker, K. J.

Frontiers in Robotics and AI, 10(1155837):1-18, January 2024 (article)

Abstract
The modern worldwide trend toward sedentary behavior comes with significant health risks. An accompanying wave of health technologies has tried to encourage physical activity, but these approaches often yield limited use and retention. Due to their unique ability to serve as both a health-promoting technology and a social peer, we propose robots as a game-changing solution for encouraging physical activity. This article analyzes the eight exergames we previously created for the Rethink Baxter Research Robot in terms of four key components that are grounded in the video-game literature: repetition, pattern matching, music, and social design. We use these four game facets to assess gameplay data from 40 adult users who each experienced the games in balanced random order. In agreement with prior research, our results show that relevant musical cultural references, recognizable social analogues, and gameplay clarity are good strategies for taking an otherwise highly repetitive physical activity and making it engaging and popular among users. Others who study socially assistive robots and rehabilitation robotics can benefit from this work by considering the presented design attributes to generate future hypotheses and by using our eight open-source games to pursue follow-up work on social-physical exercise with robots.

hi

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Robust Surface Recognition with the Maximum Mean Discrepancy: Degrading Haptic-Auditory Signals through Bandwidth and Noise

(Best ToH Short Paper Award at the IEEE Haptics Symposium Conference 2024)

Khojasteh, B., Shao, Y., Kuchenbecker, K. J.

IEEE Transactions on Haptics, 17(1):58-65, January 2024, Presented at the IEEE Haptics Symposium (article)

Abstract
Sliding a tool across a surface generates rich sensations that can be analyzed to recognize what is being touched. However, the optimal configuration for capturing these signals is yet unclear. To bridge this gap, we consider haptic-auditory data as a human explores surfaces with different steel tools, including accelerations of the tool and finger, force and torque applied to the surface, and contact sounds. Our classification pipeline uses the maximum mean discrepancy (MMD) to quantify differences in data distributions in a high-dimensional space for inference. With recordings from three hemispherical tool diameters and ten diverse surfaces, we conducted two degradation studies by decreasing sensing bandwidth and increasing added noise. We evaluate the haptic-auditory recognition performance achieved with the MMD to compare newly gathered data to each surface in our known library. The results indicate that acceleration signals alone have great potential for high-accuracy surface recognition and are robust against noise contamination. The optimal accelerometer bandwidth exceeds 1000 Hz, suggesting that useful vibrotactile information extends beyond human perception range. Finally, smaller tool tips generate contact vibrations with better noise robustness. The provided sensing guidelines may enable superhuman performance in portable surface recognition, which could benefit quality control, material documentation, and robotics.

hi

DOI Project Page [BibTex]


A simple quantitative model of neuromodulation, Part I: Ion flow through neural ion channels
A simple quantitative model of neuromodulation, Part I: Ion flow through neural ion channels

Werneck, L., Han, M., Yildiz, E., Keip, M., Sitti, M., Ortiz, M.

Journal of the Mechanics and Physics of Solids, 182, pages: 105457, 2024 (article)

Abstract
We develop a simple model of ionic current through neuronal membranes as a function of membrane potential and extracellular ion concentration. The model combines a simplified Poisson–Nernst–Planck (PNP) model of ion transport through individual ion channels with channel activation functions calibrated from ad hoc in-house experimental data. The simplified PNP model is validated against bacterial gramicidin A ion channel data. The calibrated model accounts for the transport of calcium, sodium, potassium, and chloride and exhibits remarkable agreement with the experimentally measured current–voltage curves for the differentiated human neural cells.

pi

DOI [BibTex]

DOI [BibTex]


Nanodiamond-Enhanced Magnetic Resonance Imaging
Nanodiamond-Enhanced Magnetic Resonance Imaging

Jelena Lazovic, E. G. A. W. P. S. A. S. J. L. G. W. M. S.

Advanced Materials, 36(11):2310109, 2024 (article)

zwe-ms pi

DOI [BibTex]

DOI [BibTex]


no image
Balancing a 3D Inverted Pendulum using Remote Magnetic Manipulation

Zughaibi, J., Nelson, B. J., Muehlebach, M.

Robotics and Automation Letters, 2024 (article) In revision

lds

link (url) [BibTex]

link (url) [BibTex]


no image
Small-pore hydridic frameworks store densely packed hydrogen

Oh, H., Tumanov, N., Ban, V., Li, X., Richter, B., Hudson, M. R., Brown, C. M., Iles, G. N., Wallacher, D., Jorgensen, S. W., Daemen, L., Balderas-Xicohténcatl, R., Cheng, Y., Ramirez-Cuesta, A. J., Heere, M., Posada-Pérez, S., Hautier, G., Hirscher, M., Jensen, T. R., Filinchuk, Y.

Nature Chemistry, 16(5):809-816, Nature Publishing Group, London, UK, 2024 (article)

mms

DOI [BibTex]

DOI [BibTex]


Unravelling parameter interactions in calcium alginate/polyacrylamide double network hydrogels using a design of experiments approach for the optimization of mechanical properties
Unravelling parameter interactions in calcium alginate/polyacrylamide double network hydrogels using a design of experiments approach for the optimization of mechanical properties

Gorke, O., Stuhlmüller, M., Tovar, G. E. M., Southan, A.

Materials Advances, 5, pages: 2851-2859, Royal Society of Chemistry, 2024 (article)

zwe-csfm

pdf link (url) DOI [BibTex]

pdf link (url) DOI [BibTex]


Artificial-goosebump-driven microactuation
Artificial-goosebump-driven microactuation

Zhang, M., Pal, A., Lyu, X., Wu, Y., Sitti, M.

Nature Materials, 23(23):560-569, 2024 (article)

pi

link (url) DOI [BibTex]


Learning Soft Millirobot Multimodal Locomotion with Sim-to-Real Transfer
Learning Soft Millirobot Multimodal Locomotion with Sim-to-Real Transfer

Demir, S. O., Tiryaki, M. E., Karacakol, A. C., Sitti, M.

Advanced Science, 2024 (article)

Abstract
With wireless multimodal locomotion capabilities, magnetic soft millirobots have emerged as potential minimally invasive medical robotic platforms. Due to their diverse shape programming capability, they can generate various locomotion modes, and their locomotion can be adapted to different environments by controlling the external magnetic field signal. Existing adaptation methods, however, are based on hand-tuned signals. Here, a learning-based adaptive magnetic soft millirobot multimodal locomotion framework empowered by sim-to-real transfer is presented. Developing a data-driven magnetic soft millirobot simulation environment, the periodic magnetic actuation signal is learned for a given soft millirobot in simulation. Then, the learned locomotion strategy is deployed to the real world using Bayesian optimization and Gaussian processes. Finally, automated domain recognition and locomotion adaptation for unknown environments using a Kullback-Leibler divergence-based probabilistic method are illustrated. This method can enable soft millirobot locomotion to quickly and continuously adapt to environmental changes and explore the actuation space for unanticipated solutions with minimum experimental cost.

pi

DOI [BibTex]


no image
Towards a systems theory of algorithms

Florian Dörfler, Zhiyu He, Giuseppe Belgioioso, Saverio Bolognani, John Lygeros, Michael Muehlebach

IEEE Control System Letters, 2024 (article)

lds

link (url) [BibTex]

link (url) [BibTex]


no image
Coherent magnons with giant nonreciprocity at nanoscale wavelengths

Gallardo, R., Weigand, M., Schultheiss, K., Kakay, A., Mattheis, R., Raabe, J., Schütz, G., Deac, A., Lindner, J., Wintz, S.

ACS Nano, 18(7):5249-5257, American Chemical Society, Washington, DC, 2024 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Hydrogen-stabilized ScYNdGd medium-entropy alloy for hydrogen storage

Balcerzak, M., Ponsoni, J. B., Petersen, H., Menéndez, C., Ternieden, J., Zhang, L., Winkelmann, F., Aguey-Zinsou, K., Hirscher, M., Felderhoff, M.

Journal of the American Chemical Society, 146(8):5283-5294, American Chemical Society, Washington, DC, 2024 (article)

mms

DOI [BibTex]

DOI [BibTex]


no image
Machine learning of a density functional for anisotropic patchy particles

Simon, A., Weimar, J., Martius, G., Oettel, M.

Journal of Chemical Theory and Computation, 2024 (article)

al

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Parameterizing pressure-temperature profiles of exoplanet atmospheres with neural networks

Gebhard, T. D., Angerhausen, D., Konrad, B. S., Alei, E., Quanz, S. P., Schölkopf, B.

Astronomy & Astrophysics, 681, 2024 (article)

ei

DOI [BibTex]

DOI [BibTex]


{InterCap}: Joint Markerless {3D} Tracking of Humans and Objects in Interaction from Multi-view {RGB-D} Images
InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction from Multi-view RGB-D Images

Huang, Y., Taheri, O., Black, M. J., Tzionas, D.

International Journal of Computer Vision (IJCV), 2024 (article)

Abstract
Humans constantly interact with objects to accomplish tasks. To understand such interactions, computers need to reconstruct these in 3D from images of whole bodies manipulating objects, e.g., for grasping, moving and using the latter. This involves key challenges, such as occlusion between the body and objects, motion blur, depth ambiguities, and the low image resolution of hands and graspable object parts. To make the problem tractable, the community has followed a divide-and-conquer approach, focusing either only on interacting hands, ignoring the body, or on interacting bodies, ignoring the hands. However, these are only parts of the problem. On the contrary, recent work focuses on the whole problem. The GRAB dataset addresses whole-body interaction with dexterous hands but captures motion via markers and lacks video, while the BEHAVE dataset captures video of body-object interaction but lacks hand detail. We address the limitations of prior work with InterCap, a novel method that reconstructs interacting whole-bodies and objects from multi-view RGB-D data, using the parametric whole-body SMPL-X model and known object meshes. To tackle the above challenges, InterCap uses two key observations: (i) Contact between the body and object can be used to improve the pose estimation of both. (ii) Consumer-level Azure Kinect cameras let us set up a simple and flexible multi-view RGB-D system for reducing occlusions, with spatially calibrated and temporally synchronized cameras. With our InterCap method we capture the InterCap dataset, which contains 10 subjects (5 males and 5 females) interacting with 10 daily objects of various sizes and affordances, including contact with the hands or feet. To this end, we introduce a new data-driven hand motion prior, as well as explore simple ways for automatic contact detection based on 2D and 3D cues. In total, InterCap has 223 RGB-D videos, resulting in 67,357 multi-view frames, each containing 6 RGB-D images, paired with pseudo ground-truth 3D body and object meshes. Our InterCap method and dataset fill an important gap in the literature and support many research directions. Data and code are available at https://intercap.is.tue.mpg.de.

ps

Paper link (url) DOI [BibTex]