Back

Perceiving Systems

Language, Vision, and World Models

Chatpose teaser
ChatPose [File Icon] embeds SMPL poses as distinct signal tokens within a multimodal LLM, enabling the direct generation of 3D body poses from both textual and visual inputs. Leveraging the powerful capabilities of multimodal LLMs, ChatPose empowers LLMs to apply their extensive world knowledge in reasoning about human poses, unifying classical 3D human pose and generation tasks while enabling user interaction.