This is the era of generative large models – they can generate text, images, audio, videos, 3D objects… And if we combine all of these together, we may get a world!
Now, whether it is the world model that LeCun is exploring, or the spatial intelligence that Li FeiFei wants to overcome, or other similar concepts proposed by other research teams, we are undoubtedly getting closer and closer to this world. Just a few hours ago, we took another step towards this world: CMU jointly released an open-source generative physical engine with more than 20 other research laboratories: Genesis, meaning “Genesis”. As can be seen from the name, this may really be the starting point of a new world.
According to the content shared by Zhou Xian, a doctoral student at CMU Robotics Institute and Professor Ganchuang, the leader, on X, this project took more than 2 years and nearly 20 institutions at home and abroad participated in internal testing.
Finally, the Genesis generative physics engine obtained by this joint team can generate 4D dynamic worlds, and its foundation is a physical simulation platform for general robots and physical AI applications.
- Open source address: https://github.com/Genesis-Embodied-AI/Genesis
- Project page: https://genesis-embodied-ai.github.io/
- Documentation address: https://genesis-world.readthedocs.io/en/latest/
Currently, the technical paper of Genesis has not been released. But according to the official documentation, the main features of Genesis include:
- Installation is effortless, and the API design is extremely simple and user-friendly.
- The speed of parallel simulation is unprecedented: Genesis is the fastest physical engine in the world. The simulation speed is 10 to 80 times faster than existing GPU-accelerated robot simulators (Isaac Gym/Sim/Lab, Mujoco MJX, etc.) (yes, this is a bit sci-fi), while not affecting simulation accuracy and fidelity.
- A unified framework supporting various SOTA physical solvers to model various materials and physical phenomena.
- Photorealistic ray-tracing rendering with performance optimization.
- Differentiability: Genesis is designed to be fully compatible with differentiable simulation. Currently, its MPM solver and tool solver (Tool Solver) are differentiable, and the differentiability of other solvers will be added soon (starting from rigid body simulation).
- Physically accurate and differentiable tactile sensors.
- Native support for generative simulation, allowing data of various modalities to be generated through language prompts: interactive scenes, task proposals, rewards, assets, character actions, policies, trajectories, camera actions, (physically accurate) videos, etc.
In addition, Genesis also supports various hardware and operating systems.
In order to substantiate the superior characteristics of Genesis, Zhou Xian shared an example on X: On a single RTX4090, its simulation speed is about 430,000 times faster than the real-time speed, and it only takes 26 seconds to train a robot motion strategy that can be transferred to the real world.
Zhou Xian said: “Our goal is to build a general data engine that can use the upper generative framework to automatically create the physical world and various types of data, including environments, camera motions, robot task proposals, reward functions, robot strategies, character motions, fully interactive 3D scenes, open-world articulated assets, etc., so as to automatically generate data for robots, physical AI and other applications.”
Since the announcement of Genesis, it has received countless praises.
The number of stars on the GitHub project also exceeded 1.5k within just a few hours.
Genesis: A comprehensive physical simulation platform
Genesis is a comprehensive physical simulation platform designed specifically for general-purpose robotics, embodied AI, and physical AI applications. It has multiple attributes at the same time:
- A general physics engine rebuilt from scratch that can simulate a wide range of materials and physical phenomena;
- A lightweight, ultra-fast, Pythonized, and user-friendly robot simulation platform;
- A powerful and fast realistic photo rendering system;
- A generative data engine that converts natural language descriptions input by users into various data forms.
Genesis is supported by a redesigned and rebuilt general physical engine, and integrates various physical solvers and their couplings into a unified framework. This core physical engine is further enhanced by the generative agent architecture running at a higher level, aiming to achieve fully automatic data generation for robotics and other fields.
Creating a good simulator is very challenging.
Professor Gan Chuang introduced: “The core of our method is to reverse engineer human mental models and build robot brains driven by generative physical engines! I realize that many robotics experts are skeptical about this approach. They pointed out the difficulties in setting up simulators and solving the simulation-reality gap. They advocate focusing only on learning from data in the real world. I understand these concerns, but I firmly believe that we cannot just bypass them!”
The generative framework aims to automatically generate data, including the following contents:
- Physically accurate and spatially consistent videos;
- Camera motion and parameters;
- Human and animal character motion;
- Robot manipulation and motion strategies that can be deployed to the real world;
- Fully interactive 3D scenes;
- Open-world articulated object generation;
- Speech audio, facial animation and emotions.
Currently, this research is opening up the underlying physical engine and simulation platform. In the near future, access to the generation framework will be gradually launched.
Genesis is of excellent performance and amazing in effect.
As a highly optimized physical engine, Genesis can use GPU acceleration for parallel computing and provides unprecedented simulation speed in various scenarios.
When simulating the manipulation scene, Genesis runs at a speed of 43 million frames per second, which is 430,000 times faster than real-time speed.
In large-scale simulations, Genesis uses “auto-hibernation” to accelerate the simulation of convergent and static entities. However, this function is under testing and will be released in version 0.1.1.
The speed comparison between Genesis and commonly used CPU and GPU-based robot simulators.
Zhou Xian stated that the GPU-parallelized IK (Inverse kinematics) solver of Genesis can complete the IK solving of 10,000 Franka manipulators within 2 milliseconds.
Next, let’s look at the specific example display.
Generate 4D dynamics and physical worlds.
The physical engine of Genesis is supported by VLM-based generative agents. These agents use the APIs provided by the simulation infrastructure as tools to create 4D dynamic worlds, and then use them as the basic data sources for extracting various pattern data.
Combined with the generative camera and object motion module, Genesis can generate physically accurate and view-consistent videos and other forms of data.
Moreover, Genesis also supports simulating various different materials, including rigid bodies, articulated bodies, fabrics, liquids, smoke, deformable bodies, thin-shell materials, elastic/plastic bodies, robotic muscles, etc.
Simulating a layer of chocolate sauce is naturally no problem.
The texture of the shredded foam also looks very realistic.
The texture of the planet and the spaceship is also very high. It looks like it comes from a big-budget science fiction movie.
The physical process of the bullet breaking the water balloon is just like it really comes from well-equipped high-speed photography.
A pot of alphabet candies looks very Q-shaped and elastic.
The simulation of the inflatable sex doll is also just right. It also humorously simulates real situations.
Character action generation
With such a high-quality physical engine, it is also good news for the game production industry. Many complex actions and effects can be quickly generated through prompts:
Hint: The mini-version of Wukong with a stick runs on the desktop for 3 seconds, then jumps into the air, and when landing, the right arm swings down. The camera starts from a close-up of his face and then follows the character steadily while gradually shrinking. When Wukong jumps into the air, at the highest point of the jump, the action pauses for a few seconds. The camera rotates 360 degrees around the character and then rises slowly and then continues the action.
The time cost of designing the action is immediately reduced.
Robot strategy generation
Genesis can use generative robotic agents and physical engines to automatically generate robot strategies and demonstration data for various skills in different scenarios. This means that researchers can quickly obtain robot action plans that conform to physical laws in simulation environments and reliably transfer them to physical robots.
The following shows some examples of robots with different forms performing different tasks.
Hint: A moving Franka robotic arm uses a bowl and a microwave oven to make popcorn.
Prompt: The Yushu Go2 quadruped robot is running in the rain (Sim).
For example, from the prompt words to the action strategies in the simulation environment and then transferred to the physical robot, it can be so smooth:
Hint: The YuTree H1-2 humanoid robot walks forward (Sim2Real).
Doing a handstand requires precise balance control and whole-body coordination. Such a difficult action can now be achieved through Genesis in Sim2Real:
Prompt: The quadruped robot performs handstands with the first two legs (Sim2Real).
The handstand is not enough. With the help of Genesis, the robot dog can also learn “gymnastics skills” faster and perform two straight backflips steadily:
The quadruped robot performs two backflips in a row (Sim2Real).
Actions that require interaction with objects in the real world, such as pulling a chair, are also no problem:
Movement operation of large underactuated robots (Sim2Real)
3D and fully interactive scene generation
The generation framework of Genesis supports the generation of 3D and fully interactive scenes, which can be used to train robot skills.
Family indoor scene, with a living room (including a dining area), a bathroom, a study and a bedroom.
Inside the restaurant
Open world articulated object generation
Genesis can also generate objects with articulated structures and their interaction processes, such as opening and closing car doors, opening and closing laptop computers, and folding metal blades.
Soft robots.
Genesis is still the first platform to provide comprehensive support for soft muscles and soft robots and their interaction with rigid robots. Genesis also comes with a soft robot configuration system similar to URDF. The official also provides a relevant tutorial: https://genesis-world.readthedocs.io/en/latest/user_guide/getting_started/soft_robots.html.
Genesis is also capable of simulating hybrid robots with soft skin and rigid bones.
Voice audio, facial expressions and emotion generation
Audio and facial expressions are also the modalities that Genesis wants to integrate. The following shows two examples:
The character’s emotions change from neutral to angry and then to happy.
Genesis generalizes the change of emotions to different faces
Conclusion
Finally, Zhou Xian showed a Tetris game created with Genesis. The blocks in the game are made of jelly material and can move in accordance with realistic physical laws.
We may have come across similar videos before, but those are the results carefully crafted by video special effects artists. Now, Genesis can be exported with one click and further transformed into a real and achievable technological breakthrough.
Professor Gan Chuang shared his experience of participating in this project on X:
“Since 2018, I decided to shift my research focus from vision to embodied AI because I was fascinated by creating general agents that can interact with the physical world and other intelligent beings with similar human flexibility – we call this field embodied AGI (embodied AGI).”
Generative Physics Simulator is all You Need!
He also wrote: “To be honest, sometimes I think this simulator may be too advanced to be released, but we believe it is crucial to make it completely open source and build a strong community around our mission! Please join the Genesis community! We hope to convince the robotics research community that ‘”. ”
It has to be said that it is really very much looking forward to the practical applications of Genesis!