
Google DeepMind today announced Genie 3, a new general-purpose world model that succeeds Genie 2. Genie 3 can generate interactive environments with a simple text prompt. Google claims that users can navigate the generated environments in real time at 24 frames per second at a resolution of 720p.
According to Google DeepMind, the Genie 3 offers several capabilities that simulate natural phenomena with great realism, including water flow, lighting effects, and complex environmental interactions. It can generate realistic ecosystems, capturing detailed animal behaviors and the intricate growth patterns of plant life.
Genie 3 also enables imaginative world-building, supporting expressive animated characters. Additionally, it can generate immersive experiences of distant locations and historical eras with high fidelity.
Google claims that this high degree of controllability and real-time interactivity in Genie 3 was achieved through several significant technical breakthroughs. When generating each frame, the model takes into account the previously generated trajectory, which grows over time. Google also highlighted that Genie 3-generated environments remain largely consistent for several minutes, with visual memory extending as far back as one minute.
Genie 3 does have its limitations, and the Google DeepMind team has outlined the following as known constraints of the model.
- Limited action space. Although promptable world events allow for a wide range of environmental interventions, they are not necessarily performed by the agent itself. The range of actions agents can perform directly is currently constrained.
- Interaction and simulation of other agents. Accurately modeling complex interactions between multiple independent agents in shared environments is still an ongoing research challenge.
- Accurate representation of real-world locations. Genie 3 is currently unable to simulate real-world locations with perfect geographic accuracy.
- Text rendering. Clear and legible text is often only generated when provided in the input world description.
- Limited interaction duration. The model can currently support a few minutes of continuous interaction, rather than extended hours.
Genie 3 is currently available to select creators and academics. Google is now exploring making Genie 3 available to additional testers in the future.
0 Comments - Add comment