Google DeepMind at ICML 2024

New approaches in generative AI and multimodality

Generative AI technologies and multimodal capabilities are expanding the creative possibilities of digital media.

We’ll present VideoPoet, which uses an LLM to generate state-of-the-art video and audio from multimodal inputs including images, text, audio and other video.

And share Genie (generative interactive environments), which can generate a range of playable environments for training AI agents, based on text prompts, images, photos, or sketches.

Finally, we introduce MagicLens, a novel image retrieval system that uses text instructions to retrieve images with richer relations beyond visual similarity.

Supporting the AI community

We’re proud to sponsor ICML and foster a diverse community in AI and machine learning by supporting initiatives led by Disability in AI, Queer in AI, LatinX in AI and Women in Machine Learning.

If you’re at the conference, visit the Google DeepMind and Google Research booths to meet our teams, see live demos and find out more about our research.

Source link

New approaches in generative AI and multimodality

Supporting the AI community

Recent Articles

BGIS Grand Finals 2026 Standings After Day 2

Oppo made the best foldable phone, again

Anthropic Data Leak Reveals Upcoming Mythos AI Model

President Trump Is Now Posting Animal Crossing AI-Slop

AI fraud explodes into a $400 billion machine as scams scale faster than banks can react or even detect threats in time

Related Stories