0% found this document useful (0 votes)
9 views54 pages

Robotics in The Age of Generative AI

Uploaded by

chiachenshih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views54 pages

Robotics in The Age of Generative AI

Uploaded by

chiachenshih
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Robotics

in the age of
Generative AI
Vincent Vanhoucke
Distinguished Scientist, Google DeepMind
LLMs
Embodied
AI
LLMs Embodied
AI
say-can.github.io

I spilled my drink, can you help?

LLM
“Find a cleaner”
“Find a sponge”
“Go to the trash can”
“Pick up the sponge”
“Try using the vacuum”

CoRL 2022 Special Innovation Award


say-can.github.io

I spilled my drink, can you help?

LLM Value Functions


“Find a cleaner” “Find a cleaner”
“Find a sponge” “Find a sponge”
“Go to the trash can” “Go to the trash can”
“Pick up the sponge” “Pick up the sponge”
“Try using the vacuum” “Try using the vacuum”

CoRL 2022 Special Innovation Award


say-can.github.io

I spilled my drink, can you help?

LLM Value Functions


“Find a cleaner” “Find a cleaner”
“Find a sponge” “Find a sponge”
“Go to the trash can” “Go to the trash can”
“Pick up the sponge” “Pick up the sponge”
“Try using the vacuum” “Try using the vacuum”

SayCan
“Find a cleaner”
“Find a sponge”
“Go to the trash can”
“Pick up the sponge”
“Try using the vacuum”

CoRL 2022 Special Innovation Award


say-can.github.io

I spilled my drink, can you help?

LLM Value Functions


“Find a cleaner” “Find a cleaner”
“Find a sponge” “Find a sponge”
“Go to the trash can” “Go to the trash can”
“Pick up the sponge” “Pick up teh sponge”
“Try using the vacuum” “Try using the vacuum”

SayCan I would:
“Find a cleaner” 1. Find a sponge
“Find a sponge” 2. Pick up the sponge
“Go to the trash can” 3. Come to you
“Pick up the sponge” 4. Put down the sponge
“Try using the vacuum” 5. Done

CoRL 2022 Special Innovation Award


say-can.github.io
SayCan
Perception Actuation

Planning

Goal
Perception Actuation

LLM

Goal
VLM Actuation

LLM

Goal
socraticmodels.github.io
innermonologue.github.io
Inner Monologue

SayCan Inner Monologue

Language Robot Value Robot Human Scene


Model Functions Descriptor

Language Success
Model Detector
innermonologue.github.io
Inner Monologue
robot-help.github.io
Robots that ask for help

CoRL 2023 Best Student Paper Award


Language
driven
exploration

auto- .github.io
VLM Actuation

LLM

Goal
VLM Code LM

LLM

Goal
code-as-policies.github.io
Code as policies

ICRA 2023 Outstanding Robot Learning Paper Award


robot-teaching.github.io
Non-expert teaching
robot-teaching.github.io
Non-expert teaching
mujoco.org
MuJoCo 3

Accelerated physics with


MuJoco XLA in JAX (MJX)
robot-teaching.github.io
Language Model Predictive Control
robot-teaching.github.io
Language Model Predictive Control
VLM Code LM

LLM

Goal
VLM Code LM

LLM

Goal
palm-e.github.io
PaLM-E: An embodied multimodal language model
palm-e.github.io
PaLM-E: An embodied multimodal language model
palm-e.github.io
PaLM-E: An embodied multimodal language model
palm-e.github.io
PaLM-E: An embodied multimodal language model

PaLM-E is massive
(562B params)

Yet we observe
positive transfer
across robots using
little robot data.
video-language-planning.github.io
Video Language Planning
VLM Code LM

LLM

Goal
VLM Code LM

LLM

Goal
robotics-transformer1.github.io
RT-1: Robotics Transformer v1
robotics-transformer1.github.io
RT-1: Robotics Transformer v1

RT-1 is able to reach ~100% performance on seen tasks,


While maintaining better robustness to unseen variability.
robotics-transformer1.github.io
RT-1: Diversity is all you need?
VLM Code LM

LLM

Goal
VLM Code LM

LLM

Goal
robotics-transformer2.github.io
RT-2: Making VLMs ‘speak robot’
robotics-transformer2.github.io
RT-2: Making VLMs ‘speak robot’
robotics-transformer2.github.io
RT-2: Emergent transfer

“Move coke can to Taylor Swift” “Move the banana to the sum of 2 + 1”
robotics-transformer2.github.io
RT-2: Scaling
deepmind.com/blog/robocat-a-self-improving-robotic-agent
RoboCat: scaling across robots

7DoF 7DoF 5DoF 7DoF


robotics-transformer-x.github.io
Open X-Embodiment:
Open foundations for robotics
robotics-transformer-x.github.io
Open X-Embodiment: RT-1-X
Open X-Embodiment: RT-2-X
VLM Code LM

LLM

Goal
VLM Code LM

LLM

Robot Data

Goal
VLM Code LM

LLM

Robot Data Dialects of


‘Robotese’

Goal
Thank you

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy