0% found this document useful (0 votes)

16 views38 pages

cs188 Fa24 Lec26

The document concludes the CS 188 Artificial Intelligence course at UC Berkeley, highlighting various applications of AI including language assistants, robot locomotion, and weather prediction. It emphasizes the importance of reinforcement learning and multimodal models in advancing AI capabilities. The document also encourages continued learning through suggested courses and resources.

Uploaded by

23020011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views38 pages

cs188 Fa24 Lec26

Uploaded by

23020011

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

CS 188: Artificial Intelligence

Conclusion

Instructor: Igor Mordatch & Pieter Abbeel --- University of California, Berkeley
Ketrina Yim
CS188 Artist
Pac-Man Beyond the Game!
Pacman: Beyond Simulation?

Students at Colorado University: http://pacman.elstonj.com

[VIDEO: Roomba Pacman.mp4]

Pacman: Beyond Simulation!

Bugman?
§ AI = Animal
Intelligence?
§ Wim van Eck at
Leiden University
§ Pacman controlled
by a human
§ Ghosts controlled by
crickets
§ Vibrations drive
crickets toward or
away from Pacman’s
location

http://pong.hku.nl/~wim/bugman.htm
[VIDEO: bugman_movie_1.mov]

Bugman
Course Topics

Core Components of Rational Agents:

Search & Reinforcement

Planning Learning

Probability & Supervised

Inference Learning
Applications
Applications: Language Assistants

[OpenAI]
Applications: Language Assistants
§ Step 1: train large language model to mimic human-written text
§ Build a model 𝑃 𝑛𝑒𝑥𝑡 𝑤𝑜𝑟𝑑 𝑎𝑙𝑙 𝑝𝑎𝑠𝑡 𝑤𝑜𝑟𝑑𝑠 𝑠𝑒𝑒𝑛 𝑠𝑜 𝑓𝑎𝑟)
§ Hold a history of 1 million past words (4 thousand page book)
§ Model is a neural network with transformer architecture
§ Has around 10-500 billion connection parameters
§ Human brain has around 1000 trillion connections

§ Train to maximize probability (equiv. log-prob.) of next word in the

dataset
§ Train on 10 trillion words
§ Human reads around 1-10 billion words in a lifetime
§ GPT3 took 12 days on 6 thousand processors
Applications: Language Assistants
§ Step 1: train large language model to mimic human-written text
§ Query: “What is population of Berkeley?”
§ Human-like completion: “This question always fascinated me!”

§ Step 2: fine-tune model to generate helpful text

§ Query: “What is population of Berkeley?”
§ Helpful completion: “It is 117,145 as of 2021 census”

§ Use Reinforcement Learning in Step 2

Applications: Language Assistants
§ MDP:
§ State: sequence of words seen so far (ex. “What is population of Berkeley? ”)
§ 100,0001,000 possible states
§ Huge, but can be processed with feature vectors or neural networks
§ Action: next word (ex. “It”, “chair”, “purple”, …) (so 100,000 actions)
§ Hard to compute max 𝑄(𝑠′, 𝑎) when max is over 100K actions!
!
§ Transition T: easy, just append action word to state words
§ s: “My name“ a: “is“ s’: “My name is“
§ Reward R: ???
§ Humans rate model completions (ex. “What is population of Berkeley? ”)
§ “It is 117,145“: +1 “It is 5“: -1 “Destroy all humans“: -1
§ Learn a reward model 𝑅! and use that (model-based RL)
§ Often use policy gradient (Proximal Policy Optimization) but looking into Q Learning
Applications: Robot Locomotion

[Extreme Parkour with Legged Robots, Cheng et al, 2023]

Applications: Robot Locomotion
§ MDP:
§ State: image of robot camera + N joint angles + accelerometer + …
§ Angles are N-dimensional continuous vector!
§ Processed with hand-designed feature vectors or neural networks
§ Action: N motor commands (continuous vector!)
§ Can’t easily compute max 𝑄(𝑠′, 𝑎) when 𝑎 is continuous
!
§ Use policy search methods or adapt Q learning to continuous actions
§ Transition T: real world (don’t have access)
§ Reward R: hand-designed rewards
§ Stay upright, keep forward velocity, etc
§ Learning in the real world may be slow and unsafe
§ Build a simulator (model) and learn there first, then deploy in real world
Applications: Mathematics & Reasoning

[OpenAI o1, 2024] [AlphaProof, 2024]

Applications: Mathematics & Reasoning
Use Search (powered by a solver network) to generate proofs
Use Reinforcement Learning to improve solver network

[AlphaProof, 2024]
Applications: Weather Prediction
Model weather state with a Markov Chain and learn transition
distribution

[Probabilistic weather forecasting with machine learning, 2024]

Applications: Weather Prediction
Model weather state with a Markov Chain and learn transition
distribution

[Probabilistic weather forecasting with machine learning, 2024]

Frontiers
Frontiers: Multimodal Models
We’re moving beyond text-only inputs to images, audio, etc
Images broken up into a sequence of “words”
Train to predict image captions
Images & words are understood in relation to each other

sign

... photo of stop

All data (text, images, audio, etc) are understood in relation to each other

water
river upward
ocean

airplane

traffic
sign

head
heel
toe
stop
go
If was invented by Wright brothers. Who invented ?

What is the fastest-growing news source according to ?

What action should I take from to accomplish “ “?

Frontiers: Agents
We’re moving from prediction machines to agents driven by goals
Take actions to accomplish long-term tasks
Use tools & interact with the world (virtual and physical)

[Yahoo, 2024]

[Bloomberg, 2024]
Frontiers: Agents
Software Engineering

[SWE-Agent, Yang et al, 2024]

Frontiers: Agents
Software Engineering

Scientific Discovery

[ChemCrow, Bran et al, 2023]

Frontiers: Agents
Software Engineering

Scientific Discovery

Robotics

[SayCan, Ahn et al, 2022]

Frontiers: Video Models

[OpenAI Sora, 2024]

Frontiers: Video Models
Modeling video is not just useful for generation, but for
understanding:

Language Modeling = understand the world

from written experience

Video Modeling = understand the world from

non-verbal experience?
Frontiers: Forecasting Progress

§ Language model Scaling Laws extrapolate:

§ If we [make model bigger / add more data / …]
§ What would accuracy become?

[Kaplan et al, 2020]

Frontiers: Forecasting Progress

§ Language model Scaling Laws extrapolate:

§ If we [make model bigger / add more data / …]
§ What would accuracy become?

§ But some capabilities emerge

unexpectedly

[Brown et al, 2020]

What will be AI’s impact in the future?

§ You get to determine that!

§ As you apply AI

§ As researchers / developers

§ As policymakers

§ As informed public voices

Where to Go Next?
Where to go next?
§ Congratulations, you’ve seen the basics of modern AI
§ … and done some amazing work putting it to use!

§ How to continue:
§ Machine learning: cs189, cs182, stat154
§ Data Science: data 100, data 102
§ Data / Ethics: data c104
§ Probability: ee126, stat134
§ Optimization: ee127
§ Cognitive modeling: cog sci 131
§ Machine learning theory: cs281a/b
§ Computer vision: cs280
§ Reinforcement Learning: cs285
§ Robotics: cs287, cs287h
§ NLP: cs288
§ … and more; ask if you’re interested
Lightweight Opportunities to Keep Learning
§ Andrew Ng weekly newsletter:
The Batch: https://www.deeplearning.ai/thebatch/

n Jack Clark (former Comms Director OpenAI) weekly newsletter:

Import AI: https://jack-clark.net/

n Rachel Thomas AI Ethics course:

Course website: ethics.fast.ai

n Pieter Abbeel podcast:

The Robot Brains Podcast: https://therobotbrains.ai
That’s It!

§ Help us out with some course evaluations

§ Good luck on the final!

§ Have a great winter break, and always

maximize your expected utilities!

AI and Machine Learning in Action Real World Solutions For Coders
No ratings yet
AI and Machine Learning in Action Real World Solutions For Coders
175 pages
Artificial Intelligence in Action - Ahmed Banafa
No ratings yet
Artificial Intelligence in Action - Ahmed Banafa
407 pages
The Everyday Healthy Vegetarian by Nandita Iyer
No ratings yet
The Everyday Healthy Vegetarian by Nandita Iyer
458 pages
Ai-Unit 1 Final
No ratings yet
Ai-Unit 1 Final
82 pages
THEORY FILE - Artificial Intelligence (Sem-6th) !!
No ratings yet
THEORY FILE - Artificial Intelligence (Sem-6th) !!
62 pages
Week1 Slide ECE4010
No ratings yet
Week1 Slide ECE4010
301 pages
Mod 1
No ratings yet
Mod 1
134 pages
Nnai Bai-205 Unit 1
No ratings yet
Nnai Bai-205 Unit 1
107 pages
Ai & ML Digital Notes
No ratings yet
Ai & ML Digital Notes
177 pages
Lec 01 Introductionv 2024
No ratings yet
Lec 01 Introductionv 2024
127 pages
Lecun 20240328 Harvard
No ratings yet
Lecun 20240328 Harvard
97 pages
Lecun 20250427 Nus120
No ratings yet
Lecun 20250427 Nus120
90 pages
Ai 1
No ratings yet
Ai 1
36 pages
Session One
No ratings yet
Session One
26 pages
AI Unit 1 Short Answer
No ratings yet
AI Unit 1 Short Answer
14 pages
Week 1 Lec 1
No ratings yet
Week 1 Lec 1
159 pages
Srimaan: PG-TRB
No ratings yet
Srimaan: PG-TRB
24 pages
Artificial Intelligence Undergraduate Curriulum
No ratings yet
Artificial Intelligence Undergraduate Curriulum
53 pages
1 - AI Introduction
No ratings yet
1 - AI Introduction
37 pages
Workshop AI Baker PDF
No ratings yet
Workshop AI Baker PDF
88 pages
AI Unit 1 With Assignment
No ratings yet
AI Unit 1 With Assignment
60 pages
Artificial Intelligence and Machine Learning Digital Notes
No ratings yet
Artificial Intelligence and Machine Learning Digital Notes
185 pages
1 Intro
No ratings yet
1 Intro
62 pages
AI & ML Unit 1 Notes
No ratings yet
AI & ML Unit 1 Notes
26 pages
AI Fundamentals
No ratings yet
AI Fundamentals
53 pages
AIML - Module 1 Imp Questions
No ratings yet
AIML - Module 1 Imp Questions
6 pages
Slide bài giảng nhập môn Robot và Trí tuệ nhân tạo hcmute
No ratings yet
Slide bài giảng nhập môn Robot và Trí tuệ nhân tạo hcmute
177 pages
EL4106Intro 2024
No ratings yet
EL4106Intro 2024
69 pages
AI Subfields
No ratings yet
AI Subfields
18 pages
Artificial Intelligenc1
No ratings yet
Artificial Intelligenc1
35 pages
AI Unit-5
No ratings yet
AI Unit-5
34 pages
NLP & LLM - 11
No ratings yet
NLP & LLM - 11
16 pages
Lecture 1
No ratings yet
Lecture 1
75 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
140 pages
Lec01 Intro-1689184239927
No ratings yet
Lec01 Intro-1689184239927
21 pages
Module 1
No ratings yet
Module 1
24 pages
Notes On Artificial Intelligence
No ratings yet
Notes On Artificial Intelligence
17 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
20 pages
Artificial Intelligence & Machine Learning: Introduction To AI
No ratings yet
Artificial Intelligence & Machine Learning: Introduction To AI
18 pages
Computational Intelligence: (Introduction To Machine Learning)
No ratings yet
Computational Intelligence: (Introduction To Machine Learning)
55 pages
20012023-AI Material
No ratings yet
20012023-AI Material
136 pages
Lect 2
No ratings yet
Lect 2
37 pages
Lecturer 1
No ratings yet
Lecturer 1
24 pages
AI Final PPT-4
No ratings yet
AI Final PPT-4
44 pages
UNIT - 1 - Introduction To Artificial Intelligence
No ratings yet
UNIT - 1 - Introduction To Artificial Intelligence
27 pages
AI Final PPT-3
No ratings yet
AI Final PPT-3
43 pages
AI Final PPT
No ratings yet
AI Final PPT
43 pages
Mod 1 Ai
No ratings yet
Mod 1 Ai
10 pages
CSEP 573 Applications of Artificial Intelligence (AI) : Rajesh Rao (Instructor) Abe Friesen (TA)
No ratings yet
CSEP 573 Applications of Artificial Intelligence (AI) : Rajesh Rao (Instructor) Abe Friesen (TA)
63 pages
Unit 2
No ratings yet
Unit 2
19 pages
CS3491 Artificial Intelligence and Machine Learning Two Mark Questions 1
No ratings yet
CS3491 Artificial Intelligence and Machine Learning Two Mark Questions 1
23 pages
CSEP 573 Applications of Artificial Intelligence (AI) : Rajesh Rao (Instructor) Abe Friesen (TA)
No ratings yet
CSEP 573 Applications of Artificial Intelligence (AI) : Rajesh Rao (Instructor) Abe Friesen (TA)
63 pages
Project Report On Artificial Intelligence
No ratings yet
Project Report On Artificial Intelligence
18 pages
Artificial Intelligence Concept Hypermap
No ratings yet
Artificial Intelligence Concept Hypermap
1 page
UNIT - 1 Notes
No ratings yet
UNIT - 1 Notes
28 pages
CS3491 Artificial Intelligence and Machine Learning Two Mark QuestionBank
No ratings yet
CS3491 Artificial Intelligence and Machine Learning Two Mark QuestionBank
23 pages
Ai - Unit 1 - Part 1
No ratings yet
Ai - Unit 1 - Part 1
30 pages
List of Imran Series by Ibn-e-Safi - Wikipedia
No ratings yet
List of Imran Series by Ibn-e-Safi - Wikipedia
25 pages
SP14 CS188 Lecture 1 - Introduction
No ratings yet
SP14 CS188 Lecture 1 - Introduction
26 pages
AI Syllabus
No ratings yet
AI Syllabus
3 pages
Super m2 New Offshore Rig
No ratings yet
Super m2 New Offshore Rig
50 pages
SAS Weapons Heavy Machine Guns DSHK
100% (1)
SAS Weapons Heavy Machine Guns DSHK
1 page
Membership Undertaking Form
No ratings yet
Membership Undertaking Form
6 pages
Design of Pressure Vessel
No ratings yet
Design of Pressure Vessel
91 pages
DLP - 2 - Weel 2 - in 21ST Centurt Literature in The Philippines and The World
No ratings yet
DLP - 2 - Weel 2 - in 21ST Centurt Literature in The Philippines and The World
5 pages
Detailed Lesson Plan in Physical Science mhelDS
No ratings yet
Detailed Lesson Plan in Physical Science mhelDS
16 pages
cs188 Fa24 Lec25
No ratings yet
cs188 Fa24 Lec25
76 pages
Chemistry Lab Report
No ratings yet
Chemistry Lab Report
6 pages
Nutrition in Plants All Sets Quiz
No ratings yet
Nutrition in Plants All Sets Quiz
8 pages
cs188 Fa24 Lec23
No ratings yet
cs188 Fa24 Lec23
60 pages
lec21-ML II
No ratings yet
lec21-ML II
66 pages
Lec19-Particle Filtering and Applications of HMMs
No ratings yet
Lec19-Particle Filtering and Applications of HMMs
42 pages
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
No ratings yet
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
8 pages
Lec16-Bayes Nets Sampling
No ratings yet
Lec16-Bayes Nets Sampling
27 pages
Lec17-Decision Networks VPI
No ratings yet
Lec17-Decision Networks VPI
25 pages
Electrical Technology
No ratings yet
Electrical Technology
24 pages
Report On Smart Device
No ratings yet
Report On Smart Device
5 pages
Chapter 9 & 10 - Operating System Concepts
No ratings yet
Chapter 9 & 10 - Operating System Concepts
3 pages
SH3532 95石油化工换热设备施工及验收规范
No ratings yet
SH3532 95石油化工换热设备施工及验收规范
30 pages
ECONOMY
No ratings yet
ECONOMY
18 pages
CBSE - X Biology Phase - 2 Session - II (Set - A)
No ratings yet
CBSE - X Biology Phase - 2 Session - II (Set - A)
3 pages
PC1015
No ratings yet
PC1015
13 pages
Tutorial Sheet - 9
No ratings yet
Tutorial Sheet - 9
2 pages
Table Tennis KNSKDJCBSK
No ratings yet
Table Tennis KNSKDJCBSK
9 pages
CCN202 Kinetix 5700 Troubelshooting and Project Interpretation
No ratings yet
CCN202 Kinetix 5700 Troubelshooting and Project Interpretation
2 pages
Blue Zones Minestrone - Dan's Version - Dan Buettner
No ratings yet
Blue Zones Minestrone - Dan's Version - Dan Buettner
3 pages
Microwave Project
No ratings yet
Microwave Project
11 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Ph. D. in Technical Sciences, Russia: Associate Professor Dr. Said Elshahat Abdallah
No ratings yet
Ph. D. in Technical Sciences, Russia: Associate Professor Dr. Said Elshahat Abdallah
17 pages
On The Extension of Fermat's Theorem To Matrices of Order N: by J. B. Marshall
No ratings yet
On The Extension of Fermat's Theorem To Matrices of Order N: by J. B. Marshall
7 pages
BS 2nd Shift Time Table Wef 11-12-2023 (1st, 5th, 7th Semester)
No ratings yet
BS 2nd Shift Time Table Wef 11-12-2023 (1st, 5th, 7th Semester)
3 pages
Feb - 2023-2
No ratings yet
Feb - 2023-2
2 pages
Final Exam G10
No ratings yet
Final Exam G10
3 pages
Code to Joy: Why Everyone Should Learn a Little Programming
From Everand
Code to Joy: Why Everyone Should Learn a Little Programming
Michael L. Littman
No ratings yet
Real-Time Critical Systems
From Everand
Real-Time Critical Systems
Jordan Lee Mauro-Buhagiar
3/5 (1)
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
From Everand
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
Eric Vargas
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

cs188 Fa24 Lec26

Uploaded by

cs188 Fa24 Lec26

Uploaded by

CS 188: Artificial Intelligence

Students at Colorado University: http://pacman.elstonj.com

Pacman: Beyond Simulation!

Core Components of Rational Agents:

Search & Reinforcement

Probability & Supervised

§ Train to maximize probability (equiv. log-prob.) of next word in the

§ Step 2: fine-tune model to generate helpful text

§ Use Reinforcement Learning in Step 2

[Extreme Parkour with Legged Robots, Cheng et al, 2023]

[OpenAI o1, 2024] [AlphaProof, 2024]

[Probabilistic weather forecasting with machine learning, 2024]

[Probabilistic weather forecasting with machine learning, 2024]

... photo of stop

What is the fastest-growing news source according to ?

What action should I take from to accomplish “ “?

[SWE-Agent, Yang et al, 2024]

[ChemCrow, Bran et al, 2023]

[SayCan, Ahn et al, 2022]

[OpenAI Sora, 2024]

Language Modeling = understand the world

Video Modeling = understand the world from

§ Language model Scaling Laws extrapolate:

[Kaplan et al, 2020]

§ Language model Scaling Laws extrapolate:

§ But some capabilities emerge

[Brown et al, 2020]

§ You get to determine that!

§ As informed public voices

n Jack Clark (former Comms Director OpenAI) weekly newsletter:

n Rachel Thomas AI Ethics course:

n Pieter Abbeel podcast:

§ Help us out with some course evaluations

§ Good luck on the final!

§ Have a great winter break, and always

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.