Open navigation menu

Scribd

0% found this document useful (0 votes)

269 views43 pages

LLM Prompt Engineering & RLHF - History and Techniques

The document discusses the history and techniques of engineering and prompting large language models like GPT-3, including origins, few-shot learning, zero-shot prompting, instruction tuning, and reinforcement learning from human feedback to align models.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

269 views43 pages

LLM Prompt Engineering & RLHF - History and Techniques

The document discusses the history and techniques of engineering and prompting large language models like GPT-3, including origins, few-shot learning, zero-shot prompting, instruction tuning, and reinforcement learning from human feedback to align models.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

LLM Prompt

Engineering & RLHF:

History & Techniques
Riley Goodside

©2023 Scale Inc. Confidential | ©2021 Scale Inc.

Pre-trained
GPT-3 Davinci
A journey into the past…

©2023 Scale Inc.

Pre-trained
GPT-3 Davinci
No instruction prompting.

©2023 Scale Inc.

Origins of GPT-3
Diagram from Yao Fu (2022).

©2023 Scale Inc.

Pre-trained
GPT-3 Davinci
GPT-3: a “few-shot learner”

©2023 Scale Inc.

Few-shot examples
A.k.a. “in-context learning”

©2023 Scale Inc.

Zero-shot prompt
programming
Clever zero-shot prompts can beat
few-shot examples.

https://arxiv.org/pdf/2102.07350.pdf

©2023 Scale Inc.

Zero-shot prompt
programming
Prompting via “memetic proxy”.

©2023 Scale Inc.

GPT-3 Codex
New capabilities from code training.

https://arxiv.org/pdf/2107.03374.pdf

©2023 Scale Inc.

GPT-3 Codex
New capabilities from code training.

©2023 Scale Inc.

Instruction tuning
Aligning models with user intent.

https://arxiv.org/pdf/2203.02155.pdf

©2023 Scale Inc.

Instruction tuning
Aligning models with user intent.

©2023 Scale Inc.

RLHF
Reinforcement learning from human
feedback further aligns models.

(Diagram from OpenAI ChatGPT

announcement.)

©2023 Scale Inc.

Origins of GPT-3
Diagram from Yao Fu (2022).

©2023 Scale Inc.

Zero-shot prose
prompting
Sometimes you can “just ask”.

©2023 Scale Inc.

Prompting with the
“format trick”
“Use this format:” is all you need.

©2023 Scale Inc.

Prompting with the
“format trick”
“Use this format:” is all you need.

©2023 Scale Inc.

Specifying tasks
using code prompts
Prompting through partial code.

©2023 Scale Inc.

Specifying tasks
using code prompts
Prompting with imaginary variables.

©2023 Scale Inc.

Specifying tasks
using code prompts
Prompting with imaginary variables.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.

A harder problem
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

©2023 Scale Inc.

Chain-of-thought
prompting
Figure 1 from Jason Wei et al. (2022).

©2023 Scale Inc.

Zero-shot
chain-of-thought
Figure 1 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.

Zero-shot
chain-of-thought
Figure 2 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.

Zero-shot
chain-of-thought
Figure 2 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.

Self-consistency
and consensus
Figure 1 from Xuezhi Wang et al. (2022).

©2023 Scale Inc.

“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
album Visions by Grimes?”

First, we prompt to generate code…

©2023 Scale Inc.

“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
album Visions by Grimes?”

First, we prompt to generate code…

©2023 Scale Inc.

“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

First, we prompt to generate code…

…which produces the answer by

prompting GPT-3 again in a loop.

©2023 Scale Inc.

“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

First, we prompt to generate code…

…which produces the answer by

prompting GPT-3 again in a loop.

We verify this answer using Python.

©2023 Scale Inc.

RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

Many adversarial/misleading
questions fail on pre-RLHF GPT-3

(Examples from Douglas Hofstadter

and David Bender in The Economist)

©2023 Scale Inc.

RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on

many previously hard problems

(Examples from Douglas Hofstadter

and David Bender in The Economist)

©2023 Scale Inc.

RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on

many previously hard problems

(Examples from Douglas Hofstadter

and David Bender in The Economist)

©2023 Scale Inc.

RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on

many previously hard problems

(Examples from Douglas Hofstadter

and David Bender in The Economist)

©2023 Scale Inc.

RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

Directed tool use and unprompted

chain-of-thought now incorporated
into chat models like Bing Search

(Screenshot from Thomas Rice)

©2023 Scale Inc.

Closing remarks
A poem from the digital Bard.

©2023 Scale Inc.

Open discussion
Suggested topics:

- Retrieval augmentation
- Tuned “soft prompts”
- Self-evaluation
- Diverse prompt ensembles
- When to fine-tune?
- Open-source models
- Mode collapse in RLHF
- Constitutional AI / RLAIF

©2023 Scale Inc.

You might also like

AI Agents Unleashed Playbook for 2025 Success
75% (4)
AI Agents Unleashed Playbook for 2025 Success
42 pages
Prompt engineering overview - Anthropic
No ratings yet
Prompt engineering overview - Anthropic
1 page
How AI Agents Are Reshaping The Future of Work 1730986106
100% (4)
How AI Agents Are Reshaping The Future of Work 1730986106
18 pages
LLMs in Production-MLC - GRC
No ratings yet
LLMs in Production-MLC - GRC
39 pages
Prompt Ingredients One Pager
No ratings yet
Prompt Ingredients One Pager
2 pages
Conv AI Brochure v2
No ratings yet
Conv AI Brochure v2
7 pages
2023 Vertical Snapshot Generative AI Preview
No ratings yet
2023 Vertical Snapshot Generative AI Preview
11 pages
Full Report
100% (1)
Full Report
129 pages
AIOps Done Right Automating The Next Generation of Enterprise Software
No ratings yet
AIOps Done Right Automating The Next Generation of Enterprise Software
20 pages
Chatgpt Prompt Engineering
50% (2)
Chatgpt Prompt Engineering
12 pages
Text Generation - OpenAI API
No ratings yet
Text Generation - OpenAI API
12 pages
8 AI Skills to Succeed in 2025_ Complete Upskilling Guide
No ratings yet
8 AI Skills to Succeed in 2025_ Complete Upskilling Guide
10 pages
Chatgpt Prompt Engineering
0% (1)
Chatgpt Prompt Engineering
9 pages
An AI Engineer's Guide To Machine Learning and Generative AI - by Ai Geek (Wishesh) - Medium
No ratings yet
An AI Engineer's Guide To Machine Learning and Generative AI - by Ai Geek (Wishesh) - Medium
67 pages
Generative AI For Everyone - Coursera
No ratings yet
Generative AI For Everyone - Coursera
5 pages
Generative AI 10-2023 Report
100% (4)
Generative AI 10-2023 Report
70 pages
Prompting - Unleashing the Potential of Prompt Engineering in Large Language Models
No ratings yet
Prompting - Unleashing the Potential of Prompt Engineering in Large Language Models
58 pages
Generative AI Specialization course
100% (1)
Generative AI Specialization course
29 pages
Chain-of-Thought Prompting Elicits Reasoning in LLM
No ratings yet
Chain-of-Thought Prompting Elicits Reasoning in LLM
43 pages
Hacking LLM
No ratings yet
Hacking LLM
33 pages
Developer ChatGPT
100% (1)
Developer ChatGPT
2 pages
Types of AI Agents
No ratings yet
Types of AI Agents
6 pages
Ai in Work Place
100% (1)
Ai in Work Place
29 pages
Understanding Prompt Engineering Fundamentals
100% (1)
Understanding Prompt Engineering Fundamentals
38 pages
Brief Introduction To GenAI
No ratings yet
Brief Introduction To GenAI
1 page
Day 1 - ChatGPT Architecture Principles
100% (1)
Day 1 - ChatGPT Architecture Principles
69 pages
ChatGPT Guide - Unleash de Power of ChatGPT in 30 Minutes
100% (2)
ChatGPT Guide - Unleash de Power of ChatGPT in 30 Minutes
32 pages
Winning AI Strategy With Executives
No ratings yet
Winning AI Strategy With Executives
4 pages
Mastering AI: A Guide For Prompt Engineers: Carlos Cuta
No ratings yet
Mastering AI: A Guide For Prompt Engineers: Carlos Cuta
35 pages
LLM Benchmark
No ratings yet
LLM Benchmark
21 pages
ChatGPT Cheat Sheet Devs v2 2
No ratings yet
ChatGPT Cheat Sheet Devs v2 2
1 page
Prompt Engineering
No ratings yet
Prompt Engineering
1 page
Building a Streamlit Chatbot with LangChain and Llama 3.1_ Exploring LLMs — 3 _ by Abou Zuhayr _ Sep, 2024 _ GoPenAI
No ratings yet
Building a Streamlit Chatbot with LangChain and Llama 3.1_ Exploring LLMs — 3 _ by Abou Zuhayr _ Sep, 2024 _ GoPenAI
15 pages
ALN TheMillionDollarAIPortfolio 0823
No ratings yet
ALN TheMillionDollarAIPortfolio 0823
23 pages
Prompt Engineering
100% (1)
Prompt Engineering
13 pages
Chatgpt Developer Cheatsheet
No ratings yet
Chatgpt Developer Cheatsheet
56 pages
Building of Personalised AI Assistant
100% (1)
Building of Personalised AI Assistant
12 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
Understanding ChatGPT
No ratings yet
Understanding ChatGPT
2 pages
Using GPT-Eliezer Against ChatGPT Jailbreaking
No ratings yet
Using GPT-Eliezer Against ChatGPT Jailbreaking
23 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
GPTforBusiness Strategic Prompt 240120 185117
No ratings yet
GPTforBusiness Strategic Prompt 240120 185117
46 pages
KPMG - Ai & CX - 2024
No ratings yet
KPMG - Ai & CX - 2024
62 pages
AI Tools For Product Managers
No ratings yet
AI Tools For Product Managers
8 pages
AI Tools For Software Developers Part One
No ratings yet
AI Tools For Software Developers Part One
19 pages
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
No ratings yet
List of Open Sourced Fine-Tuned Large Language Models (LLM) - by Sung Kim - Geek Culture - Mar, 2023 - Medium
18 pages
Custom AI Solutions or Ready-To-Use Products - How To Approach AI Software Development
No ratings yet
Custom AI Solutions or Ready-To-Use Products - How To Approach AI Software Development
8 pages
GenAI Top 10 Weekly PDF
No ratings yet
GenAI Top 10 Weekly PDF
12 pages
Agentic Deep Graph Reasoning 1739950593
No ratings yet
Agentic Deep Graph Reasoning 1739950593
102 pages
The Power of Generative AI - Transforming Ideas Into Reality - v1
No ratings yet
The Power of Generative AI - Transforming Ideas Into Reality - v1
85 pages
OpenAI Basics
0% (1)
OpenAI Basics
68 pages
AI Chatbot
No ratings yet
AI Chatbot
4 pages
Building Effective Agents _ Anthropic
No ratings yet
Building Effective Agents _ Anthropic
16 pages
Roadmap To Become AI Engineer
No ratings yet
Roadmap To Become AI Engineer
12 pages
7 Chatgpt Prompts by The Guy For Ai
No ratings yet
7 Chatgpt Prompts by The Guy For Ai
20 pages
Technical Seminar
100% (1)
Technical Seminar
21 pages
Auto-GPT Article - Final
No ratings yet
Auto-GPT Article - Final
8 pages
LLM Webinar - Part 1
No ratings yet
LLM Webinar - Part 1
72 pages
AI Trends of May 2023 You Need To Know by Gonzalo Recio Medium
No ratings yet
AI Trends of May 2023 You Need To Know by Gonzalo Recio Medium
1 page

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Alternative Proxy