0% found this document useful (0 votes)
269 views43 pages

LLM Prompt Engineering & RLHF - History and Techniques

The document discusses the history and techniques of engineering and prompting large language models like GPT-3, including origins, few-shot learning, zero-shot prompting, instruction tuning, and reinforcement learning from human feedback to align models.

Uploaded by

Alex Stoica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
269 views43 pages

LLM Prompt Engineering & RLHF - History and Techniques

The document discusses the history and techniques of engineering and prompting large language models like GPT-3, including origins, few-shot learning, zero-shot prompting, instruction tuning, and reinforcement learning from human feedback to align models.

Uploaded by

Alex Stoica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

LLM Prompt

Engineering & RLHF:


History & Techniques
Riley Goodside

©2023 Scale Inc. Confidential | ©2021 Scale Inc.


Pre-trained
GPT-3 Davinci
A journey into the past…

©2023 Scale Inc.


Pre-trained
GPT-3 Davinci
No instruction prompting.

©2023 Scale Inc.


Origins of GPT-3
Diagram from Yao Fu (2022).

©2023 Scale Inc.


Pre-trained
GPT-3 Davinci
GPT-3: a “few-shot learner”

©2023 Scale Inc.


Few-shot examples
A.k.a. “in-context learning”

©2023 Scale Inc.


Zero-shot prompt
programming
Clever zero-shot prompts can beat
few-shot examples.

https://arxiv.org/pdf/2102.07350.pdf

©2023 Scale Inc.


Zero-shot prompt
programming
Prompting via “memetic proxy”.

©2023 Scale Inc.


GPT-3 Codex
New capabilities from code training.

https://arxiv.org/pdf/2107.03374.pdf

©2023 Scale Inc.


GPT-3 Codex
New capabilities from code training.

©2023 Scale Inc.


Instruction tuning
Aligning models with user intent.

https://arxiv.org/pdf/2203.02155.pdf

©2023 Scale Inc.


Instruction tuning
Aligning models with user intent.

©2023 Scale Inc.


RLHF
Reinforcement learning from human
feedback further aligns models.

(Diagram from OpenAI ChatGPT


announcement.)

©2023 Scale Inc.


Origins of GPT-3
Diagram from Yao Fu (2022).

©2023 Scale Inc.


Zero-shot prose
prompting
Sometimes you can “just ask”.

©2023 Scale Inc.


Prompting with the
“format trick”
“Use this format:” is all you need.

©2023 Scale Inc.


Prompting with the
“format trick”
“Use this format:” is all you need.

©2023 Scale Inc.


Specifying tasks
using code prompts
Prompting through partial code.

©2023 Scale Inc.


Specifying tasks
using code prompts
Prompting with imaginary variables.

©2023 Scale Inc.


Specifying tasks
using code prompts
Prompting with imaginary variables.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


“You are GPT-3”
Using an external interpreter to
overcome model limitations in
conversational Q&A.

©2023 Scale Inc.


A harder problem
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

©2023 Scale Inc.


Chain-of-thought
prompting
Figure 1 from Jason Wei et al. (2022).

©2023 Scale Inc.


Zero-shot
chain-of-thought
Figure 1 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.


Zero-shot
chain-of-thought
Figure 2 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.


Zero-shot
chain-of-thought
Figure 2 from Takeshi Kojima et al. (2022).

©2023 Scale Inc.


Self-consistency
and consensus
Figure 1 from Xuezhi Wang et al. (2022).

©2023 Scale Inc.


“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
album Visions by Grimes?”

First, we prompt to generate code…

©2023 Scale Inc.


“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
album Visions by Grimes?”

First, we prompt to generate code…

©2023 Scale Inc.


“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

First, we prompt to generate code…

…which produces the answer by


prompting GPT-3 again in a loop.

©2023 Scale Inc.


“Visions” example
Question: “What is the final
character of the MD5 hash of the
final digit of the release year of the
Grimes album Visions?”

First, we prompt to generate code…

…which produces the answer by


prompting GPT-3 again in a loop.

We verify this answer using Python.

©2023 Scale Inc.


RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

Many adversarial/misleading
questions fail on pre-RLHF GPT-3

(Examples from Douglas Hofstadter


and David Bender in The Economist)

©2023 Scale Inc.


RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on


many previously hard problems

(Examples from Douglas Hofstadter


and David Bender in The Economist)

©2023 Scale Inc.


RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on


many previously hard problems

(Examples from Douglas Hofstadter


and David Bender in The Economist)

©2023 Scale Inc.


RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

RLHF-tuned ChatGPT succeeds on


many previously hard problems

(Examples from Douglas Hofstadter


and David Bender in The Economist)

©2023 Scale Inc.


RLHF & prompting
RLHF tuning reduces need for
pre-trained prompt engineering

Directed tool use and unprompted


chain-of-thought now incorporated
into chat models like Bing Search

(Screenshot from Thomas Rice)

©2023 Scale Inc.


Closing remarks
A poem from the digital Bard.

©2023 Scale Inc.


Open discussion
Suggested topics:

- Retrieval augmentation
- Tuned “soft prompts”
- Self-evaluation
- Diverse prompt ensembles
- When to fine-tune?
- Open-source models
- Mode collapse in RLHF
- Constitutional AI / RLAIF

©2023 Scale Inc.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy