LLM Prompt Engineering & RLHF - History and Techniques
LLM Prompt Engineering & RLHF - History and Techniques
https://arxiv.org/pdf/2102.07350.pdf
https://arxiv.org/pdf/2107.03374.pdf
https://arxiv.org/pdf/2203.02155.pdf
Many adversarial/misleading
questions fail on pre-RLHF GPT-3
- Retrieval augmentation
- Tuned “soft prompts”
- Self-evaluation
- Diverse prompt ensembles
- When to fine-tune?
- Open-source models
- Mode collapse in RLHF
- Constitutional AI / RLAIF