0% found this document useful (0 votes)
12 views18 pages

Clase1 Generating Your First Text

The document outlines the training process of Large Language Models (LLMs), detailing pretraining on vast text corpora and subsequent fine-tuning for specific tasks. It also discusses various applications of LLMs, responsible usage considerations, and the importance of prompt engineering in generating desired outputs. Additionally, it provides a structured lab exercise for students to explore LLM functionality and the effects of different prompts and parameters.

Uploaded by

trademartin2812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views18 pages

Clase1 Generating Your First Text

The document outlines the training process of Large Language Models (LLMs), detailing pretraining on vast text corpora and subsequent fine-tuning for specific tasks. It also discusses various applications of LLMs, responsible usage considerations, and the importance of prompt engineering in generating desired outputs. Additionally, it provides a structured lab exercise for students to explore LLM functionality and the effects of different prompts and parameters.

Uploaded by

trademartin2812
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Generating your first text

Generative AI
Marlon S. Viñán Ludeña
Traditional Machine Learning
training
Training LLMs
1. Language modeling: The first step, called pretraining, takes
the majority of computation and training time. An LLM is
trained on a vast corpus of internet text allowing the model to
learn grammar, context, and language patterns. The resulting
model is often referred to as a foundation model or base
model. These models generally do not follow instructions.
2. Fine-tuning: The second step, fine-tuning or sometimes
post-training, involves using the previously trained model and
further training it on a narrower task.
Training LLMs
Large Language Models Applications
1. Detecting whether a review left by a customer is positive or negative: This is (supervised)
classification and can be handled with both encoder- and decoder-only models either with pretrained
models or fine-tuning models
2. Developing a system for finding common topics in ticket issues: This is (unsupervised)
classification for which we have no predefined labels. We can leverage encoder-only models to perform
the classification itself and decoder-only models for labeling the topics.
3. Building a system for retrieval and inspection of relevant documents: A major component of
language model systems is their ability to add external resources of information. Using semantic search,
we can build systems that allow us to easily access and find information for an LLM to use.
4. Constructing an LLM chatbot that can leverage external resources, such as tools and
documents: This is a combination of techniques that demonstrates how the true power of LLMs can be
found through additional components (Methods such as prompt engineering, retrieval-augmented
generation, and fine-tuning an LLM
5. Constructing an LLM capable of writing recipes based on a picture showing the products in
your fridge: This is a multimodal task where the LLM takes in an image and reasons about what it sees.
Responsible LLM development and Usage
1. Bias and fairness: LLMs are trained on large amounts of data that might contain
biases.
2. Transparency and accountability: Due to LLMs’ incredible capabilities, it is not
always clear when you are talking with a human or an LLM. As such, the usage of
LLMs when interacting with humans can have unintended consequences when there
is no human in the loop
3. Generating harmful content: they can be used to generate fake news, articles,
and other misleading sources of information.
4. Intellectual property: When the output is similar to a phrase in the training data,
does the intellectual property belong to the author of that phrase? Without access to
the training data it remains unclear when copyrighted material is being used by the
LLM.
5. Regulation: European AI act
Proprietary, Private Models
Open Models
Open source frameworks
● llama.cpp
● LangChain
● Hugging Face Transformers
● LlamaIndex
Generating Your First Text
Model: Phi-3 mini

Device: Less than 8 GB of VRAM

Licensed under the MIT license

When you use an LLM, two models are loaded:

1. The generative model itself


2. Its underlying tokenizer
Notes: Transformers.pipeline
Notes: Transformers.pipeline
return_full_text: By setting this to False, the prompt will not be returned
but merely
the output of the model.

max_new_tokens: The maximum number of tokens the model will generate.


By setting a limit, we prevent long and unwieldy output as some models
might continue generating output until they reach their context window.

do_sample: Whether the model uses a sampling strategy to choose the next
token. By setting this to False, the model will always select the next most
probable token
Challenge
Decoding the Language: Exploring Prompts and Parameters in LLMs

Learning Objectives:
● Students will understand the basic functionality of a pre-trained Large
Language Model (LLM).
● Students will explore how different prompts affect LLM outputs.
Decoding the Language: Exploring Prompts and Parameters in LLMs

Materials:
● Access to the provided Google Colab notebook:
https://colab.research.google.com/drive/1toOYoeLnFAaW0OWZ-bt4ji
xKzbrjlqg-?usp=sharing
● Internet connection.
● A text editor or document for recording observations.
Decoding the Language: Exploring Prompts and Parameters
in LLMs
Phase 1: Prompt Engineering Exploration

1. Introduction and Exploration:


○ Students begin by running the provided Colab notebook, ensuring they understand the basic code structure.
○ They should focus on the section where prompts are defined and the LLM response is generated.
○ Students are asked to run the default prompt and observe the output.
2. Prompt Modification:
○ Students are tasked with modifying the provided prompt in three distinct ways:
■ Specificity: Make the prompt more specific (e.g., instead of "Tell me a story," try "Tell me a short
science fiction story about a robot on Mars.").
■ Role-Playing: Instruct the LLM to adopt a specific persona (e.g., "Act as a Shakespearean playwright
and write a short monologue.").
■ Constraint: Add constraints to the output (e.g., "Write a poem that is exactly four lines long.").
○ For each modification, students should:
■ Record the modified prompt.
■ Run the code and record the LLM's output.
■ Document their observations: How did the output change? Did the LLM adhere to the modifications?
3. Analysis:
○ Students should write a brief reflection on the impact of prompt engineering. What makes a "good" prompt?
How can prompts be used to guide the LLM's behavior?
Decoding the Language: Exploring Prompts and Parameters in LLMs

Phase 2: Creative Experimentation


1. Design a prompt that generates a short story of at least 100 words.
2. Modify the prompt to control the story’s tone (e.g., make it humorous,
dramatic, or scientific).
3. Change the configuration parameters to investigate their influence on the
output. Putting the parameter do_sample = True, you activate various
decoding strategies which influence the next token from the probability
distribution over the entire vocabulary
4. Compare the responses and describe how prompt wording affects the
model’s output.
Decoding the Language: Exploring Prompts and Parameters in LLMs

Deliverable:
At the end of the lab, submit a brief report (max 2 pages) including:
● Answers to all questions.
● Screenshots of key results from the Colab notebook.
● Reflections on what you learned about LLMs from the experiments.

Fecha de entrega: 19 de marzo hasta las 23:59

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy