Clase1 Generating Your First Text
Clase1 Generating Your First Text
Generative AI
Marlon S. Viñán Ludeña
Traditional Machine Learning
training
Training LLMs
1. Language modeling: The first step, called pretraining, takes
the majority of computation and training time. An LLM is
trained on a vast corpus of internet text allowing the model to
learn grammar, context, and language patterns. The resulting
model is often referred to as a foundation model or base
model. These models generally do not follow instructions.
2. Fine-tuning: The second step, fine-tuning or sometimes
post-training, involves using the previously trained model and
further training it on a narrower task.
Training LLMs
Large Language Models Applications
1. Detecting whether a review left by a customer is positive or negative: This is (supervised)
classification and can be handled with both encoder- and decoder-only models either with pretrained
models or fine-tuning models
2. Developing a system for finding common topics in ticket issues: This is (unsupervised)
classification for which we have no predefined labels. We can leverage encoder-only models to perform
the classification itself and decoder-only models for labeling the topics.
3. Building a system for retrieval and inspection of relevant documents: A major component of
language model systems is their ability to add external resources of information. Using semantic search,
we can build systems that allow us to easily access and find information for an LLM to use.
4. Constructing an LLM chatbot that can leverage external resources, such as tools and
documents: This is a combination of techniques that demonstrates how the true power of LLMs can be
found through additional components (Methods such as prompt engineering, retrieval-augmented
generation, and fine-tuning an LLM
5. Constructing an LLM capable of writing recipes based on a picture showing the products in
your fridge: This is a multimodal task where the LLM takes in an image and reasons about what it sees.
Responsible LLM development and Usage
1. Bias and fairness: LLMs are trained on large amounts of data that might contain
biases.
2. Transparency and accountability: Due to LLMs’ incredible capabilities, it is not
always clear when you are talking with a human or an LLM. As such, the usage of
LLMs when interacting with humans can have unintended consequences when there
is no human in the loop
3. Generating harmful content: they can be used to generate fake news, articles,
and other misleading sources of information.
4. Intellectual property: When the output is similar to a phrase in the training data,
does the intellectual property belong to the author of that phrase? Without access to
the training data it remains unclear when copyrighted material is being used by the
LLM.
5. Regulation: European AI act
Proprietary, Private Models
Open Models
Open source frameworks
● llama.cpp
● LangChain
● Hugging Face Transformers
● LlamaIndex
Generating Your First Text
Model: Phi-3 mini
do_sample: Whether the model uses a sampling strategy to choose the next
token. By setting this to False, the model will always select the next most
probable token
Challenge
Decoding the Language: Exploring Prompts and Parameters in LLMs
Learning Objectives:
● Students will understand the basic functionality of a pre-trained Large
Language Model (LLM).
● Students will explore how different prompts affect LLM outputs.
Decoding the Language: Exploring Prompts and Parameters in LLMs
Materials:
● Access to the provided Google Colab notebook:
https://colab.research.google.com/drive/1toOYoeLnFAaW0OWZ-bt4ji
xKzbrjlqg-?usp=sharing
● Internet connection.
● A text editor or document for recording observations.
Decoding the Language: Exploring Prompts and Parameters
in LLMs
Phase 1: Prompt Engineering Exploration
Deliverable:
At the end of the lab, submit a brief report (max 2 pages) including:
● Answers to all questions.
● Screenshots of key results from the Colab notebook.
● Reflections on what you learned about LLMs from the experiments.