0% found this document useful (0 votes)

30 views11 pages

LLM Fine Tune

This document provides a comprehensive cheat sheet for generative AI engineering, specifically focusing on fine-tuning transformers using various methods and packages in PyTorch. It includes code examples for implementing positional encoding, importing the IMDB dataset, creating iterators, and building vocabulary objects from pretrained GloVe embeddings. Additionally, it covers training and prediction functions, as well as fine-tuning models on datasets like AG News and IMDB for text classification tasks.

Uploaded by

S R Saini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

LLM Fine Tune

Uploaded by

S R Saini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

19/01/2025, 18:50 about:blank

Cheat Sheet: Generative AI Engineering and Fine-Tuning Transformers

Package/Method Description Code example

class PositionalEncoding(nn.Module):
"""
https://pytorch.org/tutorials/beginner/transformer_tutorial.html
"""
def __init__(self, d_model, vocab_size=5000, dropout=0.1):
super().__init__()
Pivotal in transformers and self.dropout = nn.Dropout(p=dropout)
pe = torch.zeros(vocab_size, d_model)
sequence-to-sequence position = torch.arange(0, vocab_size, dtype=torch.float).unsq
models, conveying critical div_term = torch.exp(
Positional encoding information regarding the torch.arange(0, d_model, 2).float()
positions or sequencing of * (-math.log(10000.0) / d_model)
elements within a given )
pe[:, 0::2] = torch.sin(position * div_term)
sequence. pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0)
self.register_buffer("pe", pe)
def forward(self, x):
x = x + self.pe[:, : x.size(1), :]
return self.dropout(x)

The IMDB data set contains

movie reviews from the
internet movie database
(IMDB) and is commonly urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-storag
used for binary sentiment tar = tarfile.open(fileobj=io.BytesIO(urlopened.read()))
tempdir = tempfile.TemporaryDirectory()
Importing IMBD data set classification tasks. It's a
tar.extractall(tempdir.name)
popular data set for training tar.close()
and testing models in natural
language processing (NLP),
particularly in sentiment
analysis.

root_dir = tempdir.name + '/' + 'imdb_dataset'

Creates iterators for training train_iter = IMDBDataset(root_dir=root_dir, train=True) # For trainin
and testing data sets that test_iter = IMDBDataset(root_dir=root_dir, train=False) # For test da
IMDBDataset class to create iterators
involve various steps, such as start=train_iter.pos_inx
for the train and test datasets for i in range(-10,10):
data loading, preprocessing,
and creating iterators. print(train_iter[start+i])

class GloVe_override(Vectors):
url = {
"6B": "https://cf-courses-data.s3.us.cloud-object-storage.appd
}
def __init__(self, name="6B", dim=100, **kwargs) -> None:
url = self.url[name]
name = "glove.{}.{}d.txt".format(name, str(dim))
An unsupervised learning #name = "glove.{}/glove.{}.{}d.txt".format(name, name, str(dim
algorithm to obtain vector super(GloVe_override, self).__init__(name, url=url, **kwargs)
class GloVe_override2(Vectors):
representations for words. url = {
GloVe model is trained on "6B": "https://cf-courses-data.s3.us.cloud-object-storage.appd
the aggregated global word- }
GloVe embeddings def __init__(self, name="6B", dim=100, **kwargs) -> None:
to-word co-occurrence
statistics from a corpus, and url = self.url[name]
#name = "glove.{}.{}d.txt".format(name, str(dim))
the resulting representations name = "glove.{}/glove.{}.{}d.txt".format(name, name, str(dim)
show linear substructures of super(GloVe_override2, self).__init__(name, url=url, **kwargs)
the word vector base. try:
glove_embedding = GloVe_override(name="6B", dim=100)
except:
try:
glove_embedding = GloVe_override2(name="6B", dim=100)
except:
glove_embedding = GloVe(name="6B", dim=100)

Involves various steps for from torchtext.vocab import GloVe,vocab

Building vocabulary object from creating a structured # Build vocab from glove_vectors
pretrained GloVe word embedding representation of words and vocab = vocab(glove_embedding .stoi, 0,specials=('<unk>', '<pad>'))
model their corresponding vector vocab.set_default_index(vocab["<unk>"])
embeddings.

The training data set will

contain 95% of the samples
in the original training set,
while the validation data set
will contain the remaining
5%. These data sets can be
Convert the training and testing used for training and train_dataset = to_map_style_dataset(train_iter)
test_dataset = to_map_style_dataset(test_iter)
iterators to map-style data sets evaluating a machine-
learning model for text
classification on the IMDB
data set. The final
performance of the model
will be evaluated on the hold-
out test set.

about:blank 1/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

Available in the system using

PyTorch, a popular deep-
learning framework. If a
GPU is available, it assigns
the device variable to "cuda"
(CUDA is the parallel
computing platform and device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
CUDA-compatible GPU device
application programming
interface model developed by
NVIDIA). If a GPU is not
available, it assigns the
device variable to "cpu"
(which means the code will
run on the CPU instead).

Shows that collate_fn

function is used in
conjunction with data loaders
to customize the way batches
are created from individual
samples. A collate_batch
function in PyTorch is used
with data loaders to
customize batch creation
from individual samples. It
processes a batch of data, from torch.nn.utils.rnn import pad_sequence
def collate_batch(batch):
including labels and text label_list, text_list = [], []
sequences. It applies the for _label, _text in batch:
text_pipeline function to label_list.append(_label)
collate_fn text_list.append(torch.tensor(text_pipeline(_text), dtype=torc
preprocess the text. The
processed data is then label_list = torch.tensor(label_list, dtype=torch.int64)
text_list = pad_sequence(text_list, batch_first=True)
converted into PyTorch return label_list.to(device), text_list.to(device)
tensors and returned as a
tuple containing the label
tensor, text tensor, and offsets
tensor representing the
starting positions of each text
sequence in the combined
tensor. The function also
ensures that the returned
tensors are moved to the
specified device (GPU) for
efficient computation.

ATCH_SIZE = 32
train_dataloader = DataLoader(
Used in PyTorch-based split_train_, batch_size=BATCH_SIZE, shuffle=True, collate_fn=coll
projects. It includes creating )
Convert the data set objects to data data set objects, specifying valid_dataloader = DataLoader(
split_valid_, batch_size=BATCH_SIZE, shuffle=True, collate_fn=coll
loaders data loading parameters, and )
converting these data sets test_dataloader = DataLoader(
into data loaders. test_dataset, batch_size=BATCH_SIZE, shuffle=True, collate_fn=coll
)

The predict function takes in

a text, a text pipeline, and a def predict(text, text_pipeline, model):
model as inputs. It uses a with torch.no_grad():
pretrained model passed as a text = torch.unsqueeze(torch.tensor(text_pipeline(text)),0).to
Predict function model.to(device)
parameter to predict the label output = model(text)
of the text for text return imdb_label[output.argmax(1).item()]
classification on the IMDB
data set.

Training function Helps in the training model, def train_model(model, optimizer, criterion, train_dataloader, valid_d
iteratively update the model's cum_loss_list = []
acc_epoch = []
parameters to minimize the acc_old = 0
loss function. It improves the model_path = os.path.join(save_dir, file_name)
model's performance on a acc_dir = os.path.join(save_dir, os.path.splitext(file_name)[0] +
given task. loss_dir = os.path.join(save_dir, os.path.splitext(file_name)[0] +
time_start = time.time()
for epoch in tqdm(range(1, epochs + 1)):
model.train()
#print(model)
#for parm in model.parameters():
# print(parm.requires_grad)
cum_loss = 0
for idx, (label, text) in enumerate(train_dataloader):
optimizer.zero_grad()
label, text = label.to(device), text.to(device)
predicted_label = model(text)
loss = criterion(predicted_label, label)
loss.backward()
#print(loss)
torch.nn.utils.clip_grad_norm_(model.parameters(), 0.1)
optimizer.step()
cum_loss += loss.item()
print(f"Epoch {epoch}/{epochs} - Loss: {cum_loss}")
cum_loss_list.append(cum_loss)

about:blank 2/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

accu_val = evaluate_no_tqdm(valid_dataloader,model)
acc_epoch.append(accu_val)
if model_path and accu_val > acc_old:
print(accu_val)
acc_old = accu_val
if save_dir is not None:
pass
#print("save model epoch",epoch)
#torch.save(model.state_dict(), model_path)
#save_list_to_file(lst=acc_epoch, filename=acc_dir)
#save_list_to_file(lst=cum_loss_list, filename=loss_di
time_end = time.time()
print(f"Training time: {time_end - time_start}")

train_iter_ag_news = AG_NEWS(split="train")
num_class_ag_news = len(set([label for (label, text) in train_iter_ag_
num_class_ag_news
# Split the dataset into training and testing iterators.
train_iter_ag_news, test_iter_ag_news = AG_NEWS()
# Convert the training and testing iterators to map-style datasets.
train_dataset_ag_news = to_map_style_dataset(train_iter_ag_news)
test_dataset_ag_news = to_map_style_dataset(test_iter_ag_news)
# Determine the number of samples to be used for training and validati
num_train_ag_news = int(len(train_dataset_ag_news) * 0.95)
# Randomly split the training dataset into training and validation dat
# The training dataset will contain 95% of the samples, and the valida
split_train_ag_news_, split_valid_ag_news_ = random_split(train_datase
# Make the training set smaller to allow it to run fast as an example.
# IF YOU WANT TO TRAIN ON THE AG_NEWS DATASET, COMMENT OUT THE 2 LINEs
# HOWEVER, NOTE THAT TRAINING WILL TAKE A LONG TIME
Fine-tuning a model on the num_train_ag_news = int(len(train_dataset_ag_news) * 0.05)
pretrained AG News data set split_train_ag_news_, _ = random_split(split_train_ag_news_, [num_trai
is to categorize news articles device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
into one of four categories: device
def label_pipeline(x):
Sports, Business, Sci/Tech, or return int(x) - 1
World. Start training a model from torch.nn.utils.rnn import pad_sequence
from scratch on the AG News def collate_batch_ag_news(batch):
data set. If you want to train label_list, text_list = [], []
Fine-tune a model in the AG News the model for 2 epochs on a for _label, _text in batch:
label_list.append(label_pipeline(_label))
data set smaller data set to text_list.append(torch.tensor(text_pipeline(_text), dtype=torch.in
demonstrate what the training label_list = torch.tensor(label_list, dtype=torch.int64)
process would look like, text_list = pad_sequence(text_list, batch_first=True)
uncomment the part that says return label_list.to(device), text_list.to(device)
### Uncomment to Train ### BATCH_SIZE = 32
train_dataloader_ag_news = DataLoader(
before running the cell. split_train_ag_news_, batch_size=BATCH_SIZE, shuffle=True, collate
Training for 2 epochs on the )
reduced data set can take valid_dataloader_ag_news = DataLoader(
approximately 3 minutes. split_valid_ag_news_, batch_size=BATCH_SIZE, shuffle=True, collate
)
test_dataloader_ag_news = DataLoader(
test_dataset_ag_news, batch_size=BATCH_SIZE, shuffle=True, collate
)
model_ag_news = Net(num_class=4,vocab_size=vocab_size).to(device)
model_ag_news.to(device)
'''
### Uncomment to Train ###
LR=1
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model_ag_news.parameters(), lr=LR)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1.0, gamma=0.1)
save_dir = ""
file_name = "model_AG News small1.pth"
train_model(model=model_ag_news, optimizer=optimizer, criterion=criter

Plots the cost and validation

data accuracy for each epoch
of the pretrained model up to acc_urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-st
and including the epoch that loss_urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-s
Cost and validation data accuracy for yielded the highest accuracy. acc_epoch = pickle.load(acc_urlopened)
each epoch As you can see, the cum_loss_list = pickle.load(loss_urlopened)
pretrained model achieved a plot(cum_loss_list,acc_epoch)
high accuracy of over 90%
on the AG News validation
set.

Fine-tuning the final output

layer of a neural network is
similar to fine-tuning the
whole model. You can begin urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-storag
model_fine2 = Net(vocab_size=vocab_size, num_class=4).to(device)
Fine-tune the final layer by loading the pretrained
model_fine2.load_state_dict(torch.load(io.BytesIO(urlopened.read()), m
model you would like to fine-
tune. In this case, the same
model is pretrained on the
AG News data set.

The code snippet helps

acc_urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-st
achieve a well-optimized loss_urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-s
Fine-tune full IMDB training set for model that accurately acc_epoch = pickle.load(acc_urlopened)
100 epoch classifies movie reviews into cum_loss_list = pickle.load(loss_urlopened)
positive or negative plot(cum_loss_list,acc_epoch)
sentiments.

about:blank 3/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

class FeatureAdapter(nn.Module):
"""
FeatureAdapter is a neural Attributes:
network module that size (int): The bottleneck dimension to which the embeddings a
model_dim (int): The original dimension of the embeddings or f
introduces a low-dimensional """
bottleneck in a transformer def __init__(self, bottleneck_size=50, model_dim=100):
architecture to allow fine- super().__init__()
tuning with fewer self.bottleneck_transform = nn.Sequential(
parameters. It compresses the nn.Linear(model_dim, bottleneck_size), # Down-project to
nn.ReLU(), # Apply non-lineari
original high-dimensional nn.Linear(bottleneck_size, model_dim) # Up-project back t
embeddings into a lower )
Adaptor model dimension, applies a def forward(self, x):
nonlinear transformation, and """
then expands it back to the Forward pass of the FeatureAdapter. Applies the bottleneck tra
tensor and adds a skip connection.
original dimension. This Args:
process is followed by a x (Tensor): Input tensor with shape (batch_size, seq_lengt
residual connection that adds Returns:
the transformed output back Tensor: Output tensor after applying the adapter transform
to the original input to maintaining the original input shape.
"""
preserve information and transformed_features = self.bottleneck_transform(x) # Transfo
promote gradient flow. output_with_residual = transformed_features + x # Add the
return output_with_residual

class IMDBDataset(Dataset):
def __init__(self, root_dir, train=True):
"""
root_dir: The base directory of the IMDB dataset.
train: A boolean flag indicating whether to use training or te
"""
This code snippet traverses self.root_dir = os.path.join(root_dir, "train" if train else "
the IMDB data set by self.neg_files = [os.path.join(self.root_dir, "neg", f) for f
obtaining, loading, and self.pos_files = [os.path.join(self.root_dir, "pos", f) for f
exploring the data set. It also self.files = self.neg_files + self.pos_files
Traverse the IMDB data set self.labels = [0] * len(self.neg_files) + [1] * len(self.pos_f
performs basic operations, self.pos_inx=len(self.pos_files)
visualizes the data, and def __len__(self):
analyzes and interprets the return len(self.files)
data set. def __getitem__(self, idx):
file_path = self.files[idx]
label = self.labels[idx]
with open(file_path, 'r', encoding='utf-8') as file:
content = file.read()
return label, content

This code snippet indicates a

path to the IMDB data set
directory by combining root_dir = tempdir.name + '/' + 'imdb_dataset'
temporary and subdirectory train_iter = IMDBDataset(root_dir=root_dir, train=True) # For trainin
names. This code sets up the test_iter = IMDBDataset(root_dir=root_dir, train=False) # For test da
Iterators to train and test data sets start=train_iter.pos_inx
training and testing data for i in range(-10,10):
iterators, retrieves the starting print(train_iter[start+i])
index of the training data,
and prints the items from the
training data set at indices.

Generates tokens from the

collection of text data
samples. The code snippet tokenizer = get_tokenizer("basic_english")
processes each text in def yield_tokens(data_iter):
'data_iter' through the """Yield tokens for each data sample."""
yield_tokens function
tokenizer and yields tokens to for _, text in data_iter:
generate efficient, on-the-fly yield tokenizer(text)
token generation suitable for
tasks such as training
machine learning models.

This code snippet helps

download a pretrained model urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-storag
Load pretrained model and its from URL, loads it into a model_ = Net(vocab_size=vocab_size, num_class=2).to(device)
model_.load_state_dict(torch.load(io.BytesIO(urlopened.read()), map_lo
evaluation on test data specific architecture, and evaluate(test_dataloader, model_)
evaluates it on a test data set
for assessing its performance.

This code snippet initiates a

tokenizer using a pretrained # Instantiate a tokenizer using the BERT base cased model
'bert-base-cased' model. It tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
also downloads a pretrained # Download pretrained model from huggingface.co and cache.
model = BertForMaskedLM.from_pretrained('bert-base-cased')
Loading the Hugging Face model model for the masked
# You can also start training from scratch by loading the model config
language model (MLM) task, # config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
and how to load the model # model = BertForMaskedLM.from_config(config)
configurations from a
pretrained model.

Training a BERT model for MLM task This code snippet trains the training_args = TrainingArguments(
model with the specified output_dir="./trained_model", # Specify the output directory for
overwrite_output_dir=True,
parameters and data set. do_eval=False,
However, ensure that the learning_rate=5e-5,

about:blank 4/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

'SFTTrainer' is the num_train_epochs=1, # Specify the number of training epochs
appropriate trainer class for per_device_train_batch_size=2, # Set the batch size for training
save_total_limit=2, # Limit the total number of saved checkpoints
the task and that the model is logging_steps = 20
properly defined for training. )
dataset = load_dataset("imdb", split="train")
trainer = SFTTrainer(
model,
args=training_args,
train_dataset=dataset,
dataset_text_field="text",
)

Useful for tasks where you

need to quickly classify the tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncas
Load the model and tokenizer sentiment of a piece of text model = DistilBertForSequenceClassification.from_pretrained("distilber
with a pretrained, efficient
transformer model.

The torch.no_grad() context

manager disables gradient
calculation. This reduces
memory consumption and
speeds up computation, as # Perform inference
gradients are unnecessary for with torch.no_grad():
torch.no_grad() outputs = model(**inputs)
inference (for example, when
you are not training the
model). The **inputs syntax
is used to unpack a dictionary
of keyword arguments in
Python.

Helps to initialize the GPT-2

tokenizer using a pretrained # Load the tokenizer and model
GPT-2 tokenizer tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model to handle encoding
and decoding.

This code snippet initializes

and loads the pretrained GPT-
2 model. This code makes the # Load the tokenizer and model
Load GPT-2 model model = GPT2LMHeadModel.from_pretrained("gpt2")
GPT-2 model ready for
generating text or other
language tasks.

# Generate text
output_ids = model.generate(
inputs.input_ids,
attention_mask=inputs.attention_mask,
pad_token_id=tokenizer.eos_token_id,
This code snippet generates max_length=50,
text sequences based on the num_return_sequences=1
Generate text
input and doesn't compute the )
gradient to generate output. output_ids
or
with torch.no_grad():
outputs = model(**inputs)
outputs

This code snippet decodes

the text from the token IDs # Decode the generated text
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=T
Decode the generated text generated by a model. It also print(generated_text)
decodes it into a readable
string to print it.

The pipeline() function from

the Hugging Face
transformers.pipeline(
transformers library is a high- task: str,
level API designed to model: Optional = None,
simplify the usage of config: Optional = None,
pretrained models for various tokenizer: Optional = None,
natural language processing feature_extractor: Optional = None,
Hugging Face pipeline() function framework: Optional = None,
(NLP) tasks. It abstracts the revision: str = 'main',
complexities of model use_fast: bool = True,
loading, tokenization, model_kwargs: Dict[str, Any] = None,
inference, and post- **kwargs
processing, allowing users to )
perform complex NLP tasks
with just a few lines of code.

formatting_prompts_func_no_response The prompt function def formatting_prompts_func(mydataset):

function generates formatted text output_texts = []
for i in range(len(mydataset['instruction'])):
prompts from a data set by text = (
using the instructions from f"### Instruction:\n{mydataset['instruction'][i]}"
the dataset. It creates strings f"\n\n### Response:\n{mydataset['output'][i]}"
that include only the )
instruction and a placeholder output_texts.append(text)
return output_texts
for the response. def formatting_prompts_func_no_response(mydataset):
output_texts = []

about:blank 5/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

for i in range(len(mydataset['instruction'])):
text = (
f"### Instruction:\n{mydataset['instruction'][i]}"
f"\n\n### Response:\n"
)
output_texts.append(text)
return output_texts

Tokenize instructions and the

instructions_with_responses.
Then, count the number of
tokens in instructions and
discard the equivalent expected_outputs = []
amount of tokens from the instructions_with_responses = formatting_prompts_func(test_dataset)
beginning of the tokenized instructions = formatting_prompts_func_no_response(test_dataset)
instructions_with_responses for i in tqdm(range(len(instructions_with_responses))):
expected_outputs tokenized_instruction_with_response = tokenizer(instructions_with_
vector. Finally, discard the tokenized_instruction = tokenizer(instructions[i], return_tensors=
final token in expected_output = tokenizer.decode(tokenized_instruction_with_resp
instructions_with_responses, expected_outputs.append(expected_output)
corresponding to the eos
token. Decode the resulting
vector using the tokenizer,
resulting in the
expected_output

class ListDataset(Dataset):
def __init__(self, original_list):
Inherits from Dataset and self.original_list = original_list
creates a torch Dataset from a def __len__(self):
ListDataset list. This class is then used to return len(self.original_list)
generate a Dataset object def __getitem__(self, i):
from instructions. return self.original_list[i]
instructions_torch = ListDataset(instructions)

gen_pipeline = pipeline("text-generation",
model=model,
This code snippet takes the tokenizer=tokenizer,
token IDs from the model device=device,
batch_size=2,
gen_pipeline output, decodes it from the max_length=50,
table text, and prints the truncation=True,
responses. padding=False,
return_full_text=False)

with torch.no_grad():
# Due to resource limitation, only apply the function on 3 records
This code generates text from
pipeline_iterator= gen_pipeline(instructions_torch[:3],
the given input using a max_length=50, # this is set to 50
pipeline while optimizing num_beams=5,
torch.no_grad()
resource usage by limiting early_stopping=True,)
input size and reducing generated_outputs_base = []
for text in pipeline_iterator:
gradient calculations. generated_outputs_base.append(text[0]["generated_text"])

training_args = SFTConfig(
output_dir="/tmp",
num_train_epochs=10,
save_strategy="epoch",
fp16=True,
This code snippet sets and per_device_train_batch_size=2, # Reduce batch size
per_device_eval_batch_size=2, # Reduce batch size
initializes a training
max_seq_length=1024,
configuration for a model do_eval=True
using 'SFTTrainer' by )
SFTTrainer
specifying parameters and trainer = SFTTrainer(
initializes the 'SFTTrainer' model,
train_dataset=train_dataset,
with the model, datasets, and
eval_dataset=test_dataset,
additional settings. formatting_func=formatting_prompts_func,
args=training_args,
packing=False,
data_collator=collator,
)

This code snippet helps with torch.no_grad():

# Due to resource limitation, only apply the function on 3 records
generate text sequences from
pipeline_iterator= gen_pipeline(instructions_torch[:3],
the pipeline function. It max_length=50, # this is set to 50 due
ensures that the gradient num_beams=5,
torch.no_grad()
computations are disabled early_stopping=True,)
and optimizes the generated_outputs_lora = []
for text in pipeline_iterator:
performance and memory
generated_outputs_lora.append(text[0]["generated_text"])
usage.

about:blank 6/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

This code snippet uses

LangChain library for
loading and using a from langchain.chains.summarize import load_summarize_chain
summarization chain with a chain = load_summarize_chain(llm=mixtral_llm, chain_type="stuff", verb
load_summarize_chain response = chain.invoke(web_data)
specific language model and print(response['output_text'])n
chain type. This chain type
will be applied to web data to
print a resulting summary.

from torch import nn

class TextClassifier(nn.Module):
def __init__(self, num_classes,freeze=False):
super(TextClassifier, self).__init__()
Represents a simple text self.embedding = nn.Embedding.from_pretrained(glove_embedding.
classifier that uses an # An example of adding additional layers: A linear layer and a
embedding layer, a hidden self.fc1 = nn.Linear(in_features=100, out_features=128)
linear layer with a ReLU self.relu = nn.ReLU()
avtivation, and an output # The output layer that gives the final probabilities for the
self.fc2 = nn.Linear(in_features=128, out_features=num_classes
TextClassifier linear layer. The constructor
def forward(self, x):
takes the following # Pass the input through the embedding layer
arguments: num_class: The x = self.embedding(x)
number of classes to classify. # Here you can use a simple mean pooling
freeze: Whether to freeze the x = torch.mean(x, dim=1)
# Pass the pooled embeddings through the additional layers
embedding layer. x = self.fc1(x)
x = self.relu(x)
return self.fc2(x)

def train_model(model, optimizer, criterion, train_dataloader, valid_d

cum_loss_list = []
acc_epoch = []
best_acc = 0
file_name = model_name
for epoch in tqdm(range(1, epochs + 1)):
model.train()
cum_loss = 0
for _, (label, text) in enumerate(train_dataloader):
This code snippet outlines the optimizer.zero_grad()
function to train a machine predicted_label = model(text)
learning model using loss = criterion(predicted_label, label)
PyTorch. This function trains loss.backward()
Train the model torch.nn.utils.clip_grad_norm_(model.parameters(), 0.1)
the model over a specified optimizer.step()
number of epochs, tracks cum_loss += loss.item()
them, and evaluates the #print("Loss:", cum_loss)
performance on the data set. cum_loss_list.append(cum_loss)
acc_val = evaluate(valid_dataloader, model, device)
acc_epoch.append(acc_val)
if acc_val > best_acc:
best_acc = acc_val
print(f"New best accuracy: {acc_val:.4f}")
#torch.save(model.state_dict(), f"{model_name}.pth")
#save_list_to_file(cum_loss_list, f"{model_name}_loss.pkl")
#save_list_to_file(acc_epoch, f"{model_name}_acc.pkl")

def plot_matrix_and_subspace(F):
assert F.shape[0] == 3, "Matrix F must have rows equal to 3 for 3D
ax = plt.figure().add_subplot(projection='3d')
# Plot each column vector of F as a point and line from the origin
for i in range(F.shape[1]):
ax.quiver(0, 0, 0, F[0, i], F[1, i], F[2, i], color='blue', ar
if F.shape[1] == 2:
# Calculate the normal to the plane spanned by the columns of
normal_vector = np.cross(F[:, 0], F[:, 1])
# Plot the plane
The code snippet is useful for xx, yy = np.meshgrid(np.linspace(-3, 3, 10), np.linspace(-3, 3
def plot_matrix_and_subspace(F) understanding the vectors in zz = (-normal_vector[0] * xx - normal_vector[1] * yy) / normal
the 3D space. ax.plot_surface(xx, yy, zz, alpha=0.5, color='green', label='S
# Set plot limits and labels
ax.set_xlim([-3, 3])
ax.set_ylim([-3, 3])
ax.set_zlim([-3, 3])
ax.set_xlabel('$x_{1}$')
ax.set_ylabel('$x_{2}$')
ax.set_zlabel('$x_{3}$')
#ax.legend()
plt.show()

class LoRALayer(torch.nn.Module):
The provided code is useful def __init__(self, in_dim, out_dim, rank, alpha):
super().__init__()
for defining the parameters of std_dev = 1 / torch.sqrt(torch.tensor(rank).float())
the 'LoRALayer' module self.A = torch.nn.Parameter(torch.randn(in_dim, rank) * std_de
nn.Parameter during the training. The self.B = torch.nn.Parameter(torch.zeros(rank, out_dim))
'LoRALayer' has been used self.alpha = alpha
as an intermediate layer in a def forward(self, x):
x = self.alpha * (x @ self.A @ self.B)
simple neural network. return x

LinearWithLoRA class This code snippet defines the class LinearWithLoRA(torch.nn.Module):

custom neural network layer def __init__(self, linear, rank, alpha):
super().__init__()
called 'LoRALayer' using self.linear = linear.to(device)
PyTorch. It uses self.lora = LoRALayer(
'nn.Parameter' to create linear.in_features, linear.out_features, rank, alpha

about:blank 7/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

learnable parameters for ).to(device)
optimizing the training def forward(self, x):
return self.linear(x) + self.lora(x)
process.

To fine-tune with LoRA, first,

load a pretrained
TextClassifier model with
LoRA (while freezing its
layers), load its pretrained
state from a file, and then
disable gradient updates for
all its parameters to prevent
further training. Here, you from urllib.request import urlopen
will load a model that was import io
pretrained on the AG NEWS model_lora=TextClassifier(num_classes=4,freeze=False)
model_lora.to(device)
data set, which is a data set urlopened = urlopen('https://cf-courses-data.s3.us.cloud-object-storag
that has 4 classes. Note that stream = io.BytesIO(urlopened.read())
Applying LoRA when you initialize this state_dict = torch.load(stream, map_location=device)
model, you set num_classes model_lora.load_state_dict(state_dict)
to 4. Moreover, the pretrained # Here, you freeze all layers:
for parm in model_lora.parameters():
AG_News model was trained parm.requires_grad=False
with the embedding layer model_lora
unfrozen. Hence, you will
initialize the model with
freeze=False. Although you
are initializing the model
with layers unfrozen and the
wrong number of classes for
your task, you will make
modifications to the model
later that correct this.

ranks = [1, 2, 5, 10]

alphas = [0.1, 0.5, 1.0, 2.0, 5.0]
results=[]
accuracy_old=0
# Loop over each combination of 'r' and 'alpha'
for r in ranks:
for alpha in alphas:
print(f"Testing with rank = {r} and alpha = {alpha}")
model_name=f"model_lora_rank{r}_alpha{alpha}_AGtoIBDM_final_ad
model_lora=TextClassifier(num_classes=4,freeze=False)
model_lora.to(device)
urlopened = urlopen('https://cf-courses-data.s3.us.cloud-objec
stream = io.BytesIO(urlopened.read())
state_dict = torch.load(stream, map_location=device)
The given code spinet model_lora.load_state_dict(state_dict)
evaluates the performance of for parm in model_lora.parameters():
a text classification model parm.requires_grad=False
varying configurations of model_lora.fc2=nn.Linear(in_features=128, out_features=2, bias
'LoRALayer'. It assesses the model_lora.fc1=LinearWithLoRA(model_lora.fc1,rank=r, alpha=alp
Select rank and alpha optimizer = torch.optim.Adam(model_lora.parameters(), lr=LR)
combination of rank and scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer,
alpha hyperparameters, trains model_lora.to(device)
the model, and records the train_model(model_lora, optimizer, criterion, train_dataloader
accuracy of each accuracy=evaluate(valid_dataloader , model_lora, device)
configuration. result = {
'rank': r,
'alpha': alpha,
'accuracy':accuracy
}
# Append the dictionary to the results list
results.append(result)
if accuracy>accuracy_old:
print(f"Testing with rank = {r} and alpha = {alpha}")
print(f"accuracy: {accuracy} accuracy_old: {accuracy_old}"
accuracy_old=accuracy
torch.save(model.state_dict(), f"{model_name}.pth")
save_list_to_file(cum_loss_list, f"{model_name}_loss.pkl")
save_list_to_file(acc_epoch, f"{model_name}_acc.pkl")

Sets up the training

components for the model,
defining a learning rate of 1, LR=1
using cross-entropy loss as criterion = torch.nn.CrossEntropyLoss()
model_lora model the criterion, optimizing with optimizer = torch.optim.SGD(model_lora.parameters(), lr=LR)
stochastic gradient descent scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1.0, gamma=0.1)
(SGD), and scheduling the
learning rate to decay by a
factor of 0.1 at each epoch.

dataset_name = "imdb"
ds = load_dataset(dataset_name, split = "train")
The data set is loaded using N = 5
the load_dataset function for sample in range(N):
print('text',ds[sample]['text'])
load_dataset from the data set's library,
print('label',ds[sample]['label'])
specifically loading the ds = ds.rename_columns({"text": "review"})
"train" split. ds
ds = ds.filter(lambda x: len(x["review"]) > 200, batched=False)

about:blank 8/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

del(ds)
dataset_name="imdb"
ds = load_dataset(dataset_name, split="train")
ds = ds.rename_columns({"text": "review"})
def build_dataset(config, dataset_name="imdb", input_min_text_length=2
"""
Build dataset for training. This builds the dataset from `load_dat
customize this function to train the model on its own dataset.
Args:
dataset_name (`str`):
The name of the dataset to be loaded.
Returns:
dataloader (`torch.utils.data.DataLoader`):
Incorporates the necessary The dataloader for the dataset.
steps to build a data set """
build_dataset
object for use as an input to tokenizer = AutoTokenizer.from_pretrained(config.model_name)
PPOTrainer. tokenizer.pad_token = tokenizer.eos_token
# load imdb with datasets
ds = load_dataset(dataset_name, split="train")
ds = ds.rename_columns({"text": "review"})
ds = ds.filter(lambda x: len(x["review"]) > 200, batched=False)
input_size = LengthSampler(input_min_text_length, input_max_text_l
def tokenize(sample):
sample["input_ids"] = tokenizer.encode(sample["review"])[: inp
sample["query"] = tokenizer.decode(sample["input_ids"])
return sample
ds = ds.map(tokenize, batched=False)
ds.set_format(type="torch")
return ds

gen_kwargs = {"min_length": -1, "top_k": 0.0, "top_p": 1.0, "do_sample

def generate_some_text(input_text,my_model):
# Tokenize the input text
Tokenizes input text, input_ids = tokenizer(input_text, return_tensors='pt').input_ids.t
Text generation function generates a response, and generated_ids = my_model.generate(input_ids,**gen_kwargs )
decodes it. # Decode the generated text
generated_text_ = tokenizer.decode(generated_ids[0], skip_special_
return generated_text_

This code snippet defines a # Instantiate a tokenizer using the BERT base cased model
function tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
'compare_models_on_dataset' # Define a function to tokenize examples
for comparing the def tokenize_function(examples):
performance of two models # Tokenize the text using the tokenizer
Tokenizing data # Apply padding to ensure all sequences have the same length
by initializing generation # Apply truncation to limit the maximum sequence length
parameters and setting the return tokenizer(examples["text"], padding="max_length", truncatio
batch size, preparing the data # Apply the tokenize function to the dataset in batches
set in the pandas format, and tokenized_datasets = dataset.map(tokenize_function, batched=True)
sampling the batch queries.

def train_model(model,tr_dataloader):
# Create a progress bar to track the training progress
The train_model function progress_bar = tqdm(range(num_training_steps))
trains a model using a set of # Set the model in training mode
training data provided model.train()
tr_losses=[]
through a dataloader. It
# Training loop
begins by setting up a for epoch in range(num_epochs):
progress bar to help monitor total_loss = 0
the training progress visually. # Iterate over the training data batches
The model is switched to for batch in tr_dataloader:
# Move the batch to the appropriate device
training mode, which is
batch = {k: v.to(device) for k, v in batch.items()}
necessary for certain model # Forward pass through the model
behaviors like dropout to outputs = model(**batch)
work correctly during # Compute the loss
training. The function loss = outputs.loss
# Backward pass (compute gradients)
Training loop processes the data in batches
loss.backward()
for each epoch, which total_loss += loss.item()
involves several steps for # Update the model parameters
each batch: transferring the optimizer.step()
data to the correct device # Update the learning rate scheduler
lr_scheduler.step()
(like a GPU), running the
# Clear the gradients
data through the model to get optimizer.zero_grad()
outputs and calculate loss, # Update the progress bar
updating the model's progress_bar.update(1)
parameters using the tr_losses.append(total_loss/len(tr_dataloader))
#plot loss
calculated gradients, plt.plot(tr_losses)
adjusting the learning rate, plt.title("Training loss")
and clearing the old plt.xlabel("Epoch")
gradients. plt.ylabel("Loss")
plt.show()

evaluate_model function Works similarly to the def evaluate_model(model, evl_dataloader):

train_model function but is # Create an instance of the Accuracy metric for multiclass classif
metric = Accuracy(task="multiclass", num_classes=5).to(device)
used for evaluating the # Set the model in evaluation mode
model's performance instead model.eval()
of training it. It uses a # Disable gradient calculation during evaluation
dataloader to process data in with torch.no_grad():
batches, setting the model to # Iterate over the evaluation data batches
for batch in evl_dataloader:
evaluation mode to ensure # Move the batch to the appropriate device

about:blank 9/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

accuracy in measurements batch = {k: v.to(device) for k, v in batch.items()}
and disabling gradient # Forward pass through the model
outputs = model(**batch)
calculations since it's not # Get the predicted class labels
training. The function logits = outputs.logits
calculates predictions for predictions = torch.argmax(logits, dim=-1)
each batch, updates an # Accumulate the predictions and labels for the metric
accuracy metric, and finally, metric(predictions, batch["labels"])
# Compute the accuracy
prints the overall accuracy accuracy = metric.compute()
after processing all batches. # Print the accuracy
print("Accuracy:", accuracy.item())

def llm_model(prompt_txt, params=None):

model_id = 'mistralai/mixtral-8x7b-instruct-v01'
default_params = {
"max_new_tokens": 256,
"min_new_tokens": 0,
"temperature": 0.5,
"top_p": 0.2,
"top_k": 1
}
if params:
This code snippet defines
default_params.update(params)
function 'llm_model' for parameters = {
generating text using the GenParams.MAX_NEW_TOKENS: default_params["max_new_tokens"], #
language model from the GenParams.MIN_NEW_TOKENS: default_params["min_new_tokens"], #
mistral.ai platform, GenParams.TEMPERATURE: default_params["temperature"], # this r
GenParams.TOP_P: default_params["top_p"],
llm_model specifically the 'mitral-8x7b-
GenParams.TOP_K: default_params["top_k"]
instruct-v01' model. The }
function helps in customizing credentials = {
generating parameters and "url": "https://us-south.ml.cloud.ibm.com"
interacts with IBM Watson's }
project_id = "skills-network"
machine learning services.
model = Model(
model_id=model_id,
params=parameters,
credentials=credentials,
project_id=project_id
)
mixtral_llm = WatsonxLLM(model=model)
response = mixtral_llm.invoke(prompt_txt)
return response

This code snippet maps

numerical labels to their
corresponding textual
descriptions to classify tasks.
This code helps in machine class_names = {0: "negative", 1: "positive"}
class_names learning to interpret the class_names
output model, where the
model's predictions are
numerical and should be
presented in a more human-
readable format.

This code snippet uses

'AutoTokenizer' for
preprocessing text data for
DistilBERT, a lighter version
of BERT. It tokenizes input
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
DistilBERT tokenizer text into a format suitable for
model processing by
converting words into token
IDs, handling special tokens,
padding, and truncating
sequences as needed.

This code snippet tokenizes my_tokens=tokenizer(imdb['train'][0]['text'])

# Print the tokenized input IDs
text data and inspects the print("Input IDs:", my_tokens['input_ids'])
resulting token IDs, attention # Print the attention mask
Tokenize input IDs masks, and token type IDs print("Attention Mask:", my_tokens['attention_mask'])
for further processing the # If token_type_ids is present, print it
natural language processing if 'token_type_ids' in my_tokens:
print("Token Type IDs:", my_tokens['token_type_ids'])
(NLP) tasks.

This code snippet explains

how to use a tokenizer for
def preprocess_function(examples):
preprocessing text data from return tokenizer(examples["text"], padding=True, truncation=True,
the IMDB data set. The small_tokenized_train = small_train_dataset.map(preprocess_function, b
Preprocessing function tokenizer tokenizer is applied to review small_tokenized_test = small_test_dataset.map(preprocess_function, bat
the training data set and medium_tokenized_train = medium_train_dataset.map(preprocess_function,
convert text into tokenized medium_tokenized_test = medium_test_dataset.map(preprocess_function, b
input IDs, an attention mask,
and token type IDs.

compute_metrics funcion Evaluates model performance def compute_metrics(eval_pred):

using accuracy. load_accuracy = load_metric("accuracy", trust_remote_code=True)
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
accuracy = load_accuracy.compute(predictions=predictions, reference

about:blank 10/11
19/01/2025, 18:50 about:blank

Package/Method Description Code example

return {"accuracy": accuracy}

config_bnb = BitsAndBytesConfig(
load_in_4bit=True, # quantize the model to 4-bits when you load it
bnb_4bit_quant_type="nf4", # use a special 4-bit data type for wei
Defines the quantization bnb_4bit_use_double_quant=True, # nested quantization scheme to qu
Configure BitsAndBytes
parameters. bnb_4bit_compute_dtype=torch.bfloat16, # use bfloat16 for faster c
llm_int8_skip_modules=["classifier", "pre_classifier"] # Don't co
)

Maps IDs to text labels for

id2label = {0: "NEGATIVE", 1: "POSITIVE"}
id2label the two classes in this
problem.

Swaps the keys and the

label2id = dict((v,k) for k,v in id2label.items())
label2id values to map the text labels
to the IDs.

This code snippet initializes a

tokenizer using text data
from the IMDB data set,
creates a model called
model_qlora = AutoModelForSequenceClassification.from_pretrained("dist
model_qlora for sequence id2la
classification using label
model_qlora DistilBERT, and configures num_l
with id2label and label2id quant
mappings. This code )
provides two output labels,
including quantization
configuration using
config_bnb settings.

This code snippet initializes

training arguments to train a training_args = TrainingArguments(
model. It specifies the output output_dir="./results_qlora",
directory for results, sets the num_train_epochs=10,
number of training epochs to per_device_train_batch_size=16,
per_device_eval_batch_size=64,
training_args 10 and the learning rate to
learning_rate=2e-5,
2e-5, and defines the batch evaluation_strategy="epoch",
size for training and weight_decay=0.01
evaluation. This code also )
specifies the assessment
strategies for each epoch.

Designed to convert a list of def text_to_emb(list_of_text,max_input=512):

text strings into their data_token_index = tokenizer.batch_encode_plus(list_of_text, add_
text_to_emb question_embeddings=aggregate_embeddings(data_token_index['input_i
corresponding embeddings return question_embeddings
using a pre-defined tokenizer.

# Define the model name or path

This code snippet defines the model_name_or_path = "gpt2"
model name to ‘gpt2’ and # Initialize tokenizer and model
initializes the token and tokenizer = GPT2Tokenizer.from_pretrained(model_name_or_path, use_fast
model using the GPT-2 model = GPT2ForSequenceClassification.from_pretrained(model_name_or_pa
model_name_or_path # Add special tokens if necessary
model. In this code, add tokenizer.pad_token = tokenizer.eos_token
special tokens for padding by model.config.pad_token_id = model.config.eos_token_id
keeping the maximum # Define the maximum length
sequence length to 1024. max_length = 1024

about:blank 11/11

Deep Learning With PyTorch 1
No ratings yet
Deep Learning With PyTorch 1
1 page
RLDL128
No ratings yet
RLDL128
73 pages
Final DL
No ratings yet
Final DL
26 pages
GPT 2 - Learninhg 4
0% (1)
GPT 2 - Learninhg 4
2 pages
Transformers Torch
No ratings yet
Transformers Torch
38 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
PyTorch PDF
No ratings yet
PyTorch PDF
72 pages
CS236 Introduction To PyTorch
100% (4)
CS236 Introduction To PyTorch
33 pages
Pytorch Cheatsheet EN
No ratings yet
Pytorch Cheatsheet EN
1 page
PyTorch Made Easy A Quick Overview
No ratings yet
PyTorch Made Easy A Quick Overview
55 pages
Pytorch Tutorial: - Ntu Machine Learning Course
No ratings yet
Pytorch Tutorial: - Ntu Machine Learning Course
64 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
Deep Learning With PyTorch
No ratings yet
Deep Learning With PyTorch
19 pages
cl12 Huggingface
No ratings yet
cl12 Huggingface
34 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Google Aiml
No ratings yet
Google Aiml
50 pages
Lab 9
No ratings yet
Lab 9
29 pages
Chapter1 Intro
No ratings yet
Chapter1 Intro
35 pages
Deep Learning Lab: How To Train Your First Neural Network
No ratings yet
Deep Learning Lab: How To Train Your First Neural Network
68 pages
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
No ratings yet
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
18 pages
Astro AI
No ratings yet
Astro AI
20 pages
Chapter 3 - Training Deep Neural Networks
No ratings yet
Chapter 3 - Training Deep Neural Networks
25 pages
Module02 PyTorch
No ratings yet
Module02 PyTorch
36 pages
Review Exercise - Chapter 1 - Solution
100% (1)
Review Exercise - Chapter 1 - Solution
9 pages
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
No ratings yet
Natural Language Processing With Pytorch Readthedocs Io en Latest PDF
35 pages
Homework 6
No ratings yet
Homework 6
7 pages
Assingment-3 NLP
No ratings yet
Assingment-3 NLP
5 pages
DLT Experiment 2
No ratings yet
DLT Experiment 2
7 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
GPT2 From Scratch in PyTorch
No ratings yet
GPT2 From Scratch in PyTorch
13 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
17 pages
COMP 4650 6490 Assignment 3 2023-v1.1
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
6 pages
CCS355-Neural Networks and Deep Learning - Assignment 1
No ratings yet
CCS355-Neural Networks and Deep Learning - Assignment 1
15 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
16 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Exercise 8
No ratings yet
Exercise 8
6 pages
DVP&R
No ratings yet
DVP&R
2 pages
Harvard CS197 Lecture 6 & 7 Notes
No ratings yet
Harvard CS197 Lecture 6 & 7 Notes
18 pages
Unit III
No ratings yet
Unit III
28 pages
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
No ratings yet
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
8 pages
ML Code Analysis
No ratings yet
ML Code Analysis
6 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
No ratings yet
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
8 pages
Tutorials Sources Beginner Ptcheat
No ratings yet
Tutorials Sources Beginner Ptcheat
7 pages
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
No ratings yet
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
8 pages
EE769 Assignment 3
No ratings yet
EE769 Assignment 3
1 page
Code Explanation
No ratings yet
Code Explanation
8 pages
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
No ratings yet
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
7 pages
Building Deep Learning Models Using The PyTorch Library
No ratings yet
Building Deep Learning Models Using The PyTorch Library
4 pages
Unit 4 Part 3
No ratings yet
Unit 4 Part 3
8 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
PyTorch Workflow Fundamentals
No ratings yet
PyTorch Workflow Fundamentals
1 page
2c PyTorch4
No ratings yet
2c PyTorch4
4 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
hw1 2487155975100812
No ratings yet
hw1 2487155975100812
6 pages
Culture and Traditions of The Karaya People
No ratings yet
Culture and Traditions of The Karaya People
9 pages
Tutorial Pytorch Best Commands
No ratings yet
Tutorial Pytorch Best Commands
8 pages
Cesc 12 - Q1 - M5 PDF
No ratings yet
Cesc 12 - Q1 - M5 PDF
14 pages
Story Timelevel 2
No ratings yet
Story Timelevel 2
33 pages
Scan Converting Circle
No ratings yet
Scan Converting Circle
36 pages
Fire Hose Cabinet
No ratings yet
Fire Hose Cabinet
7 pages
Farm Life By: Jenny Rose R. Santos
No ratings yet
Farm Life By: Jenny Rose R. Santos
6 pages
Quality Systems Manual Method Statement
No ratings yet
Quality Systems Manual Method Statement
9 pages
Introducing Transdisciplinary Design Thinking in Early Undergradu
No ratings yet
Introducing Transdisciplinary Design Thinking in Early Undergradu
272 pages
Purposive Communication 2
100% (1)
Purposive Communication 2
2 pages
ISB Consulting Book 2015
No ratings yet
ISB Consulting Book 2015
234 pages
Diseases Parasites and Predators Management and Control
No ratings yet
Diseases Parasites and Predators Management and Control
7 pages
Science 5 - Q2 - M12
No ratings yet
Science 5 - Q2 - M12
16 pages
2025 04 03 Burford 2025 Investor Day FINAL 1543020 1
No ratings yet
2025 04 03 Burford 2025 Investor Day FINAL 1543020 1
132 pages
Bus Paper Craft
No ratings yet
Bus Paper Craft
10 pages
Answer Key
No ratings yet
Answer Key
1 page
Unit 2
No ratings yet
Unit 2
39 pages
Nokia Solutions and Networks Jaipur (Raj.) : Seminar Report ON Industrial Training AT
No ratings yet
Nokia Solutions and Networks Jaipur (Raj.) : Seminar Report ON Industrial Training AT
51 pages
Resume Shivani Jain 11EC35006
No ratings yet
Resume Shivani Jain 11EC35006
2 pages
Laguna State Polytechnic University
0% (1)
Laguna State Polytechnic University
2 pages
03 Descriptive Statistics
No ratings yet
03 Descriptive Statistics
71 pages
Lesson Plan 10 Transactional Letters
No ratings yet
Lesson Plan 10 Transactional Letters
3 pages
NM FCL 301 Exam Essay
No ratings yet
NM FCL 301 Exam Essay
14 pages
Clark 1962 Letter To The Editor The Pert Model For The Distribution of An Activity Time
No ratings yet
Clark 1962 Letter To The Editor The Pert Model For The Distribution of An Activity Time
3 pages
Scheat Malibu
No ratings yet
Scheat Malibu
23 pages
Rocky Mountain Spotted Fever
No ratings yet
Rocky Mountain Spotted Fever
9 pages
Leading For The Future
No ratings yet
Leading For The Future
4 pages
Significance of Sara Pariksha in Ayurveda: A Critical Review: October 2018
No ratings yet
Significance of Sara Pariksha in Ayurveda: A Critical Review: October 2018
7 pages
Key Responsibilities: Required Skills
No ratings yet
Key Responsibilities: Required Skills
1 page
Monitoring Sheet MR Sia Opv Campaign Final 2023 Doc Grace
No ratings yet
Monitoring Sheet MR Sia Opv Campaign Final 2023 Doc Grace
12 pages
Anova Summary Output
No ratings yet
Anova Summary Output
11 pages
m2l12 PDF
No ratings yet
m2l12 PDF
8 pages
ICT Project Creation Process
No ratings yet
ICT Project Creation Process
3 pages
Anova Summary Output
No ratings yet
Anova Summary Output
6 pages
Dectection Theory Packet
No ratings yet
Dectection Theory Packet
4 pages
Bobj Integration With Portal
No ratings yet
Bobj Integration With Portal
3 pages
SM T311 - Direy 6
No ratings yet
SM T311 - Direy 6
3 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Java Programming Tutorial With Screen Shots & Many Code Example
From Everand
Java Programming Tutorial With Screen Shots & Many Code Example
Desmond Ohwofosirai
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

LLM Fine Tune

Uploaded by

LLM Fine Tune

Uploaded by

19/01/2025, 18:50 about:blank

Cheat Sheet: Generative AI Engineering and Fine-Tuning Transformers

The IMDB data set contains

root_dir = tempdir.name + '/' + 'imdb_dataset'

Involves various steps for from torchtext.vocab import GloVe,vocab

The training data set will

Package/Method Description Code example

Available in the system using

Shows that collate_fn

The predict function takes in

Package/Method Description Code example

Plots the cost and validation

Fine-tuning the final output

The code snippet helps

Package/Method Description Code example

This code snippet indicates a

Generates tokens from the

This code snippet helps

This code snippet initiates a

Package/Method Description Code example

Useful for tasks where you

The torch.no_grad() context

Helps to initialize the GPT-2

This code snippet initializes

This code snippet decodes

The pipeline() function from

formatting_prompts_func_no_response The prompt function def formatting_prompts_func(mydataset):

Package/Method Description Code example

Tokenize instructions and the

This code snippet helps with torch.no_grad():

Package/Method Description Code example

This code snippet uses

from torch import nn

def train_model(model, optimizer, criterion, train_dataloader, valid_d

LinearWithLoRA class This code snippet defines the class LinearWithLoRA(torch.nn.Module):

Package/Method Description Code example

To fine-tune with LoRA, first,

ranks = [1, 2, 5, 10]

Sets up the training

Package/Method Description Code example

gen_kwargs = {"min_length": -1, "top_k": 0.0, "top_p": 1.0, "do_sample

evaluate_model function Works similarly to the def evaluate_model(model, evl_dataloader):

Package/Method Description Code example

def llm_model(prompt_txt, params=None):

This code snippet maps

This code snippet uses

This code snippet tokenizes my_tokens=tokenizer(imdb['train'][0]['text'])

This code snippet explains

compute_metrics funcion Evaluates model performance def compute_metrics(eval_pred):

Package/Method Description Code example

Maps IDs to text labels for

Swaps the keys and the

This code snippet initializes a

This code snippet initializes

Designed to convert a list of def text_to_emb(list_of_text,max_input=512):

# Define the model name or path

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.