0% found this document useful (0 votes)
20 views6 pages

nlfynx7RfS0IZ9YGOtls - Some Core Concepts

Uploaded by

aayan.feb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views6 pages

nlfynx7RfS0IZ9YGOtls - Some Core Concepts

Uploaded by

aayan.feb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Some core concepts

Navigating the landscape of generative AI models is quite like steering a ship through uncharted
waters for most of us. Understanding these models and some of the core concepts is not just
about tech jargon; it's about wielding the tools that can redefine how businesses innovate,
communicate, and stay ahead in the dynamic seas of industry evolution.

Here is a list of a few of the most popular models used in the generative space.

Foundational Models
Foundational models are advanced AI frameworks transforming language, image generation,
and comprehension tasks across diverse industries. Foundational models are large,
multipurpose machine learning models that are pre-trained on diverse data at scale to learn
representations and patterns that enable adapting them to downstream tasks through transfer
learning instead of training bespoke models from scratch. This accelerates development and
enhances performance.

Let’s review some of them.

Here is the table with various types of models:

Type Definition Use Cases Examples

Models trained on large


text corpuses to
generate human-like Content generation,
Language text and power natural chatbots, search,
Models language tasks. document analysis GPT-3, BERT, T5

Models that synthesize Creative tools,


Image realistic images from media, advertising, DALL-E 2, Stable Diffusion,
Generators text descriptions. ecommerce Imagen
Models that translate Voice assistants,
Speech text to realistic human audiobooks,
Models speech. accessibility tools Whisper, WaveNet

Models that suggest Retail, media,


Recommende content to users based advertising, YouTube RS, Amazon RS,
rs on preferences. ecommerce Spotify RS

Translation Models that translate Localization, travel, Google Translate, WMT,


Models text between languages. customer support M2M-100

Models that predict 3D Drug discovery,


Protein protein structure from materials science,
Models amino acid sequence. agriculture AlphaFold, RoseTTAFold

Models trained with


Game-Playing reinforcement learning Game testing, AlphaGo, AlphaStar,
Models to play games. education, robotics OpenAI Five

Tabular Data Models that generate Finance, healthcare,


Models synthetic tabular data. insurance TableGAN, TVAE

Models that solve Logistics,


Optimization complex optimization manufacturing,
Models problems. transportation DeepMind Gato

Large Language Models (LLMs)


Large language models are neural networks trained on vast text datasets, enabling them to
generate human-like language and power advanced natural language applications through
few-shot learning. Their foundation of extracting meaning from text allows adapting them to
many downstream NLP tasks by learning surface level patterns on small datasets.

Some of the LLMs are listed in the following table:


Type Description Use Cases Examples

Predict next word based


Autoregressiv on context. Good for Content creation,
e fluent generation. summaries, chatbots GPT-3, GPT-4

Encode input, decode for Translation, question


Encoder-Deco tasks like translation. answering, text
der Flexible. summarization BART, T5

Learn context by
processing text both Sentiment analysis,
directions. Good for entity recognition,
Bidirectional analysis. search BERT, RoBERTa

Trained to directly answer Conversational


Question questions. Specialized for search, analytics,
Answering QA. dialog systems ELI5, FARM

Virtual assistants,
Trained on conversations. customer service,
Dialog Useful for chatbots. conversational AI Meena, Blenderbot

Process text and images Creative applications,


Multi-Modal for multimodal generation. contextual generation DALL-E 2, Imagen

Tailored for specific tasks Drug discovery,


like code or protein software
Specialized structure. Trade generality development,
Task for depth. theorem proving AlphaFold, Codex

Generative adversarial networks (GANs)


Generative adversarial networks (GANs) are a powerful AI technique where two neural networks
– a generator and discriminator – compete against each other in a minimax game framework.
The generator learns to produce increasingly realistic synthetic data while the discriminator tries
to differentiate real from fake data. This drives the models to improve until the outputs are
indistinguishable from actual training data.

GAN Type Description Use Cases Examples

Original basic architecture


with generator and
Vanilla GAN discriminator Concept learning GAN, DCGAN

Generator and discriminator


Conditional conditioned on additional Controlled
GAN input generation pix2pix, CycleGAN

Image-to-Ima Input image transformed to Image editing,


ge GAN output image colorization Pix2Pix

Paired Style transfer,


generators/discriminators domain
CycleGAN between domains adaptation CycleGAN, DiscoGAN

GAN specialized for very Media,


StyleGAN realistic image generation entertainment StyleGAN, StyleGAN2

Massively scaled up GAN High-res image


BigGAN architecture generation BigGAN

Text-to-Image Generator maps text to Multimodal


GAN images creative apps DALL-E, Imagen

Diffusion Models
Diffusion models are generative deep learning models that progressively add structured noise to
data and then train a neural network to reverse that process for high-fidelity generation. By
odelling noise schedules, they create fine-grained conditional control for manipulating images,
audio, 3D scenes, and other data with neural nets.

Various diffusion models are elaborated in the following table

Type Description Use Cases Examples

Denoising Progressive image Image


Diffusion denoising generation DDPM
Extend to
Diffusion probabilistic Accurate
Probabilistic modeling generation DPIM
Leverage score
Score-Based function for Efficient
Diffusion sampling generation SDSM
Language-Co Condition on text Text-based
nditioned for control generation DALL-E
Video Apply across time
Diffusion for video Video generation V-Diffusion

Audio Capture properties


Diffusion of natural sound Audio generation WaveGrad
3D Scene Generate 3D 3D scene
Diffusion spaces modeling msd-nerf

Variational autoencoders (VAEs)


Variational autoencoders (VAEs) are deep generative models that learn latent representations of
data through probabilistic encoders and decoders. They consist of an encoder network that
maps data examples to distributions over latent space, and a decoder network that reconstructs
the data given samples from those distributions. The encoder and decoder are trained jointly to
maximize the likelihood of reconstructed data while keeping the latent space continuous.

VAEs are used for:

● Generating new data based on the data on which the model was trained (text, images,
audio, etc). Examples: GPT-2, Dall-E
● Anomaly or outlier detection, Example: Credit card fraud
● Dimensionality reduction for visualizing high-dimensional data

Types of VAEs include:

● Conditional VAEs - Encoder/decoder conditioned on auxiliary inputs to target generation


● Disentangled VAEs - Latent codes isolate explanatory factors of variation
● Hierarchical VAEs - Model hierarchical dependencies with layers of latent variables
● Multimodal VAEs - Jointly model data across modalities like text & images

Autoregressive Models

Autoregressive models are generative deep learning models that factorize the joint probability of
a sequence by modeling it as a product of conditional probabilities. They estimate the probability
of a token conditioned on the previous tokens in a process that can generate variable-length
outputs.

Autoregressive models are commonly used for:

● Natural language generation, Examples: GPT-3, XLNet


● Time series forecasting, Example: Retail sales predictions
● Image generation, Example: PixelCNN
● Audio generation, Example: WaveNet

Types of autoregressive models include:

● Transformer-based models - Leverage attention mechanisms over sequences


● Pixel autoregressive models - Model 2D image structures
● Neural additive models - Jointly train both shallow and deep networks
● Temporal autoregressive models - Specialize in forecasting temporal sequences
● Distribution-based models - Model distributions rather than scalar outputs

The modeling flexibility of autoregressive factorization has made this technique effective across
different data types. Fine-tuning on downstream tasks further leverages generative pretraining.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy