Page | 1
Author: Muhammad Omar Akhlaq
1. PANDAS
Data Reading and Writing:
read_csv(): Reads data from a CSV file into a DataFrame.
read_excel(): Reads data from an Excel file into a DataFrame.
to_csv(): Writes data from a DataFrame to a CSV file.
to_excel(): Writes data from a DataFrame to an Excel file.
DataFrame Basics:
head(): Displays the first few rows of the DataFrame.
tail(): Displays the last few rows of the DataFrame.
info(): Provides concise summary information about the DataFrame.
describe(): Generates descriptive statistics of the DataFrame.
shape: Returns the dimensions (rows, columns) of the DataFrame.
columns: Returns the column labels of the DataFrame.
index: Returns the row labels of the DataFrame.
Data Selection and Manipulation:
loc[]: Accesses a group of rows and columns by labels.
iloc[]: Accesses a group of rows and columns by integer position.
drop(): Drops specified rows or columns from the DataFrame.
fillna(): Fills NaN (missing) values in the DataFrame with specified values.
groupby(): Groups data based on specified criteria.
merge(): Merges DataFrames using a database-style join operation.
Data Aggregation and Calculation:
sum(), mean(), median(): Aggregate functions for calculations.
max(), min(): Finds maximum and minimum values.
apply(): Applies a function along an axis of the DataFrame.
pivot_table(): Creates a spreadsheet-style pivot table as a DataFrame.
Data Cleaning and Handling:
drop_duplicates(): Drops duplicate rows from the DataFrame.
rename(): Renames columns or index labels.
astype(): Converts the data type of a column to another data type.
set_index(), reset_index(): Sets or resets the DataFrame index.
Time Series Analysis (for date-time data):
to_datetime(): Converts a column to datetime format.
resample(): Performs data aggregation at different time frequencies.
Page | 2
Author: Muhammad Omar Akhlaq
date_range(): Generates date-time indices.
2. NUMPY
Array Creation:
np.array(): Creates an array from a Python list or tuple.
np.zeros(): Generates an array of zeros with a specified shape.
np.ones(): Generates an array of ones with a specified shape.
np.arange(): Creates an array with evenly spaced values within a given interval.
np.linspace(): Generates an array with a specified number of elements within a range.
Array Manipulation:
np.reshape(): Changes the shape of an array without changing its data.
np.flatten(), np.ravel(): Flattens a multidimensional array into a 1D array.
np.concatenate(): Joins arrays along a specified axis.
np.split(): Splits an array into multiple sub-arrays.
np.transpose(), ndarray.T: Transposes array dimensions.
np.vstack(), np.hstack(): Stacks arrays vertically or horizontally.
Array Operations:
np.sum(), np.mean(), np.median(): Aggregate functions for calculations.
np.max(), np.min(): Finds maximum and minimum values.
np.dot(): Performs matrix multiplication.
np.sort(): Sorts elements in an array.
np.unique(): Finds unique elements in an array.
Array Indexing and Slicing:
Slicing: Allows accessing sub-arrays within arrays using indices.
Fancy Indexing: Uses arrays of indices to access specific elements.
Mathematical Functions:
np.sin(), np.cos(), np.tan(): Trigonometric functions.
np.exp(), np.log(), np.sqrt(): Exponential, logarithmic, and square root functions.
np.absolute(), np.power(): Absolute value and exponentiation functions.
Random Sampling:
np.random.rand(): Generates random numbers from a uniform distribution.
np.random.randn(): Generates random numbers from a normal distribution.
np.random.randint(): Generates random integers within a specified range.
np.random.choice(): Picks random elements from an array.
Linear Algebra:
Page | 3
Author: Muhammad Omar Akhlaq
np.linalg.inv(): Computes the inverse of a matrix.
np.linalg.det(): Computes the determinant of a matrix.
np.linalg.solve(): Solves a system of linear equations.
3. SciKit-Learn
Data Preprocessing:
StandardScaler: Standardizes features by removing the mean and scaling to unit variance.
MinMaxScaler: Scales features to a given range (typically 0 to 1).
OneHotEncoder, LabelEncoder: Converts categorical variables into numerical representations.
train_test_split: Splits datasets into training and testing subsets for model evaluation.
Supervised Learning:
Regression:
LinearRegression, Ridge, Lasso, ElasticNet: Linear regression and regularized versions.
Classification:
LogisticRegression, SVC, RandomForestClassifier, KNeighborsClassifier: Classifiers for different
algorithms.
DecisionTreeClassifier, GradientBoostingClassifier: Decision tree-based models.
Unsupervised Learning:
Clustering:
KMeans, DBSCAN, AgglomerativeClustering: Algorithms for clustering data.
Dimensionality Reduction:
PCA, TruncatedSVD, FactorAnalysis: Methods for reducing dimensions while preserving important
information.
Model Evaluation and Metrics:
accuracy_score, precision_score, recall_score: Evaluation metrics for classification models.
mean_squared_error, mean_absolute_error, r2_score: Evaluation metrics for regression models.
cross_val_score, GridSearchCV: Tools for cross-validation and hyperparameter tuning.
confusion_matrix, classification_report: Metrics to assess classification performance.
Pipelines and Feature Selection:
Pipeline: Chains together multiple steps into a single workflow.
FeatureUnion: Combines different transformers applied in parallel.
SelectKBest, SelectFromModel: Feature selection methods based on statistics or model importance.
Ensemble Methods:
VotingClassifier, VotingRegressor: Combines multiple models' predictions for improved performance.
BaggingClassifier, RandomForestClassifier: Bootstrap aggregating methods for classification.
GradientBoostingClassifier, AdaBoostClassifier: Boosting methods for classification.
Neural Network Support:
MLPClassifier, MLPRegressor: Multi-layer Perceptron models for classification and regression.
Page | 4
Author: Muhammad Omar Akhlaq
Text Analysis:
CountVectorizer, TfidfVectorizer: Convert text data into numerical feature vectors.
TfidfTransformer: Applies Term Frequency-Inverse Document Frequency transformation.
Model Serialization:
pickle: Built-in Python module for serializing and deserializing scikit-learn models.
4. TENSOR FLOW
Core Components:
Tensors: Fundamental data structures in TensorFlow, similar to multi-dimensional arrays.
Operations: Functions that manipulate tensors, perform computations, and define the computational
graph.
Graph: The computational graph defines the operations and dependencies between tensors.
Model Building:
Keras API: High-level API within TensorFlow for building and training neural networks easily.
tf.keras.layers: Module containing various layers like Dense, Conv2D, LSTM, etc., for building neural
network architectures.
tf.keras.Sequential: Allows the sequential stacking of layers to create a model.
Training and Optimization:
tf.keras.Model.compile(): Compiles the model, defining loss functions, optimizers, and metrics.
tf.keras.Model.fit(): Trains the model on training data.
tf.keras.Model.evaluate(): Evaluates the model's performance on a test dataset.
tf.optimizers: Module containing optimization algorithms like SGD, Adam, RMSprop, etc.
Customization and Extensions:
tf.GradientTape: Records operations for automatic differentiation and custom gradient computation.
tf.function: Converts Python functions into graph-based TensorFlow functions for performance
optimization.
Custom Layers, Metrics, Losses, and Callbacks: Allows the creation of custom components for specific
needs.
Deployment and Serialization:
tf.saved_model: Tools for saving and loading models in the SavedModel format for deployment.
tf.keras.Model.save(), tf.keras.models.load_model(): Saving and loading models using the Keras API.
Data Handling:
tf.data.Dataset: A powerful API for creating input pipelines to handle large datasets efficiently.
tf.data.experimental: Module containing experimental features for data pipeline handling.
GPU and Distributed Computing:
tf.device(): Context manager to explicitly specify the device (CPU or GPU) for execution.
tf.distribute: Module providing tools for distributed training across multiple devices or machines.
Page | 5
Author: Muhammad Omar Akhlaq
Miscellaneous:
tf.math: Module containing mathematical operations on tensors.
tf.image: Module for image processing operations in TensorFlow.
tf.strings: Module for string manipulation operations.
5. KERAS
Model Building:
Sequential: A linear stack of layers for building sequential models.
Functional API (tf.keras.Model): Allows creating complex models with shared layers and multiple inputs
or outputs.
Dense, Conv2D, LSTM, Dropout, etc.: Various layer types for constructing neural network architectures.
Compilation and Configuration:
compile(): Configures the model for training by specifying the optimizer, loss function, and metrics.
Optimizers (SGD, Adam, RMSprop, etc.): Algorithms for optimizing model weights during training.
Loss Functions (mean_squared_error, categorical_crossentropy, etc.): Measures the model's
performance.
Training and Evaluation:
fit(): Trains the model on training data.
evaluate(): Evaluates the model's performance on a test dataset.
predict(): Generates predictions for new data.
Callbacks (EarlyStopping, ModelCheckpoint, etc.): Tools for customizing training behavior.
Regularization and Optimization:
Dropout, BatchNormalization: Techniques for regularization and improving training convergence.
Regularizers: Methods for applying penalties on layer parameters during optimization.
Customization and Extension:
Layer and Model classes: Allows building custom layers and models by subclassing.
Custom Callbacks, Custom Losses, Custom Metrics: Enables customizing and extending Keras
functionality.
Serialization and Deployment:
save() and load_model(): Functions for saving and loading models.
model_to_json() and model_from_json(): Serialization of model architecture to/from JSON format.
Preprocessing and Utilities:
preprocessing: Module for data preprocessing techniques like normalization, text tokenization, etc.
utils: Module containing utility functions for working with Keras models and layers.
GPU and Distributed Computing:
Keras models can leverage TensorFlow's capabilities for GPU and distributed computing seamlessly.
Page | 6
Author: Muhammad Omar Akhlaq
6. PyTorch
Tensor Operations:
torch.Tensor: The core data structure, supports various tensor operations similar to NumPy arrays.
Math operations: Element-wise operations, matrix multiplications, and other mathematical functions.
Indexing and Slicing: Accessing and manipulating tensor elements.
Autograd and Dynamic Computation Graph:
torch.autograd: Automatic differentiation engine for computing gradients.
torch.autograd.Function: Base class for defining custom autograd operations.
backward(): Computes gradients of tensors with respect to a given computational graph.
Neural Network Building Blocks:
torch.nn: Module for building neural network architectures.
torch.nn.Module: Base class for creating custom neural network modules.
Layers: Various layers like Linear, Conv2d, LSTM, Dropout, etc.
Activation functions: ReLU, Sigmoid, Tanh, etc.
Initialization methods: Different weight initialization techniques.
Optimizers and Loss Functions:
Optimizers (SGD, Adam, RMSprop, etc.): Algorithms for optimizing model weights.
torch.optim: Module containing optimization algorithms.
Loss functions (MSE Loss, CrossEntropyLoss, etc.): Measures the model's performance.
Training and Evaluation:
torch.nn.functional: Functions for implementing neural network operations.
torch.utils.data: Tools for handling datasets and data loaders for efficient batching and loading.
torch.utils.data.Dataset: Base class for creating custom datasets.
torch.utils.data.DataLoader: Loads data into batches for training and evaluation.
GPU and Distributed Computing:
torch.device(): Context manager to specify the device (CPU or GPU) for tensor computations.
torch.distributed: Module for distributed computing across multiple devices or machines.
Serialization and Deployment:
torch.save() and torch.load(): Functions for saving and loading models.
Model serialization and deployment mechanisms for inference in production environments.
Miscellaneous:
torch.cuda: Module for GPU-related functionalities and operations.
torchvision: Module containing datasets, model architectures, and image transformation utilities for
computer vision tasks.
torchtext: Module for text-related utilities and datasets.
Page | 7
Author: Muhammad Omar Akhlaq
7. MatPlotLib
Basic Plotting:
plt.plot(): Creates line plots.
plt.scatter(): Generates scatter plots.
plt.bar(), plt.barh(): Creates vertical and horizontal bar plots.
plt.hist(): Generates histograms.
plt.boxplot(): Creates boxplots to visualize data distributions.
Customization and Styling:
plt.xlabel(), plt.ylabel(): Sets labels for the x and y axes.
plt.title(): Sets the title of the plot.
plt.legend(): Adds legends to the plot.
plt.grid(): Displays grid lines on the plot.
plt.xlim(), plt.ylim(): Sets the limits of the x and y axes.
Subplots and Layouts:
plt.subplots(): Creates multiple subplots within a single figure.
plt.subplot(): Adds individual subplots to a figure with custom configurations.
plt.tight_layout(): Automatically adjusts subplot parameters for better layout.
Advanced Plot Types:
plt.contour(), plt.contourf(): Generates contour plots.
plt.pcolor(), plt.pcolormesh(): Creates pseudocolor plots.
plt.quiver(): Displays vector fields.
plt.imshow(): Displays images.
Annotations and Text:
plt.text(): Adds text at specified coordinates on the plot.
plt.annotate(): Annotates a specific point on the plot with optional arrow indicators.
Save and Show:
plt.show(): Displays the plot.
plt.savefig(): Saves the plot as an image file (PNG, JPG, PDF, etc.).
Specialized Plots:
plt.pie(): Generates pie charts.
plt.stem(): Creates stem plots.
plt.violinplot(): Displays violin plots.
3D Plotting (with mpl_toolkits.mplot3d):
Page | 8
Author: Muhammad Omar Akhlaq
Axes3D: Provides 3D axes for plotting.
plot_surface(): Plots 3D surfaces.
scatter(): Creates 3D scatter plots.
Interactive and Animation:
Interactive mode: Enables interactive plotting in supported environments.
Animation: Module for creating animated plots.
8. Seaborn
Data Visualization:
sns.lineplot(): Generates line plots with optional estimation and confidence intervals.
sns.scatterplot(): Creates scatter plots with optional hue and size mapping.
sns.barplot(), sns.countplot(): Produces bar plots and count plots.
sns.boxplot(), sns.violinplot(): Displays boxplots and violin plots to show data distributions.
Categorical Data:
sns.catplot(): Creates categorical plots (scatter, strip, box, violin, etc.) based on data types.
sns.swarmplot(): Visualizes categorical data along with the distribution of observations.
Distribution Visualization:
sns.histplot(), sns.kdeplot(): Displays histograms and kernel density estimation plots.
sns.rugplot(): Shows individual data points as dashes on a plot axis.
Relationship Plots:
sns.pairplot(): Creates a matrix of scatterplots for examining pairwise relationships in a dataset.
sns.heatmap(): Generates a heatmap to visualize matrix-like data.
Regression and Model Visualization:
sns.regplot(), sns.lmplot(): Displays linear regression models.
sns.residplot(): Plots the residuals of a linear regression model.
sns.jointplot(): Visualizes the relationship between two variables and their individual distributions.
Color Palettes and Styles:
sns.set_palette(): Sets the color palette for the plot.
sns.set_style(): Sets the visual aesthetic styles for the plot.
sns.color_palette(): Creates color palettes for use in plots.
Axis and Figure-Level Functions:
FacetGrid and PairGrid: Allow the creation of customized subplots.
sns.relplot(), sns.lineplot(): Higher-level interfaces to create relational plots.
Themes and Contexts:
sns.set_theme(): Sets the overall visual theme for the plots.
sns.plotting_context(): Controls the scaling of plot elements.
Statistical Estimation:
Page | 9
Author: Muhammad Omar Akhlaq
sns.pointplot(): Visualizes point estimates and confidence intervals.
sns.barplot(): Displays the central tendency of a numeric variable.
9. Ploty
Basic Plotting:
go.Scatter(): Generates scatter plots.
go.Bar(): Creates bar charts.
go.Histogram(): Displays histograms.
go.Box(): Generates box plots.
go.Surface(): Generates 3D surface plots.
Additional Plot Types:
go.Pie(): Generates pie charts.
go.Candlestick(): Displays financial candlestick charts.
go.Heatmap(): Creates heatmaps.
go.Contour(): Generates contour plots.
Specialized Visualizations:
go.Scatter3d(): Generates 3D scatter plots.
go.Choropleth(): Creates choropleth maps.
go.FigureWidget(): Enables interactive figures with widgets for live updating.
Customization and Layout:
plotly.graph_objs.Layout: Allows configuring layout settings for the plot.
update_layout(): Updates layout settings of the plot.
Interactivity and Animation:
plotly.express: High-level functions for creating interactive plots easily.
plotly.graph_objs.Figure: Creates interactive figures.
plotly.graph_objs.Plot() and plotly.graph_objs.iplot(): Render plotly figures in Jupyter notebooks.
Dash Integration:
dash_core_components.Graph(): Integrates Plotly graphs into Dash web applications.
dash_html_components.Div(): Creates HTML div elements for organizing the layout in Dash apps.
Export and Sharing:
plotly.io.write_image(): Saves figures as images (PNG, JPEG, SVG, etc.).
plotly.io.show(): Displays plots in Jupyter Notebooks or standalone HTML pages.
plotly.io.write_html(): Saves figures as standalone HTML files.
Page | 10
Author: Muhammad Omar Akhlaq
Themes and Styling:
plotly.io.templates: Provides built-in templates for different plot styles.
update_traces(): Allows customization of individual traces within a plot.
Dashboards and Web Apps:
dash: Integrates Plotly plots into web applications using Dash, a Python web framework.
10. NLTK
Corpus and Text Processing:
nltk.corpus: Module for accessing built-in corpora and lexical resources.
nltk.word_tokenize(): Tokenizes text into words or sentences.
nltk.sent_tokenize(): Tokenizes text into sentences.
Basic Text Processing and Analysis:
nltk.FreqDist(): Generates frequency distributions of words.
nltk.Text(): Wraps a sequence of tokens for advanced operations like concordance and similar context
search.
Part-of-Speech Tagging:
nltk.pos_tag(): Assigns parts of speech (POS) tags to words in a text.
Stemming and Lemmatization:
nltk.PorterStemmer(), nltk.LancasterStemmer(): Implements stemming algorithms.
nltk.WordNetLemmatizer(): Lemmatizes words based on WordNet.
Named Entity Recognition (NER):
nltk.ne_chunk(): Labels named entities such as persons, organizations, locations, etc.
Parsing and Syntax:
nltk.ChartParser(), nltk.RecursiveDescentParser(): Implements parsers for syntactic analysis.
nltk.ParentedTree(): Represents constituency-based parse trees.
WordNet Interface:
nltk.WordNet: Interface to access WordNet, a lexical database for English.
nltk.synsets(): Retrieves synsets (sets of synonyms) from WordNet.
Text Classification:
nltk.classify: Module containing various classifiers for text classification.
nltk.NaiveBayesClassifier(), nltk.DecisionTreeClassifier(): Examples of classifiers.
Machine Learning for NLP:
nltk.classify.scikitlearn: Integrates NLTK classifiers with scikit-learn for machine learning.
Page | 11
Author: Muhammad Omar Akhlaq
Tokenization and Chunking:
nltk.chunk: Module for chunking and extracting phrases from sentences.
nltk.RegexpParser(): Creates chunk parsers using regular expressions.
Sentiment Analysis:
nltk.sentiment: Module providing sentiment analysis tools and lexicons.
Collocations and Bigrams:
nltk.collocations: Module for extracting collocations and bigrams from text.
Language Models and Probabilities:
nltk.probability: Module for implementing probability distributions and frequency estimation.
Page | 12
Author: Muhammad Omar Akhlaq