0% found this document useful (0 votes)
14 views10 pages

CADD Unit 4

Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology to analyze biological data, originating in the 1970s and focusing on genome data analysis since the late 1980s. Its goals include data organization, tool development, and deriving biological insights, with applications in genomics, proteomics, and drug discovery. Chemoinformatics, on the other hand, applies computational techniques to chemical problems, facilitating drug design and discovery, and includes tools for structure searching and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views10 pages

CADD Unit 4

Bioinformatics is an interdisciplinary field that merges biology, computer science, and information technology to analyze biological data, originating in the 1970s and focusing on genome data analysis since the late 1980s. Its goals include data organization, tool development, and deriving biological insights, with applications in genomics, proteomics, and drug discovery. Chemoinformatics, on the other hand, applies computational techniques to chemical problems, facilitating drug design and discovery, and includes tools for structure searching and data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Introduction to Bioinformatics

Definition:

Bioinformatics is an interdisciplinary field that combines biology, computer science, and


information technology to analyze and interpret biological data.
It involves solving biological science problems through computation.
Handles biological data collected from experimental techniques and documents them in databases for
scientific use.
Origins of the Term:
First used by Paulien Hogeweg (Dutch theoretical biologist) and Ben Hesper in the early 1970s.
From the late 1980s, it primarily referred to the computational analysis of genome data.

Goals of Bioinformatics:

• Data Organization: Efficient storage, management, and retrieval of biological data.


• Tool Development: Creating algorithms and statistical tools for interpreting complex
biological data.
• Biological Insight: Deriving meaningful conclusions from raw data—such as gene function,
evolutionary relationships, and disease mechanisms.
Scientific Disciplines Associated with Bioinformatics:
1. Traditional and Advanced Sciences:
1. Plant sciences
2. Animal sciences
3. Molecular biology
4. Genetics
5. Evolutionary biology
2. Other Fields:
1. Pharmaceuticals
2. Mathematical and statistical sciences
3. Omics (e.g., genomics, proteomics)
Support Systems for Bioinformatics:
1. Technology and Infrastructure:
1. Computer science
2. Information technology (IT)
3. Computational resources (serve as the backbone)

Scope of Bioinformatics:
• Molecular Biology: Genome sequencing, gene expression analysis.
• Structural Biology: Protein structure prediction, molecular modeling.
• Systems Biology: Pathway mapping, systems modeling.
• Drug Discovery: Target identification, virtual screening.
• Evolutionary Biology: Phylogenetic analysis, genome evolution.

Applications:

1. Genomics – Sequence analysis, gene prediction, comparative genomics.


2. Proteomics – Protein structure/function analysis, interactions.
3. Transcriptomics – Gene expression profiling using microarrays or RNA-seq.
4. Metabolomics – Metabolic pathway analysis.
5. Pharmacogenomics – Personalizing medicine based on genetic profile.
6. Systems Biology – Understanding complex biological systems through modeling.
7. Biological Databases – Storing and accessing biological information like GenBank, PDB,
Swiss-Prot.
8. Functions include data storage, mining, processing, structural and functional annotation of
genes/proteins, system modeling, and drug discovery.
9. Applications in Agriculture: Bioinformatics facilitates the use of genetic, genomic, and
proteomic information to develop crops that are resistant, nutritionally enhanced, and more
profitable.
10. Drug Discovery: The same data and techniques are applied in discovering therapeutic drugs.
11.

Central Dogma in Bioinformatics:

Understanding the flow of genetic information: DNA → RNA → Protein


• Bioinformatics tools help in studying this flow at each level.

Bioinformatics tools assist in analyzing each step:

• DNA: Genome assembly, mutation detection


• RNA: Transcript quantification
• Protein: Function prediction, interaction modelling

Major Bioinformatics Databases :

Database Description
GenBank Nucleotide sequence database maintained by NCBI
EMBL European Molecular Biology Laboratory nucleotide archive
DDBJ DNA Data Bank of Japan
ZINC Databases contain commercially available molecules for
computational screening
TrEMBL Computer-annotated supplement of Swiss-Prot
PDB (Protein Data 3D structures of proteins and nucleic acids
Bank)
Pfam Protein family and domain database
KEGG Kyoto Encyclopedia of Genes and Genomes – pathways and
function
BLAST Tool for comparing sequences (Basic Local Alignment Search
Tool)
GEO GEO is a databases for functional genomics, contain gene
expression/ microarray data

Introduction to Chemoinformatics

Definition:

Chemoinformatics (or cheminformatics) is the use of computer and informational techniques to solve
chemical problems. It involves data storage, structure searching, and data analysis relevant to
chemical and biological information.
Scope:
Applied to a large number of small molecules (#N ~ 10, 100, 1,000...10⁶...10⁶⁰).
Fields of Application:
Medical science: Developing novel and effective drugs.
Material science: Creating new and superior materials.
Allied fields: Agrochemicals and biotechnology.

Goals of Chemoinformatics:

• Convert chemical data into a digital form that is easily accessible and analyzable.
• Facilitate drug design and discovery using computational tools.
• Develop chemical databases and virtual libraries.
• Predict physicochemical and biological properties of molecules.

Key Concepts:

1. Molecular Representations – SMILES, InChI, 2D & 3D formats


2. Descriptors – Numeric values that describe molecular features
3. QSAR Modeling – Quantitative Structure-Activity Relationship to predict biological activity
4. Virtual Screening – In silico filtering of large libraries for active compounds
5. Scaffold Hopping – Identifying new chemical cores with same activity
Open source software:
Free Open-Source Software (FOSS) tools are defined as those programs which anyone can download
and change the source code, provided that they make the changes publicly available again, according
to the GNU Lesser General Public License (LGPL).
Key Areas:

1. Molecular Representations:
o SMILES (Simplified Molecular Input Line Entry System)
o InChI (IUPAC International Chemical Identifier)
o 2D & 3D structures used in modeling and visualization.
2. Chemical Databases:
o PubChem
o ChEMBL
o ZINC database
o DrugBank
3. Structure-Activity Relationship (SAR):
o Examines the relationship between a compound’s structure and its biological activity.
o QSAR (Quantitative SAR): Uses statistical models to predict activity quantitatively.
4. Virtual Screening:
o Computational technique to screen large libraries of compounds.
o Saves time and cost compared to traditional lab screening.
5. Drug Design Tools:
o Ligand-based design (pharmacophore modeling)
o Structure-based design (docking studies, molecular dynamics)

Applications:

• Lead identification and optimization in drug discovery.


• Prediction of ADME-Tox properties (Absorption, Distribution, Metabolism, Excretion,
Toxicity).
• Design of novel compounds with desired bioactivity.
• Chemical data mining and knowledge discovery.
• Tasks such as spectra simulation, structure elucidation, reaction modeling, synthesis planning.
Cheminformatics tools
JChemPaint (JCP):
Open-source tool for drawing and editing 2D chemical structures.
Developed using the Chemistry Development Kit (CDK).
RDKit:
Offers molecule drawing and editing capabilities.
Includes a Python API for integration into data analysis workflows.
ChemicalToolbox:
Provides a user-friendly graphical interface for cheminformatics analysis.
Built on the Galaxy platform for reproducible workflows.
ChemDraw:
A widely used commercial software for creating chemical structure diagrams.
Supports advanced features like reaction mechanisms and spectral analysis.
Avogadro:
Open-source molecular editor and visualization tool.
Ideal for 3D modeling and quantum chemistry calculations.
MayaChemTools:
Includes cheminformatics utilities for molecular design and analysis.
Offers command-line tools but integrates with graphical environments.
COMMERCIAL CHEMINFORMATICS TOOLS FOR CHEMICAL STRUCTURE
CREATION AND EDITING:
ChemDraw:
A leading software for drawing and editing chemical structures.
Offers advanced features like reaction mechanisms, spectral analysis, and 3D modeling.
MarvinSketch:
A powerful tool for creating and visualizing chemical structures.
Supports a wide range of file formats and chemical calculations.
BIOVIA Draw:
Provides tools for creating publication-quality chemical drawings.
Integrates with other BIOVIA software for cheminformatics and molecular modeling.
ACD/ChemSketch:
A comprehensive tool for drawing chemical structures and reactions.
Includes features for property prediction and molecular modeling.
CD/ChemSketch is a freeware for drawing chemical structures including organics, organometallics,
polymers, and Markush structures. It has options for structure cleaning, viewing and naming, inch
conversion, stereo descriptors etc. For freeware, no technical support is provided and the
functionalities are less compared to the commercial version which has structure search capabilities.
Molecular Operating Environment (MOE):
A versatile platform for molecular modeling and cheminformatics.
Supports structure creation, visualization, and analysis.

Major Chemoinformatics Databases

Database Description
PubChem Open chemistry database by NCBI; includes structure and bioactivity
ChEMBL Bioactivity database of drug-like small molecules
ZINC Free database of commercially available compounds for virtual
Database screening
DrugBank Comprehensive drug and drug target information
ChemSpider Free chemical structure database by Royal Society of Chemistry
BindingDB Database of binding affinities between protein targets and small
molecules
PDBbind Contains experimentally measured binding affinities for protein-ligand
complexes

ADME DATABASES
ADME databases are specialized resources that provide information on the Absorption, Distribution,
Metabolism, and Excretion (ADME) properties of drugs. These databases are crucial for drug
discovery and development, as they help predict how a drug will behave in the human body. Here are
a few notable ADME databases:
Fujitsu's ADME Database: This database contains over 130,000 entries related to pharmacokinetics,
including data on drug-metabolizing enzymes (like cytochrome P450s) and transporters. It provides
both in vitro and human clinical drug interaction data.
WangLab's ADME Databases: These include multiple datasets such as water solubility, Caco-2
permeability, blood-brain permeability, and oral bioavailability. They are useful for benchmarking
experiments and building predictive models.
ADME@NCATS: Developed by the National Institutes of Health, this resource offers in silico
prediction models for various ADME properties. It allows users to input molecular data and receive
predictions with confidence scores.
Key ADME Databases and Tools

Database/Tool Description

ADMETlab An online platform for comprehensive ADMET property prediction (covers


absorption, distribution, metabolism, excretion, and toxicity).
SwissADME Free web tool by Swiss Institute of Bioinformatics for evaluating
pharmacokinetics, drug-likeness, and medicinal chemistry friendliness of
small molecules.

pkCSM Predicts pharmacokinetic properties using graph-based signatures. Offers


data on absorption, BBB permeability, metabolism, and more.

PreADMET Web-based application for predicting ADME and toxicity properties based on
molecular structure.
admetSAR A structure-activity relationship-based database and prediction tool for over
60 ADMET properties.

QikProp Commercial software that provides accurate predictions of ADME


(Schrödinger) properties. Useful in virtual screening and lead optimization.

Toxtree Open-source application to estimate toxic hazard using decision tree


approaches; includes some ADME-related predictions.
eTOX A collaborative project integrating data from pharmaceutical industries to
(eTRANSAFE) build a shared ADMET database.
OCHEM Online Chemical Modeling Environment – supports QSAR modeling for
ADMET and toxicity prediction.

FAF-Drugs4 Free web service for filtering chemical libraries based on physicochemical
properties and ADMET rules.

Commonly Predicted ADME Properties:

• Absorption: Caco-2 permeability, HIA (Human Intestinal Absorption), P-gp substrate


• Distribution: Volume of distribution, BBB (Blood-Brain Barrier) penetration, plasma protein
binding
• Metabolism: CYP450 enzyme inhibition/induction (e.g., CYP3A4, CYP2D6)
• Excretion: Renal clearance, half-life prediction
• Toxicity (Tox): Ames mutagenicity, hepatotoxicity, cardiotoxicity (hERG inhibition)

Databases
Chemical Databases
Chemical databases store information about chemical compounds, including their:
• Structures (2D & 3D)
• Molecular formulas
• Physical and chemical properties
• Spectral data
• Synthetic routes

Applications:

• Structure searching (e.g., substructure or similarity)


• Virtual screening
• Property prediction (e.g., solubility, boiling point)
• Compound procurement or sourcing

1. PubChem: A comprehensive repository for chemical molecules and their activities.


2. ChemSpider: A free chemical structure database with over 130 million structures.
3. ChEMBL: Focuses on bioactive molecules with drug-like properties.

These contain information on chemical structures, properties, and identifiers.

Database Description
PubChem Open chemistry database maintained by NCBI; contains compounds,
substances, and bioassays.
ChemSpider Free database by the Royal Society of Chemistry with data on chemical
structures, spectra, properties.
ZINC Contains purchasable chemical compounds for virtual screening; widely
used in docking studies.
ChEBI Chemical Entities of Biological Interest – focuses on small molecules of
biological relevance.
MolPort Commercial compound sourcing database for screening and synthesis.
Reaxys Commercial database for chemical reaction and substance data, including
experimental conditions.
eMolecules Offers searchable information on commercially available molecules.

Biochemical Databases
Biochemical databases contain data related to biological macromolecules, such as:
• Proteins
• DNA/RNA sequences
• Enzymes
• Pathways
• Biological interactions
Applications:

• Protein structure prediction


• Sequence alignment and comparison
• Functional annotation
Understanding metabolic and signaling pathways
1. GenBank: A nucleotide sequence database maintained by NCBI.
2. Protein Data Bank (PDB): Contains 3D structural data of biomolecules.
3. UniProt: A protein sequence and functional information database.

Database Description

UniProt Protein sequence and functional information (Swiss-Prot & TrEMBL).

PDB (Protein Data 3D structures of proteins, DNA, RNA, and protein-ligand complexes.
Bank)

KEGG Kyoto Encyclopedia of Genes and Genomes – pathways, genes, and


chemical functions.

BioCyc Provides data on metabolic pathways and genomes from various


organisms.

STRING Database of known and predicted protein-protein interactions.

Pfam Protein families database based on conserved domains.


NCBI Gene Gene-specific information including sequences, variants, and expression.

Pharmaceutical Databases
Pharmaceutical databases provide detailed data on drugs and drug targets, including:
• Approved and experimental drugs
• Drug structures
• Mechanisms of action
• Pharmacokinetics (ADME)
• Side effects
• Clinical trial data

Applications:

• Drug repurposing and discovery


• Target identification
• Pharmacogenomics (gene-drug interactions)
• ADMET prediction
• Safety and efficacy evaluation

1. DrugBank: Offers detailed drug and drug target information.


2. FDA Drug Databases: Includes drug approvals, safety data, and more.
3. Pharma Data: Provides insights into clinical trials, market trends, and patient demographics.

Database Description

DrugBank Detailed chemical, pharmacological, and pharmaceutical data on


drugs and their targets.

ChEMBL Bioactivity database with drug-like small molecules and their


activity on biological targets.
ClinicalTrials.gov Registry of clinical trials worldwide, including drug interventions
and outcomes.

PharmGKB Focuses on how genetic variation affects drug response


(pharmacogenomics).

Drugs@FDA U.S. FDA database of approved drugs, including labels and


regulatory status.
DailyMed Provides FDA-approved drug labeling and dosage information.

RxList Offers detailed drug descriptions and uses, mainly for consumer
health reference.
SIDER Side Effect Resource – information on adverse drug reactions.

TTD (Therapeutic Target Information on known and explored therapeutic protein and
Database) nucleic acid targets.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy