CADD Unit 4
CADD Unit 4
Definition:
Goals of Bioinformatics:
Scope of Bioinformatics:
• Molecular Biology: Genome sequencing, gene expression analysis.
• Structural Biology: Protein structure prediction, molecular modeling.
• Systems Biology: Pathway mapping, systems modeling.
• Drug Discovery: Target identification, virtual screening.
• Evolutionary Biology: Phylogenetic analysis, genome evolution.
Applications:
Database Description
GenBank Nucleotide sequence database maintained by NCBI
EMBL European Molecular Biology Laboratory nucleotide archive
DDBJ DNA Data Bank of Japan
ZINC Databases contain commercially available molecules for
computational screening
TrEMBL Computer-annotated supplement of Swiss-Prot
PDB (Protein Data 3D structures of proteins and nucleic acids
Bank)
Pfam Protein family and domain database
KEGG Kyoto Encyclopedia of Genes and Genomes – pathways and
function
BLAST Tool for comparing sequences (Basic Local Alignment Search
Tool)
GEO GEO is a databases for functional genomics, contain gene
expression/ microarray data
Introduction to Chemoinformatics
Definition:
Chemoinformatics (or cheminformatics) is the use of computer and informational techniques to solve
chemical problems. It involves data storage, structure searching, and data analysis relevant to
chemical and biological information.
Scope:
Applied to a large number of small molecules (#N ~ 10, 100, 1,000...10⁶...10⁶⁰).
Fields of Application:
Medical science: Developing novel and effective drugs.
Material science: Creating new and superior materials.
Allied fields: Agrochemicals and biotechnology.
Goals of Chemoinformatics:
• Convert chemical data into a digital form that is easily accessible and analyzable.
• Facilitate drug design and discovery using computational tools.
• Develop chemical databases and virtual libraries.
• Predict physicochemical and biological properties of molecules.
Key Concepts:
1. Molecular Representations:
o SMILES (Simplified Molecular Input Line Entry System)
o InChI (IUPAC International Chemical Identifier)
o 2D & 3D structures used in modeling and visualization.
2. Chemical Databases:
o PubChem
o ChEMBL
o ZINC database
o DrugBank
3. Structure-Activity Relationship (SAR):
o Examines the relationship between a compound’s structure and its biological activity.
o QSAR (Quantitative SAR): Uses statistical models to predict activity quantitatively.
4. Virtual Screening:
o Computational technique to screen large libraries of compounds.
o Saves time and cost compared to traditional lab screening.
5. Drug Design Tools:
o Ligand-based design (pharmacophore modeling)
o Structure-based design (docking studies, molecular dynamics)
Applications:
Database Description
PubChem Open chemistry database by NCBI; includes structure and bioactivity
ChEMBL Bioactivity database of drug-like small molecules
ZINC Free database of commercially available compounds for virtual
Database screening
DrugBank Comprehensive drug and drug target information
ChemSpider Free chemical structure database by Royal Society of Chemistry
BindingDB Database of binding affinities between protein targets and small
molecules
PDBbind Contains experimentally measured binding affinities for protein-ligand
complexes
ADME DATABASES
ADME databases are specialized resources that provide information on the Absorption, Distribution,
Metabolism, and Excretion (ADME) properties of drugs. These databases are crucial for drug
discovery and development, as they help predict how a drug will behave in the human body. Here are
a few notable ADME databases:
Fujitsu's ADME Database: This database contains over 130,000 entries related to pharmacokinetics,
including data on drug-metabolizing enzymes (like cytochrome P450s) and transporters. It provides
both in vitro and human clinical drug interaction data.
WangLab's ADME Databases: These include multiple datasets such as water solubility, Caco-2
permeability, blood-brain permeability, and oral bioavailability. They are useful for benchmarking
experiments and building predictive models.
ADME@NCATS: Developed by the National Institutes of Health, this resource offers in silico
prediction models for various ADME properties. It allows users to input molecular data and receive
predictions with confidence scores.
Key ADME Databases and Tools
Database/Tool Description
PreADMET Web-based application for predicting ADME and toxicity properties based on
molecular structure.
admetSAR A structure-activity relationship-based database and prediction tool for over
60 ADMET properties.
FAF-Drugs4 Free web service for filtering chemical libraries based on physicochemical
properties and ADMET rules.
Databases
Chemical Databases
Chemical databases store information about chemical compounds, including their:
• Structures (2D & 3D)
• Molecular formulas
• Physical and chemical properties
• Spectral data
• Synthetic routes
Applications:
Database Description
PubChem Open chemistry database maintained by NCBI; contains compounds,
substances, and bioassays.
ChemSpider Free database by the Royal Society of Chemistry with data on chemical
structures, spectra, properties.
ZINC Contains purchasable chemical compounds for virtual screening; widely
used in docking studies.
ChEBI Chemical Entities of Biological Interest – focuses on small molecules of
biological relevance.
MolPort Commercial compound sourcing database for screening and synthesis.
Reaxys Commercial database for chemical reaction and substance data, including
experimental conditions.
eMolecules Offers searchable information on commercially available molecules.
Biochemical Databases
Biochemical databases contain data related to biological macromolecules, such as:
• Proteins
• DNA/RNA sequences
• Enzymes
• Pathways
• Biological interactions
Applications:
Database Description
PDB (Protein Data 3D structures of proteins, DNA, RNA, and protein-ligand complexes.
Bank)
Pharmaceutical Databases
Pharmaceutical databases provide detailed data on drugs and drug targets, including:
• Approved and experimental drugs
• Drug structures
• Mechanisms of action
• Pharmacokinetics (ADME)
• Side effects
• Clinical trial data
Applications:
Database Description
RxList Offers detailed drug descriptions and uses, mainly for consumer
health reference.
SIDER Side Effect Resource – information on adverse drug reactions.
TTD (Therapeutic Target Information on known and explored therapeutic protein and
Database) nucleic acid targets.