0% found this document useful (0 votes)

17 views37 pages

18-DIP - Database of Interacting Proteins-26-09-2024

The document discusses databases for biological interactions, focusing on the BioGRID, STRING, and DIP databases, which catalog protein-protein interactions and related data. BioGRID is a curated resource that includes over 1.7 million interactions from various species, while STRING integrates known and predicted protein interactions and supports advanced search features. DIP catalogs experimentally determined interactions and is part of the International Molecular Exchange Consortium, facilitating collaboration among major interaction data providers.

Uploaded by

Piyush Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views37 pages

18-DIP - Database of Interacting Proteins-26-09-2024

Uploaded by

Piyush Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Databases for biological interactions

PPI network components

 Nodes represent proteins called signaling receptors.

 A line connecting two nodes is an edge

 Hub gene is a gene that is highly connected to other genes in a network,

and is thought to be important for the regulation of the network as a
whole.

 Seeds are the proteins used to start the analysis and around which the
network is built.
BioGRID
(The Biological General Repository for
Interaction Datasets )
BioGRID Database
• The Biological General Repository for Interaction Datasets (BioGRID) is a curated biological
database of protein-protein interactions, genetic interactions, chemical interactions, and post-
translational modifications.
• The BioGRID is hosted in Toronto, Ontario, Canada and Dallas, Texas, United States and is
partnered with the Saccharomyces Genome Database, FlyBase, WormBase, PomBase, and the
Alliance of Genome Resources.
• The BioGRID is funded by the NIH (National Institutes of Health) and (Canadian Institutes of Health
Research) CIHR.
• It is an observer member of the International Molecular Exchange Consortium (IMEx).
• In addition to collaborating with experts in themed curation project efforts, BioGRID actively works
together with MOD and meta-database resources in order to facilitate the widespread propagation
of BioGRID records.
• Open-access database resource that houses manually curated protein and genetic interactions from
multiple species including yeast, worm, fly, mouse, and human.
• In addition to general curation across species, BioGRID undertakes themed curation
projects in specific aspects of cellular regulation, for example the ubiquitin-proteasome
system, as well as specific disease areas, such as for the SARS-CoV-2 virus that causes
COVID-19 severe acute respiratory syndrome.
• BioGRID also now curates gene‐phenotype relationships from genome‐wide CRISPR
screens.
• All protein and genetic interactions annotated in BioGRID are exclusively derived from
expert manual curation of experimental data reported in peer‐reviewed publications.
• BioGRID currently holds over 1,740,000 interactions curated from both high-throughput
datasets and individual focused studies, as derived from over 70,000+ publications in the
primary literature.
Partners
BioGRID Architecture and Data Model

• The BioGRID database architecture consists of three distinct components: (i)

the Core (ii) the Web (iii) the Interaction Management System (IMS).
• Each of the three components has a specific role in driving the BioGRID
system and can easily be modified to meet rapidly changing needs in data
management without entailing major changes to the applications it supports.
• All of the BioGRID databases use MySQL 5.1.
• BioGRID provides interaction data to several model organism databases,
resources such as Entrez-Gene, SGD, TAIR, FlyBase and other interaction
meta-databases.
BioGRID Architecture and Data Model
Searches in BioGRID

• BioGRID searches can be performed by clicking on ‘gene’ tab from the main search page.
• To perform search ,one can simply enter term and search engine will search for matching identifier.
• Advance search option can be also used for the better result.
Search option
at BioGRID

Wildcard Publication
searches searches
Wildcard searches
• The BioGRID supports wildcard searching (ie. searching where the search is not exact but
rather a range of possible matches) on the TAIL end of any keyword entered in our search
field.
• To perform a wildcard search, simply enter your prefix (must be 3 letters or more)
followed by a star (*).
• Examples: STE*, CDC*, YAL01*, CLN*

Publication Searches

• Searches publication by pubmed id, author name or by keyword terms.

Ex. 9006895
BioGRID-ORCS
• A recent extension of BioGRID, named the Open
Repository of CRISPR Screens (ORCS,
orcs.thebiogrid.org), captures single mutant
phenotypes and genetic interactions from published
high throughput genome-wide CRISPR/Cas9-based
genetic screens.
• ORCS is updated on a quarterly basis and is fully
searchable by gene/protein, phenotype, cell line,
authors, and other attributes. All data in ORCS can be
downloaded in standard formats and fly cell lines.
• BioGRID-ORCS searches 360 publications and
93,209 genes to return 1,982 CRISPR screens from 5
major model organism species, 801 cell lines, and
135 cell types.
Themed Curation Projects

• Due to the overwhelming size of published scientific literature containing human (Homo
sapiens) gene, protein, and chemical interactions, BioGRID has taken a targeted, project-based
approach to curation of human interaction data in manageable collections of high impact data.
• These themed curation projects represent central biological processes with disease relevance
such as chromatin modification, autophagy, and the ubiquitin-proteasome system or diseases of
interest including glioblastoma, Fanconi Anemia, and COVID-19.
• As of 18 October 2020, BioGRID themed curation project efforts have resulted in the extraction
of 424,631 interactions involving 2,361 proteins from more than 37,000 scientific articles.
STRING
(Search Tool for the Retrieval of
Interacting Genes/Proteins)
STRING
• STRING is a database of known and predicted protein-protein interactions. The database aims to
integrate all known and predicted associations between proteins, including both physical
interactions as well as functional associations.
• The STRING database contains information from numerous sources, including experimental data,
computational prediction methods and public text collections.
• Freely accessible.
• STRING has been developed by a consortium of academic institutions including CPR, EMBL, KU,
SIB, TUD and UZH.
• The STRING database currently covers 59,309,604 proteins from 12,535 organisms.
• Apart from the website, the database can be queried directly from within Cytoscape and from
within R (via a Bioconductor package).
Why is STRING Important?

 Proteins are essential for almost every function in a living cell. Understanding how they
interact helps us figure out how cells work, how diseases develop, and how we might treat
them.

 STRING helps visualize these interactions in the form of networks, making it easier to see
how proteins connect with each other.
Features
• In STRING, each protein-protein interaction is annotated with one or more 'scores’.
• These scores are indicators of confidence, i.e. how likely STRING judges an interaction to
be true, given the available evidence. All scores rank from 0 to 1, with 1 being the highest
possible confidence. A score of 0.5 would indicate that roughly every second interaction
might be erroneous (i.e., a false positive).
• Results of the various computational predictions can be inspected from different designated
views.
• There are two modes of STRING: Protein-mode and COG-mode (clusters of orthologous
groups).
• In COG mode Predicted interactions are propagated to proteins in other organisms for
which interaction has been described by inference of orthology.
• A web interface is available to access the data and to give a fast overview of the proteins
and their interactions.
• A plug-in for cytoscape to use STRING data is available.
• Another possibility to access data STRING is to use the application programming interface
(API) by constructing a URL that contain the request.
Data sources
• STRING imports data from experimentally derived protein–protein interactions through
literature curation.
• STRING also store computationally predicted interactions from: (i) text mining of scientific
texts, (ii) interactions computed from genomic features, and (iii) interactions transferred
from model organisms based on orthology.
• All predicted or imported interactions are benchmarked against a common reference of
functional partnership as annotated by KEGG (Kyoto Encyclopedia of Genes and
Genomes).
• Interactions in STRING are derived from five main sources:
Imported data

STRING imports protein association knowledge from databases of physical interaction and databases
of curated biological pathway knowledge (MINT, HPRD, BIND, DIP, BioGRID, KEGG, Reactome,
IntAct, EcoCyc, NCI-Nature Pathway Interaction Database, GO).

Text mining

A large body of scientific texts (SGD, OMIM, FlyBase, PubMed) are parsed to search for statistically
relevant co-occurrences of gene names.
Predicted data

• Neighborhood: Similar genomic context in different species suggest a similar function of

the proteins.
• Fusion-fission events: Proteins that are fused in some genomes are very likely to be
functionally linked (as in other genomes where the genes are not fused).
• Occurrence: Proteins that have a similar function or an occurrence in the same metabolic
pathway, must be expressed together and have similar phylogenetic profile.
• Coexpression: Predicted association between genes based on observed patterns of
simultaneous expression of genes.
Occurrence
Neighborhood

Fusion-fission events

Coexpression
Searches in STRING
Netwo Data Settings
rk Legen
d
DIP: The Database of
Interacting Proteins
DIP
• The Database of Interacting Proteins (DIP) is a biological database which catalogs
experimentally determined interactions between proteins.
• It combines information from a variety of sources to create a single, consistent set of
protein–protein interactions.
• The data stored within DIP have been curated, both manually, by expert curators, and
automatically, using computational approaches that utilize the knowledge about the
protein–protein interaction networks extracted from the most reliable, core subset of the
DIP data.
• The database was initially released in 2002. As of 2014, DIP is curated by the research
group of David Eisenberg at UCLA.
DIP
• DIP is a member of the International Molecular Exchange Consortium (IMEx),a group of the
major public providers of interaction data.
• Other participating databases include the Biomolecular Interaction Network Database
(BIND), IntAct, the Molecular Interaction Database (MINT), MIPS, MPact, and BioGRID.
• The databases of IMEx work together to prevent duplications of effort, collecting data from
non-overlapping sources and sharing the curated interaction data.
• DIP is useful for understanding protein function and protein–protein relationships, studying
the properties of networks of interacting proteins, benchmarking predictions of protein–
protein interactions, and studying the evolution of protein–protein interactions.
The DIP database is composed of nodes and edges:

• DIP Nodes (proteins)

Each protein in the DIP database has a unique identifier, like <DIP:nnnN> (e.g. DIP:310N)
These proteins are linked to major protein databases such as PIR, SWISSPROT, and
GENBANK.
• DIP Edges (interactions)
Each interaction between proteins is also given a unique identifier, like <DIP:nnnE> (e.g.
DIP:1234E)
It provides access to information such as the region involved in the interaction, the dissociation
constant and the experimental methods used to identify and characterize the interaction.
Relational structure of DIP

The DIP database is composed of three linked tables: a table of protein information, a table of protein–protein
interactions, and a table describing details of experiments detecting the protein–protein interactions.

(i) The protein information table contains protein identification codes from the SWISS-PROT, PIR and GenBank
sequence databases, as well as each protein’s gene name, description, enzyme code and cellular localization,
when known.

(ii) The interaction table describes proteins that interact from the protein information table, as well as the ranges
of amino acids and the protein domains involved in the protein–protein interaction, when known.

(iii) The experimental article table details the experiments used to detect the interactions from the interaction
table and their associated literature citations. This table includes the MEDLINE standard article code
(PMID/UID), as well as the authors, title, journal and year of publication of the article. Over 20 different
experimental techniques are represented in DIP, including co-immunoprecipitation, yeast two-hybrid and in
vitro binding assays. Where determined, a dissociation constant is also included.
Relational structure of DIP
Searching The Database
DIP can be searched in a variety of ways. One can look for interactions
involving a specific protein by entering its gene name or its accession code from
GenBank, PIR or SWISS-PROT. More general searches can be performed for
information such as organisms, protein superfamilies, keywords, experimental
techniques or literature citations.
Applications of PPI interaction
databases
Predict
functions of
gene/proteins

Network Molecular
medicine: drug mechanisms
targets: drug behind certain
repurposing phenotypes

Network biology
Systems Enhance our
understanding of
biology
model organisms

Ecological
systems
Applications
• Disease Research: widely used in identifying protein networks involved in diseases, including
cancer, neurodegenerative disorders, and infectious diseases like rheumatoid arthritis.

• Drug Target Identification: By exploring PPIs, researchers can identify potential drug targets
and understand the impact of drugs on cellular networks.

• Systems Biology: it aids in the construction of large-scale interaction networks, which are
crucial for systems biology studies aimed at understanding complex biological systems.

• Functional Genomics: Researchers can predict the function of unknown proteins based on
their interactions with well-characterized proteins.

Rappaport (1987)
100% (4)
Rappaport (1987)
28 pages
Bif501 Handouts PDF Bif
No ratings yet
Bif501 Handouts PDF Bif
197 pages
Bioinformatics Lecture Notes
No ratings yet
Bioinformatics Lecture Notes
5 pages
Module 4 Complete
No ratings yet
Module 4 Complete
197 pages
Bio GRID
No ratings yet
Bio GRID
7 pages
Fat Noews
No ratings yet
Fat Noews
36 pages
Database Dalam Bioinformatika
No ratings yet
Database Dalam Bioinformatika
34 pages
Bioinformatics Intro
No ratings yet
Bioinformatics Intro
69 pages
Bioinformatics-An Introduction and Overview
No ratings yet
Bioinformatics-An Introduction and Overview
12 pages
16 Article
No ratings yet
16 Article
8 pages
8A Iyengar
No ratings yet
8A Iyengar
8 pages
Gkad 445
No ratings yet
Gkad 445
6 pages
Bioin
No ratings yet
Bioin
34 pages
Session 3 Notes Basics
No ratings yet
Session 3 Notes Basics
11 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
Unit 1 Bioinformatics
No ratings yet
Unit 1 Bioinformatics
38 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
What Is Bioinformatics
100% (1)
What Is Bioinformatics
22 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
STRING v10: Protein-Protein Interaction Networks, Integrated Over The Tree of Life
No ratings yet
STRING v10: Protein-Protein Interaction Networks, Integrated Over The Tree of Life
6 pages
Genome Res.-2003-Shannon-2498-504
No ratings yet
Genome Res.-2003-Shannon-2498-504
8 pages
STRING v11: Protein-Protein Association Networks With Increased Coverage, Supporting Functional Discovery in Genome-Wide Experimental Datasets
No ratings yet
STRING v11: Protein-Protein Association Networks With Increased Coverage, Supporting Functional Discovery in Genome-Wide Experimental Datasets
7 pages
Klingstrom Plewczynski 2010
No ratings yet
Klingstrom Plewczynski 2010
13 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Protein Databases
No ratings yet
Protein Databases
23 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
01-What Is Bioinformatics
No ratings yet
01-What Is Bioinformatics
40 pages
Unit 7 (Application of Bioinformatics in Agriculture)
No ratings yet
Unit 7 (Application of Bioinformatics in Agriculture)
25 pages
Shannon 2003 Cytoscape
No ratings yet
Shannon 2003 Cytoscape
8 pages
CH12
No ratings yet
CH12
8 pages
IInd Sem Class1
No ratings yet
IInd Sem Class1
56 pages
Bioinformatics Glossary
No ratings yet
Bioinformatics Glossary
4 pages
CCC Pta 2023 Final Paper
100% (1)
CCC Pta 2023 Final Paper
5 pages
CR Micro
No ratings yet
CR Micro
2 pages
Peace BMCB Seminar
No ratings yet
Peace BMCB Seminar
13 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatica
No ratings yet
Bioinformatica
10 pages
Ahora Si Este Es El Bueno
No ratings yet
Ahora Si Este Es El Bueno
8 pages
A Review Article On Bioinformatics Tools and Software
No ratings yet
A Review Article On Bioinformatics Tools and Software
14 pages
Bio Informatics
No ratings yet
Bio Informatics
46 pages
Concepts of Bioinformatics PDF
100% (2)
Concepts of Bioinformatics PDF
20 pages
Elscgbf
No ratings yet
Elscgbf
11 pages
Unit I
No ratings yet
Unit I
11 pages
Biological Databases For Human Research
No ratings yet
Biological Databases For Human Research
9 pages
Safety For Hydraulics: Compac Type Proportional Valves Series CV
No ratings yet
Safety For Hydraulics: Compac Type Proportional Valves Series CV
32 pages
CR Micro
No ratings yet
CR Micro
2 pages
Bioinformatics Definition
No ratings yet
Bioinformatics Definition
11 pages
Sop Qualification
100% (1)
Sop Qualification
9 pages
Arabidopsis Thaliana: Designing A Computational System To Predict Protein-Protein Interactions in
No ratings yet
Arabidopsis Thaliana: Designing A Computational System To Predict Protein-Protein Interactions in
4 pages
6.3MVA Distribution Transformer
No ratings yet
6.3MVA Distribution Transformer
6 pages
On Bioinformatic Resources
No ratings yet
On Bioinformatic Resources
7 pages
Introduction To NCBI Resources
No ratings yet
Introduction To NCBI Resources
39 pages
BIND - Biomolecular Interaction Network Database
No ratings yet
BIND - Biomolecular Interaction Network Database
6 pages
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
No ratings yet
"If You Can't Do Bioinformatics, You Can't Do Biology", J.D. Tisdall, 2003
12 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
13 pages
Ijbt 1 (1) 101-116
No ratings yet
Ijbt 1 (1) 101-116
16 pages
Bioinformatics Overview
100% (1)
Bioinformatics Overview
18 pages
Effective Practice For Eye Care Team Case Studies - Final
No ratings yet
Effective Practice For Eye Care Team Case Studies - Final
12 pages
BS 1710-1984 Pipeline Identification
No ratings yet
BS 1710-1984 Pipeline Identification
1 page
Rift-Type Basins
100% (1)
Rift-Type Basins
4 pages
202 07 Bioinformatics
No ratings yet
202 07 Bioinformatics
14 pages
Bio in For Matics
No ratings yet
Bio in For Matics
17 pages
Bioinformatics Note
No ratings yet
Bioinformatics Note
7 pages
Compensation and Benefits
No ratings yet
Compensation and Benefits
34 pages
Boron Family
No ratings yet
Boron Family
12 pages
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
No ratings yet
Bioinformatics - Group21 - Report - Application of Bioinformatics in Agriculture
11 pages
Sequential Batch Reactor - Manual and Design Guidelines
No ratings yet
Sequential Batch Reactor - Manual and Design Guidelines
27 pages
String Database
No ratings yet
String Database
10 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
7 pages
Ifc 100W
No ratings yet
Ifc 100W
8 pages
Elden Ring (Video Game) - Works Archive of Our Own
0% (1)
Elden Ring (Video Game) - Works Archive of Our Own
1 page
Ritwik Samanta-Resume
No ratings yet
Ritwik Samanta-Resume
4 pages
CHOLELITHIASIS Seminar-1
No ratings yet
CHOLELITHIASIS Seminar-1
10 pages
Population Biology Concept
No ratings yet
Population Biology Concept
18 pages
2-DNA RNA Protein Structure-22!07!2024
No ratings yet
2-DNA RNA Protein Structure-22!07!2024
28 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
Eaton 207379 P3 100 I5 SVB N en - GB
No ratings yet
Eaton 207379 P3 100 I5 SVB N en - GB
6 pages
Detect PH of Coca Cola, Fanta, Limca
No ratings yet
Detect PH of Coca Cola, Fanta, Limca
19 pages
6-DNA Repair Mechanism-07-08-2024
No ratings yet
6-DNA Repair Mechanism-07-08-2024
29 pages
30-Register Allocation and Assignment-14-07-2023
No ratings yet
30-Register Allocation and Assignment-14-07-2023
97 pages
Job Description Purchasing Manager
No ratings yet
Job Description Purchasing Manager
2 pages
8-Tca Cycle-29-07-2024
No ratings yet
8-Tca Cycle-29-07-2024
25 pages
11-Protein Information Resource (PIR) - 02-09-2024
No ratings yet
11-Protein Information Resource (PIR) - 02-09-2024
11 pages
Weg Motor 400V
No ratings yet
Weg Motor 400V
1 page
9-Arithmetic Instructions and Pin Diagram-17!01!2024
No ratings yet
9-Arithmetic Instructions and Pin Diagram-17!01!2024
26 pages
Book 2014 Samal Bioinformaticsmanual
No ratings yet
Book 2014 Samal Bioinformaticsmanual
120 pages
13-SCOP - Structural Classification of Proteins-06-09-2024
No ratings yet
13-SCOP - Structural Classification of Proteins-06-09-2024
21 pages
15-GO-gene Ontology-13-09-2024
No ratings yet
15-GO-gene Ontology-13-09-2024
27 pages
12 Blossum
No ratings yet
12 Blossum
10 pages
11-Machine Cycle, Assembling and Running An 8051 Program-19!01!2024
No ratings yet
11-Machine Cycle, Assembling and Running An 8051 Program-19!01!2024
9 pages
17-KEGG Pathway Database-19-09-2024
No ratings yet
17-KEGG Pathway Database-19-09-2024
17 pages
28-Code Generation - Issues-11-07-2023
No ratings yet
28-Code Generation - Issues-11-07-2023
15 pages
12-Protein Data Bank (PDB) - 05-09-2024
No ratings yet
12-Protein Data Bank (PDB) - 05-09-2024
12 pages
7-Pyruvate To Acetyl Coa-26!07!2024
No ratings yet
7-Pyruvate To Acetyl Coa-26!07!2024
14 pages
16-PROSITE-protein Function Pattern and Profile-16-09-2024
No ratings yet
16-PROSITE-protein Function Pattern and Profile-16-09-2024
13 pages
Drug Delivery Systems
No ratings yet
Drug Delivery Systems
60 pages
The+Horde+of+Counterwind Translation Sample
No ratings yet
The+Horde+of+Counterwind Translation Sample
19 pages
27-Peephole Optimization-08-07-2023
No ratings yet
27-Peephole Optimization-08-07-2023
9 pages
14-Pfam-Protein Family Database-12-09-2024
No ratings yet
14-Pfam-Protein Family Database-12-09-2024
7 pages
Cat 1
No ratings yet
Cat 1
2 pages
Hazard Statements Precautionary Statements
No ratings yet
Hazard Statements Precautionary Statements
22 pages
Worksheet 1 Answer Key
No ratings yet
Worksheet 1 Answer Key
2 pages
Icici Lombar Health Insurance Copy Apsara
No ratings yet
Icici Lombar Health Insurance Copy Apsara
1 page
eTOOL 5A
No ratings yet
eTOOL 5A
9 pages
Pi Pima 01 05 en Web
No ratings yet
Pi Pima 01 05 en Web
2 pages
Payslip Sep 2022-1
No ratings yet
Payslip Sep 2022-1
2 pages
Depression of Freezing Point1
No ratings yet
Depression of Freezing Point1
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

18-DIP - Database of Interacting Proteins-26-09-2024

Uploaded by

18-DIP - Database of Interacting Proteins-26-09-2024

Uploaded by

Databases for biological interactions

PPI network components

 Nodes represent proteins called signaling receptors.

 A line connecting two nodes is an edge

 Hub gene is a gene that is highly connected to other genes in a network,

• The BioGRID database architecture consists of three distinct components: (i)

• Searches publication by pubmed id, author name or by keyword terms.

• Neighborhood: Similar genomic context in different species suggest a similar function of

• DIP Nodes (proteins)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.