0% found this document useful (0 votes)

72 views9 pages

Rijul Research Paper

This document summarizes emerging research fields in database management systems. It discusses automatic database administration and tuning, data mining from data streams, information retrieval from both structured and unstructured data, and information extraction from medical reports. The document outlines challenges in modeling and representing changes in data streams, developing mining methods for streams, and interactively exploring changes. It also discusses challenges in data integration like supporting interoperability between different data models and sources, and approaches for modeling source contents and queries like local-as-view and global-as-view.

Uploaded by

Rijul Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views9 pages

Rijul Research Paper

Uploaded by

Rijul Chauhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Emerging research fields across database

management systems
Rijul Chauhan
Department of Computer Science Engineering
Delhi Technological University, New Delhi, India
rijulchauhan4@gmail.com

Abstract
The database community is exploring administration. There are many research
continuously multidisciplinary research fields so problems unsolved in this area. First, very
general-purpose databases can now support little work has been done in automatically
multiple data models, extend capabilities such tuning system parameters and it is
as spatial and graph, and support data challenging to predict the system
virtualization, distributed storage, and in-memory performance after changing such
storage. These new research avenues become parameters. Second, very little is known on
more evident after studying the papers written how to adjust the systems with the change
by various doctoral students and professors in workload. Third, knowing various features
from various universities worldwide. This paper to tune, it remains challenging to identify
surveys the emerging fields published by various systems bottleneck.
universities and professors. 1.2 Data Mining Changes from Data
Streams
The science of discovering meaningful
1.SURVEY OF TOPICS knowledge in data, has always been a core
area of research. A large number of
Our survey divides the topic into different emerging applications such as network
research domains. We start with automatic flowing analysis, e-business and stock
database administration and tuning followed by market online analysis, have to handle
data mining changes from data streams and various data streams. It is demanding to
information retrieval and extraction. conduct advanced analysis and data mining
over fast and large data streams to capture
1.1 Automatic Database Administration the trends, patterns, and exceptions.
and Tuning Recently, some interesting results have
Today’s database systems all have been reported for modelling and handling
numerous features, making it very difficult to data streams such as monitoring statistics
choose these features toward the need of over streams and query answering. Previous
the specific applications using them. For studies (e.g., [21, 29]) argue that mining
example building indexes and data streams is challenging in the following
performance on a given query workload, but two respects. On the one hand, random
it is often very difficult to select the access to fast and large data streams may
necessary indexes and views because such be impossible. Thus, multi-pass algorithms
decisions depends on how these queries are (i.e., ones that load data items into main
executed. As the cost of hardware has memory multiple times) are often infeasible.
dropped dramatically, thus the cost for Another problem in the area of data mining
human to tune and manage the database is frequent subgraph mining. Interesting
systems often dominates the cost of research problems on mining changes in
ownership. To reduce such cost, it is data streams can be divided into three
desirable to automate database tuning and
categories: modelling and representation of
changes, mining methods, and interactive 1.4 Information extraction
exploration of changes. The solutions Information extraction, in its widest sense, is
toward these problems are outlined, the extraction of structured data from
together with the preliminary experimental unstructured data. One domain where large
validation, focusing on query optimizations corpora of unstructured text would
and time complexity. To the best of our particularly benefit from information
knowledge, the above problems have not extraction is the medical domain. A medical
been researched systematically so far. By report is a natural language description of
no means is the above list complete. We diagnoses, treatments or medications,
believe that thorough studies on these together with structured information about
issues will bring about many challenges, the patient. The goal is to extract a
opportunities, and benefits to stream data chronology of events from the reports. Such
processing, management, and analysis. chronologies can then be used to review a
1.3 Information retrieval patient’s history or to gather statistical data
Recent years have seen an opening up of about the effectiveness or consequences of
the border between the database research medical treatments. The task is challenging,
and information retrieval. Storing data and because the reports use medical jargon and
querying data go hand in hand, and this colloquial temporal expressions (“two days
applies to both structured data and ago”). The paper conducts two initial case
unstructured data. In many scenarios, studies: In the first, machine learning on
queries can be classified into “lookup medical reports is used to determine
searches” and “exploratory searches”. In whether patients qualify for Leukemia trials.
lookup searches, users “look up” details on In the second, a bio-specimen repository is
topics known to them; in exploratory augmented with data from medical reports.
searchers they “explore” new information. This additional data facilitates the
One area of Information Retrieval that is classification of tissue probes and also
gaining more and more attention lately is the information retrieval on the specimen
exploitation of the deep Web. Tjin-Kam-Jet database. Intensive research has been
proposes to address this challenge in a conducted on challenges that arise in data
distributed environment. Given the fact that integration. The first challenge is how to
the deep Web is up to two orders of support interoperability of sources, which
magnitude larger than the surface Web, they have different data models (relational, XML,
argue that distribution might be the key to etc.), schemas, data representations, and
scalability. This work proposes to querying interfaces. Wrapper techniques
automatically convert free-text queries to have been developed to solve these issues.
structured queries for complex web forms in The second challenge is how to model
order to make the deep web more easily source contents and user queries, and two
searchable. Challenges include developing approaches have been widely adopted. In
a formal query description syntax, the local-as-view (LAV) approach, a
translating queries with correct collection of global predicates are used to
interpretation, bridging the gap between describe source contents as views and
user expectations and system capabilities, formulate user queries. Given a user query,
adapting query description for resource the mediation system decides how to
selection, ranking top k resources, merging answer the query by synthesizing source
results from resources to maximize views, called answering queries using views.
precision, recall and suggestion ranking for Many techniques have been developed to
users with respect to resources. He aims to solve this problem and these techniques can
evaluate this solution with a prototype also be used in other database applications
system and user studies, criteria being such as data warehousing and query
processing time and user satisfaction. optimization. Another approach to data
integration, called the global-as-view (GAV),
assumes that user queries are posed network can be straightforwardly
directly on global views that are defined on constructed: each observed variable is
source relations. In this approach, a query represented as a “node” in the network, and
plan can be generated using a view- any pair of nodes with a “1” in the adjacency
expansion process. Researchers mainly matrix is given an “edge” or connection
focus on efficient query processing in this between them. Note that the choice of
case. The third challenge is how to process threshold is a controversial one and could
and optimize queries when sources have have a significant effect on the structure of
limited query capabilities. For instance, the the resultant network.81 The choice of
Amazon.com source can be viewed as a threshold may depend on several factors:
database that provides book information. the size of the sample from which the data
However, we cannot easily download all its was drawn, the choice of type I error rate,
books. Instead, we can query the source by the density of the resulting network, and the
filling out Web search forms and retrieving domain from which the data was drawn.
the results. Studies have been conducted on Network metrics should ideally be applied
how model and compute source capabilities, across a range of thresholds to demonstrate
how to generate plans for queries, and how the result is not based on an arbitrary
to optimize queries in the presence of limited threshold determination. Fortunately, most
capabilities. network scientists are sensitive to this issue,
1.5 Network Analysis and many networks have been observed to
Network analysis is a set of techniques have robust community structure across a
derived from network theory, which has range of thresholds. Network
evolved from computer science to Analysis in GIS is based on the
demonstrate the power of social network mathematical sub-disciplines of graph theory
influences. Using network analysis in and topology. Any network consists of a set
domain analysis can add another layer of of connected vertices and edges. Graph
methodological triangulation by providing a theory describes, measures, and compares
different way to read and interpret the same graphs or networks. Topological properties
data. The use of network analysis in of networks are: connectivity, adjacency,
knowledge organization domain analysis is and incidence. These properties serve as a
recent and is just evolving. The visualization basis for analysis. A simple example of a
technique involves mapping relationships network in GIS can be streets, power lines,
among entities based on the symmetry or or city centerlines.
asymmetry of their relative proximity. 1.6 Role of Database management
Network analysis can also be illustrated in a system in GIS
series of steps: choosing a threshold, Since the early ‘90, Geographical
applying the threshold to a correlation matrix Information System (GIS) has become a
to produce an adjacency matrix, and sophisticated system for maintaining and
producing the network from the adjacency analysing spatial and thematic information
matrix. Like factor analysis, network analysis on spatial objects. DBMSs are increasingly
can begin with a correlation matrix of important in GIS, since DBMSs are
associations among a set of observed traditionally used to handle large volumes of
variables.In the first and second step, a data and to ensure the logical consistency
threshold is chosen and applied to “binarize” and integrity of data, which also have
or “dichotomize” the correlation matrix, become major requirements in GIS. Today
creating an adjacency matrix. Correlations spatial data is mostly part of a complete
with an absolute value above the threshold work and information process. In many
are given a “1” and those below are given a organisations there is a need to implement
“0.” (The binarization process is optional; an GIS functionality as part of a central
alternative, although a computationally more Database Management System (DBMS), at
complex option, is to construct a weighted least at the conceptual level, in which spatial
network). From the adjacency matrix, a data and alphanumerical data are
maintained in one integrated environment.
Consequently DBMS occupies a central
place in the new generation GIS
architecture. A lot of progress is observed in
the management of spatial and non-spatial
information for objects in one integrated
DBMS environment, called a geo-DBMS.
The OpenGeospatial Consortium (OGC)
largely contributed to this progress. The
OpenGeospatial Consortium adopted the
ISO 19107 international standard (ISO
2001) as Topic 1 of the Abstract
Specifications: Feature Geometry99. These
Abstract Specifications provide conceptual
schemas for describing the spatial
characteristics of spatial objects (geographic
Cardcop launches GIS data management tools
features, in OGC terms) and a set of spatial
operations consistent with these schemas
and with vector geometry and topology up to
three dimensions embedded in 3D space.
According to the specifications, the spatial
object is represented by two structures, i.e.,
geometrical structure, i.e., simple feature,
and topological structure, i.e., complex
feature. While the geometrical structure
provides direct access to the coordinates of
individual objects, the topological structure
encapsulates information about their spatial
relationships. Currently, no 3D primitive is
implemented. However, most DBMSs,
including Oracle, Postgres, IBM, Ingres,
Informix, support the storage of simple
features in 3D space. In general, it is
possible to store for example a polygon in
3D. 3D volumetric objects can be stored in a
geometrical model as polyhedrons using 3D
polygons, i.e., a body with flat faces, in two Topological structure in the spatial DBMS of the Netherlands
ways: as a set of polygons or as
multipolygon.

Techniques, Implementations and
Applications. MIT Press, June 1999.
[9] K. S. Candan, W.-S. Li, Q. Luo, W.-P.
Hsiung, and D. Agrawal. Enabling
2. ACKNOWLEDGEMENT dynamic content caching for
Author wishes to sincerely acknowledge the databasedriven web sites. In Proc. of
guidance and support of mentors, Ms. Ayushi the 2001 ACM SIGMOD Intl. Conf. on
Vijhani and Professor KC Tiwari, Management of Data, Santa Barbara,
Multidisciplinary Centre for Geoinformatics, California, USA, May 2001.
Delhi Technological University, New Delhi. [10] S. Saltenis, C. Jensen, S. Leutenegger,
and M. Lopez. Indexing the positions of
3. REFERENCES continuously moving objects. SIGMOD,
2000.
[1] A. Rajaraman, Y. Sagiv, and J. D. [11] M. Hadjieleftheriou, G. Kollios, and V.
Ullman. Answering queries using Tsotras. Performance evaluation of
templates with binding patterns. spatio-temporal selectivity techniques.
[2] Surajit Chaudhuri and Vivek R. SSDBM, 2003.
Narasayya. Index merging. In [12] J. Gehrke, F. Korn, and D. Srivastava.
ICDE’1999, pages 296–303, 1999. On computing correlated aggregates
[3] ] P. Domingos and G. Hulten. Mining over continuous data streams. In Proc.
high-speed data streams. In Proc. 2000 2001 ACM-SIGMOD Int. Conf.
ACM SIGKDD Int. Conf. Knowledge Management of Data (SIGMOD’01),
Discovery in Databases (KDD’00), pages 13–24, Santa Barbara, CA, May
pages 71–80, Boston, MA, Aug. 2000. 2001.
[4] D. Zhang, A. Markowetz, V. J. Tsotras, [13] M. Vazirgiannis, Y. Theodoridis, and T.
D. Gunopulos, and B. Seeger. Efficient Sellis. Spatio-temporal composition and
computation of temporal aggregates indexing for large multimedia
with range predicates. In ACM applications. Multimedia Systems,
International Symposium on Principles 6(4):284–298, 1998.
of Database Systems (PODS), 2001. [14] D. Quass, A. Gupta, I. S. Mumick, and J.
[5] D. Zhang, D. Gunopulos, V. J. Tsotras, Widom. Making views self-maintainable
and B. Seeger. Temporal aggregation for data warehousing. In Proc. of the
over data streams using multiple 1996 Intl. Conf. on Parallel and
granularities. In Proceedings of Distributed Information Systems, pages
International Conference on Extending 158–169, December 1996.
Database Technology (EDBT), 2002. [15]https://www.google.com
[6] P. B. Gibbons and Y. Matias. Synopsis
[16]https://www.ieee.org/conferences/pu
data structures for massive data sets.
DIMACS Series in Discrete Mathematics blishing/templates.html
and Theoretical Computer Science:
Special Issue on External Memory
Algorithms and Visualization, A:39–70,
1999.
[7] A. Nica and A. S. Varde, editors.
Proceedings of the Third Ph.D.
Workshop in CIKM, PIKM 2010,
Nineteenth ACM Conference on
Information and Knowledge
Management, CIKM 2010, Toronto,
Canada, Oct. 2010. ACM.
[8] Ashish Gupta and Inderpal Singh
Mumick, editors. Materialized Views:

Chi Gamma Cinderella Girl Sorority Official Handbook
No ratings yet
Chi Gamma Cinderella Girl Sorority Official Handbook
16 pages
Hifonics Atlas Subwoofer Manual
No ratings yet
Hifonics Atlas Subwoofer Manual
8 pages
Recent Progress On Selected Topics in Database Research
No ratings yet
Recent Progress On Selected Topics in Database Research
15 pages
Artigo - The Lowell Database Research Self Assessment
No ratings yet
Artigo - The Lowell Database Research Self Assessment
9 pages
The Claremont Report On Database Research
No ratings yet
The Claremont Report On Database Research
11 pages
A Brief Survey On Data Mining For Biological and Environmental Problems.
No ratings yet
A Brief Survey On Data Mining For Biological and Environmental Problems.
46 pages
Data Mining L-3,4
No ratings yet
Data Mining L-3,4
25 pages
TERM PAPER - DBMS N
No ratings yet
TERM PAPER - DBMS N
5 pages
Mining Databases: Towards Algorithms For Knowledge Discovery
No ratings yet
Mining Databases: Towards Algorithms For Knowledge Discovery
10 pages
Hci Unit 5
No ratings yet
Hci Unit 5
22 pages
Big Data Vs Data Mining: Abstract
No ratings yet
Big Data Vs Data Mining: Abstract
5 pages
Compusoft, 3 (10), 1108-115 PDF
No ratings yet
Compusoft, 3 (10), 1108-115 PDF
8 pages
A Review and Analysis of The Usability of Data Management Environments
No ratings yet
A Review and Analysis of The Usability of Data Management Environments
23 pages
A Study: Web Data Mining Challenges and Application For Information Extraction
No ratings yet
A Study: Web Data Mining Challenges and Application For Information Extraction
6 pages
DS Italy1.5
No ratings yet
DS Italy1.5
6 pages
Complete Doc - Lavanya
No ratings yet
Complete Doc - Lavanya
95 pages
SEM 4 MC0077 Advances Database System
No ratings yet
SEM 4 MC0077 Advances Database System
38 pages
DM - UNIT I
No ratings yet
DM - UNIT I
58 pages
Insert Your Titles and Guide Name: International Research Journal of Engineering and Technology (IRJET)
No ratings yet
Insert Your Titles and Guide Name: International Research Journal of Engineering and Technology (IRJET)
8 pages
On What Kind of Data Mining Task Can Be Performed? or Explain Different Data Repository On Which Data Mining Task Can Be Performed
No ratings yet
On What Kind of Data Mining Task Can Be Performed? or Explain Different Data Repository On Which Data Mining Task Can Be Performed
5 pages
Advances in Data Warehousing and OLAP in The Big Data Era
No ratings yet
Advances in Data Warehousing and OLAP in The Big Data Era
2 pages
Unit - 6
No ratings yet
Unit - 6
12 pages
Dataspaces: Dataspaces Are An Abstraction in
No ratings yet
Dataspaces: Dataspaces Are An Abstraction in
5 pages
Torna-Freze G M
No ratings yet
Torna-Freze G M
3 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
3 pages
Hci Unit 5 PDF
No ratings yet
Hci Unit 5 PDF
22 pages
JCR-Dr. G. Sanjay Gandhi1 56 - AL PDF
No ratings yet
JCR-Dr. G. Sanjay Gandhi1 56 - AL PDF
13 pages
1st Slides
No ratings yet
1st Slides
60 pages
ICDAR
No ratings yet
ICDAR
5 pages
Bi - Unit 3
No ratings yet
Bi - Unit 3
18 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Dataminingarticle PDF
No ratings yet
Dataminingarticle PDF
6 pages
History of Database Systems
No ratings yet
History of Database Systems
4 pages
Introduction To Modern Information Retrieval (2nd Edition) : Ali Shiri
No ratings yet
Introduction To Modern Information Retrieval (2nd Edition) : Ali Shiri
3 pages
Relational Databases and Beyond
No ratings yet
Relational Databases and Beyond
12 pages
Kuvempu University Data Warehousing
No ratings yet
Kuvempu University Data Warehousing
6 pages
Knowledge Discovery Analysis
No ratings yet
Knowledge Discovery Analysis
7 pages
Image Retrieval by Examples
No ratings yet
Image Retrieval by Examples
15 pages
User Specific Search Using Grouping and Organization
No ratings yet
User Specific Search Using Grouping and Organization
6 pages
Efficient Image Classification Using Data Mining
No ratings yet
Efficient Image Classification Using Data Mining
18 pages
DWDM Module II
No ratings yet
DWDM Module II
103 pages
John - Fields - HW1 Data Mining
No ratings yet
John - Fields - HW1 Data Mining
10 pages
Paper 5 - The Rise of NoSQL Systems Research and Pedagogy
No ratings yet
Paper 5 - The Rise of NoSQL Systems Research and Pedagogy
17 pages
Database Research Oooo
No ratings yet
Database Research Oooo
7 pages
Research Publish Journal
No ratings yet
Research Publish Journal
7 pages
Optimized Prediction of Hard Keyword Queries Over Databases
No ratings yet
Optimized Prediction of Hard Keyword Queries Over Databases
5 pages
Unit V 1
No ratings yet
Unit V 1
23 pages
1 Intor To DMW
No ratings yet
1 Intor To DMW
22 pages
Data Mining and Information Retrieval in The 21st Century - A Bibliographic Review
No ratings yet
Data Mining and Information Retrieval in The 21st Century - A Bibliographic Review
13 pages
Data Mining in The World of BIG Data-A Survey
No ratings yet
Data Mining in The World of BIG Data-A Survey
9 pages
What Motivated Data Mining? Why Is It Important?
No ratings yet
What Motivated Data Mining? Why Is It Important?
12 pages
1.1 Introduction To Data Mining: 1.1.1 Moving Toward The Information Age
No ratings yet
1.1 Introduction To Data Mining: 1.1.1 Moving Toward The Information Age
14 pages
Information Retrieval 1
No ratings yet
Information Retrieval 1
10 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
16 pages
Mathematical Programming For Data Mining: Formulations and Challenges
No ratings yet
Mathematical Programming For Data Mining: Formulations and Challenges
35 pages
IEEE Representation Learning
No ratings yet
IEEE Representation Learning
6 pages
A System For Keyword-Based Searching in Databases: N.L. Sarda Ankur Jain
No ratings yet
A System For Keyword-Based Searching in Databases: N.L. Sarda Ankur Jain
18 pages
Slicing A New Approach To Privacy Preserving Data Publishing
No ratings yet
Slicing A New Approach To Privacy Preserving Data Publishing
19 pages
Icse - Computer - Question Paper-4
No ratings yet
Icse - Computer - Question Paper-4
8 pages
IOM For Retractable Systems - KS - IOM - ENG - 007 - Rev1
No ratings yet
IOM For Retractable Systems - KS - IOM - ENG - 007 - Rev1
9 pages
Muslim Prayer Guide Part I
No ratings yet
Muslim Prayer Guide Part I
496 pages
2024-JEE Advanced Full Test-1 - Paper-2 - Solutions PDF
No ratings yet
2024-JEE Advanced Full Test-1 - Paper-2 - Solutions PDF
16 pages
Se3,5 120
No ratings yet
Se3,5 120
19 pages
The Processor Status and THR Flags Register
No ratings yet
The Processor Status and THR Flags Register
12 pages
Business Cases and Benefits Management
100% (2)
Business Cases and Benefits Management
66 pages
Note #1 - Substantive Test of Cash
No ratings yet
Note #1 - Substantive Test of Cash
11 pages
Ilnas-En Iso 14713-2:2020
No ratings yet
Ilnas-En Iso 14713-2:2020
8 pages
Parker Palmer Interview Leader To Leader
100% (1)
Parker Palmer Interview Leader To Leader
8 pages
Japanese Philosopher
No ratings yet
Japanese Philosopher
4 pages
Cost
No ratings yet
Cost
26 pages
VADUE Goes To Peru
No ratings yet
VADUE Goes To Peru
19 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
Maps The Crusher
No ratings yet
Maps The Crusher
32 pages
Unit 1
No ratings yet
Unit 1
35 pages
The Alkali Metals PowerPoint
No ratings yet
The Alkali Metals PowerPoint
15 pages
RDCAM Software Installation Manual
100% (1)
RDCAM Software Installation Manual
25 pages
anization+Theory+and+Public+Administration Final
No ratings yet
anization+Theory+and+Public+Administration Final
27 pages
SUB 00029-B03 v2.0
No ratings yet
SUB 00029-B03 v2.0
63 pages
Getting Higher Quality
No ratings yet
Getting Higher Quality
30 pages
Fill Data: Qno O1 O2 O3 O4 Ans1 Ans2 Ans3 Ans4
No ratings yet
Fill Data: Qno O1 O2 O3 O4 Ans1 Ans2 Ans3 Ans4
10 pages
Spring 2014
No ratings yet
Spring 2014
24 pages
Successful Backtesting of Algorithmic Trading Strategies - Part II - QuantStart PDF
0% (1)
Successful Backtesting of Algorithmic Trading Strategies - Part II - QuantStart PDF
4 pages
BERINGER PMP518M User Manual
No ratings yet
BERINGER PMP518M User Manual
11 pages
IS082IU-Syllabus of Retail Management
No ratings yet
IS082IU-Syllabus of Retail Management
9 pages
Android Practical File (IT-602)
50% (2)
Android Practical File (IT-602)
59 pages
Multi Rate Test
100% (2)
Multi Rate Test
39 pages
Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
No ratings yet
Natural Language Processing: Instructor: Dr. Muhammad Asfand-E-Yar
41 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Rijul Research Paper

Uploaded by

Rijul Research Paper

Uploaded by

Emerging research fields across database

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.