DM Trend Seminar
DM Trend Seminar
71762133046 – SUDHARSHINI R
71762133048 – SUTHARSHANA S S
CONTENTS
Trend analysis can also be used for time-series forecasting, ARIMA (auto-regressive
integrated moving average), long-memory time-series modeling, and autoregression are
popular methods for such analysis.
SEQUENTIAL PATTERN MINING IN
SYMBOLIC SEQUENCES
Symbolic sequences represent ordered sets of elements or events, found in various
applications like customer shopping sequences and biological sequences.
In bioinformatics, research predominantly focuses on the complex semantic meaning of
biological sequences.
Sequential pattern mining is extensively applied to symbolic sequences.
A sequential pattern is a frequent subsequence within a single sequence or a set of
sequences. It involves finding subsequences that occur frequently.
Algorithms for mining sequential patterns have been developed, including scalable
approaches and methods for mining closed sequential patterns.
SEQUENTIAL PATTERN MINING IN
SYMBOLIC SEQUENCES
User-specified constraints can reduce the search space in sequential pattern mining,
deriving only patterns of interest. This is referred to as constraint-based sequential
pattern mining.
Constraints can be relaxed or additional constraints enforced to derive different kinds of
patterns, such as enforcing gap constraints or deriving periodic sequential patterns.
Partial order patterns can also be mined by relaxing strict sequential ordering requirements.
Sequential pattern mining methodology can be extended to mining trees, lattices, episodes,
and other ordered patterns.
SEQUENCE CLASSIFICATION
Spatial data mining discovers patterns and knowledge from spatial data. Spatial data, in
many cases, refer to geospace-related data stored in geospatial data repositories.
Recently, large geographic data warehouses have been constructed by integrating thematic
and geographically referenced data from multiple sources. From these, we can construct
spatial data cubes that contain spatial dimensions and measures, and support spatial OLAP
for multidimensional spatial data analysis.
Spatial data mining can be performed on spatial data warehouses, spatial databases, and
other geospatial data repositories.
MINING SPATIOTEMPORAL DATA AND
MOVING OBJECTS
Spatiotemporal data mining is about extracting patterns and knowledge from data that spans both
space and time.
It's vital for understanding phenomena like urban evolution, weather patterns, and natural disasters.
With the rise of GPS, mobile devices, and digital mapping services, this field has grown
immensely.
Moving-object data, such as wildlife telemetry and vehicle GPS data, is a key focus.
This involves discovering relationships among moving objects, identifying movement patterns like
clusters and swarms, and analyzing periodic and trajectory patterns.
In essence, spatiotemporal data mining helps us comprehend complex spatial and temporal
dynamics across various fields, from ecology to transportation and beyond.
MINING CYBER-PHYSICAL SYSTEM
DATA
Cyber-physical systems (CPS) consist of interconnected physical and information components,
forming heterogeneous cyber-physical networks.
Data in CPS are dynamic, noisy, and contain rich spatiotemporal information, vital for real-time
decision-making.
Mining cyber-physical data involves linking current situations with vast information bases,
performing real-time calculations, and providing prompt responses.
Research in this field focuses on rare-event detection, anomaly analysis, reliability,
trustworthiness, effective spatiotemporal data analysis, and integrating stream data mining with
real-time automated control processes.
CPS and networks are expected to be ubiquitous, playing critical roles in modern information
infrastructure.
MINING MULTIMEDIA DATA
Multimedia data mining involves discovering patterns from multimedia databases containing
images, videos, audio, sequences, and hypertext data.
It integrates disciplines like image processing, computer vision, data mining, and pattern
recognition.
Key issues in multimedia data mining include content-based retrieval, similarity search,
generalization, and multidimensional analysis.
Multimedia data cubes incorporate additional dimensions and measures for multimedia
information.
Other topics in multimedia mining include classification, prediction analysis, association mining,
and specific techniques for video and audio data mining.
It's an interdisciplinary field with applications across various domains, facilitating the extraction of
valuable insights from multimedia collections.
MINING TEXT DATA
Text mining is an interdisciplinary field drawing from information retrieval, data mining,
machine learning, statistics, and computational linguistics.
It aims to extract high-quality information from text sources like news articles, emails,
blogs, and web pages.
This is achieved through discovering patterns and trends using statistical pattern learning,
topic modeling, and language modeling.
Typical text mining tasks include categorization, clustering, concept extraction, sentiment
analysis, summarization, and entity-relation modeling.
Other areas include multilingual mining, contextual analysis, and trust analysis.
Text mining finds applications in security, biomedical literature analysis, media analysis,
and customer relationship management.
MINING TEXT DATA
Various software and tools are available for text mining, often utilizing resources like
WordNet, Semantic Web, and Wikipedia to enhance understanding and analysis of text
data.
Overall, text mining plays a crucial role in extracting valuable insights from large volumes
of textual information across diverse domains.
MINING WEB DATA
Web mining is the application of data mining techniques to discover patterns, structures,
and knowledge from the Web.
According to analysis targets, web mining can be organized into three main areas:
1. Web content mining
2. Web structure mining
3. Web usage mining
WEB CONTENT MINING
Web content mining involves analyzing web content, including text, multimedia, and
structured data, to understand web pages' content and provide valuable information for web
search and analysis.
The surface web is indexed by typical search engines, while the deep web consists of
content not accessible through standard searches, often provided by underlying database
engines.
Extensive research has been conducted by academics, search engines, and web service
companies in web content mining.
However, concerns about privacy arise due to the potential disclosure of personal
information through web content mining. Privacy-preserving data mining techniques aim to
address these concerns by developing methods to protect individuals' privacy on the web.
WEB STRUCTURE MINING
Web structure mining is the process of using graph and network mining theory and
methods to analyze the nodes and connection structures on the Web.
It extracts patterns from hyperlinks, where a hyperlink is a structural component that
connects a web page to another location.
It can also mine the document structure within a page (e.g., analyze the treelike structure of
page structures to describe HTML or XML tag usage).
Both kinds of web structure mining help us understand web content.
WEB USAGE MINING
Web usage mining is the process of extracting useful information (e.g., user click streams)
from server logs.
It finds patterns related to general or particular groups of users; understands users’ search
patterns, trends, and associations; and predicts what users are looking for on the Internet.
It helps improve search efficiency and effectiveness, as well as promotes products or
related information to different groups of users at the right time.
Web search companies routinely conduct web usage mining to improve their quality of
service.
MINING DATA STREAMS
Stream data refers to continuously flowing data into a system, characterized by vast
volumes, dynamic changes, potential infinity, and multidimensional features.
Traditional database systems cannot store such data, and most systems can only read the
stream once sequentially, posing significant challenges for effective mining.
Techniques for handling stream data include using sliding windows or tilted time windows
to collect information, along with methods like microclustering, limited aggregation, and
approximation.
Applications of stream data mining span various domains such as real-time anomaly
detection in network traffic, botnets etc…
VISUAL AND AUDIO DATA MINING
Visual data mining leverages data and knowledge visualization techniques to extract
implicit and valuable insights from large datasets.
It harnesses the capabilities of the human visual system, which includes the eyes and the
brain's powerful processing and reasoning capabilities.
This approach effectively combines data visualization and data mining, integrating
techniques from computer graphics, multimedia systems, human-computer interaction,
pattern recognition, and high-performance computing.
In general, data visualization and data mining can be integrated in the following ways:
DATA VISUALIZATION
Visualization of data mining results is the presentation of the results or knowledge obtained
from data mining in visual forms.
DATA MINING PROCESS
VISUALIZATION
This type of visualization presents the various processes of data mining in visual forms so
that users can see how the data are extracted and from which database or data warehouse
they are extracted, as well as how the selected data are cleaned, integrated, preprocessed,
and mined.
Moreover, it may also show which method is selected for data mining, where the results
are stored, and how they may be viewed.
INTERACTIVE VISUAL DATA MINING
In (interactive) visual data mining, visualization tools can be used in the data mining
process to help users make smart data mining decisions.
AUDIO DATA MINING
Audio data mining uses audio signals to indicate the patterns of data or the features of data
mining results.
Although visual data mining may disclose interesting patterns using graphical displays, it
requires users to concentrate on watching patterns and identifying interesting or novel
features within them.
If patterns can be transformed into sound and music, then instead of watching pictures, we
can listen to pitchs, rhythm, tune, and melody to identify anything interesting or unusual.
Audio data mining is an interesting complement to visual mining.