0% found this document useful (0 votes)
5 views9 pages

Artificial Intelligence and Innovative A

The document discusses the integration of artificial intelligence (AI) in web mining, highlighting its role in enhancing data retrieval and manipulation from the vast amounts of unstructured and semi-structured data available online. It reviews various methodologies and techniques used in web mining, including web usage, content, and structure mining, and emphasizes the importance of AI in improving the efficiency of these processes. The systematic review concludes that AI is essential for effective web mining, as it enables intelligent data processing and enhances user experience in navigating the web.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views9 pages

Artificial Intelligence and Innovative A

The document discusses the integration of artificial intelligence (AI) in web mining, highlighting its role in enhancing data retrieval and manipulation from the vast amounts of unstructured and semi-structured data available online. It reviews various methodologies and techniques used in web mining, including web usage, content, and structure mining, and emphasizes the importance of AI in improving the efficiency of these processes. The systematic review concludes that AI is essential for effective web mining, as it enables intelligent data processing and enhances user experience in navigating the web.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Artificial Intelligence and Innovative Approaches to Web Mining

S.Jaiganesh1, Dr.L.R.Aravind Babu2*


1
Research Scholar Department of Computer and Information Science, Annamalai University,
Annamalai Nagar, Tamil Nadu, India,
jganesh0@gmail.com
2
Department of Computer and Information Science ,Head of the department, Annamalai University,
Annamalai Nagar, Tamil Nadu, India,
er.arvee@rediffmail.com
ABSTRACT

Artificial intelligence used in web mining is known as web intelligence. achieving


sufficient intelligence for the web. The majority of the data on the internet is noisy, semi-
structured, and unstructured, and it comes in the form of text, audio, video, and other file types.
The requirement for various sorts of information in many aspects fluctuates as data accumulates.
Web mining arises from many uses of data on the internet. The web replaced data warehouses,
which housed massive amounts of use, content, and structurally-mined information. Because of
the vast amount of data on the online, intelligent ways to information retrieval and manipulation
are required. Specifically, the area of artificial intelligence must be integrated with web data
using a variety of methodologies.

In web mining, AI comes in several forms. The optimization level determined by each
technique's algorithms. For example, search engine best intelligence may be used by web agents
working on behalf of users to find and filter necessary information. Trusting the most popular
page and giving priority to internal and external communication were functions of AI algorithms
such as Page Ranking. The ability to personalize web publications allowed for more information
to be gathered for user identification and classification from web server logs, allowing for
further study. Despite the volume of online data, query search should not need a lot of
processing time. AI was helpful in meeting demands for information retrieval through sequential
and parallel processing. Combining machine learning methods with artificial intelligence
algorithms produced an ideal way to monitor, investigate, and forecast user request. As artificial
intelligence (AI) develops, business intelligence (BI) will eventually integrate more with web
data than big data, which combines observation, interaction, and transactional data.

KEYWORDS: Artificial intelligence, personalization, adaptive web, web intelligence, web


mining, web content mining, web use mining, and web structure mining.

I. INTRODUCTION

The integration of AI with web mining is evaluated in this systematic review. The diverse
integration of AI in web mining enabled a collection of computers on the WWW/web to
communicate with one another and respond appropriately to stimuli. The following is an
expression of the research question: RQ: What role does AI play in web mining? The purpose of
the systematic review was to examine the impact of AI on the three different forms of web
mining. In other words, AI enhanced the intelligence of the web or rendered it clever enough to
function well. The area of computer science called artificial intelligence is focused on teaching
machines to act like people [6]. As of right now, Web mining may be understood as the
application of data mining methods to automatically retrieve, extract, and analyze information
[10].

Table 1. Steps in Evolution of Web Mining [12].

Evolutionary Step Enabling Technologies

Data Collection(1960s) “Computer, Tapes, Disks

Data Access(1980s) Relational Database(RDBMS), Structure


Query language (SQL),ODBC

Data Warehousing &Decision On-Line Analytic Processing (OLAP),


Support(1990s) Multidimensional Databases, Data warehouses

Data Mining (2000s) Advanced algorithms, Multiprocessor


Computers, Massive Databases

Web Mining(Emerging Today) WWW, Internet, monumental scale Database

Fig 1.Classification of Web Mining [12].


The Web mining analysis relies on three general sets of information: previous usage patterns,
degree of shared content and inter-memory associative link structures corresponding to the three
subsets in Web mining namely:

Web usage mining,

Data collection, preprocessing, pattern identification, and analysis are the four processing
phases that make up this procedure. Pattern Recognition: Based on the data preprocessing,
methods such as machine learning and data mining are used to find patterns and apply the
information that has been found. The step that follows pattern discovery is called pattern
analysis. Its purpose is to verify that the pattern found on the internet is accurate and to teach you
how to use it to extract information from the web and conduct online searches.

Web content mining

Web content mining is the process of extracting valuable information from web content,
such as text, images, audio, video, etc. There are two methods used in web content mining:
agent-based approach and database approach. The two methods involve the use of three different
types of agents: intelligent search agents, information filtering/categorizing agents, and
personalized web agents. Intelligent search agents use user profiles and domain characteristics to
automatically search for information based on specific queries. Information agents filter data
using a variety of techniques. Adapted web agents learn user preferences and uncover documents
related to those user profiles. The database approach comprises well-formed databases with
attributes and schemas that are defined.

Unstructured text mining, structured text mining, semi-structured text mining, and
multimedia mining are the four methods used in web content mining to extract data.
The implementations of the findings have led to two trends in web use mining: Customized
Usage Tracking and General Access Pattern Tracking [4].

Web structure mining.

Graphs may be used to specify web structure mining. Links constitute edges, while web
pages are represented as nodes. In essence, it illustrates the interaction between the user and the
web. The goal of web structure mining is to provide organized summaries of data from online
pages.[12]

Two perspectives have been taken into consideration while studying the Web: its
semantics and graph structure. Research on Web structures look at a number of structural
characteristics of graphs that emerge from the Web, such as the graph of hyperlinks and the
graph created by interconnections between dispersed search servers. Not only is the study of the
Web as a graph intriguing in and of itself, but it also provides important insights into the social
phenomena that characterize the growth of the Web and the algorithms used for crawling,
searching, and community finding. Tim Berners-Lee, the guy of the World Wide Web, started
researching the semantics of the Web. The term "semantic Web" refers to a version of the Web
where data may be processed by machines in ways that encourage the use of intelligent network
services like search agents and information brokers [1].

Hyper Link

Web Document

Fig. 3. Web graph structure [11].

Link structures, with or without link descriptions, serve as the foundation for web
structure mining. The Markov chain model is a valuable tool for classifying webpages and
generating relationships and similarities between various websites. Creating organized
summaries of websites and online pages is the aim of web structure mining. It analyzes and
describes HTML and XML using a structure like a tree.

Web mining is an online endeavor. Web mining data is saved in server databases and web
logs, whereas data mining data is stored in data warehouses (databases) [12].

A web user navigates the internet without knowing the path taken by other users who
have the same goal, much like ants who do not have a global perspective of their surroundings.
Target page and route information can be stored on a dedicated server so that an ACO algorithm
can be used to identify the shortest path to a given document or cluster of documents [2].

The challenge of adaptable websites is defined as follows in [Perkowitz and Etzioni, 1998]:
“Adaptive websites are websites that automatically improve their organization and presentation
by learning from visitor access patterns” [3]. Adaptive websites are an intriguing and demanding
topic.Online page ranking is the primary component of all data. retrieval framework. Web search
engines are viewed as acting as a middleman between users and information sources. Look for
To display web pages, engines employed crawler, spider, and indexer programs [7].
Discussions on AJAX and Deep Web Crawling are interesting and provide opportunity for more
precise and in-depth web crawling, claim Tilak Patidar and Aditya Ambasth (2016). By giving
the spiders human-like selection intelligence, this moves in the direction of artificial intelligence
[9].
According to some definitions, web intelligence is the use of cutting-edge ARTIFICIAL
INTELLIGENCE and INFORMATION TECHNOLOGY approaches to the investigation,
analysis, and knowledge extraction of web data. WI examines the essential functions and real-
world applications of sophisticated IT and artificial intelligence (AI) on the upcoming generation
of web-related systems, services, and activities [6].

II. METHODS

Information Sources

Books, journals, conferences, theses, webpages, presentations, and technical reports were
among the information sources that were searched for more research. Public electronic databases
including IEEE, ACM, Science Direct, Citeseer, GOOGLE, and Google Scholar were also
included. Having an electronic or paper copy of "AI in web mining" is more difficult. It must
connect to several domains. Even artificial intelligence (AI) is a relatively recent and advanced
topic. Web intelligence is the same. Nonetheless, the term clarifies an applied AI in the web
enough. Although AI is a broad field of research, online intelligence is created when it combines
with web mining. Examining the many forms of web mining that are offered, with each sort of
AI algorithm adhering to the fundamental idea of online data mining and processing in a unique
method.

Study Selection

The research subject was thoroughly examined while choosing the studies, and terms that
were connected to it, such online intelligence, were also employed. The following check lists for
inclusion and exclusion criteria were used in the review paper selection process.

i. Inclusion Criteria

 The major research must cover artificial intelligence (AI) methods, machine learning
(ML) approaches, and deep learning (DL) or neural networks in relation to at least one
 web mining classification.
 The primary studies focus on discussing the study topic as the main point, either
completely or partially.
 Unpublished, but discusses classification of web mining and includes useful concepts in
the field of knowledge.
 ISSN but no publication year.

ii. Exclusion Criteria


 TA publication that discusses AI methods, ML (Machine Learning), and DL (Deep
Learning), either separately or in combination, without mentioning
 the idea of a web mining application. Given that Machine Learning and Deep Learning
are subsets of AI.
 A duplicate of the data included in the other main investigations.
 The name of the paper's author is not stated.
 It's either a magazine or book.
iii.Data Extraction

Using the study selection data taken from each primary study that the author referenced
in the systematic review, further analysis was conducted. Read the document line by line to get
every detail and grasp the idea. Any associated idea that enhances the research, as stated by the
author, or as specified by the IEEE citation standard.

iv.Quality Assessment

Eleven publications were evaluated; three were unpublished but nonetheless valuable
domain knowledge inclusions, and nine were published in reputable science journals. Since Web
intelligence is still a relatively young area of study, it might be difficult to find publications that
are too closely related to the topic being studied. Due to the use of document analysis,
experimenter and subjective professional knowledge bias may not be present to the extent that it
poses a danger to validity. Every single paper that is mentioned in the references section. All of
the papers' material is expertly structured by combining information to a certain extent and
serving as a roadmap for more research and analysis.

v.Data Synthesis

Information mostly derived from identifying the root terms where it matters most for the
study's detailed description and applicability. But in order to demonstrate the study's goal of
tackling the challenge of "AI in Web Mining," each detail was inferred from the overall notion of
providing intelligence to the web (web intelligence).

RESULTS

Twelve publications were chosen after meeting all of the requirements for the final
systematic review. nearly all of the papers written by IT domain specialists. Using document
analysis as strategies to investigate the goal of the study subject, which is how artificial
intelligence (AI) is used to make current information retrieval considerably simpler than any user
would experience. Without various strategies used at various levels, DRIP—Data Rich
Information Poor—may undoubtedly occur. Numerous online intelligence characteristics are
covered, making it easier to remember and see how the web is rapidly becoming self-managing.
Ioana Moisil [2] claims that in order to function effectively, artificial intelligence techniques and
algorithms are used in nearly all web mining operations.

Finding and analyzing patterns in unstructured material, mining structured text Among the tasks
included in the web mining process are multimedia mining, semi-structured text mining, and
mining.Let's continue by enumerating a handful of the numerous strategies from the original
research once more: Markov Chain Model, SVM, Naïve Bayes, Clustering, and Classification.
Adaptive websites, web agents, web page rankings, web search engines, crawlers, spiders,
indexer programs, and web crawling are all products of the use of these AI approaches on the
web Clustering and classification are used in web content mining, AJAX crawling, and deep web
crawling. The data mining techniques of clustering andclassification operate objectively on the
basis of similarity and attribute selection, respectively. Not only are significant web pages
identifiable, but people with similar interests may also be recognized by using the same clusters
of connected pages. Pages were retrieved according to content similarity up to 1996 [11].
Crawlers are targeted, relevant, and universal. The most advanced type of topical crawlers are
adaptive ones, which are made with various machine learning approaches, namely classifiers to
direct them over the web [2]. The matrix of link strengths was computed based on the co-
occurrence of links in web pages (user choices) [2]. Association is based on co-occurrence or
frequency.Three data mining approaches are used in web usage mining: clustering, classification,
various methods of association. Techniques for association and categorization are used in web
structure mining. Clearly, machine learning approaches include association, classification, and
clustering. Artificial intelligence has a subset called machine learning.

Fig. 4. Flow Diagram of the study selection.


DISCUSSIONS

Compared to the period of time since its inception in the 1950s, the application of AI in
web mining has advanced significantly in recent years. Even with proactive measures to do
online mining without artificial intelligence, managing web documents becomes challenging.
Amazing outcomes are obtained by utilizing AI in online mining when data output throughout
the web service increases excessively. Nearly all of the articles discuss the benefits of online
intelligence, but security concerns must be addressed. Let's evaluate the investigation conducted
as a result of the systematic review as follows:

The original research served as the basis for the systematic review study, which was
structured to provide a concise and detailed explanation of how various AI algorithms and
strategies led to the development of intelligence in the web. As we have discussed the role of AI,
AI enabled the web to function by endowing things with the capacity for rational or logical
thinking. with wisdom. AI provided the web with the dynamic qualities that come naturally to it.
The internet's hunt for patterns—the relationships between data—makes too many applications
for addressing problems, ranging from easy to difficult. In the instance of web intelligence, the
worldwide firm benefited too much to maintain

its competitiveness in the market and implement a strategy for maximum profit through
business analytics. The internet indirectly aided in research and development to improve
humankind's ability to survive. This work made some contributions to the investigation of
taxonomies and ontologism in the recently developing field of online intelligence.

CONCLUSION

As the amount of online data grows, artificial intelligence (AI) must be used in web
mining for best results. As the globe was constructed Imagine web mining in this day and age
without artificial intelligence (AI), and society will collapse. AI enhances mankind rather than
replaces it. The results of the main and secondary research demonstrated that although data
quantity rose geometrically, AI is occasionally applied to web mining to create an environment
that makes it possible for people to engage online.

AI transformed the way people see the world by altering the online conversation in
tandem with creative heavyweights. The goal of the systematic review was to examine how
artificial intelligence is used in web mining, and the research topic was suggested by the
extensive and rich data triangulation. The principal concerns in the Future web creation can be
corrected by building on the successful outcomes in the field of web intelligence that have been
attained recently. Security and privacy concerns are better left until later. Based on the factual
information gathered from sources, the study combined these domains.
REFERENCES

[1] Y.Y. Yao, Ning Zhong, Jiming Liu, and Setsuo Ohsuga,“Web Intelligence (WI), Research
Challenges and Trends in the New Information Age”, unpublished.

[2] IOANA MOISIL, “Advanced AI Techniques for Web Mining,” MATHEMATICAL


METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEARSYSTEMS, INTELLIGENT
SYSTEMS, ISSN: 1790-2769.

[3] Oznur Kirmemis Alkan and Pinar Senkul, “IntWEB: An AI-Based Approach for Adaptive
Web”, unpublished.

[4] R.Malarvizhi and K.Saraswathi, “Web Content Mining Techniques Tools & Algorithms – A
Comprehensive Study,”International Journal of Computer Trends and Technology(IJCTT),
volume 4, August 2013.

[5] Pradnyesh Bhisikar and Prof. Amit Sahu, “Overview on Web Mining and Different
Technique for Web Personalisation,” Applications (IJERA) ISSN: 2248-9622,Vol. 3, pp.543-
545, March -April 2013.

[6] Rahul Pareek, “Web Intelligence-An Emerging vertical of Artificial Intelligence,”


International Journal of Engineeringand Computer Science, ISSN: 231-7242, Voil. 3, pp. 9430-
9436, December 2014.

[7] Seema Rani and Upasana Garg, “A Ranking Of Web Documents Using Semantic Similarity
And Artificial Intelligence Based Search Engine,” International Journal of Science, Engineering
and Technology Research (IJSETR), Volume 3, December 2014.

[8] Subhendu kumar pani, Deepak Mohapatra, and Bikram Keshari Ratha, “Integration of Web
mining and web crawler:Relevance and State of Art,” International Journal onComputer Science
and Engineering Vol. 02, No. 03, 772-776,2010.

[9] Tilak Patidar and Aditya Ambasth, “Improvised Architecture for Distributed Web Crawling,”
International Journal of Computer Applications (0975 – 8887), Volume151, October 2016.

[10] Tulasi Gayatri Devi, and Aparna KS, “A Survey on Web Mining: Overview, Techniques,
Tools, and Applications,”International Journal for Research in Applied Science & Engineering
Technology (IJRASET), Volume 4, January2016.

[11] K.Harish Kumar, “A Study on Web Mining Types and Applications,”International Journal
of Trend in Research and Development, Volume 3(5), ISSN: 2394-9333.

[12] Kavita Sharma, Gulshan Shrivastava, and Vikas Kumar,“Web Mining: Today and
Tomorrow”, unpublished.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy