Artificial Intelligence and Innovative A
Artificial Intelligence and Innovative A
In web mining, AI comes in several forms. The optimization level determined by each
technique's algorithms. For example, search engine best intelligence may be used by web agents
working on behalf of users to find and filter necessary information. Trusting the most popular
page and giving priority to internal and external communication were functions of AI algorithms
such as Page Ranking. The ability to personalize web publications allowed for more information
to be gathered for user identification and classification from web server logs, allowing for
further study. Despite the volume of online data, query search should not need a lot of
processing time. AI was helpful in meeting demands for information retrieval through sequential
and parallel processing. Combining machine learning methods with artificial intelligence
algorithms produced an ideal way to monitor, investigate, and forecast user request. As artificial
intelligence (AI) develops, business intelligence (BI) will eventually integrate more with web
data than big data, which combines observation, interaction, and transactional data.
I. INTRODUCTION
The integration of AI with web mining is evaluated in this systematic review. The diverse
integration of AI in web mining enabled a collection of computers on the WWW/web to
communicate with one another and respond appropriately to stimuli. The following is an
expression of the research question: RQ: What role does AI play in web mining? The purpose of
the systematic review was to examine the impact of AI on the three different forms of web
mining. In other words, AI enhanced the intelligence of the web or rendered it clever enough to
function well. The area of computer science called artificial intelligence is focused on teaching
machines to act like people [6]. As of right now, Web mining may be understood as the
application of data mining methods to automatically retrieve, extract, and analyze information
[10].
Data collection, preprocessing, pattern identification, and analysis are the four processing
phases that make up this procedure. Pattern Recognition: Based on the data preprocessing,
methods such as machine learning and data mining are used to find patterns and apply the
information that has been found. The step that follows pattern discovery is called pattern
analysis. Its purpose is to verify that the pattern found on the internet is accurate and to teach you
how to use it to extract information from the web and conduct online searches.
Web content mining is the process of extracting valuable information from web content,
such as text, images, audio, video, etc. There are two methods used in web content mining:
agent-based approach and database approach. The two methods involve the use of three different
types of agents: intelligent search agents, information filtering/categorizing agents, and
personalized web agents. Intelligent search agents use user profiles and domain characteristics to
automatically search for information based on specific queries. Information agents filter data
using a variety of techniques. Adapted web agents learn user preferences and uncover documents
related to those user profiles. The database approach comprises well-formed databases with
attributes and schemas that are defined.
Unstructured text mining, structured text mining, semi-structured text mining, and
multimedia mining are the four methods used in web content mining to extract data.
The implementations of the findings have led to two trends in web use mining: Customized
Usage Tracking and General Access Pattern Tracking [4].
Graphs may be used to specify web structure mining. Links constitute edges, while web
pages are represented as nodes. In essence, it illustrates the interaction between the user and the
web. The goal of web structure mining is to provide organized summaries of data from online
pages.[12]
Two perspectives have been taken into consideration while studying the Web: its
semantics and graph structure. Research on Web structures look at a number of structural
characteristics of graphs that emerge from the Web, such as the graph of hyperlinks and the
graph created by interconnections between dispersed search servers. Not only is the study of the
Web as a graph intriguing in and of itself, but it also provides important insights into the social
phenomena that characterize the growth of the Web and the algorithms used for crawling,
searching, and community finding. Tim Berners-Lee, the guy of the World Wide Web, started
researching the semantics of the Web. The term "semantic Web" refers to a version of the Web
where data may be processed by machines in ways that encourage the use of intelligent network
services like search agents and information brokers [1].
Hyper Link
Web Document
Link structures, with or without link descriptions, serve as the foundation for web
structure mining. The Markov chain model is a valuable tool for classifying webpages and
generating relationships and similarities between various websites. Creating organized
summaries of websites and online pages is the aim of web structure mining. It analyzes and
describes HTML and XML using a structure like a tree.
Web mining is an online endeavor. Web mining data is saved in server databases and web
logs, whereas data mining data is stored in data warehouses (databases) [12].
A web user navigates the internet without knowing the path taken by other users who
have the same goal, much like ants who do not have a global perspective of their surroundings.
Target page and route information can be stored on a dedicated server so that an ACO algorithm
can be used to identify the shortest path to a given document or cluster of documents [2].
The challenge of adaptable websites is defined as follows in [Perkowitz and Etzioni, 1998]:
“Adaptive websites are websites that automatically improve their organization and presentation
by learning from visitor access patterns” [3]. Adaptive websites are an intriguing and demanding
topic.Online page ranking is the primary component of all data. retrieval framework. Web search
engines are viewed as acting as a middleman between users and information sources. Look for
To display web pages, engines employed crawler, spider, and indexer programs [7].
Discussions on AJAX and Deep Web Crawling are interesting and provide opportunity for more
precise and in-depth web crawling, claim Tilak Patidar and Aditya Ambasth (2016). By giving
the spiders human-like selection intelligence, this moves in the direction of artificial intelligence
[9].
According to some definitions, web intelligence is the use of cutting-edge ARTIFICIAL
INTELLIGENCE and INFORMATION TECHNOLOGY approaches to the investigation,
analysis, and knowledge extraction of web data. WI examines the essential functions and real-
world applications of sophisticated IT and artificial intelligence (AI) on the upcoming generation
of web-related systems, services, and activities [6].
II. METHODS
Information Sources
Books, journals, conferences, theses, webpages, presentations, and technical reports were
among the information sources that were searched for more research. Public electronic databases
including IEEE, ACM, Science Direct, Citeseer, GOOGLE, and Google Scholar were also
included. Having an electronic or paper copy of "AI in web mining" is more difficult. It must
connect to several domains. Even artificial intelligence (AI) is a relatively recent and advanced
topic. Web intelligence is the same. Nonetheless, the term clarifies an applied AI in the web
enough. Although AI is a broad field of research, online intelligence is created when it combines
with web mining. Examining the many forms of web mining that are offered, with each sort of
AI algorithm adhering to the fundamental idea of online data mining and processing in a unique
method.
Study Selection
The research subject was thoroughly examined while choosing the studies, and terms that
were connected to it, such online intelligence, were also employed. The following check lists for
inclusion and exclusion criteria were used in the review paper selection process.
i. Inclusion Criteria
The major research must cover artificial intelligence (AI) methods, machine learning
(ML) approaches, and deep learning (DL) or neural networks in relation to at least one
web mining classification.
The primary studies focus on discussing the study topic as the main point, either
completely or partially.
Unpublished, but discusses classification of web mining and includes useful concepts in
the field of knowledge.
ISSN but no publication year.
Using the study selection data taken from each primary study that the author referenced
in the systematic review, further analysis was conducted. Read the document line by line to get
every detail and grasp the idea. Any associated idea that enhances the research, as stated by the
author, or as specified by the IEEE citation standard.
iv.Quality Assessment
Eleven publications were evaluated; three were unpublished but nonetheless valuable
domain knowledge inclusions, and nine were published in reputable science journals. Since Web
intelligence is still a relatively young area of study, it might be difficult to find publications that
are too closely related to the topic being studied. Due to the use of document analysis,
experimenter and subjective professional knowledge bias may not be present to the extent that it
poses a danger to validity. Every single paper that is mentioned in the references section. All of
the papers' material is expertly structured by combining information to a certain extent and
serving as a roadmap for more research and analysis.
v.Data Synthesis
Information mostly derived from identifying the root terms where it matters most for the
study's detailed description and applicability. But in order to demonstrate the study's goal of
tackling the challenge of "AI in Web Mining," each detail was inferred from the overall notion of
providing intelligence to the web (web intelligence).
RESULTS
Twelve publications were chosen after meeting all of the requirements for the final
systematic review. nearly all of the papers written by IT domain specialists. Using document
analysis as strategies to investigate the goal of the study subject, which is how artificial
intelligence (AI) is used to make current information retrieval considerably simpler than any user
would experience. Without various strategies used at various levels, DRIP—Data Rich
Information Poor—may undoubtedly occur. Numerous online intelligence characteristics are
covered, making it easier to remember and see how the web is rapidly becoming self-managing.
Ioana Moisil [2] claims that in order to function effectively, artificial intelligence techniques and
algorithms are used in nearly all web mining operations.
Finding and analyzing patterns in unstructured material, mining structured text Among the tasks
included in the web mining process are multimedia mining, semi-structured text mining, and
mining.Let's continue by enumerating a handful of the numerous strategies from the original
research once more: Markov Chain Model, SVM, Naïve Bayes, Clustering, and Classification.
Adaptive websites, web agents, web page rankings, web search engines, crawlers, spiders,
indexer programs, and web crawling are all products of the use of these AI approaches on the
web Clustering and classification are used in web content mining, AJAX crawling, and deep web
crawling. The data mining techniques of clustering andclassification operate objectively on the
basis of similarity and attribute selection, respectively. Not only are significant web pages
identifiable, but people with similar interests may also be recognized by using the same clusters
of connected pages. Pages were retrieved according to content similarity up to 1996 [11].
Crawlers are targeted, relevant, and universal. The most advanced type of topical crawlers are
adaptive ones, which are made with various machine learning approaches, namely classifiers to
direct them over the web [2]. The matrix of link strengths was computed based on the co-
occurrence of links in web pages (user choices) [2]. Association is based on co-occurrence or
frequency.Three data mining approaches are used in web usage mining: clustering, classification,
various methods of association. Techniques for association and categorization are used in web
structure mining. Clearly, machine learning approaches include association, classification, and
clustering. Artificial intelligence has a subset called machine learning.
Compared to the period of time since its inception in the 1950s, the application of AI in
web mining has advanced significantly in recent years. Even with proactive measures to do
online mining without artificial intelligence, managing web documents becomes challenging.
Amazing outcomes are obtained by utilizing AI in online mining when data output throughout
the web service increases excessively. Nearly all of the articles discuss the benefits of online
intelligence, but security concerns must be addressed. Let's evaluate the investigation conducted
as a result of the systematic review as follows:
The original research served as the basis for the systematic review study, which was
structured to provide a concise and detailed explanation of how various AI algorithms and
strategies led to the development of intelligence in the web. As we have discussed the role of AI,
AI enabled the web to function by endowing things with the capacity for rational or logical
thinking. with wisdom. AI provided the web with the dynamic qualities that come naturally to it.
The internet's hunt for patterns—the relationships between data—makes too many applications
for addressing problems, ranging from easy to difficult. In the instance of web intelligence, the
worldwide firm benefited too much to maintain
its competitiveness in the market and implement a strategy for maximum profit through
business analytics. The internet indirectly aided in research and development to improve
humankind's ability to survive. This work made some contributions to the investigation of
taxonomies and ontologism in the recently developing field of online intelligence.
CONCLUSION
As the amount of online data grows, artificial intelligence (AI) must be used in web
mining for best results. As the globe was constructed Imagine web mining in this day and age
without artificial intelligence (AI), and society will collapse. AI enhances mankind rather than
replaces it. The results of the main and secondary research demonstrated that although data
quantity rose geometrically, AI is occasionally applied to web mining to create an environment
that makes it possible for people to engage online.
AI transformed the way people see the world by altering the online conversation in
tandem with creative heavyweights. The goal of the systematic review was to examine how
artificial intelligence is used in web mining, and the research topic was suggested by the
extensive and rich data triangulation. The principal concerns in the Future web creation can be
corrected by building on the successful outcomes in the field of web intelligence that have been
attained recently. Security and privacy concerns are better left until later. Based on the factual
information gathered from sources, the study combined these domains.
REFERENCES
[1] Y.Y. Yao, Ning Zhong, Jiming Liu, and Setsuo Ohsuga,“Web Intelligence (WI), Research
Challenges and Trends in the New Information Age”, unpublished.
[3] Oznur Kirmemis Alkan and Pinar Senkul, “IntWEB: An AI-Based Approach for Adaptive
Web”, unpublished.
[4] R.Malarvizhi and K.Saraswathi, “Web Content Mining Techniques Tools & Algorithms – A
Comprehensive Study,”International Journal of Computer Trends and Technology(IJCTT),
volume 4, August 2013.
[5] Pradnyesh Bhisikar and Prof. Amit Sahu, “Overview on Web Mining and Different
Technique for Web Personalisation,” Applications (IJERA) ISSN: 2248-9622,Vol. 3, pp.543-
545, March -April 2013.
[7] Seema Rani and Upasana Garg, “A Ranking Of Web Documents Using Semantic Similarity
And Artificial Intelligence Based Search Engine,” International Journal of Science, Engineering
and Technology Research (IJSETR), Volume 3, December 2014.
[8] Subhendu kumar pani, Deepak Mohapatra, and Bikram Keshari Ratha, “Integration of Web
mining and web crawler:Relevance and State of Art,” International Journal onComputer Science
and Engineering Vol. 02, No. 03, 772-776,2010.
[9] Tilak Patidar and Aditya Ambasth, “Improvised Architecture for Distributed Web Crawling,”
International Journal of Computer Applications (0975 – 8887), Volume151, October 2016.
[10] Tulasi Gayatri Devi, and Aparna KS, “A Survey on Web Mining: Overview, Techniques,
Tools, and Applications,”International Journal for Research in Applied Science & Engineering
Technology (IJRASET), Volume 4, January2016.
[11] K.Harish Kumar, “A Study on Web Mining Types and Applications,”International Journal
of Trend in Research and Development, Volume 3(5), ISSN: 2394-9333.
[12] Kavita Sharma, Gulshan Shrivastava, and Vikas Kumar,“Web Mining: Today and
Tomorrow”, unpublished.