0% found this document useful (0 votes)
46 views35 pages

Internshippython

Report

Uploaded by

user-690653
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views35 pages

Internshippython

Report

Uploaded by

user-690653
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

BOOK RECOMMENDATION SYSTEM

Submitted in partial fulfillment of the requirements for the award of


Bachelor of Engineering Degree in Computer Science and Engineering

By

BAVISETTI GOWTHAM (Reg.No – 39110146)

BETHA SHANMUKHA (Reg.No – 39110151)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF COMPUTING

SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with Grade “A” by NAAC | 12B Status by UGC | Approved by AICTE
JEPPIAAR NAGAR, RAJIV GANDHISALAI,
CHENNAI - 600119

APRIL - 2023

i
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BONAFIDE CERTIFICATE

This is to certify that this Project Report is the bonafide work of BAVISETTI
GOWTHAM (39110146) who carried out the Project Phase-2 entitled “BOOK
RECOMMENDATION SYSTEM” under my supervision from Jan 2023 to April 2023.

Internal Guide
Dr. M. D. ANTO PRAVEENA M.E., Ph.D.,

Head of the Department


Dr. L. LAKSHMANAN, M.E., Ph.D.,

Submitted for Viva-voce Examination held on 19.04.2023

Internal Examiner External Examiner

ii
DECLARATION

I, BAVISETTI GOWTHAM (Reg.No – 39110146), hereby declare that the


Project Phase-2 Report entitled BOOK RECOMMENDATION SYSTEM” done
by me under the guidance of Dr. M. D. ANTO PRAVEENA, M.E., Ph.D., is
submitted in partial fulfillment of the requirements for the award of Bachelor of
Engineering degree in Computer Science and Engineering.

DATE: 19-04-2023

PLACE: CHENNAI SIGNATURE OF THE CANDIDATE

iii
ACKNOWLEDGEMENT

I am pleased to acknowledge my sincere thanks to Board of Management of


SATHYABAMA for their kind encouragement in doing this project and for
completing it successfully. I am grateful to them.

I convey my thanks to Dr. T. Sasikala M.E., Ph.D., Dean, School of Computing, and
Dr. L. Lakshmanan M.E., Ph.D., Head of the Department of Computer Science and
Engineering for providing me necessary support and details at the right time during
the progressive reviews.

I would like to express my sincere and deep sense of gratitude to my Project Guide
Dr. M. D. Anto Praveena, M.E., Ph.D., for his valuable guidance, suggestions and
constant encouragement paved the way for the successful completion of my phase-
2 project work.

I wish to express my thanks to all Teaching and Non-teaching staff members of the
Department of Computer Science and Engineering who were helpful in many
ways for the completion of the project.

4
ABSTRACT

Today the World Wide Web provides users with a vast array of information, and commercial
activity on the Web has increased to the point where hundreds of new companies are adding
web pages daily. This has led to the problem of information overload. Recommender systems
have been developed to overcome this problem by providing recommendations that help
individual users identify content of interest by using the opinions of a community of users
and/or the user’s preferences.

The aim of this thesis was to design and evaluate different approaches for producing
personalised recommendations within the book domain. To achieve this goal, the project
first investigated existing recommender systems and profiling techniques. The next step was
to build users’ profiles by monitoring users’ behaviour, and develop three different
approaches for producing recommendations. Finally, an evaluation of the system
recommendations’ accuracy was done, by first conducting live user experiments and then
performing offline analysis to measure the recommendations’ accuracy using appropriate
methods for testing.

The system evaluation results show that the accuracy of the system recommendations is very
good and that a recommender system based on the combination of content-based and
collaborative filtering approaches provides more accurate recommendations for the book
domain.

5
TABLE OF CONTENTS

Chapter
TITLE Page No.
No.
ABSTRACT v

LIST OF FIGURES vii

1 INTRODUCTION 1

1.1 Problem Definition 2

1.2 Scope and Objectives 2

2 REQUIREMENTS ANALYSIS 3

3.1 Feasibility Studies / Risk Analysis of the Project 5

3.2 Software Requirements Specification Document 6

DES 7
SSSS
SSSS
8
8
8
4 DESCRIPTION OF PROPOSED SYSTEM 13
4.1 Selected Methodology or process model 13
4.2 Data sets 14
4.3 Architecture / Overall Design of Proposed System 18
Description of Software for Implementation and Testing 19
4.4
plan of the Proposed Model/System
4.5 Project Management Plan 24
5 IMPLEMENTATION DETAILS 32
5.1 Development and Deployment Setup 25
5.2 Algorithms 26
5.3 Testing 26
6 RESULTS AND DISCUSSIONS 32
7 CONCLUSION 33
7.1 Conclusion 33

6
7.2 Future Work 33

7.3 Research Issues 34

REFERENCES 35
APPENDIX
A. SOURCE CODE
B. SCREENSHOTS
C. RESEARCH PAPER

7
LIST OF FIGURES

FIGURE FIGURE NAME PAGE


NO. NO.
Books are
identified by
their
respective
ISBN. Invalid
ISBNs have
already been

8
CHAPTER 1

INTRODUCTION

This is an experimental project which first, designs…. And second evaluates different
approaches for offering recommendations to readers regarding books they may wish to
purchase, as part of an online bookshop website.

Today the World Wide Web has provided access to a vast array of information through the
web pages, as a result of the Internet growth. Also, commercial activity on the Web has
increased to the point where hundreds of new companies are adding Web pages daily. With
this increase in information sources, a problem of information overload occurs, in which the
users are trying to deal with an excess of information that is not useful to them as they try to
make sensible decisions (Losee, 1989). As a response to this problem, a range of tools to help
with retrieving, searching, and filtering have been developed.

The tool most widely used to alleviate the problem of information overload is the search
engine. The benefits for the users from search engine technology have decreased as the
number of web pages has grown. In addition, the user must first consider the large number of
search tools available and decide which one to access. Then the user must interact with each
one individually because search engines are typically not personalised to individual users or
their prevailing context. Users usually make a choice on the basis of their personal experience
or other people’s experience. Based on these facts, recommender systems have been
developed to provide recommendations that help individual users identify content of interest
by using the opinions of a community of users and/or the user’s preferences.

Today various recommendation systems play an important role in supporting commercial


websites to help users find items that they know they would like to purchase, as well as
discover new items about which they had been unaware. The ability to persuade the
consumers to buy a suitable item is a significant goal for any recommender system in an
ecommerce environment. However, for any recommender system to be successful, the
consumer must trust and accept the

1
Introduction

system’s recommendations. This is done with a clear explanation from the system, presented
in a way that is in keeping with the consumer’s preferences. A good recommender system
can significantly contribute to achieving the consumer’s acceptance of the system
recommendations.

1.1 PROBLEM DEFINITION


This project aims to design and evaluate different approaches for computing
recommendations within the book domain to provide personalised recommendations to the
users.

1.2 SCOPE AND OBJECTIVES


 An effective solution to the issue of information overload in e-commerce
websites is the recommender system.
 This method offers users accurate recommendations.
 most reliable book-related suggestion technology.

Objectives:
 Look into and assess the profiling and recommender systems that are already
in use.
 By observing dynamic user behaviours, you can create a user's profile for a
recommender system. The user profile needs to change to reflect the user's
shifting interests.
 Create a recommender system that uses a variety of computation methods.
 Utilize the right methods to assess the system's recommendations' accuracy.

Dept of ISE,SKIT 2
CHAPTER 2

REQUIREMENT ANALYSIS

2.1 FEASIBILITY STUDIES / RISK ANALYSIS OF THE PROJECT

The project will create and assess a collaborative filtering and content-based recommender
system for a real online bookstore. Machine learning methods are typically needed for
content-based recommendations in order to identify trends in the products customers like
(Middleton, 2003). The experiences of actual users will be reflected in the content-based
technology. Users' profiles will be created so that their behaviour may be tracked.
Additionally, the system will produce recommendations by comparing the contents of the
books in the user's profile with those that the user hasn't reviewed.

2.2 SOFTWARE REQUIREMENTS SPECIFICATION DOCUMENT

2.2.1 Hardware Requirements


 System : Intel® Core™ i5-9300H CPU @ 2.40GHz.

 Monitor : LED.

 Mouse : Logitech.

 Ram : 8.00 GB or above 8.00 GB

 Hard Disk : 1 TB

2.2.2 Software Requirements:


 Operating System : Windows 10, Kali Linux

 Language : Python 3

 Framework : Flask 2.2.2

2.2.3 Python:
Python is a high-level, interpreted, interactive and object-oriented scripting

3
Requirement Analysis

language. Python is designed to be highly readable. It uses English keywords frequently


where as other languages use punctuation, and it has fewer syntactical constructions than
other languages.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do
not need to compile your program before executing it. This is similar to PERL and
PHP.

 Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.

 Python is Object-Oriented − Python supports Object-Oriented style or technique of


programming that encapsulates code within objects.

 Python is a Beginner's Language − Python is a great language for the beginner-


level programmers and supports the development of a wide range of applications
from simple text processing to WWW browsers to games.

2.2.4 History of Python


 Python was developed by Guido van Rossum in the late eighties and early nineties at
the National Research Institute for Mathematics and Computer Science in the
Netherlands.
 Python is derived from many other languages, including ABC, Modula-3, C, C++,
Algol-68, Small-Talk, and Unix shell and other scripting languages.
 Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).
 Python is now maintained by a core development team at the institute, although
Guido van Rossum still holds a vital role in directing its progress.

2.2.5 Python Features


Python's features include −
 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.

 Easy-to-read − Python code is more clearly defined and visible to the eyes.

Dept of ISE,SKIT 4
Requirement Analysis

 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.

 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.

 Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.

 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.

 Databases − Python provides interfaces to all major commercial databases.

 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.

 Scalable − Python provides a better structure and support for large programs than shell
scripting.

Apart from the above-mentioned features, Python has a big list of good features, few are
listed below −
 It supports functional and structured programming methods as well as OOP.

 It can be used as a scripting language or can be compiled to byte-code for building


large applications.

 It provides very high-level dynamic data types and supports dynamic type checking.

 It supports automatic garbage collection.

.
2.2.6 Getting Python
The most up-to-date and current source code, binaries, documentation, news, etc is
available on the official website of Python https://www.python.org.

Dept of ISE,SKIT 5
Requirement Analysis

2.2.7 Windows Installation


Here are the steps to install Python on Windows machine.
 Open a Web browser and go to https://www.python.org/downloads/.

 Follow the link for the Windows installer python-XYZ.msifile where XYZ is the
version you need to install.

 To use this installer python-XYZ.msi, the Windows system must support Microsoft
Installer 2.0. Save the installer file to your local machine and then run it to find out
if your machine supports MSI.

 Run the downloaded file. This brings up the Python install wizard, which is really
easy to use. Just accept the default settings, wait until the install is finished, and you
are done.

The Python language has many similarities to Perl, C, and Java. However, there are some
definite differences between the languages.

2.2.8 First Python Program

Let us execute programs in different modes of programming.

Interactive Mode Programming


Invoking the interpreter without passing a script file as a parameter brings up the following
prompt −

$ python

Python 2.4.3(#1, Nov112010,13:34:43)

[GCC 4.1.220080704(RedHat4.1.2-48)] on linux2

Type "help", "copyright", "credits" or "license" for more information.

>>>

Dept of ISE,SKIT 5
Requirement Analysis

>>>print "Hello, Python!"

If you are running new version of Python, then you would need to use print statement with
parenthesis as in print ("Hello, Python!"). However, in Python version 2.4.3, this
produces the following result −

Hello, Python!

2.2.8 Script Mode Programming


Invoking the interpreter with a script parameter begins execution of the script and continues
until the script is finished. When the script is finished, the interpreter is no longer active.
Let us write a simple Python program in a script. Python files have extension .py. Type the
following source code in a test.py file −

Print "Hello, Python!"

We assume that you have Python interpreter set in PATH variable. Now, try to run this
program as follows −

$ python test.py

This produces the following result −

Hello, Python!

Dept of ISE,SKIT 7
CHAPTER 3

DESCRIPTION OF PROPOSED SYSTEM

The application will be developed using the incremental development methodology and will
be made up of four increments: Front End, Learning module, Recommendation module and
Database increment. The requirements outlined in the Requirements Document will be
mapped to manageable increments.

3.1 SELECTED METHODOLOGY OR PROCESS MODEL

Recommender systems have been developed to overcome the above mentioned limitations
of searching through the massive volume of information available. Recommender systems,
in comparison with other filtering tools, require less experience on the part of the user and
less effort to specify their interests when querying and operating the system (Resnick and
Varian, 1997).
Recommendations systems rely on different technologies for computing recommendations.
The most important approaches are content-based filtering and collaborative filtering.
Content-based filtering displays users as individuals, while recommender systems
employing the collaborative filtering approach display the user as a part of a group (Fasli,
2006). In addition, an advanced recommender system that combines content-based and
collaborative filtering to avoid the limitations of each approach, is called a hybrid approach.

3.1.1 MODULE DESCRIPTION

Content based filtering approach


The content-based filtering approach identifies the similarity between a user and the new
items using the content of the previously evaluated items in the user profile. In addition, each
item in a user profile is characterized by a set of attributes which is constructed by extracting
a set of features from an item. Such a profile is used to determine if the new item is similar
to the item that a user has preferred in the past. For instance, the Newsweeder is a
netnewsfiltering system that suggests news articles to the user based on the user’s profile
(Lang, 1995). Most content-based

8
Description And Propsed System

approaches are performed on textual documents, such as web pages and articles. The textual
document can be easily broken down into individual words, unlike video and physical
resources, which required sophisticated analysis.

Collaborative filtering approach


Collaborative filtering recommendations are based on the opinions of a community of
similar users. The basic idea is that users recommend items to one another. Collaborative
filtering makes this possible by asking the users to rate items, which allows the system to
recommend new items that similar users have rated highly. For instance, MovieLens is a
movie recommender system that uses collaborative filtering to help people find movies they
will like in the huge stream of available movies. Collaborative filtering works well for
multimedia technology such as music and movies.

Data Set
During the last few decades, with the rise of Youtube, Amazon, Netflix and many other such
web services, recommender systems have taken more and more place in our lives. From e-
commerce (suggest to buyers articles that could interest them) to online advertisement
(suggest to users the right contents, matching their preferences), recommender systems are
today unavoidable in our daily online journeys.

In a very general way, recommender systems are algorithms aimed at suggesting relevant
items to users (items being movies to watch, text to read, products to buy or anything else
depending on industries).

Recommender systems are really critical in some industries as they can generate a huge
amount of income when they are efficient or also be a way to stand out significantly from
competitors. As a proof of the importance of recommender systems, we can mention that, a
few years ago, Netflix organised a challenges (the “Netflix prize”) where the goal was to
produce a recommender system that performs better than its own algorithm with a prize of 1
million dollars to win

3.2 DATA SET


Books are identified by their respective ISBN. Invalid ISBNs have already been

Dept of ISE,SKIT 9
Description And Propsed System

removed from the dataset. Moreover, some content-based information is given (Book-Title,
Book-Author, Year-Of-Publication, Publisher), obtained from Amazon Web Services. Note
that in case of several authors, only the first is provided. URLs linking to cover images are
also given, appearing in three different flavours (Image- URL-S, Image-URL-M, Image-
URL-L), i.e., small, medium, large. These URLs point to the Amazon web site.

3.2.1 Recommender System


Recommender systems intend to provide users with suggestions of items that they may be
interested in, based upon their past preferences, history of purchase, or demographic
information, as well as the environment of possible items. In addition, a recommender system
helps the site adapt itself and provide individual personalisation for each consumer; this
increases the sales for the commercial site.

Different forms for providing recommendations have been developed; they can be classified
into the following forms: attribute-based recommendations, item-to-item correlation,
peopleto- people correlation and non-personalised recommendations (Konstan et al., 2001).
For more detailed descriptions.

Recommendations systems rely on different technologies for computing recommendations.


The most important approaches are content-based filtering and collaborative filtering.
Content-based filtering displays users as individuals, while recommender systems
employing the collaborative filtering approach display the user as a part of a group (Fasli,
2006). In addition, an advanced recommender system that combines content-based and
collaborative filtering to avoid the limitations of each approach, is called a hybrid approach.

3.2.2 Content based filtering approach


The content-based filtering approach identifies the similarity between a user and the new
items using the content of the previously evaluated items in the user profile. In addition, each
item in a user profile is characterized by a set of attributes which is constructed by extracting
a set of features from an item. Such a profile is used to determine if the new item is similar
to the item that a user has preferred in the past. For instance, the Newsweeder is a
netnewsfiltering system that suggests news articles to the user based on the user’s profile
(Lang, 1995). Most content-based

Dept of ISE,SKIT 10
Description And Propsed System

approaches are performed on textual documents, such as web pages and articles. The textual
document can be easily broken down into individual words, unlike video and physical
resources, which required sophisticated analysis.
Content-based filtering has some shortcomings in recommending items. A user's selection is
based on the subjective attributes (such as the quality) of the item (Goldberg et al., 1992); in
contrast, content based approaches are based on objective attributes (such as the description
of an item) about the items. Also, some items the users may be interested in cannot be
recommended to them because content-based methods compare new items with the items
previously seen by the user, while the user's interests may be beyond the scope of the
previously seen items. Finally, multimedia technology such as sound, video or physical items
cannot be analysed automatically for relevant attribute information, due to limitations of
resources (Jennings et al., 2005).

3.2.3 Collaborative Filtering approach

Collaborative filtering recommendations are based on the opinions of a community of


similar users. The basic idea is that users recommend items to one another. Collaborative
filtering makes this possible by asking the users to rate items, which allows the system to
recommend new items that similar users have rated highly. For instance, MovieLens is a
movie recommender system that uses collaborative filtering to help people find movies they
will like in the huge stream of available movies. Collaborative filtering works well for
multimedia technology such as music and movies. However, it also has some limitations:
New user problem: A new user starts off with a profile of interests from scratch. The system
needs to know the user preferences in different items to generate accurate recommendations.
Cold start problem: New items cannot be recommended until more information is obtained
when another user either rates an item or provides feedback on the item (Fasli, 2006). As a
result, the recommendations generated by the system will not recommend items similar
enough to the users’ interests.
Scalability: A collaborative filtering algorithm should address the scalability issue as the
number of users increase and their collective profile size becomes large (Fasli, 2006).

Dept of ISE,SKIT 11
Description And Propsed System

The schematic diagram of the collaborative filtering process is showed in Figure


3.1. As you can see from the figure, there is a list of users denoted by U= {u1, u2,…,um}
and a list of items I={i1,i2,….,in}. Each user has a list of items. The collaborative filtering
algorithm will generate recommendations(fig 4.1), a list of N items that the active user will
mostly like, according to the active user. Also, the process will output a prediction, which is
the result prediction on item j for the active user (Sarwar et al., 2001).

Fig 3.1 Collaborative Filtering Process

3.2.4 Hybrid Approach

Hybrid approach introduced to combine the advantages of both content-based and


collaborative filtering techniques help to overcome their limitations. These use thestrength
of one set of filtering techniques to overcome the limitation of the other. The hybrid filtering
approach is also called “collaborative via content” because content-based profiles are also
taken when identifying the similarities among users for collaborative recommendations
(Pazzani, 1999).

3.2.5 User - Based collaborative Filtering

User-based algorithm is based on the fact that each user belongs to a larger group of similarly
behaving individuals. It uses statistical techniques to find a set of users with similar interests,
known as neighbours, in the entire user-item database, to generate a list of recommendation
for the active user (Middleton, 2003).
Different measures of similarity that are based on neighbourhood algorithms are used to
compute the similarity between the active user and other users in the database, such as the
Pearson correlation coefficient and Mean squared differences

Dept of ISE,SKIT 12
Description And Propsed System

algorithms (Breese et al., 1998). Moreover, to predict the rating of an item given by the active
user, the ratings from the most similar users for the item are averaged and weighted by their
similarities to the active user. The Pearson Correlation (fig 4.2) reflects the degree of linear
relationship between two variables and ranges from
+1 to -1. A positive correlation means that the two users have very similar tastes, while a
negative correlation indicates that the users have dissimilar tastes (Fasli, 2006). The Pearson
Correlation Coefficient method defines the similarity between two users by:

Fig 3.2: Pearson correlation

3.2.6 Item Based collaborative Filtering

The item-based algorithms are developed to overcome the scalability on user-based


recommendations. Unlike a user-based approach, the item-based approach identifies the set
of items that are similar or related to the item that the active user has evaluated. After that, it
computes the similarity between items and then selects the most similar items to the target
item within the set of items that the user has rated (Sarwar et al., 2001).

Dept of ISE,SKIT 13
Description And Propsed System

3.3 ARCHITECTURE / OVERALL DESIGN OF PROPOSED SYSTEM


Fig 3.3 depicts the architecture of the systems intend to provide users with suggestions of
items that they may be interested in, based upon their past preferences, history of purchase,
or demographic information, as well as the environment of possible items

Fig 3.3: System Architecture

3.4 DESCRIPTION OF SOFTWARE FOR IMPLEMENTATION


AND TESTING PLAN OF THE PROPOSED MODEL/SYSTEM

Flask Framework:
Flask is a web application framework written in Python. Armin Ronacher, who leads
an international group of Python enthusiasts named Pocco, develops it. Flask is based on
Werkzeug WSGI toolkit and Jinja2 template engine. Both are Pocco projects.
Http protocol is the foundation of data communication in world wide web. Different
methods of data retrieval from specified URL are defined in this protocol.
By default, the Flask route responds to the GET requests. However, this preference
can be altered by providing methods argument to route () decorator.In order to demonstrate
the use of POST method in URL routing, first let us create an HTML form and use the
POST method to send form data to a URL.

Dept of ISE,SKIT 14
Description And Propsed System

The following table 3.1 summarizes different http methods –

Table 3.1: Methods and description of the Flask

S.No Methods & Description

1 GET

Sends data in unencrypted form to the server. Most common method.

2 HEAD

Same as GET, but without response body

3 POST

Used to send HTML form data to server. Data received by POST method
is not cached by server.

4 PUT

Replaces all current representations of the target resource with the


uploaded content.

5 DELETE

Removes all current representations of the target resource given by a


URL

By default, the Flask route responds to the GET requests. However, this preference
can be altered by providing methods argument to route () decorator.In order to demonstrate
the use of POST method in URL routing, first let us create an HTML form and use the
POST method to send form data to a URL.

Dept of ISE,SKIT 15
Description And Propsed System

3.5 PROJECT MANAGEMENT PLAN

Fig 3.4: Flow Chart

Fig 3.4 depicts the project management plan flowchart of standard project management
practices and methodologies widely recognized and utilized globally. It covers essential
topics such as project initiation, planning, execution, monitoring, and closure, offering
detailed processes, tools, and techniques for each phase

Dept of ISE,SKIT 16
CHAPTER 4

IMPLEMENTATION DETAILS

A recommendation engine is a class of machine learning which offers relevant suggestions


to the customer. Before the recommendation system, the major tendency to buy was to take
a suggestion from friends. But Now Google knows what news you will read, Youtube knows
what type of videos you will watch based on your search history, watch history, or purchase
history.

A recommendation system helps an organization to create loyal customers and build trust by
them desired products and services for which they came on your site. The recommendation
system today are so powerful that they can handle the new customer too who has visited the
site for the first time. They recommend the products which are currently trending or highly
rated and they can also recommend the products which bring maximum profit to the
company.

4.1 DEVELOPMENT AND DEPLOYMENT DETAILS


Machine learning is a process that is widely used for prediction. N number of
algorithms are available in various libraries which can be used for prediction. In this article,
we are going to build a prediction model on historical data using different machine learning
algorithms and classifiers, plot the results, and calculate the accuracy of the model on the
testing data.
Building/Training a model using various algorithms on a large dataset is one part of the data.
But using these models within the different applications is the second part of deploying
machine learning in the real world.To put it to use in order to predict the new data, we have
to deploy it over the internet so that the outside world can use it. In this article, we will talk
about how we have trained a machine learning model and created a web application on it
using- Flask.We have to install many required libraries which will be used in this model.

Use pip command to install all the libraries.

pip install pandas


pip install numpy

pip install sklearn

17
Implementation

4.2 ALGORITHM

 Content based Filtering: The algorithm recommends a product that is similar to


those which used as watched. In simple words, In this algorithm, we try to find finding
item look alike. For example, a person likes to watch Sachin Tendulkar shots, so he
may like watching Ricky Ponting shots too because the two videos have similar tags
and similar categories.
 Collaborative based Filtering: Collaborative based filtering recommender systems
are based on past interactions of users and target items. In simple words here, we try
to search for the look-alike customers and offer products based on what his or her
lookalike has chosen. Let us understand with an example. X and Y are two similar
users and X user has watched A, B, and C movie. And Y user has watched B, C, and
D movie then we will recommend A movie to Y user and D movie to X user.
 Hybrid filtering method: It is basically a combination of both the above methods.
It is a too complex model which recommends product based on your history as well
based on similar users like you.
 There are some organizations that use this method like Facebook which shows news
which is important for you and for others also in your network and the same is used
by Linkedin too.

Dataset description
we have 3 files in our dataset which is extracted from some books selling websites.

 Books – first are about books which contain all the information related to
books like an author, title, publication year, etc.
 Users – The second file contains registered user’s information like user id,
location.
 ratings – Ratings contain information like which user has given how much
rating to which book.

So based on all these three files we can build a powerful collaborative filtering model.

Dept of ISE,SKIT 18
Implementation

Loading data
let us start while importing libraries and load datasets. while loading the file we have some
problems like.

 The values in the CSV file are separated by semicolons, not by a comma.
 There are some lines which not work like we cannot import it with pandas and It
throws an error because python is Interpreted language.
 Encoding of a file is in Latin

Preprocessing Data: Now in the books file, we have some extra columns which are not
required for our task like image URLs. And we will rename the columns of each file as the
name of the column contains space, and uppercase letters so we will correct as to make it easy
to use.

The dataset is reliable and can consider as a large dataset. we have 271360 books data and
total registered users on the website are approximately 278000 and they have given near
about 11 lakh rating. hence we can say that the dataset we have is nice and reliable.We do
not want to find a similarity between users or books. we want to do that If there is user A
who has read and liked x and y books, And user B has also liked this two books and now user
A has read and liked some z book which is not read by B so we have to recommend z book
to user B. This is what collaborative filtering is.

So this is achieved using Matrix Factorization, we will create one matrix where columns will
be users and indexes will be books and value will be rating. Like we have to create a Pivot
table.If we take all the books and all the users for modeling, Don’t you think will it create a
problem? So what we have to do is we have to decrease the number of users and books
because we cannot consider a user who has only registered on the website or has only read
one or two books. On such a user, we cannot rely to recommend books to others because we
have to extract knowledge from data. So what we will limit this number and we will take a
user who has rated at least 200 books and also we will limit books and we will take only
those books which have received at least 50 ratings from a user.

Dept of ISE,SKIT 19
Implementation

Exploratory data analysis

The primary goal of EDA is to support the analysis of data prior to making any conclusions.
It may aid in the detection of apparent errors, as well as a deeper understanding of data
patterns, the detection of outliers or anomalous events, and the discovery of interesting
relationships between variables.

Website Deployment

We are using the pycharm community to deploy the website. By creating the project book
recommendation System.

Flask provides configuration and conventions, with sensible defaults, to get started. This
section of the documentation explains the different parts of the Flask framework and how they
can be used, customized, and extended. Beyond Flask itself, look for community-maintained
extensions to add even more functionality.

def create_app():
app = Flask( name )
hello.init_app(app) return
app

 create_app function: This function is a convention used in Flask applications to


create an instance of the Flask application. Inside the function:
 app = Flask(__name__): Creates a Flask application instance. __name__ is a special
Python variable that holds the name of the current module.
 hello.init_app(app): Initializes the hello module or blueprint with the Flask
application app. This typically means registering routes or configuring the hello
module to work with the Flask application.
 Return: Finally, the function returns the Flask app object, which represents your
entire application.

Dept of ISE,SKIT 20
CHAPTER 5

RESULTS AND DISCUSSION

Today the World Wide Web provides users with a vast array of information, and commercial
activity on the Web has increased to the point where hundreds of new companies are adding
web pages daily. This has led to the problem of information overload. Recommender systems
have been developed to overcome this problem by providing recommendations that help
individual users identify content of interest by using the opinions of a community of users
and/or the user’s preferences.

The aim of this thesis was to design and evaluate different approaches for producing
personalised recommendations within the book domain. To achieve this goal, the project
first investigated existing recommender systems and profiling techniques. The next step was
to build users’ profiles by monitoring users’ behaviour, and develop three different
approaches for producing recommendations. Finally, an evaluation of the system
recommendations’ accuracy was done, by first conducting live user experiments and then
performing offline analysis to measure the recommendations’ accuracy using appropriate
methods for testing.

The system evaluation results show that the accuracy of the system recommendations is very
good and that a recommender system based on the combination of content-based and
collaborative filtering approaches provides more accurate recommendations for the book
domain.

21
CHAPTER 6
CONCLUSION

6.1 CONCLUSION
All of our systems– purely content-based, purely collaborative-filtering, and hybrid–
performed quite well. Looking back on the project, one thing that we might have chosen to
do differently in retrospect would have been to spend more time searching for a dataset of
ratings with a higher rating variance per user. Had we been able to find such a dataset, our
implementations of algorithms would have been tested on data that would have been more
representative of what a typical commercial recommendation system could access in creating
its predictions. However, given the data that was available to us, as well as the results our
various approaches produced, our systems were largely successful, providing insight into
how the different systems we regularly use work and the varying algorithms that make that
possible.

6.2 FUTURE WORK


Given more information regarding the books dataset, namely features like Genre,
Description etc., we could implement a content-filtering based recommendation system and
compare the results with the existing collaborative-filtering based system.
We would like to explore various clustering approaches for clustering the users based on
Age, Location etc., and then implement voting algorithms to recommend items to the user
depending on the cluster into which it belongs.

22
REFERENCES
[1] Ahuja, Rishabh, Arun Solanki, and Anand Nayyar.” Movie recommender system using

K-Means clustering and K-Nearest Neighbor.” In 2019 9th International Conference on


Cloud Computing, Data Science Engineering (Confluence), pp. 263-
268. IEEE, 2019.

[2] Badriyah, Tessy, Erry Tri Wijayanto, Iwan Syarif, and Prima Kristalina. ”A hybrid
recommendation system for E-commerce based on product description and user profile.” In
2017 Seventh International Conference on Innovative Computing Technology (INTECH),
pp. 95-100. IEEE, 2017.

[3] Chen, Junnan, Courtney Miller, and Gaby G. Dagher. ”Product recommendation system
for small online retailers using association rules mining.” In Proceedings of the 2014
International Conference on Innovative Design and Manufacturing (ICIDM), pp. 71-77.
IEEE, 2014.

[4] Jisha, R. C., Ram Krishnan, and Varun Vikraman. ”Mobile applications
recommendation based on user ratings and permissions.” In 2018 International Conference
on Advances in Computing, Communications and Informatics (ICACCI),
pp. 1000-1005.IEEE, 2018.

[5] Keerthana, N. K., Shriram K. Vasudevan, and Nalini Sampath. ”An Effective Approach
to Cluster Customers with a Product Recommendation System.” Journal of Computational
and Theoretical Nanoscience Vol. 17, No. 1, pp. 347-352.IEEE, 2020.

[6] Kurmashov, Nursultan, Konstantin Latuta, and Abay Nussipbekov. ”Online book
recommendation System.” In 2015 Twelve International Conference on Electronics
Computer and Computation (ICECCO), pp. 1-4. IEEE, 2015.

[7] Kurup, Ayswarya R., and G. P. Sajeev. ”Task recommendation in rewardbased


crowdsourcing systems.” In 2017 International Conference on Advances in Computing,
Communications and Informatics (ICACCI), pp. 1511-1518. IEEE, 2017.

23
APPENDIX

A. SOURCE CODE
Fig A.1 Refers to the implementation of the flask for the website deployment.
from flask import Flask,render_template,request
import pickle
import numpy as np

popular_df=pickle.load(open('popular.pkl','rb'))
pt=pickle.load(open('pt.pkl','rb'))
books=pickle.load(open('books.pkl','rb'))
similarity_scores=pickle.load(open('similarity_scores.pkl','rb'))
app=Flask( name )
@app.route('/')
def index():
return render_template('index.html',
book_name = list(popular_df['Book-
Title'].values),
author=list(popular_df['Book-
Author'].values),
image=list(popular_df['Image-URL-
M'].values),

votes=list(popular_df['num_ratings'].values),

rating=list(popular_df['avg_rating'].values)

)
@app.route('/recommendation')
def recommendation_ui():
return render_template('recommendation.html')
@app.route('/recommend_books',methods=['post'])
def recommend():
user_input=request.form.get('user_input')
index = np.where(pt.index == user_input)[0][0]
similar_items = sorted(list(enumerate(similarity_scores[0])),
key=lambda x: x[1], reverse=True)[1:6]
data = []
for i in similar_items:
item = []
temp_df = books[books['Book-Title'] == pt.index[i[0]]]
item.extend(temp_df.drop_duplicates("Book-Title")['Book-
Title'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Book-
Author'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Image-
URL-M'].values)
data.append(item)
print(data)
return render_template('recommendation.html',data=data)

Fig A.1: Flask Sourcecode

24
Appendix

B. SCREENSHOTS

First of all we are importing the required libraries and datasets (Fig B.1)

Fig B.1 Importing Libraries and Datasets

In the Fig B.2, the books dataset is merged with ratings dataset to evaluate the highest
average rating of the books

Fig B.2 Popularity-based filtering / Content-based filtering output

Dept of ISE,SKIT 25
Appendix

In the next step data pre-processing is carried out to modify the data as required
(FigB.3)

Fig B.2 Data Pre-processing

Dept of ISE,SKIT 26
Appendix

Webpage deployment of content based filtering model is shown in Fig B.4

Fig B.4 Webpage for content-based filtering Model

Webpage deployment of collaborative based filtering is shown in the fig B.5

Fig B.5 Webpage for collaborative based filtering model

Dept of ISE,SKIT 27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy