Book Recomendation System Documentation Final
Book Recomendation System Documentation Final
By
Mr.Suresh Babu
Assoc.Professor
CERTIFICATE
This is to certify that the Project Report on “Book Recommendation System” is a bonafide
work by Argyadip Das (21911A1206), Subramanyam (21911A1205),Praneeth(21911A1223)
in partial fulfilment of the requirement for the award of the degree of Bachelor of Technology
in “INFORMATION TECHNOLOGY” JNTU Hyderabad during the year 2024-2025.
Project Guide
Mr.B.Suresh Babu, M.Tech,Associate Professor
ii
DECLARATION
Argyadip Das(21911A1206)
A.Subramanyam(21911A1205)
Praneeth Teja(21911A1223)
iii
ACKNOWLEDGEMENT
I would like to express my sincere and deep sense of gratitude to my Project Guide
Mr.Suresh Babu,Associate Professor, for his valuable guidance, suggestions and
constant encouragement paved the way for the successful completion of our project
work.
I wish to express my thanks to all Teaching and Non-teaching staff members of the
Department of Information Tehnology who were helpful in many ways for the
completion of the project.
iv
TABLE OF CONTENTS
Chapter
TITLE Page No.
No.
ABSTRACT v
1 INTRODUCTION 1
2 LITERATURE SURVEY 3
2.2 Motivation 6
3 REQUIREMENTS ANALYSIS 8
3.1 Feasibility Studies / Risk Analysis of the Project 8
3.2 Software Requirements Specification Document 8
4 DESCRIPTION OF PROPOSED SYSTEM 13
4.1 Selected Methodology or process model 13
4.2 Data sets 14
REFERENCES 35
APPENDIX
A. SOURCE CODE
B. SCREENSHOTS
C. RESEARCH PAPER
vi
LIST OF FIGURES
vii
ABTRACT
Today the World Wide Web provides users with a vast array of information, and
commercial activity on the Web has increased to the point where hundreds of new
companies are adding web pages daily. This has led to the problem of information
overload. Recommender systems have been developed to overcome this problem
by providing recommendations that help individual users identify content of interest
by using the opinions of a community of users and/or the user’s preferences.
The aim of this thesis was to design and evaluate different approaches for producing
personalised recommendations within the book domain. To achieve this goal, the
project first investigated existing recommender systems and profiling techniques.
The next step was to build users’ profiles by monitoring users’ behaviour, and
develop three different approaches for producing recommendations. Finally, an
evaluation of the system recommendations’ accuracy was done, by first conducting
live user experiments and then performing offline analysis to measure the
recommendations’ accuracy using appropriate methods for testing.
The system evaluation results show that the accuracy of the system
recommendations is very good and that a recommender system based on the
combination of content-based and collaborative filtering approaches provides more
accurate recommendations for the book domain.
viii
CHAPTER 1
INTRODUCTION
Today the World Wide Web has provided access to a vast array of information
through the web pages, as a result of the Internet growth. Also, commercial activity
on the Web has increased to the point where hundreds of new companies are adding
Web pages daily. With this increase in the information sources, a problem of
information overload occurs, in which the users are trying to deal with an excess of
information that is not useful to them as they try to make sensible decisions (Losee,
1989). As a response to this problem, a range of tools to help with retrieving,
searching, and filtering have been developed.
The tool most widely used to alleviate the problem of information overload is the
search engine. The benefits for the users from search engine technology have
decreased as the number of web pages has grown. In addition, the user must first
consider the large number of search tools available and decide which one to access.
Then the user must interact with each one individually because search engines are
typically not personalised to individual users or their prevailing context. Users usually
make a choice on the basis of their personal experience or other people’s
experience. Based on these facts, recommender systems have been developed to
provide recommendations that help individual users identify content of interest by
using the opinions of a community of users and/or the user’s preferences.
1
system’s recommendations. This is done with a clear explanation from the system,
presented in a way that is in keeping with the consumer’s preferences. A good
recommender system can significantly contribute to achieving the consumer’s
acceptance of the system recommendations.
Objectives:
• Look into and assess the profiling and recommender systems that are
already in use.
• By observing dynamic user behaviours, you can create a user's profile
for a recommender system. The user profile needs to change to reflect
the user's shifting interests.
• Create a recommender system that uses a variety of computation
methods.
• Utilize the right methods to assess the system's recommendations'
accuracy.
2
CHAPTER 2
LITERATURE SURVEY
A unique book recommendation system was proposed by Binge Cui and Xin Chen.
When readers are unable to locate the desired book using the library's bibliographic
retrieval system, they are directed to the recommendation pages. It is a web-based
system for recommending books to a library's patrons. After logging in, a user can
search for books using author names or keywords like book titles. A bibliographic
retrieval system will then look for books using the same keywords. If the
recommendation system returns any results, submit these keywords to the web
books retrieval module. Web Books Retrieval Module allows the librarian or
administrator of the online book recommendation system to search the online
bookshop using keywords by creating accounts on sites like Amazon. As a result,
the web retrieval module searches these online bookshops as the logged-in user
when the keyword is presented to it. The user will receive the results from these
online booksellers in the form of recommendations. The statistic and analysis
3
module will determine the value of that specific book based on user
recommendations. The Auto-Order Module will then generate a book order
automatically based on the analysis results according to this value of book. The
Short Message and Email Notification Module will get a report from the Book Storage
System once the purchased books have been shelved (fig 2.1). Then, utilizing a
message and email server, it will inform the readers who have suggested the books
that have been acquired.
4
calculations and results.
For a book recommendation system, Adli Ihsan Hariadi and Dade Nurjanah
presented a hybrid-based approach that blends attribute-based and user
personality-based methodologies. The MSV-MSL (Most Similar Visited Material to
the Most Similar Learner) method is used in this study, and the authors claim that it
is the best hybrid attributes-based strategy. When forming neighbourhood ties, the
personality trait is used to compare users. The hybrid attribute will use the similarity
5
scores between a target book and its neighbours as well as between the active user
and that user's neighbours to generate the recommendation scores of rated books
from neighbours. the score for user u's book B, designated as score b. The goal of
this is to match up the most similar learner with the most similar visited material. It
makes advantage of the collaborative and content values. Utilize the hybrid result
as a recommendation after that. That is the webpage that the most similar learner
has visited, according to data.
2.2 MOTIVATION
Due to the expansion of the Internet, the World Wide Web now offers access to a
wide variety of information via web sites. Additionally, business activity online has
grown to the point that every day, hundreds of new businesses add new Web pages.
Due to the increase in information sources, a problem known as information
overload arises, where users must cope with an abundance of information that is
unhelpful to them in order to make sound judgments (Losee, 1989). A variety of tools
for accessing, finding, and filtering information have been created as a solution to
this issue The search engine is the resource that is most frequently utilised to
address the issue of information overload. As the amount of web pages has
increased, so too have the benefits for consumers of search engine technology.
Additionally, the user must choose which search tool to utilise after carefully
weighing the several options. Then, as search engines are often not tailored to
specific individuals or their current environment, the user must interact with each
one separately. Users typically base their decisions on either their own or other
people's experiences. These facts led to the development of recommender systems,
which leverage user feedback from a community of users and/or the user's own
preferences to generate recommendations that assist individual users in identifying
content of interest.
6
overload is a condition when consumers strive to manage more information but are
unable to make rational decisions (Losee, 1989).
A variety of retrieval, searching, and filtering techniques have been created
to aid with the problem of information overload. The search engine is the most
frequently used instrument to help with the issue of information overload. Although
search engines are efficient in filtering pages, consumers find it challenging and
time-consuming to express their needs in a search query. Due to the exponential
growth of web pages, search engine technology's advantages for users have
lessened with time.
7
CHAPTER 3
REQUIREMENT ANALYSIS
The project will create and assess a collaborative filtering and content-based
recommender system for a real online bookstore. Machine learning methods are
typically needed for content-based recommendations in order to identify trends in
the products customers like (Middleton, 2003). The experiences of actual users will
be reflected in the content-based technology. Users' profiles will be created so that
their behaviour may be tracked. Additionally, the system will produce
recommendations by comparing the contents of the books in the user's profile with
those that the user hasn't reviewed.
• Monitor : LED.
• Mouse : Logitech.
• Hard Disk : 1 TB
• Language : Python 3
3.2.3 Python:
Python is a high-level, interpreted, interactive and object-oriented scripting
8
language. Python is designed to be highly readable. It uses English keywords
frequently where as other languages use punctuation, and it has fewer syntactical
constructions than other languages.
• Python is Interpreted − Python is processed at runtime by the interpreter.
You do not need to compile your program before executing it. This is similar
to PERL and PHP.
• Python is Interactive − You can actually sit at a Python prompt and interact
with the interpreter directly to write your programs.
• Easy-to-read − Python code is more clearly defined and visible to the eyes.
9
• A broad standard library − Python's bulk of the library is very portable and
cross-platform compatible on UNIX, Windows, and Macintosh.
• Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
• Portable − Python can run on a wide variety of hardware platforms and has
the same interface on all platforms.
• Scalable − Python provides a better structure and support for large programs
than shell scripting.
Apart from the above-mentioned features, Python has a big list of good features,
few are listed below −
• It supports functional and structured programming methods as well as OOP.
• It provides very high-level dynamic data types and supports dynamic type
checking.
• It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
10
is available on the official website of Python https://www.python.org.
• Follow the link for the Windows installer python-XYZ.msifile where XYZ is the
version you need to install.
• Run the downloaded file. This brings up the Python install wizard, which is
really easy to use. Just accept the default settings, wait until the install is
finished, and you are done.
The Python language has many similarities to Perl, C, and Java. However, there
are some definite differences between the languages.
$ python
>>>
Type the following text at the Python prompt and press the Enter −
11
>>>print "Hello, Python!"
If you are running new version of Python, then you would need to use print
statement with parenthesis as in print ("Hello, Python!"). However, in Python
version 2.4.3, this produces the following result −
Hello, Python!
We assume that you have Python interpreter set in PATH variable. Now, try to run
this program as follows −
$ python test.py
Hello, Python!
12
CHAPTER 4
13
approaches are performed on textual documents, such as web pages and articles.
The textual document can be easily broken down into individual words, unlike video
and physical resources, which required sophisticated analysis.
Collaborative filtering approach
Collaborative filtering recommendations are based on the opinions of a community
of similar users. The basic idea is that users recommend items to one another.
Collaborative filtering makes this possible by asking the users to rate items, which
allows the system to recommend new items that similar users have rated highly. For
instance, MovieLens is a movie recommender system that uses collaborative
filtering to help people find movies they will like in the huge stream of available
movies. Collaborative filtering works well for multimedia technology such as music
and movies.
Data Set
During the last few decades, with the rise of Youtube, Amazon, Netflix and many
other such web services, recommender systems have taken more and more place
in our lives. From e-commerce (suggest to buyers articles that could interest them)
to online advertisement (suggest to users the right contents, matching their
preferences), recommender systems are today unavoidable in our daily online
journeys.
Recommender systems are really critical in some industries as they can generate a
huge amount of income when they are efficient or also be a way to stand out
significantly from competitors. As a proof of the importance of recommender
systems, we can mention that, a few years ago, Netflix organised a challenges (the
“Netflix prize”) where the goal was to produce a recommender system that performs
better than its own algorithm with a prize of 1 million dollars to win
14
removed from the dataset. Moreover, some content-based information is given
(Book-Title, Book-Author, Year-Of-Publication, Publisher), obtained from Amazon
Web Services. Note that in case of several authors, only the first is provided. URLs
linking to cover images are also given, appearing in three different flavours (Image-
URL-S, Image-URL-M, Image-URL-L), i.e., small, medium, large. These URLs point
to the Amazon web site.
4.2.1 Recommender System
Recommender systems intend to provide users with suggestions of items that they
may be interested in, based upon their past preferences, history of purchase, or
demographic information, as well as the environment of possible items. In addition,
a recommender system helps the site adapt itself and provide individual
personalisation for each consumer; this increases the sales for the commercial site.
Different forms for providing recommendations have been developed; they can be
classified into the following forms: attribute-based recommendations, item-to-item
correlation, peopleto- people correlation and non-personalised recommendations
(Konstan et al., 2001). For more detailed descriptions.
15
approaches are performed on textual documents, such as web pages and articles.
The textual document can be easily broken down into individual words, unlike video
and physical resources, which required sophisticated analysis.
Content-based filtering has some shortcomings in recommending items. A user's
selection is based on the subjective attributes (such as the quality) of the item
(Goldberg et al., 1992); in contrast, content based approaches are based on
objective attributes (such as the description of an item) about the items. Also, some
items the users may be interested in cannot be recommended to them because
content-based methods compare new items with the items previously seen by the
user, while the user's interests may be beyond the scope of the previously seen
items. Finally, multimedia technology such as sound, video or physical items cannot
be analysed automatically for relevant attribute information, due to limitations of
resources (Jennings et al., 2005).
16
The schematic diagram of the collaborative filtering process is showed in Figure
4.1. As you can see from the figure, there is a list of users denoted by U= {u1,
u2,…,um} and a list of items I={i1,i2,….,in}. Each user has a list of items. The
collaborative filtering algorithm will generate recommendations(fig 4.1), a list of N
items that the active user will mostly like, according to the active user. Also, the
process will output a prediction, which is the result prediction on item j for the active
user (Sarwar et al., 2001).
User-based algorithm is based on the fact that each user belongs to a larger group
of similarly behaving individuals. It uses statistical techniques to find a set of users
with similar interests, known as neighbours, in the entire user-item database, to
generate a list of recommendation for the active user (Middleton, 2003).
Different measures of similarity that are based on neighbourhood algorithms are
used to compute the similarity between the active user and other users in the
database, such as the Pearson correlation coefficient and Mean squared differences
17
algorithms (Breese et al., 1998). Moreover, to predict the rating of an item given by
the active user, the ratings from the most similar users for the item are averaged
and weighted by their similarities to the active user. The Pearson Correlation (fig
4.2) reflects the degree of linear relationship between two variables and ranges from
+1 to -1. A positive correlation means that the two users have very similar tastes,
while a negative correlation indicates that the users have dissimilar tastes (Fasli,
2006). The Pearson Correlation Coefficient method defines the similarity between
two users by:
ALGORITHM
• Content based filtering mechanism
• Collaborative based filtering algorithm
• Cosine similarity
18
4.3 ARCHITECTURE / OVERALL DESIGN OF PROPOSED SYSTEM
Fig 4.3 depicts the architecture of the System
Flask Framework:
Flask is a web application framework written in Python. Armin Ronacher, who
leads an international group of Python enthusiasts named Pocco, develops it. Flask
is based on Werkzeug WSGI toolkit and Jinja2 template engine. Both are Pocco
projects.
Http protocol is the foundation of data communication in world wide web. Different
methods of data retrieval from specified URL are defined in this protocol.
19
The following table 4.1 summarizes different http methods –
1 GET
2 HEAD
3 POST
Used to send HTML form data to server. Data received by POST method
is not cached by server.
4 PUT
5 DELETE
By default, the Flask route responds to the GET requests. However, this
preference can be altered by providing methods argument to route () decorator.
In order to demonstrate the use of POST method in URL routing, first let us create
an HTML form and use the POST method to send form data to a URL.
20
Save the following script as login.html
<html>
<body>
<p>Enter Name:</p>
</form>
</body>
</html>
app=Flask( name )
@app. route('/success/<name>')
def success(name):
if request. method=='POST':
user=request. form['nm']
21
return redirect(url_for('success’, name= user))
else:
user=request.args.get('nm')
After the development server starts running, open login.html in the browser, enter
name in the text field and click Submit.
22
Fig 4.5: Local Host Output Console
Change the method parameter to ‘GET’ in login.html and open it again in the
browser. The data received on server is by the GET method. The value of ‘nm’
parameter is now obtained by −
User = request.args.get(‘nm’)
Here, args is dictionary object containing a list of pairs of form parameter and its
corresponding value. The value corresponding to ‘nm’ parameter is passed on to
‘/success’ URL as before.
23
4.5 PROJECT MANAGEMENT PLAN
24
CHAPTER 5
IMPLEMENTATION DETAILS
25
pip install sklearn
5.2 ALGORITHM
Content based Filtering: The algorithm recommends a product that is similar to
those which used as watched. In simple words, In this algorithm, we try to find finding
item look alike. For example, a person likes to watch Sachin Tendulkar shots, so he
may like watching Ricky Ponting shots too because the two videos have similar tags
and similar categories.
Collaborative based Filtering: Collaborative based filtering recommender systems
are based on past interactions of users and target items. In simple words here, we
try to search for the look-alike customers and offer products based on what his or
her lookalike has chosen. Let us understand with an example. X and Y are two
similar users and X user has watched A, B, and C movie. And Y user has watched
B, C, and D movie then we will recommend A movie to Y user and D movie to X
user.
Hybrid filtering method: It is basically a combination of both the above methods.
It is a too complex model which recommends product based on your history as well
based on similar users like you.
There are some organizations that use this method like Facebook which shows
news which is important for you and for others also in your network and the same is
used by Linkedin too.
Dataset description
we have 3 files in our dataset which is extracted from some books selling websites.
• Books – first are about books which contain all the information related to
books like an author, title, publication year, etc.
• Users – The second file contains registered user’s information like user id,
location.
• ratings – Ratings contain information like which user has given how much
rating to which book.
So based on all these three files we can build a powerful collaborative filtering
model. let’s get started.
26
Loading data
let us start while importing libraries and load datasets. while loading the file we have
some problems like.
• The values in the CSV file are separated by semicolons, not by a comma.
• There are some lines which not work like we cannot import it with pandas
and It throws an error because python is Interpreted language.
• Encoding of a file is in Latin
So, while loading data we have to handle these exceptions and after running the
below code you will get some warning and it will show which lines have an error that
we have skipped while loading.
Preprocessing Data: Now in the books file, we have some extra columns which
are not required for our task like image URLs. And we will rename the columns of
each file as the name of the column contains space, and uppercase letters so we
will correct as to make it easy to use.
The dataset is reliable and can consider as a large dataset. we have 271360 books
data and total registered users on the website are approximately 278000 and they
have given near about 11 lakh rating. hence we can say that the dataset we have is
nice and reliable.
So this is achieved using Matrix Factorization, we will create one matrix where
columns will be users and indexes will be books and value will be rating. Like we
have to create a Pivot table.
If we take all the books and all the users for modeling, Don’t you think will it create
a problem? So what we have to do is we have to decrease the number of users and
books because we cannot consider a user who has only registered on the website
or has only read one or two books. On such a user, we cannot rely to recommend
27
books to others because we have to extract knowledge from data. So what we will
limit this number and we will take a user who has rated at least 200 books and also
we will limit books and we will take only those books which have received at least
50 ratings from a user.
The primary goal of EDA is to support the analysis of data prior to making any
conclusions. It may aid in the detection of apparent errors, as well as a deeper
understanding of data patterns, the detection of outliers or anomalous events, and
the discovery of interesting relationships between variables.
Upon merging the rating dataset with book dataset we are grouping the ratings with
respect to the name of the book
num_rating_df=ratings_with_name.groupby('Book-Title').count()['Book
Rating'].reset_index()
num_rating_df.rename(columns={'Book-Rating':'num_ratings'},inplace=True)
num_rating_df
Using the mean function calculating the average ratings of the books.
avg_rating_df=ratings_with_name.groupby('Book-Title').mean()['Book
Rating'].reset_index()
avg_rating_df.rename(columns={'Book-Rating':'avg_rating'},inplace=True)
avg_rating_df
28
By sorting the average rating we can retrive the first 50 highest rating books fro the
dataset
popular_df=popular_df[popular_df['num_ratings']>=250].sort_values('avg_rating',a
scending=False).head(50)
popular_df
Collaborative filtering is a technique that can filter out items that a user might like on
the basis of reactions by similar users.
It works by searching a large group of people and finding a smaller set of users with
tastes similar to a particular user. It looks at the items they like and combines them
to create a ranked list of suggestions.
There are many ways to decide which users are similar and combine their choices
to create a list of recommendations. This article will show you how to do that with
Python.
To experiment with recommendation algorithms, you’ll need data that contains a set
of items and a set of users who have reacted to some of the items.
While working with such data, you’ll mostly see it in the form of a matrix consisting
of the reactions given by a set of users to some items from a set of items. Each row
would contain the ratings given by a user, and each column would contain the
ratings received by an item.
To build a system that can automatically recommend items to users based on the
preferences of other users, the first step is to find similar users or items. The second
step is to predict the ratings of the items that are not yet rated by a user.
29
x=ratings_with_name.groupby('User-ID').count()['Book-Rating']>200
padhe_likhe_users=x[x].index
filtered_rating=ratings_with_name[ratings_with_name['User-
ID'].isin(padhe_likhe_users)]
y=filtered_rating.groupby('Book-Title').count()['Book-Rating']>=50
famous_books=y[y].index
final_ratings=filtered_rating[filtered_rating['Book-Title'].isin(famous_books)]
pt=final_ratings.pivot_table(index='Book-Title',columns='User-ID',values='Book-
Rating')
Cosine Similarity
cosine similarity means the similarity between two vectors of inner product space, It
is measured by the cosine of the angle between two vectors.
similarity_scores=cosine_similarity(pt)
similarity_scores.shape
def recommend(book_name):
#index fetch
index=np.where(pt.index==book_name)[0][0]
similar_items=sorted(list(enumerate(similarity_scores[index])),key=lambda
x:x[1],reverse=True)[1:6]
data=[]
for i in similar_items:
30
item=[]
temp_df=books[books['Book-Title']==pt.index[i[0]]]
item.extend(temp_df.drop_duplicates("Book-Title")['Book-Title'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Book-Author'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Image-URL-M'].values)
data.append(item)
return data
Accuracy
One of the approaches to measure the accuracy of your result is the Root Mean
Square Error (RMSE), in which you predict ratings for a test dataset of user-item
pairs whose rating values are already known. The difference between the known
value and the predicted value would be the error. Square all the error values for the
test set, find the average (or mean), and then take the square root of that average
to get the RMSE.
Website Deployment
We are using the pycharm community to deploy the website. By creating the project
book recommendation System.
Flask provides configuration and conventions, with sensible defaults, to get started.
This section of the documentation explains the different parts of the Flask framework
and how they can be used, customized, and extended. Beyond Flask itself, look for
community-maintained extensions to add even more functionality.
def create_app():
app = Flask( name )
hello.init_app(app)
return app
31
CHAPTER 6
Today the World Wide Web provides users with a vast array of information, and
commercial activity on the Web has increased to the point where hundreds of new
companies are adding web pages daily. This has led to the problem of information
overload. Recommender systems have been developed to overcome this problem
by providing recommendations that help individual users identify content of interest
by using the opinions of a community of users and/or the user’s preferences.
The aim of this thesis was to design and evaluate different approaches for producing
personalised recommendations within the book domain. To achieve this goal, the
project first investigated existing recommender systems and profiling techniques.
The next step was to build users’ profiles by monitoring users’ behaviour, and
develop three different approaches for producing recommendations. Finally, an
evaluation of the system recommendations’ accuracy was done, by first conducting
live user experiments and then performing offline analysis to measure the
recommendations’ accuracy using appropriate methods for testing.
The system evaluation results show that the accuracy of the system
recommendations is very good and that a recommender system based on the
combination of content-based and collaborative filtering approaches provides more
accurate recommendations for the book domain.
32
CHAPTER 7
CONCLUSION
7.1 CONCLUSION
All of our systems– purely content-based, purely collaborative-filtering, and hybrid–
performed quite well. Looking back on the project, one thing that we might have
chosen to do differently in retrospect would have been to spend more time searching
for a dataset of ratings with a higher rating variance per user. Had we been able to
find such a dataset, our implementations of algorithms would have been tested on
data that would have been more representative of what a typical commercial
recommendation system could access in creating its predictions. However, given
the data that was available to us, as well as the results our various approaches
produced, our systems were largely successful, providing insight into how the
different systems we regularly use work and the varying algorithms that make that
possible.
33
In this article, we shed light on the important features that have proved to be sound
and effective in predicting phishing websites. In addition, we proposed some new
features, experimentally assign new rules to some well-known features and update
some other features.
• Handling of sparsity was a major challenge as well since the user interactions
were not present for the majority of the books.
• Since the data consisted of text data, data cleaning was a major challenge in
features like Location etc.
• Decision making on missing value imputations and outlier treatment was quite
challenging as well.
34
REFERENCES
[1] Ahuja, Rishabh, Arun Solanki, and Anand Nayyar.” Movie recommender system
using K-Means clustering and K-Nearest Neighbor.” In 2019 9th International
Conference on Cloud Computing, Data Science Engineering (Confluence), pp. 263-
268. IEEE, 2019.
[2] Badriyah, Tessy, Erry Tri Wijayanto, Iwan Syarif, and Prima Kristalina. ”A hybrid
recommendation system for E-commerce based on product description and user
profile.” In 2017 Seventh International Conference on Innovative Computing
Technology (INTECH), pp. 95-100. IEEE, 2017.
[3] Chen, Junnan, Courtney Miller, and Gaby G. Dagher. ”Product recommendation
system for small online retailers using association rules mining.” In Proceedings of
the 2014 International Conference on Innovative Design and Manufacturing
(ICIDM), pp. 71-77. IEEE, 2014.
[4] Jisha, R. C., Ram Krishnan, and Varun Vikraman. ”Mobile applications
recommendation based on user ratings and permissions.” In 2018 International
Conference on Advances in Computing, Communications and Informatics (ICACCI),
pp. 1000-1005.IEEE, 2018.
[5] Keerthana, N. K., Shriram K. Vasudevan, and Nalini Sampath. ”An Effective
Approach to Cluster Customers with a Product Recommendation System.” Journal
of Computational and Theoretical Nanoscience Vol. 17, No. 1, pp. 347-352.IEEE,
2020.
[6] Kurmashov, Nursultan, Konstantin Latuta, and Abay Nussipbekov. ”Online book
recommendation System.” In 2015 Twelve International Conference on Electronics
Computer and Computation (ICECCO), pp. 1-4. IEEE, 2015.
35
[8] Maya L Pai1,Suchithra M. S2andDhanya M, ”Analysis of Soil Parameters for
Proper Fertilizer Recommendation to Increase the Productivity of Paddy Field
Cultivation ”,Department of Computer Science and ITSchool of Arts and
Sciences,KochiAmrita Vishwa Vidyapeetham, International Journal of Advanced
Science and Technology Vol. 29, No. 03, pp. 4681-4696.IEEE, 2020
[10] Mohamed, Marwa Hussien, Mohamed Helmy Khafagy, and Mohamed Hasan
Ibrahim. ” Recommender systems challenges and solutions survey.” In 2019
International Conference on Innovative Trends in Computer Engineering (ITCE), pp.
149-155. IEEE, 2019.
[13] Tewari, Anand Shanker, and Kumari Priyanka. ”Book recommendation system
based on collaborative filtering and association rule mining for college students.” In
2014 International Conference on Contemporary Computing and Informatics (IC3I),
pp. 135-138. IEEE, 2014.
36
APPENDIX
A. SOURCE CODE
Fig A.1 Refers to the implementation of the flask for the website deployment.
from flask import Flask,render_template,request
import pickle
import numpy as np
popular_df=pickle.load(open('popular.pkl','rb'))
pt=pickle.load(open('pt.pkl','rb'))
books=pickle.load(open('books.pkl','rb'))
similarity_scores=pickle.load(open('similarity_scores.pkl','rb'))
app=Flask( name )
@app.route('/')
def index():
return render_template('index.html',
book_name = list(popular_df['Book-
Title'].values),
author=list(popular_df['Book-
Author'].values),
image=list(popular_df['Image-URL-
M'].values),
votes=list(popular_df['num_ratings'].values),
rating=list(popular_df['avg_rating'].values)
)
@app.route('/recommendation')
def recommendation_ui():
return render_template('recommendation.html')
@app.route('/recommend_books',methods=['post'])
def recommend():
user_input=request.form.get('user_input')
index = np.where(pt.index == user_input)[0][0]
similar_items = sorted(list(enumerate(similarity_scores[0])),
key=lambda x: x[1], reverse=True)[1:6]
data = []
for i in similar_items:
item = []
temp_df = books[books['Book-Title'] == pt.index[i[0]]]
item.extend(temp_df.drop_duplicates("Book-Title")['Book-
Title'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Book-
Author'].values)
item.extend(temp_df.drop_duplicates("Book-Title")['Image-
URL-M'].values)
data.append(item)
print(data)
return render_template('recommendation.html',data=data)
37
B. SCREENSHOTS
First of all we are importing the required libraries and datasets (Fig B.1)
38
In the next step data pre-processing is carried out to modify the data as required
(FigB.2)
39
In the Fig B.3, the books dataset is merged with ratings dataset to evaluate the
highest average rating of the books
Fig B.4 describes the output of the content based filtering model i.e. the top 10
highest average rating books are displayed
40
Collaborative filtering model recommends the books to users based on the books
interacted by the users(fig B.5)
41
Webpage deployment of content based filtering model is shown in Fig B.7
42
Webpage deployment of collaborative based filtering is shown in the fig B.8
43
C. RESEARCH PAPER
44
suggestion of books for online learning is illustrated in
The system provides prompt recommendations for any Section IV and the words are briefly defined in Section
services and goods the customer requests "Whatever V. The dataset, experimental evaluation, and result
and Whenever" thanks to these portable technologies. analysis are all described in Part VI. The section of the
Having a observation on the scenario when a user essay that concludes is Section VII.
wishes to choose an e-book or other reading content but
lacks the necessary personal experience. Unfortunately,
because there were so many things or e-resources II. LITERATURE REVIEW
available, they were unable to find the right ones. The
system that suggests is what we have suggested to avoid The field of recommendation system has expanded
this situation. Fig.1 depicts the flow of the recommender significantly over the past 20 years. Customers have
system overloaded by the information which is addressed by
offering unique, specialized material and repair advice.
A few modern cutting-edge techniques are shown in the
analysis sections.
45
address usage scenarios like proposing a new book to a dataset contains the user ID, location, age. The third
new customer or a generic counsel based on the dataset. dataset is the ratings dataset. The ratings dataset
Our system includes each of these programs, giving contains the user ID, ISBN of the book and the ratings
users a selection of options for picking the ideal book to provided by the users for the particular book.
read.
B. Algorithm
The primary shortcoming of the earlier solutions is that
they only handle one application, namely, a current 1) Pearson correlation: To gauge the linear correlation
customer receiving suggestions from the dataset and do between variables, co-efficient of Pearson R correlation
not deal with use scenarios such as recommending a is utilized. Its purpose is to build a recommendation
new book to a brand-new customer, or a generic advice using rating counts.
based on the dataset. Each of these applications is
included in our system, providing customers with a depending upon the ratings and the count of rating, a
variety of options for selecting the appropriate book to graph of rating distribution is drawn, and most
read. individuals have given ratings of 0.
III. METHODOLOGY In order to obtain the count in desc order, first group the
rating data frame by ISBN, then take the book rating
Algorithm used in "Book Recommendation column. Hence, if more people rate a book, it must be a
System" initiative aims to assist users in selecting the very well-known book overall. In order to do that, we
right choice of books that piques their interest and so have created a data frame using all the ISBN numbers,
motivate them to learn more. In this case, we're using then combined it with the books dataset's most popular
Cosine similarity, KNN, and Pearson correlation. titles using the ISBN field. Therefore, in conclusion, we
Seeing similarities across the books is becoming more can find the top 5 books based on the number of ratings
and more routine. As previously indicated, the major here. The ratings mean as well as the ratings count were
three use cases for this system are suggestions for discovered for the correlation. Then, present the data in
current users, recommendations for new users, and desc order after generating two fields in data frames:
ratings for newly uploaded books. Several approaches rating of the book and rating count. As a result, if we are
are used to deal with each of these. The main method using the rating count to determine popularity and my
used in this project is collaborative filtering based on book does not have a decent rating but the greatest
users. Depending upon the ratings given to the product number of people have given it a rating, it cannot be a
by another reader who share the target user's widely read book, so we should not promote it. In order
preferences, the system predicts what a user will like. for them to make recommendations, they must take into
Fig 2 depicts the difference between the content based account the rating average and the amount of ratings.
filtering and collaborative based filtering. We have some factual importance where users with less
than 200 evaluations and books with fewer than 100
evaluations are disallowed in contemplation to create
the best recommendation system. At that time, having
the opinion to combine user id and ISBN using the pivot
table; therefore, by using the indexes as user id on the
column features, it would show if the individual has
provided any ratings or not. When the pivot is used, the
rating table is actually transformed into a 2D matrix if
the user has not provided any ratings, which is basically
represented as nan. Then a correlation between the
ratings and the ratings average is discovered. Fig 3
depicts the output of pearson algorithm.
46
unsupervised learning algorithm is nearest neighbor. It
is imported as sklearn.neighbors from the sklearn
library. When determining users who are similar, the
user's previous ratings of the books are considered.
3) Collaborative filtering: Collaborative filtering is one 4) Suggestion of books to the New Users: A few books
of the methods we use to forecast the book ratings from should also be recommended to any new users who need
an already subscribed user by computing the to be added to the data set so that they can rate them.
resemblance among readers. The fundamental concept Future recommendations for that person can be
of collaborative filtering is this. Let us say we want to enhanced using this rating. The top 10 books, as
make recommendations for user x. To start, we identify determined by the total average rating of all the books,
a lot of other users that have user x's likes and dislikes. are presented to each new user who has been added to
K Nearest Neighbor is used to accomplish this. An this recommendation system. By dividing the total
47
number of ratings for each book by the total number of approach for determining a model's error.
ratings, the average ratings are determined. Afterwards, 2) MAE: The only difference between this method and
they are arranged in the order of decrement. Following the former is that with MAE, In the case of linear
the users' ratings of these books, collaborative filtering continuous variables, the error is estimated.
is used to create the users' subsequent suggestions. Fig
5 depicts the books with highest rating and fig 6 depicts From the ratings pivot table five readers were chosen
the highest rated author’s randomly. The five users anticipated and saved ratings
of the previously reviewed books. Also, each of the
dataset's actual evaluations was saved. There were 666
total ratings. These served as a basis for calculating the
Root Mean Squared Error and Mean Absolute Error
values.
48
System.” In 2015 Twelve International Conference on Recommendation to Increase the Productivity of Paddy
Electronics Computer and Computation (ICECCO), pp. Field Cultivation ”,Department of Computer Science
1-4. IEEE, 2015. and ITSchool of Arts and Sciences,KochiAmrita
Vishwa Vidyapeetham, International Journal of
[2] Tewari, Anand Shanker, and Kumari Priyanka. Advanced Science and Technology Vol. 29, No. 03, pp.
”Book recommendation system based on collaborative 4681-4696.IEEE, 2020
filtering and association rule mining for college
students.” In 2014 International Conference on [11] Jisha, R. C., Ram Krishnan, and Varun Vikraman.
Contemporary Computing and Informatics (IC3I), pp. ”Mobile applications recommendation based on user
135-138. IEEE, 2014. ratings and permissions.” In 2018 International
Conference on Advances in Computing,
[3] Mohamed, Marwa Hussien, Mohamed Helmy Communications and Informatics (ICACCI), pp. 1000-
Khafagy, and Mohamed Hasan Ibrahim. ” 1005.IEEE, 2018.
Recommender systems challenges and solutions
survey.” In 2019 International Conference on
Innovative Trends in Computer Engineering (ITCE), [12] Keerthana, N. K., Shriram K. Vasudevan, and
pp. 149-155. IEEE, 2019. Nalini Sampath. ”An Effective Approach to Cluster
Customers with a Product Recommendation System.”
[4] Badriyah, Tessy, Erry Tri Wijayanto, Iwan Syarif, Journal of Computational and Theoretical Nanoscience
and Prima Kristalina. ”A hybrid recommendation Vol. 17, No. 1, pp. 347-352.IEEE, 2020.
system for E-commerce based on product description
and user profile.” In 2017 Seventh International [13] PV Devika, K Jothisree, PV Rahul, S Arjun,
Conference on Innovative Computing Technology Jayasree Narayan. “Book Recommendation System”,
(INTECH), pp. 95-100. IEEE, 2017. 2021 12th International Conference on Computing
Communication and Networking Technologies
[5] Ahuja, Rishabh, Arun Solanki, and Anand Nayyar.” (ICCCNT), 2021
Movie recommender system using K-Means clustering
and K-Nearest Neighbor.” In 2019 9th International
Conference on Cloud Computing, Data Science
Engineering (Confluence), pp. 263-268. IEEE, 2019.
49
50