0% found this document useful (0 votes)
14 views24 pages

Bda Mini Project Part2

This document outlines a project focused on developing a collaborative filtering-based book recommendation system that utilizes both memory-based and model-based approaches. It discusses the motivation, problem statement, objectives, and proposed methodology, including data exploration and machine learning techniques for generating recommendations. The project aims to enhance user experience by providing personalized book suggestions based on user preferences and ratings.

Uploaded by

vishal jadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views24 pages

Bda Mini Project Part2

This document outlines a project focused on developing a collaborative filtering-based book recommendation system that utilizes both memory-based and model-based approaches. It discusses the motivation, problem statement, objectives, and proposed methodology, including data exploration and machine learning techniques for generating recommendations. The project aims to enhance user experience by providing personalized book suggestions based on user preferences and ratings.

Uploaded by

vishal jadhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Index

TABLE OF CONTENTS

Sr. No. Title Page No.

Abstract

List of Figures

List of Tables

Chapter 1 Introduction
1-2
Chapter 2 Literature Survey 3

Chapter 3 Problem Statement 4

Chapter 4 Objectives and scope 5

Chapter 5 Proposed methodology 6-11

Chapter 6 System Architecture 12

Chapter 7 Tools and Dataset required 13

Chapter 8 Implementation screenshot 14-18

Chapter 9 Conclusion 19

Reference 20

4
Abstract

The data available online, helps users to get information about anything of his/her interest. But
since the data is huge and complex it is difficult to get useful information from it. Recommender
System are effective software techniques to overcome this problem. Based on the user's and
item's information available, these techniques provides recommendations to users in their area
of interest. Recommender systems have wide applications like providing suggestive list of
items to customers for online shopping, recommending articles or books for online reading,
movie or music recommendations, news recommendation etc. In this project, a collaborative
filtering based books recommendation platform is proposed which uses both the memory based
and model based approaches.

5
List of Figures

Sr. No. Name of Figure Page No.

1 System Architecture 8

6
List of Tables

Sr. No. Name of Table Page No.

1 Tools 13

7
Chapter 1

Introduction
1.1 Motivation:
There has been a lot of analysis done both in business and world on developing new
approaches for service recommender systems .a lot of firms capture large scale data
regarding their customers, providers and operations. The ascent of the amount of
consumers, services and different on-line data yields service recommender systems in “Big
Data‖” setting, which poses crucial challenges for service recommender systems.
Moreover, in most existing service recommender systems such as hotel reservation systems
and Restaurant place.

1.1.1 Need of the problem


A book recommendation system is a type of recommendation system
where we have to recommend similar books to the reader based on his interest.
The books recommendation system is used by online websites which provide
ebooks like google play books, open library, good Read’s, etc.

1.2 Scope of the project:


Given more information regarding the books dataset, namely features like Genre,
Description etc., we could implement a content-filtering based recommendation system
and compare the results with the existing collaborative-filtering based system.

1.3 Aim:
This project aims to build & optimize a book recommendation system based on
collaborative filtering and will tackle an example of both memory based & model
based approach (using KNNWithMeans & Singular Value Decomposition)
Recommendation Systems are one of the largest application areas of Big Data
Analytics. They enable tailoring personalized content for users, thereby generating
revenue for businesses

1
Chapter 2
Problem Statement

2.1 Problem statement


Books are recommended by the clustering model and we are going to train and build using
various features such as user’s rating, book description, book titles etc. The system groups
users into clusters so that each data point within cluster is similar and dissimilar to the data
point in the other cluster. The system we would like to develop will also be able to find an
average rating for each cluster and it is going to find top rated books of users from each
cluster. All these books shortlisted by our system will be used for training our model in
future. The prediction model needs to be trained so as to produce better results.

2.2 Objectives
The objective of book recommender systems is to provide recommendations based on
recorded information on the users' preferences. These systems use information filtering
techniques to process information and provide the user with potentially more relevant
items. Recommend relevant books to users based on popularity and user interests.

2.3 Specifications of the system


The method used to create a recommendation system is collaborative filtering and will use
centered cosine similarity, cosine similarity, and k nearest neighbors. In giving a book
recommendation to students in implementing the program code there are a number of steps that
make all the data from borrowing the book into a matrix in the form of an array that will be
calculated using an algorithm, in the second stage will be done the process of normalizing the
ranking because in making book recommendations using borrowing data the number of books will
continue to grow if borrowed by the same student and this makes the ranking scale erratic so that
it needs normalization of ranks using centered cosine similarity, after the stage of normalization
ranking is complete it will start using the cosine similarity algorithm to get the results of book
rating, for the last stage sort the results of the previous calculation from the closest to k nearest
neighbors (KNN).

2
Chapter 3

Literature Review

The proposes a simple comprehensible system for book recommendations that help
readers to recommend the correct book. In recent years, data analysis challenge has been
centered on for the administration recommendation system.For shoppers, network assets square
measure utterly joined and quickly developed. The planned method works on coaching,
feedback, management, reporting, configuration, and exploitation it to supply helpful data to
the user in order to assist in decision-making and knowledge item recommendations.
Book recommendation system has been developed rapidly because of the net
technology and library modernization, which provide a replacement means for the librarians to
amass the readers’ demands. However, existing recommendation systems can’t provide enough
info for readers to choose whether or not to suggest a book or not, and that they don’t analyze
the recommendation info. Some systems conjointly lack of a feedback mechanism for readers,
which might hurt their enthusiasm. So as to unravel these issues, they designed a novel
book recommendation system.
Readers are redirected to the advice pages once they can’t realize the required book through
the library list retrieval system. the advice pages contain all the essential and increasing book
info for readers to seek advice from. Readers will suggest a book on these pages, and the
recommendation information is analyzed by the advice system to create scientific getting call.
They planned two formulas to reason the value and replica range respectively supported the
advice information. The application of the advice system shows that each the recommended
book utilization and readers’ satisfaction were greatly exaggerated

3
Chapter 4

Objectives and scope

Objectives:

The objective of book recommender systems is to provide recommendations based on recorded


information on the users' preferences. These systems use information filtering techniques to
process information and provide the user with potentially more relevant items. Recommend
relevant books to users based on popularity and user interests.

Scope:

Given more information regarding the books dataset, namely features like Genre, Description
etc., we could implement a content-filtering based recommendation system and compare the
results with the existing collaborative-filtering based system.

We would like to explore various clustering approaches for clustering the users based on Age,
Location etc., and then implement voting algorithms to recommend items to the user depending
on the cluster into which it belongs.

4
Chapter 5

Proposed methodology

Data exploration & cleaning

This project will use the 'Book-Crossing dataset' collected by Cai-Nicolas Ziegler

The dataset consists of 3 different tables:

• 'BX-Users': 278,858 records

• 'BX-Books': 271,379 records

• 'BX-Book-Ratings' : 1,149,780 records

5
Exploratory data analysis

Ratings are of two types, an implicit rating & explicit rating. An implicit rating is based
on tracking user interaction with an item such as a user clicking on an item '0'. An explicit
rating is when a user explicitly rates an item, i.e., b/w '1-10'
• Majority of ratings are implicit i.e., rating '0'

• Rating of '8' has the highest rating count among explicit ratings '1-10'

6
Machine Learning – Model Selection
• After data cleaning, there were 78,782 records left

• 686 unique users who have rated > 250 books each

• 1913 unique book titles that have received > 50 ratings each

• Use Python’s surprise library algorithms for building recommendations

5 fold cross validation model performance on training set (with default model
parameters):

Machine Learning – Hyperparameter tuning with GridSearchCV


KNNWithMeans Model Parameters:

• Name: distance measure, e.g., MSD or cosine

7
Chapter 6

System Architecture
Datasets were pre-processed to make suitable for developing the Recommendation system. Feature
extraction is performed in which Truncated-SVD is used to reduce the features of the dataset and Data
splitting is done in which training dataset and testing dataset are divided into 80:20 ratio. Content
Based Filtering System is developed in which book description is taken as an input and Collaborative
Filtering System is developed by building a model using K-Means Algorithm. Testing of model with test
data is performed.

Fig. System Architecture

8
Chapter 7
Tools and Dataset Required

Tools

Software Google Colaboratory


Browser Chrome
Operating System Windows 10
Backend Python

Dataset

9
Chapter 8

Implementation Screenshot

Program:

import numpy as np

import pandas as pd

import plotly.offline as py

import plotly.graph_objs as go

import plotly.io as pio

pio.renderers.default = "png"

import warnings

warnings.filterwarnings("ignore")

from sklearn.model_selection import train_test_split

from surprise import Reader, Dataset

from surprise.model_selection import train_test_split, cross_validate, GridSearchCV

from surprise import KNNBasic, KNNWithMeans, KNNWithZScore, KNNBaseline, SVD

from surprise import accuracy

!pip install surprise

def loaddata(filename):

df = pd.read_csv(f'{filename}.csv',sep=';',error_bad_lines=False,w
arn_

bad_lines=False,encoding='latin-1')

10
return df

book = loaddata("BX-Books")

user = loaddata("BX-Users")

rating = loaddata("BX-Book-Ratings")

rating.shape

rating.head(3)

rating.info()

print(f'Duplicate entries: {rating.duplicated().sum()}')

rating = rating[rating['User-

ID'].isin(rating_users[rating_users['Rating']>250]['index'])]

rating = rating[rating['ISBN'].isin(rating_books[rating_books['Rating'
]> 5

0]['index'])]

rating

rating = rating.merge(book, on="ISBN")[['User-ID','Book-


Title','Book-

Rating']] # merging with the book dataframe

rating

11
print(f'Duplicate entries: {rating.duplicated().sum()}')

rating.drop_duplicates(inplace=True)

rating

list_of_distinct_users = list(rating['User-ID'].unique())

reader = Reader(rating_scale=(0, 10))

data = Dataset.load_from_df(rating[['User-ID','Book-
Title','Book-

Rating']], reader)

raw_ratings = data.raw_ratings

import random

random.shuffle(raw_ratings) # shuffle dataset

threshold = int(len(raw_ratings)*0.8)

train_raw_ratings = raw_ratings[:threshold] # 80% of data is trainset

test_raw_ratings = raw_ratings[threshold:] # 20% of data is testset

data.raw_ratings = train_raw_ratings # data is now the trainset

trainset = data.build_full_trainset()

testset = data.construct_testset(test_raw_ratings)

models=[KNNBasic(),KNNWithMeans(),KNNWithZScore(),KNNBaseline(),SVD()]

results = {}

for model in models:

12
recommendations[index] += all_item_weighted_rating[index]

else:

recommendations[index] = all_item_weighted_rating[index]

for index in range(len(all_item_indices)):

if all_item_weights[index] !=0:

recommendations[index] =recommendations[index]/\

(all_item_weights[index]*like_re

commend)

temp_df = pd.Series(recommendations).reset_index().sort_values(by=0, a

scending=False)

recommendations = list(temp_df.to_records(index=False))

final_recommendations = []

count = 0

for item, score in recommendations:

flag = True

for userItem, userRating in trainset.ur[userID]:

if item == userItem:

flag = False # If item in recommendations has not be

en rated by user,

break # add to final_recommendations

if flag == True:

final_recommendations.append(trainset.to_raw_iid(item))

13
count +=1 # trainset has the items stored as inner id,

# convert to raw id & append

if count > get_recommend: # Only get 'get_recommend' number of re

commendations

break

return(final_recommendations)

recommendationsKNN = generate_recommendationsKNN(userID=13552, like_recomm

end=5, get_recommend=10)

recommendationsKNN

def generate_recommendationsSVD(userID=13552, get_recommend =10):

model = SVD(n_factors=50, n_epochs=10, lr_all=0.005, reg_all= 0.2)

model.fit(trainset)

testset = trainset.build_anti_testset()

predictions = model.test(testset)

predictions_df = pd.DataFrame(predictions)

predictions_userID = predictions_df[predictions_df['uid'] == userID].\

sort_values(by="est", ascending = False).head(get

_recommend)

14
recommendations = []

recommendations.append(list(predictions_userID['iid']))

recommendations = recommendations[0]

return(recommendations)

recommendationsSVD = generate_recommendationsSVD(userID=13552, get_recommend =1


0)

recommendationsSVD

15
Output:

16
17
18
Chapter 9
Conclusion
We have successfully implemented a memory based as well as method based collaborative
filtering approach to make book recommendations in this project .In instances with a new user
or new item where little is known of the rating preference, collaborative filtering may not be
the method of choice for generating recommendations. Content based filtering methods may
be more appropriate. A book recommendation system is a type of recommendation system
where we have to recommend similar books to the reader based on his interest. The books
recommendation system is used by online websites which provide ebooks like google play
books, open library, good Read’s, etc. Often, a hybrid approach is taken for building real time
recommendations using multiple different approaches in industry. .

19
References

[1]B. Cui and X.Chen, "An Online Book Recommendation System Based on Web Service," 2
009 Sixth InternationalConference on Fuzzy Systems and Knowledge Discovery, 2009, pp.
520-524, doi: 10.1109/FSKD.2009.328.

[2] A. S. Tewari, A. Kumar and A. G.


Barman, "Book recommendation system based on combine features of content based
filtering, collaborative filtering and association rule mining," 2014 IEEE International Advance
ComputingConference (IACC), 2014, pp. 500-503, doi: 10.1109/IAdCC.2014.6779375

[3] S. S. Sohail, J. Siddiqui and R. Ali, "Book recommendation system using opinion
mining technique," 2013International Conference on Advances in Computing,
Communications and Informatics (ICACCI), 2013, pp. 1609-1614, doi:
10.1109/ICACCI.2013.6637421.

[4] P. Jomsri, "Book recommendation system for digital library


based on user profiles by using association rule,"Fourth edition of the International Conference
on the Innovative Computing Technology (INTECH 2014), 2014, pp. 130-134,
doi:10.1109/INTECH.2014.6927766.

[5] S. Kanetkar, A. Nayak, S. Swamy and G. Bhatia, "Web-


based personalized hybrid book recommendation system,"2014 International Conference on
Advances in Engineering & Technology Research (ICAETR - 2014), 2014, pp. 1-5, doi:
10.1109/ICAETR.2014.7012952.

[6] N. Kurmashov, K. Latuta and A. Nussipbekov, "Online book recommendation system," 2


015 Twelve InternationalConference on Electronics Computer and Computation (ICECCO),
2015, pp. 1-4, doi:10.1109/ICECCO.2015.7416895

20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy