PLAG 4.2 Final
PLAG 4.2 Final
INTRODUCTION
1.1 MOTIVATION
The present marketing trends are being driven by the retail sector's rapid structural
development as well as the expansion of online companies. Sales forecasting is a crucial
component of many businesses nowadays. The management of a store must do the crucial work
of sales forecasting. Sales prediction is employed to forecast product sales at numerous stores
and outlets of major retail mart corporations in various cities. Store sales forecasting can be
divided into two categories: (i) projecting sales from currently open stores for distribution,
setting goal sales and determining their feasibility, and managing the finances; and (ii)
forecasting sales from potential new stores for the study of new store site selection. Future
probability and trends are predicted using predictive analytics. Retailers are attempting to
improve their product offerings, pricing strategies, and service standards through the use of
predictive analytics in order to establish and strengthen sustainable competitive advantage. In
the field of sales management, intelligent forecasting can be quite important. In order to
estimate the most effective advertising channels and, thus, predict the development of sales,
intelligent forecasting leverages information from retail sales publicity. Despite the complexity
of the forecasting process, there is a simple method for assessing its accuracy. Can use machine
learning to find out what factors affect sales as well as predict how many sales there will be in
the near future.
The problem statement for advertising and sales analysis using Python and machine
learning involves developing a system to effectively analyze advertising and sales data to
improve marketing strategies and increase revenue. The system will utilize Python
programming language and machine learning techniques to analyze large datasets, identify
patterns, and predict future sales trends. This will involve cleaning and preprocessing the data,
performing exploratory data analysis, developing predictive models, and evaluating the
accuracy and effectiveness of these models. The system will also provide visualizations and
insights to aid in decision-making. The goal of this project is to provide a comprehensive and
1
automated solution for businesses to make data-driven decisions and optimize their advertising
and sales strategies. In this research, we developed a straightforward linear regression model
that predicts how sales of a specific product are impacted by its advertisement. A feature and
a label for sales are present in this commercial. We must forecast the sales of continuous input.
In doing so, we also examined if the mean absolute, mean square, and root mean squared error
metrics were exceptionally low. In order to determine whether the test residual feature supports
the linear regression model or not, I also added features from the advertisement model, such
as television, radio, and newspaper advertisements. Additionally, we used the Gradio Python
package, a GUI-based framework that accepts three inputs and produces sales. Additionally, it
can be housed in the cloud, making it accessible from anywhere in the world.
In this project, we created a simple linear regression model that forecasts the effects of
advertising on the sales of a specific product (in this case, a Samsung television that was
scraped from the internet). A feature and a label for sales are present in this commercial. We
must forecast the sales of continuous input. In doing so, we also looked to see if the mean
absolute, mean square, and root mean squared errors were very small. In order to determine
whether the test residual feature supports the linear regression model or not, I also added
features from the advertisement model, such as television, radio, and newspaper
advertisements. Additionally, we used the Gradio Python package, a GUI-based framework
that accepts three inputs and produces sales. Additionally, it can be housed in the cloud, making
it accessible from anywhere in the world. Use a variety of Python visualization tools to analyze
the data and determine which perimeter is most commonly used for future projection.
2
CHAPTER 2
LITERATURE SURVEY
Description: This paper examines the use of machine learning algorithms to predict
advertising success. The authors use Python to preprocess the data and apply algorithms such
as Logistic Regression, Random Forest, and Gradient Boosting. The study finds that machine
learning can accurately predict advertising success and that Random Forest outperforms other
algorithms in this context.
Title: "Predictive Modeling of Sales and Advertising Data with Machine Learning"
Description: This paper explores the use of machine learning algorithms to predict sales and
advertising outcomes. The authors use Python to preprocess the data and apply algorithms such
as Gradient Boosting, Artificial Neural Networks, and Random Forest. The study finds that
machine learning can improve the accuracy of sales and advertising predictions compared to
traditional statistical models.
Title: "Sales Forecasting using Machine Learning: A Case Study for a Large Retailer in Brazil"
Description: This paper explores the use of machine learning algorithms in predicting sales
for a large retailer in Brazil. The authors use Python to preprocess the data and apply machine
learning algorithms such as Random Forest, Gradient Boosting, and Artificial Neural Networks
to forecast sales. The study finds that machine learning can improve sales forecasting accuracy
compared to traditional statistical models.
3
Title: A Comparative Study of Machine Learning Algorithms for Sales Forecasting in Retail
Industry
Description: This paper compares the performance of different machine learning algorithms
for sales forecasting in the retail industry. The authors use Python to preprocess the data and
apply algorithms such as Random Forest, Support Vector Regression, and Artificial Neural
Networks. The study finds that Random Forest outperforms other algorithms in terms of
accuracy.
4
CHAPTER 3
EXISTING SYSTEM
There are several existing systems that use Python and machine learning for
advertisement and
This is a Python-based dashboard that helps marketers analyze the performance of their
AdWords campaigns. It uses machine learning algorithms to provide insights into which
keywords and ads are driving the most clicks and conversions.
This is a machine-learning model that predicts future sales based on historical data. It
can be used to help businesses make decisions about inventory, staffing, and pricing.
This is a machine learning technique that groups customers based on their buying
behavior and demographic information. It can be used to identify the most profitable customer
segments and tailor marketing campaigns to them.
There are several Python libraries that can be used to analyze social media data,
including sentiment analysis, topic modeling, and network analysis. This can be useful for
understanding how customers are talking about a brand and identifying opportunities for
engagement.
5
model that takes into account customer data such as age, gender, location, and previous
transaction history. The model could then use machine learning algorithms to analyze this data
and recommend products that are likely to be of interest to the customer. This approach can
help improve customer engagement, increase sales, and ultimately drive revenue growth for
the business.
• Takes more time: Finding the predicted data manually requires an increasing amount
of time when there is no computer operation.
• Absence of EDA: Without exploratory data analysis, are unable to analyze the data
and cannot determine which data points should receive high priority.
6
CHAPTER 4
PROPOSED SYSTEM
These are the features that the system must fulfill to meet the needs of the end users. Each
of these functions must be implemented in the system as a contract must be executed. They
specify or describe the inputs to be provided to the system, the actions to be performed, and
the expected results. Unlike nonfunctional requirements, they are the main users of the declared
structure contained in the finished product.
• Data Collection
• Testing Values
• Removing Duplicate Rows
• Data Cleaning
• Data Scaling
• Finding Values
• Data Visualization
• Learning Models
• Deployment
Basically, they are the quality requirements that the system must meet in accordance
with the commitment of the project. Depending on the project, these standards may be of
different importance or used to varying degrees. Also called negative attitude of character.
• Portability
• Security
7
• Manageability
• Reliability
• Scalability
• Performance
• Reusability
• Flexibility
8
• Personalization: Machine learning algorithms can analyze consumer data to identify
individual preferences and behaviors. This can enable businesses to personalize their
advertising and sales strategies to better target specific audiences and increase customer
engagement.
• Scalability: Machine learning algorithms can handle large amounts of data and can be
easily scaled up or down depending on the size of the business. This can help businesses
to grow and expand their operations without having to invest in additional resources.
Overall, a proposed system that uses machine learning for advertising and sales analysis
can provide several benefits to businesses, including improved accuracy, cost savings, real-
time insights, personalization, and scalability.
9
CHAPTER 5
SYSTEM DESIGN
System Design involves creating a software application that uses advanced data
analysis techniques to identify patterns and relationships in sales and advertising data. This
involves building machine learning models and using them to make predictions and
recommendations to optimize advertising and sales strategies
• Low-Level Design(LLD)
Unified Modeling Language (UML) is a visual modeling language used to design and
document software systems. It provides a standardized way of representing software
components, their relationships, and their behavior. UML diagrams can be used to describe the
structure of the system, the behavior of the system, and the interactions between the system
and its environment.
10
UML is a powerful tool for software designers and developers because it provides a way to
communicate complex design concepts in a clear and concise manner. By using UML
diagrams, designers can visualize the structure and behavior of a system and communicate that
design to other developers or stakeholders. UML also provides a standardized way of
representing software systems, making it easier to collaborate with others and maintain
consistency across the development process. Overall, UML is an essential tool for designing
and documenting software systems.
• UML can help software developers to better understand and communicate the design
and architecture of a software system.
• UML diagrams can be used to document and visualize the various components,
relationships, and interactions within a software system.
• UML can help to identify potential design flaws and inconsistencies before
development begins, saving time and resources in the long run.
• However, UML can also be complex and time-consuming to learn and use effectively
and may not be necessary for small or simple software projects.
• The need for UML also depends on the development methodology being used, as some
methodologies may place less emphasis on modeling and documentation.
• Ultimately, the decision to use UML should be based on the specific needs and
requirements of the software project and the development team.
• Structural diagrams:
11
Shows the structure of a system or software application. It typically includes
class diagrams, object diagrams, and component diagrams, and shows the relationships
between different components and how they interact with each other.
• Behavior diagrams:
These are used to model the dynamic behavior of a system by describing how
it responds to internal and external events. They focus on the interactions between the
system's components and how they collaborate to achieve specific functionality.
Examples of UML behavioral diagrams include use case diagrams, activity diagrams,
and sequence diagrams.
A class diagram is a type of UML diagram used to represent the structure of a software
system. It shows the classes in the system, their attributes, methods, and the relationships
between them. Each class is represented as a box with its name, and its attributes and methods
listed underneath. The relationships between the classes are shown as lines connecting the
boxes, with arrows indicating the direction of the relationship. Overall, a class diagram
provides a visual representation of the classes and their relationships in a software system,
making it easier to understand and communicate the system's structure.
12
FIG: 5.2.1.1 CLASS DIAGRAM
13
FIG: 5.2.2.1 USE CASE DIAGRAM
14
FIG: 5.2.3.1 ACTIVITY DIAGRAM
A sequence diagram is a type of interaction diagram that shows how objects in a system
interact with each other over time. It illustrates the interactions between objects in a sequential
order and is used to model the behavior of a system. It shows the messages that pass between
objects and the order in which they occur. The sequence diagram provides a visual
representation of the interactions in a system, making it easier to understand and analyze the
behavior of the system.
15
FIG: 5.2.4.1 SEQUENCE DIAGRAM
16
5.4 DATA FLOW DIAGRAM
Python and machine learning are powerful tools that can be used to analyze and gain
insights from large amounts of data, helping businesses make data-driven decisions.Python
offers a wide range of libraries, such as Pandas, Numpy, and Matplotlib, that provide efficient
and effective data manipulation, analysis, and visualization capabilities. Machine learning
algorithms, such as decision trees, random forests, and support vector machines, can be used
to analyze and predict customer behavior, preferences, and sentiment based on historical data.
By using Python and machine learning, businesses can analyze and understand customer
behavior, improve advertisement targeting, and optimize sales strategies. The use of these
technologies can also help businesses identify trends and patterns in customer data, which can
be used to develop new products and services that meet customers' needs and preferences.
17
Overall, the use of Python and machine learning can help businesses gain a competitive edge
and improve their bottom line.
Python is a highly adaptable programming language that can be used in various applications.
It is commonly used for a wide range of purposes, including but not limited to:
• Data analysis and manipulation using libraries such as Pandas and Numpy
• Web development using frameworks like Django and Flask
• Machine learning and artificial intelligence
• Automation of repetitive tasks using scripting
• Scientific computing and numerical analysis using libraries such as SciPy and
Matplotlib
5.5.1 IDE
A group of software tools called Project Jupyter are used in interactive computing. In
2001, Fernando Perez created IPython as an improved version of the Python interpreter. In
2011, IPython Notebook, a web-based interface to the IPython terminal, was released. Project
Jupyter began as an IPython spin-off project in 2014.
IDE used:
5.5.2 VS CODE
Visual Studio Code (VS Code) is a free source-code editor developed by Microsoft that
runs on Windows, Linux, and macOS. It provides a rich set of features and tools that help
developers to write, debug, and deploy code quickly and easily. The editor offers support for
many programming languages, with features such as syntax highlighting, code completion,
18
and code snippets. VS Code also integrates with Git for version control, has a built-in terminal,
and allows for the installation of extensions and themes to customize the editor's functionality
and appearance. It is a popular choice among developers for its versatility, ease of use, and
extensive capabilities.
5.5.3 JUPYTER
19
FIG: 5.5.3.1 jupyter notebook code
20
CHAPTER 6
IMPLEMENTATION
6.1 LIBRARIES
Numpy:
NumPy is a Python library used for scientific computing. It provides a powerful array
data structure, as well as functions for performing mathematical operations on those arrays.
NumPy is designed to be efficient and fast, making it a popular choice for working with large
data sets, machine learning algorithms, and other computational tasks.
One of the key features of NumPy is its support for multidimensional arrays. These arrays can
be used to represent vectors, matrices, and other higher-dimensional data structures. NumPy
also provides a wide range of mathematical functions, including basic arithmetic operations,
linear algebra functions, and statistical functions. Additionally, NumPy can be used in
conjunction with other Python libraries, such as Matplotlib, to create visualizations of the data.
Overall, NumPy is a powerful tool for scientific computing and data analysis in Python.
Pandas:
Python's Pandas module is a well-liked open-source tool for handling and analyzing
data. In order to do data analysis activities including data cleansing, transformation, and
analysis, it offers highly effective and simple-to-use data structures. Data science, machine
learning, and scientific computing all make extensive use of Pandas.
Series and Data Frame are the two primary data structures that Pandas offers. A Series is an
object that resembles a one-dimensional array and may store any type of data, including
numbers, texts, and Python objects. Similar to a spreadsheet, a data frame is a two-dimensional
tabular data structure made up of rows and columns.
Additionally, Pandas has strong data-manipulation features like filtering, grouping, joining,
and aggregating. Users can rapidly and simply manipulate huge datasets using these
techniques. And also, Pandas includes native support for reading and writing data in a number
21
of different formats, including CSV, Excel, SQL databases, and others. It is also highly helpful
for data cleaning and pre-processing chores because it can manage missing or NaN (Not a
Number) values in data sets.
MatplotLib:
Matplotlib is a popular data visualization library for Python that provides a wide range
of tools and functions for creating static, animated, and interactive visualizations of data. It can
be used to create line plots, scatter plots, bar charts, histograms, and many other types of
visualizations, making it a versatile and powerful tool for data analysis and communication.
Seaborn:
Seaborn is a Python data visualization library based on the popular matplotlib library.
It provides a high-level interface for creating attractive and informative statistical graphics,
such as scatter plots, line plots, heatmaps, and more. Seaborn offers many built-in features and
themes that can be used to customize the appearance of the visualizations, making it easier to
create professional-looking plots with minimal effort. It is widely used in data science, machine
learning, and statistical analysis applications for exploring and presenting data in a clear and
compelling way.
Gradio:
Gradio is a free and open-source Python toolkit that makes it easier to create and
distribute unique machine-learning models. Developers may quickly build user interfaces for
their machine learning models with Gradio, enabling users to communicate with them via a
web-based interface. A wide variety of inputs, including text, photos, and audio, as well as
outputs, including images, text, and interactive visualizations, are supported by the library.
Gradio's simplicity of use is one of its main advantages. Developers don't need to be experts
in web programming or user interface design to swiftly build and deploy their machine learning
models. With just a few lines of code, developers may design interactive models using Gradio's
user-friendly interface.
22
Gradio's high degree of adaptability enables developers to customize their models for certain
use cases. They can quickly add unique input and output types, adjust the user interface's layout
and appearance, and more.
Preparing and preprocessing data for advertising and sales analysis involves several
steps, including data cleaning, data transformation, and data integration. Here are some steps
to follow using Python and machine learning techniques:
Data collection:
Collecting data from various data sources including social media, advertising
platforms, transactional data, CRM systems, etc.
Data Visualization:
To better understand patterns, trends, and correlations in the data, data visualization in
Python entails using libraries like Matplotlib, Seaborn, and Plotly to produce illuminating and
aesthetically pleasing graphs, charts, and plots.
Labeling:
Each data point in a dataset is given a categorical or numerical label in Python, denoting
the class or category to which it belongs. It is a crucial phase in the process of supervised
machine learning because it allows the algorithm to draw knowledge from the labeled data and
generate precise predictions about brand-new, unlabeled data.
Data Selection:
Data selection is a critical step in data processing that involves identifying and
extracting specific subsets of data from a larger data set. This process involves analyzing the
data and selecting only the relevant data points or records based on specific criteria, such as
time period, geographical location, or other attributes.
Data selection is important because it helps to ensure that the data being analyzed is
accurate, relevant, and representative of the population being studied. By selecting only the
data points that are needed for a particular analysis, researchers can avoid including irrelevant
23
or noisy data that might skew the results. Additionally, data selection can help to reduce the
processing time and storage requirements for large data sets, making it more efficient to work
with the data.
Data preprocessing is the process of cleaning, transforming, and preparing raw data to
make it suitable for analysis. This involves identifying and correcting errors, removing
duplicates and missing values, scaling and normalizing data, and other steps to ensure that the
data is accurate, consistent, and complete.
• Data Formatting: Data formatting refers to the process of structuring and organizing
data in a specific format or structure that is suitable for analysis, storage, and
transmission. This involves converting data from one format to another, such as
converting text data to numerical data, or standardizing the format of data to ensure
consistency. Data formatting can also involve adding metadata or annotations to the
data to provide additional context or information about the data. Proper data formatting
is essential for accurate analysis and efficient data processing.
24
• Data cleaning: Data cleaning is the process of identifying and correcting errors or
inconsistencies in a data set. It involves detecting and removing duplicate records,
handling missing or incomplete data, correcting inaccurate values, and dealing with
outliers. Data cleaning is essential for ensuring that the data being analyzed is accurate
and reliable, which is crucial for making sound decisions. It is often a time-consuming
process, but it is critical for achieving meaningful insights and drawing reliable
conclusions from the data.
• Data anonymization: Data anonymization is the process of removing or obfuscating
personally identifiable information (PII) from a dataset, in order to protect the privacy
of individuals whose data is included. This is done by replacing sensitive information
with non-sensitive equivalents, or by altering the data in such a way that it cannot be
linked back to individuals. The goal of data anonymization is to provide useful data for
analysis while minimizing the risk of identification or other privacy violations.
• Data sampling: Data sampling is the process of selecting a representative subset of
data from a larger dataset for analysis. This technique involves selecting a random
sample of data points or records based on a predetermined sample size and criteria.
Sampling is useful for making inferences about a population based on a smaller sample,
as it reduces the time and computational resources required for analysis and can
improve the accuracy of the results.
• Data transformation:A data scientist consolidates or alters data in this final
preprocessing stage so that it is suitable for mining (the creation of algorithms to extract
knowledge from data) or machine learning. Data can be changed via attribute
aggregations, attribute decompositions, and scaling (normalization). Additionally
known as feature engineering
Scaling
Data may include numerical qualities (features) with a range of values, such as , meters,
and kilometers. In order for these attributes to have the same scale—for example, between 0
and 1, or 1 and 10, for the smallest and largest value for an attribute—they must be scaled.
Decomposition
25
Decomposition is a technique used in data analysis to break down complex data sets
into simpler, more manageable components. It involves identifying patterns and relationships
within the data and separating them into individual components based on their characteristics
or properties. Decomposition can be used to analyze time series data, where the data is broken
down into trend, seasonality, and residual components. It can also be used in machine learning
algorithms, such as principal component analysis, to identify the most important features of a
data set and reduce its dimensionality. Overall, decomposition is a powerful tool for
simplifying complex data sets and gaining insights into their underlying structures.
Aggregation
Aggregation is a process in which multiple data values are combined into a single
summary value. This process is often used in data analysis to simplify the data and make it
easier to understand and interpret. Aggregation can involve calculating summary statistics,
such as the mean, median, or mode of a set of data points. It can also involve grouping data by
a specific variable, such as time period, location, or category, and then calculating summary
statistics for each group.
Aggregation is an important tool for data analysis because it helps to identify patterns and
trends in the data. By summarizing large amounts of data into a few key metrics, analysts can
gain insights into the behavior of the data and make more informed decisions. Aggregation can
also help to reduce the complexity of the data and make it more manageable for further analysis
or visualization.
Data splitting is the process of dividing a data set into two or more subsets for the
purpose of training and testing machine learning models.
Training set
Test set
26
A test set is a subset of a larger data set that is used to evaluate the performance of a
machine learning model. It is a set of data points that have been held back from the training
set, which the model has not seen during the training process. The purpose of a test set is to
provide an independent measure of the model's accuracy and to assess its ability to generalize
to new, unseen data.
6.4 Modeling
Model training
Model training is the process of training a machine learning model to make accurate
predictions or classifications based on input data. This involves feeding the model with large
amounts of training data and adjusting its parameters to minimize the error between the
predicted and actual outputs. The model is iteratively refined until it achieves a level of
accuracy that meets the desired performance criteria. Model training is a critical step in the
machine learning pipeline and requires careful selection of training data, feature engineering,
and tuning of hyperparameters to achieve the best possible performance.
Supervised learning
27
real-world problems, as it allows machines to learn from data and make accurate predictions,
without the need for explicit programming.
Unsupervised learning
Unsupervised learning is a type of machine learning in which the model learns patterns
and relationships in the data without explicit guidance or supervision from a human. In other
words, the model is not given labeled data to train on, but instead must identify patterns and
structure in the data on its own. Unsupervised learning algorithms can be used for tasks such
as clustering, anomaly detection, and dimensionality reduction. These techniques can be
applied to a wide range of applications, including customer segmentation, image recognition,
and natural language processing. Overall, unsupervised learning is a powerful tool for
exploring and analyzing large, complex data sets.
Model evaluation and testing is the process of assessing the performance of a machine
learning model using various metrics and techniques. This involves testing the model on a set
of data that was not used to train the model and evaluating its accuracy, precision, recall, and
other performance measures.
28
FIG: 6.3.1 DATA ANALYSIS OF DIFFERENT DATAPOINTS
29
FIG: 6.3.3 LINEAR REGRESSION LINE
30
FIG: 6.3.5 HEAT MAP
31
FIG: 6.3.7 KDE PLOT
32
CHAPTER 7
TESTING
Testing refers to the process of evaluating a software product or application to identify defects,
errors, or other issues that may affect its functionality, performance, or usability. It involves
executing specific test cases or scenarios and comparing the actual results with expected
outcomes. The goal of testing is to ensure that the software meets the intended requirements,
works as intended, and provides a satisfactory user experience. Testing can be conducted at
different stages of the software development lifecycle, including unit testing, integration
testing, system testing, and acceptance testing.
Testing is an essential process in software development that involves checking the software's
functionality, performance, and usability. To conduct testing, specific scenarios or test cases
are designed to evaluate the software's behavior in different conditions. Test cases define the
inputs, actions, and expected outputs for each test scenario. They help ensure that the software
is meeting the intended requirements and functioning as expected. Test cases can be created at
various stages of the software development lifecycle, such as unit testing, integration testing,
system testing, and acceptance testing. The effectiveness of testing depends on the quality and
coverage of the test cases used to evaluate the software.
White box testing is a software testing technique that examines the internal structure
and workings of the software being tested. It involves evaluating the code, design, and
architecture of the software to ensure that it functions as intended and meets the intended
requirements. White box testing is also known as clear box testing, glass box testing, or
structural testing.
33
Black Box Testing
Black box testing is a software testing technique that focuses on evaluating the external
behavior and functionality of the software being tested. It involves testing the software without
knowledge of its internal structure or workings and using test cases to validate that it functions
as intended and meets the intended requirements. Black box testing is also known as functional
testing or specification-based testing.
Functional Testing
Non-functional Testing
Grey box testing is a software testing technique that combines aspects of both white
box and black box testing. It involves testing the software with partial knowledge of its internal
workings and design, and using test cases to evaluate both its external behavior and internal
structure. Grey box testing is often used in integration testing and system testing.
Unit testing is a software testing technique that involves testing individual units or
components of a software application in isolation from the rest of the system. It typically
involves writing test cases for each unit and running them automatically to verify that the unit
34
meets its intended requirements. Unit testing is performed during the development phase to
detect defects early and ensure that the software functions as intended.
Unit testing frameworks are software tools that help automate the process of writing, executing,
and reporting unit tests. They provide a set of pre-built test functions and assertion libraries
that can be used to test individual units of code in various programming languages. Some
popular unit testing frameworks include JUnit for Java, NUnit for .NET, and pytest for Python.
A test fixture is a fixed set of test data or preconditions that are used as the basis for
running one or more test cases. It provides a known baseline for testing and helps ensure that
tests are repeatable and consistent. Test fixtures are commonly used in unit testing to set up
the environment for testing individual units of code.
A test case is a specific scenario or set of steps designed to evaluate the behavior and
functionality of a software application. It defines the inputs, actions, and expected outputs for
each scenario, and is used to verify that the software meets its intended requirements and
functions as expected.
A test suite is a collection of test cases or scenarios that are designed to test specific
aspects or functions of a software application. It is used to ensure that the software meets its
intended requirements and functions as expected in different scenarios. Test suites are often
organized into groups based on their objectives and can be run automatically or manually. The
results of the test suite can provide valuable feedback on the overall quality of the software
product.
35
7.3.4 Test runner
A test runner is a software tool that helps automate the process of executing and
managing software tests. It provides an interface for running test cases, organizing and
managing test suites, and reporting the results of the tests. A test runner can also be used to
automate the process of running tests as part of a continuous integration or delivery pipeline,
making it easier to ensure that software is tested thoroughly and reliably.
OUTCOMES POSSIBLE:
• The first is a passing test, which indicates that the software is functioning as intended
and meets its intended requirements.
• The second is a failing test, which indicates that the software has defects or errors that
need to be addressed.
• The third is an inconclusive test, which occurs when the test cannot determine whether
the software is functioning correctly or not.
36
3 Interface To check if If the It produces Verified
correct input inputsare the sales
issued or not correct it report in a
while using gui produces neat fromat
correct
resukts with
gui
CHAPTER 8
RESULTS
When the error prediction procedure is completely finished, we discover that the rmse
value is 0.916. This aided in our analysis of whether or not our variables fitted the provided
model appropriately, and we also helped us attain an overall accuracy of 92%. Additionally,
the Gradio GUI, which is viewable from anywhere in the world, is developed.
37
FIG: 8.1 DIFFERENC BETWEEN ACTUAL vs GENERATED OUTPUT
38
CHAPTER 9
CONCLUSION
When the error prediction procedure is completely finished, we discover that the rmse
value is 0.916. This aided in our analysis of whether or not our variables fitted the provided
model appropriately, and we also helped us attain an overall accuracy of 92%. In the future,
we may expand this research by considering several different prediction-based algorithms and
ultimately grading them according to their accuracy and error levels. With more categorical
variables included in our data set, we can also use a much bigger data set. This aids us in
comprehending and putting into practice a method that has, correspondingly, far higher
accuracy and lower error levels. Using this unique data set, a survey article covering all the
methods may also be written.
In conclusion, Python and Machine Learning are powerful tools for analyzing advertisements
and sales data. Businesses can learn about consumer behavior and enhance their marketing
tactics to increase sales success by using machine learning algorithms and statistical models.
39
Overall, Python and Machine Learning are essential tools for businesses looking to improve
their advertising and sales performance. By leveraging the power of data analysis and
predictive modeling, businesses can make data-driven decisions that lead to increased revenue
and growth.
CHAPTER 10
CONCLUSION
There are several potential areas of future work in advertisement and sales analysis using
Python and machine learning. Some of them are:
• Advanced predictive modeling: In spite of the fact that conventional machine learning
models like linear regression, decision trees, and random forests are helpful in
predicting sales and optimizing ad spend, more advanced models like neural networks,
deep learning models, and ensemble methods could provide even better results. Future
work could explore the use of these advanced models to improve prediction accuracy.
• Personalization: Personalized marketing and sales strategies are becoming
increasingly popular, and future work could focus on developing models that can
recommend products or services based on customer preferences and behavior. This
could involve using data to create individualized recommendations, companies may
use data about their customers' browsing and purchasing patterns and demographics.
40
• Real-time analysis: With the increasing availability of real-time data, future work
could explore the use of machine learning models that can analyze sales and ad
performance in real-time. This could allow businesses to make immediate adjustments
to their strategies, improving their chances of success.
• Natural language processing: The study of human language is the focus of the
machine learning discipline known as natural language processing (NLP). Future work
could explore the use of NLP techniques to analyze customer reviews and feedback, to
acquire insights on, and use user-generated content such as social media posts and other
customer sentiment and preferences.
• Visualization: While machine learning models can provide valuable insights,
visualizing these insights in a way that is easy to understand and act on is also
important. Future work could focus on developing interactive dashboards and
visualizations that can help businesses understand and act on the insights provided by
machine learning models.
Overall, the field of advertisement and sales analysis using Python and machine learning
is constantly evolving, and there are many potential areas of future work. By leveraging
advanced techniques and technologies, businesses can gain a competitive edge and improve
their bottom line.
41