Documentation
Documentation
The primary objective of this project is to analyze the smartphone market in Kazakhstan by
collecting and examining data from a leading local electronics retailer, Technodom.kz. By
scraping data on smartphones available on their website, we aim to uncover insights into pricing
trends, brand popularity, consumer ratings, and other key factors influencing the market.
Understanding the dynamics of the smartphone market is crucial for manufacturers, retailers, and
consumers alike. In Kazakhstan, smartphones are an essential part of daily life, and the market is
highly competitive with numerous brands vying for consumer attention. Analyzing data from a
major retailer provides valuable insights into market trends, consumer preferences, and potential
areas for growth or improvement.
Data Collection
The dataset was collected by scraping the official website of Technodom.kz, one of the largest
electronics retailers in Kazakhstan. The specific category targeted was smartphones and mobile
phones.
https://www.technodom.kz/catalog/smartfony-i-gadzhety/smartfony-i-telefony
Methods of Collection
Data was collected using a Python script that employed the following techniques:
Web Scraping Libraries: Utilized requests to fetch HTML content and BeautifulSoup for
parsing the HTML structure.
Pagination Handling: Iterated through multiple pages (up to page 19) to collect a
comprehensive list of available smartphones.
Data Extraction: Extracted relevant information such as brand, full name, price, rating, number
of reviews, model year, screen size, screen resolution, discount, refresh rate, and matrix type.
Data Storage: Compiled the extracted data into two pandas DataFrames and saved them as CSV
files for further analysis.
Dataset Description
Methodology
Data Cleaning
Handling Missing Values: Replaced 'No rating' and 'No reviews' entries with NaN for accurate
statistical calculations.
Data Type Conversions:
o Cleaned the 'Price' column by removing currency symbols and spaces, converting it to a
numeric float type.
o Converted 'Rating' and 'Reviews' columns to numeric types, handling non-numeric
entries appropriately.
Normalization: Applied min-max scaling to the 'Rating' column to normalize values between 0
and 1.
Feature Extraction: Parsed the 'Screen Resolution' column to extract 'Width' and 'Height' as
separate numerical features.
Analysis
Descriptive Statistics: Calculated measures such as mean, median, variance, and standard
deviation for key numerical attributes.
Grouping and Aggregation:
o Grouped data by 'Brand' to analyze average prices, maximum prices, and total reviews.
o Created pivot tables to summarize average and maximum ratings by brand.
Regression Modeling:
o Utilized the patsy library to prepare data for modeling.
o Employed statsmodels and scikit-learn to build and evaluate a linear regression
model predicting 'Rating' based on 'Price' and 'Reviews'.
Visualization
Key Findings
1. Price Distribution:
o The average smartphone price is ₸307,917, with prices ranging significantly across
different brands.
o Outlier detection revealed that certain high-end models significantly increase the
average price.
2. Brand Analysis:
o Samsung and APPLE are among the top brands in terms of the number of products
offered.
o APPLE tends to have higher-priced models compared to other brands.
3. Consumer Ratings:
o The average rating across all products is relatively high, suggesting general consumer
satisfaction.
o No significant correlation was found between price and rating, indicating that higher
price does not necessarily equate to higher consumer satisfaction.
1. Distribution of Ratings:
o The histogram shows that most smartphones have ratings between 4.0 and 5.0.
2. Price Outliers:
o The boxplot highlights several outliers with exceptionally high prices.
Problems Faced
3. Encoding Management:
o Ensured correct encoding (utf-8) when reading from and writing to CSV files.
o Utilized Python's string handling capabilities to manage multilingual text.
Conclusion
Summary of Insights
Recommendations
For Retailers:
o Consider promoting mid-range smartphones with high ratings to attract cost-conscious
consumers.
o Leverage consumer reviews and ratings in marketing strategies to build trust.
For Manufacturers:
o Focus on maintaining high product quality regardless of price point to enhance
consumer satisfaction.
o Explore opportunities in the budget segment without compromising on key features.