0% found this document useful (0 votes)
20 views4 pages

Documentation

Uploaded by

aalbaaba394
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Documentation

Uploaded by

aalbaaba394
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Analysis of Smartphone Market in Kazakhstan

Using Data from Technodom.kz


Introduction

Objective of the Project

The primary objective of this project is to analyze the smartphone market in Kazakhstan by
collecting and examining data from a leading local electronics retailer, Technodom.kz. By
scraping data on smartphones available on their website, we aim to uncover insights into pricing
trends, brand popularity, consumer ratings, and other key factors influencing the market.

Background of the Problem

Understanding the dynamics of the smartphone market is crucial for manufacturers, retailers, and
consumers alike. In Kazakhstan, smartphones are an essential part of daily life, and the market is
highly competitive with numerous brands vying for consumer attention. Analyzing data from a
major retailer provides valuable insights into market trends, consumer preferences, and potential
areas for growth or improvement.

Data Collection

Source of the Dataset

The dataset was collected by scraping the official website of Technodom.kz, one of the largest
electronics retailers in Kazakhstan. The specific category targeted was smartphones and mobile
phones.

https://www.technodom.kz/catalog/smartfony-i-gadzhety/smartfony-i-telefony

Methods of Collection

Data was collected using a Python script that employed the following techniques:

 Web Scraping Libraries: Utilized requests to fetch HTML content and BeautifulSoup for
parsing the HTML structure.
 Pagination Handling: Iterated through multiple pages (up to page 19) to collect a
comprehensive list of available smartphones.
 Data Extraction: Extracted relevant information such as brand, full name, price, rating, number
of reviews, model year, screen size, screen resolution, discount, refresh rate, and matrix type.
 Data Storage: Compiled the extracted data into two pandas DataFrames and saved them as CSV
files for further analysis.

Dataset Description

The final dataset comprises information on numerous smartphone models available on


Technodom.kz. Key attributes include:

 Brand: The manufacturer of the smartphone.


 Full Name: The complete product name as listed on the website.
 Price: The listed price in Kazakhstani Tenge (₸).
 Rating: Consumer rating out of 5 stars.
 Reviews: Number of consumer reviews.
 Model Year: The year the smartphone model was released.
 Screen Size: The diagonal size of the display in inches.
 Screen Resolution: The display resolution in pixels (width x height).
 Discount: Any discount available on the product.
 Refresh Rate: The screen refresh rate in Hertz.
 Matrix Type: The type of display matrix (e.g., IPS, AMOLED).

Methodology

Data Cleaning

 Handling Missing Values: Replaced 'No rating' and 'No reviews' entries with NaN for accurate
statistical calculations.
 Data Type Conversions:
o Cleaned the 'Price' column by removing currency symbols and spaces, converting it to a
numeric float type.
o Converted 'Rating' and 'Reviews' columns to numeric types, handling non-numeric
entries appropriately.
 Normalization: Applied min-max scaling to the 'Rating' column to normalize values between 0
and 1.
 Feature Extraction: Parsed the 'Screen Resolution' column to extract 'Width' and 'Height' as
separate numerical features.

Analysis

 Descriptive Statistics: Calculated measures such as mean, median, variance, and standard
deviation for key numerical attributes.
 Grouping and Aggregation:
o Grouped data by 'Brand' to analyze average prices, maximum prices, and total reviews.
o Created pivot tables to summarize average and maximum ratings by brand.
 Regression Modeling:
o Utilized the patsy library to prepare data for modeling.
o Employed statsmodels and scikit-learn to build and evaluate a linear regression
model predicting 'Rating' based on 'Price' and 'Reviews'.

Visualization

 Libraries Used: Employed matplotlib and seaborn for creating visualizations.


 Plots Created:
o Histogram: Distribution of ratings across all products.
o Boxplot: Detection of outliers in pricing.
o Scatter Plot: Relationship between price and rating for the top 10 brands by product
count.
o Rare Plot Type: Used a violin plot to visualize the distribution of prices by brand.

Results and Insights

Key Findings

1. Price Distribution:
o The average smartphone price is ₸307,917, with prices ranging significantly across
different brands.
o Outlier detection revealed that certain high-end models significantly increase the
average price.

2. Brand Analysis:
o Samsung and APPLE are among the top brands in terms of the number of products
offered.
o APPLE tends to have higher-priced models compared to other brands.

3. Consumer Ratings:
o The average rating across all products is relatively high, suggesting general consumer
satisfaction.
o No significant correlation was found between price and rating, indicating that higher
price does not necessarily equate to higher consumer satisfaction.

4. Reviews and Popularity:


o Products with more reviews tend to have higher visibility but do not always correlate
with higher ratings.

Visualizations and Interpretations

1. Distribution of Ratings:
o The histogram shows that most smartphones have ratings between 4.0 and 5.0.

2. Price Outliers:
o The boxplot highlights several outliers with exceptionally high prices.

3. Price Distribution by Brand:


o The boxplot demonstrates that APPLE smartphones are priced higher on average
compared to other brands.

4. Price vs. Rating for Top Brands:


o The scatter plot indicates no clear trend between price and rating among top brands.

Challenges and Solutions

Problems Faced

1. Data Extraction Issues:


o Inconsistent HTML structure on different pages made it challenging to extract certain
features.
o Missing or non-standardized data entries for some products.

2. Data Cleaning Complexities:


o Non-numeric entries like 'No rating' and 'No reviews' required careful handling to avoid
errors in analysis.

3. Encoding and Language Barriers:


o The website content is in Russian, which led to encoding issues when saving CSV files
and parsing text.
Solutions

1. Robust Scraping Logic:


o Implemented try-except blocks to handle exceptions during scraping and continued
processing without interruption.
o Used regular expressions and conditional checks to manage inconsistent HTML
elements.

2. Data Preprocessing Techniques:


o Replaced placeholder text with NaN and converted data types using pandas functions.
o Employed feature extraction methods to parse complex text fields into usable numerical
data.

3. Encoding Management:
o Ensured correct encoding (utf-8) when reading from and writing to CSV files.
o Utilized Python's string handling capabilities to manage multilingual text.

Conclusion

Summary of Insights

 The smartphone market in Kazakhstan, as reflected on Technodom.kz, offers a wide range of


products across various price points.
 Brand Influence: Certain brands dominate the market in terms of product availability and
pricing strategies.
 Consumer Satisfaction: High average ratings suggest that consumers are generally satisfied with
their smartphone purchases.
 Price vs. Quality: The lack of correlation between price and rating indicates that consumers can
find quality smartphones at various price levels.

Recommendations

 For Retailers:
o Consider promoting mid-range smartphones with high ratings to attract cost-conscious
consumers.
o Leverage consumer reviews and ratings in marketing strategies to build trust.

 For Manufacturers:
o Focus on maintaining high product quality regardless of price point to enhance
consumer satisfaction.
o Explore opportunities in the budget segment without compromising on key features.

 For Future Research:


o Incorporate additional data such as sales figures or stock availability for a more
comprehensive market analysis.
o Analyze the impact of discounts and promotions on consumer purchasing behavior.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy