0% found this document useful (0 votes)
17 views6 pages

Discussion #2 Shahrzad Karbasi

Uploaded by

shahrzad.krbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

Discussion #2 Shahrzad Karbasi

Uploaded by

shahrzad.krbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 6

1

Concordia University Chicago


MBA Program

Student: Shahrzad karbasi


Shahrzad.krbc@gmail.com
Crf_karbass

Day Telephone: 0049 176 164 85048


Evening Telephone: 0049 176 164 85048

Assignment Title:
Date of Submission: 6/11/2024
Assignment Due Date: 6/11/2024

Course: Business Analytics

Section Number:
Semester: 3
Course Instructor:

Certification of Authorship: I certify that I am the author of this paper and that any assistance I
received in its preparation is fully acknowledged and disclosed in the paper. I also have cited any
sources from which I used data, ideas, or words, either quoted directly or paraphrased. I certify
that this paper was prepared by me specifically for the purpose of this assignment, as directed.

Student’s Signature: shahrzad karbasi

[Digital signature]
2

Shahrzad karbasi

Concordia University Chicago

Business Analytics

Professor Farshad Badie

Discussion #2
3

Data management and analysis of major college survey responses


This analysis examines academic discipline-specific datasets collected from survey
respondents, focusing on how data management practices can improve the quality and
usability of survey results. Specifically, this paper details the data cleaning steps
undertaken to ensure accuracy and consistency, followed by an interpretation of
frequency distributions and descriptive statistics for several key variables. These efforts
provide insights into the diversity of educational backgrounds represented in the dataset
and highlight trends in respondent disciplines.

Data cleaning steps


Data cleaning is an essential step in preparing data sets for meaningful analysis. For this
project, the data cleaning process began by examining the variable “MAJOR1,” which
represents the respondents' chosen academic major. The first step was to standardize the
labels to ensure that all primary names were formatted consistently. This process
minimized inconsistencies that could lead to confusion or inaccuracy when interpreting
the data.

Next, handling missing data was a priority, as the dataset contained several reserved
codes for responses that were either unavailable or inapplicable. These codes included
"don't know", "no answer", "not applicable" and "rejected on the web". To focus on
meaningful and relevant responses, these inapplicable codes were removed from the
analysis. In addition, I merged the categories with low number of responses. For
example, fields with fewer than five respondents, such as "optometry" and "gerontology,"
were grouped under broader labels such as "other professions" or similar terms. This
4

integration reduced the complexity of the data set and allowed for more focused insights
into prominent areas of study.

Frequency distribution and descriptive statistics


With data cleaning completed, I calculated frequency distributions and descriptive
statistics—including mean, median, mode, and standard deviation—for three variables of
interest: "MAJOR1" (university major), "COUNT" (number of respondents in each major
), and "PCT" (percentage of respondents by discipline).

MAJOR1 (College major):


Due to the classification of this variable, it does not have an average value. However, the
median and mode provide useful insight. Business Administration was selected as the
most used major with a significant number of 221 respondents, representing the major as
well as major average. Education and nursing as the next most common fields of study
indicate that many respondents are from practical and career-oriented backgrounds. This
reflects a trend toward fields with broad, professional applications, and suggests that
survey respondents may be primarily seeking education that directly prepares them for
specific occupations.

COUNT (number of respondents in each field):


The number of respondents in each field, excluding non-applicable responses, yielded an
average of approximately 27.3 respondents. However, the median of 8 shows that many
disciplines have relatively low representation and only a few disciplines have high
numbers. Business management, with 221 respondents, serves as the method, which
highlights the great difference between this field and other fields. A standard deviation of
47.8 further emphasizes this variance, indicating a wide spread of respondents in
different fields. This expansion shows that while the dataset includes a diverse range of
majors, only a small subset captures a significant number of respondents.
5

PCT (percentage of respondents by discipline):


The percentage of respondents in each discipline provides a clearer sense of
representation in the data set. The average percentage per major is 1.2%, while the
median is 0.3%, with Business Administration accounting for the highest ratio at 14.8%.
However, most other fields represent less than 1% of all responses. A standard deviation
of 2.8% indicates significant variance in representation, with respondents gravitating
toward only a few popular disciplines. The high percentage for business management, in
particular, suggests that it appeals to a broad audience, possibly because of its versatility
and broad applicability to the workforce.

Overall, the data show a clear pattern across academic fields. Majors such as business
administration, education, and nursing are dominant, and most other majors have only
limited representation. This distribution suggests that respondents may be predominantly
from vocational and professional backgrounds with a strong emphasis on practical and
applicable skills. In addition, a significant portion of missing data—comprising 56.8% of
the data set—was effectively managed by removing these entries from the analysis,
allowing the study to focus on meaningful responses.

This analysis demonstrates how effective data management and cleaning practices
enhance the clarity and accuracy of survey data. By standardizing labels, consolidating
categories, and managing missing data, the dataset was refined to enable valuable
insights into trends in college majors among survey respondents. The frequency
distributions and descriptive statistics suggest a predominance of career-oriented majors,
with a skewed distribution favoring fields like Business Administration and Education.
These findings provide a foundational understanding of the educational diversity among
6

respondents and underscore the value of data cleaning in preparing datasets for accurate
and insightful analysis.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy