Four Visualizationss
Four Visualizationss
The assignment is a case study for the housing unit sales in King County, US, between the year 2014-
2015. The information has been obtained from Kaggle and then visualizations have been created
with Tableau 22.0 as well as their managerial interpretations have been covered. The paper also
provides a brief account of the recent development in visualizations, which is data democratization
given its relevance to the housing data obtained from Kaggle. Some insights have also been included
for the relevance of the visualization concepts towards the UAE Ministry of Health.
For instance, in the chosen dataset, which is the house sales in King County, US, can help us uncover
insights about the housing and market in the said county of the United States. However, some
aspects of the information may not be easily understood unless someone has specific knowledge for
that, such as some of the real estate agents in the country, for that matter.
So, data democratization refers to the concept of enabling everyone in the company being
comfortable with the data, so that they can work with it or at least interact with it at least –
regardless of their technical knowledge about data analysis or the lack thereof. Therefore, in our
present dataset, data democratization is important, because some details related to our housing
sales in the county can be better understood by individuals who were involved in the transaction.
As per (Sil, Sharma, Jhamb, Marathe, & Sharma, 2021), data democratization in multiple contexts
can help us achieve better results towards getting or gaining a better understanding of the data,
including the ways it is visualized or interpreted as well as the insights it provides.
The first visualization was focused on the number of bedrooms sold per ZIP code. Therefore, the
student has constructed a heatmap to show which localities have purchased the greatest number of
bedrooms and by extension, houses. Therefore, the heatmap visualization has been shown below.
The variables in the visualization are –
‘SUM (Bedrooms)’ – the total number of bedrooms sold, since this is a SUM function of the
number of bedrooms sold
Zip code – the respective ZIP code representing the locality in the county
The greater number of bedrooms sold in a ZIP code, the darker shade of the blue the box appears in
the heatmap. There is another visualization that later talks about the total price of the houses sold
per ZIP code, but the quantity or number of houses sold may not be the same as the total price of
the houses; some neighbourhoods may have higher sales volume of housing units, but some others
may have higher sales revenue figures of housing units.
The second visualization was a comparison of two variables, namely ‘Yr_built’ and ‘SUM(Price)’. The
definitions of the two variables are as follows –
Yr_built – the year in which the house was initially built (not renovated)
SUM (price) – the price of each house sold; the SUM function indicates that it is the total of
all houses sold constructed in the respective years
The illustration is a line graph with a trend to show how the year of construction affects the house
prices or what customers are willing to pay for them. The idea of this graph is to better understand
whether there is any trend or not about how the houses are priced on the basis of when or in which
year they have been constructed in the first place.
Third visualization (sum price of houses sold by ZIP code)
Third visualization is about the median price of house sold classified by the ZIP code. The three
variables included here were ‘MEDIAN (Price)’, ‘SUM (Price)’, and Zip code. The definitions for the
three variables are given below as –
SUM (Price) – the sum of all the houses sold in a ZIP code (in this visualization)
Zip code – the ZIP code for the locality in the King County
Since the house sales would bring profits for the real estate companies in the county, the
visualization has been coded in green automatic colour scheme/palette. The higher the sum price of
the house sold in the ZIP code or locality, the greener the said box in the heatmap is. For instance,
the 98004 ZIP code is greener than the 98039 ZIP code, because the price or sum price for 98004 is
$429 million approximately, which is greater than that of 98039, which is $108 million.
The fourth and final visualization here considers two variables here once more; including ‘Zip code’
and ‘MEDIAN (Price)’; the definitions here have been mentioned as shown below –
Zip code – which shows the zip code for the locality in the King County
MEDIAN (Price) – this shows the median price of the house sold in the said locality or ZIP
code
The difference between the fourth and third visualization is that the third visualization is a heatmap
that visualizes the most profitable or ZIP codes or localities with the highest SUM (Price), while the
fourth visualization is focused on the median price of the houses sold per locality or ZIP code.
The visualization dashboard includes the four visualizations including the number of bedrooms sold
per ZIP code, the sum price of the houses sold per ZIP code, the median price of houses sold by ZIP
code, and finally, the price of house sold vs. the year of house build.
The first visualization that examines the number of bedrooms sold per ZIP code indicate that ZIP
code 98052, 98038, and 98006 are among the three localities with the highest sales volume (in the
terms of no. of bedrooms sold) with 2,076, 2,072, and 1,913 units respectively. Therefore, it appears
that the residents in those localities may be purchasing more properties and hence, it could be
possible to further support marketing campaigns to target the prospective buyers in those areas.
Alternatively, the zip codes 98148 and 98039 had only 179 and 203 bedrooms sold. Therefore, it can
be interpretated that those localities do not have buyers for housing properties and thus, any
further constructions should be halted or discontinued to save capital that may be reallocated
elsewhere.
The second visualization helps us understand how the year of housing construction affects the price
for which the house may be sold. With the help of the illustration, we can see that although there is
an upward trend, there may be certain cases wherein houses built in specific years may be sold for
very little. For instance, the houses built in 2014 were sold cumulatively for $382 million, but houses
built in 2015 were sold cumulatively for $28.87 million only. However, it is clear that the houses that
were constructed in the 1990s were sold for much less. The managerial takeaway point here is that
the company should not try to purchase older residences and properties with the intention of
renovating them and selling them, because of two reasons, including (1) the additional costs
associated with renovation of the older properties, and (2) the overall drop in the price due to the
properties being aged or dated in terms of their construction dates.
The third visualization shows us the cumulative sales in houses sold per ZIP code. The purpose of this
visualization is to help us as managers better analyse and understand where and how the total sales
revenue from the housing unit sales were the highest. It is clearly noticed that the zip codes 98004,
98006, and 98052 were among the highest grossing neighbourhoods as the company was able to sell
maximum properties there. Some connection can be drawn here with the first visualization where it
was seen that neighbourhoods 98006 and 98052 also accounted for some of the highest number of
bedrooms sold. Therefore, it is a further confirmation that those zip codes must be targeted by the
company for further property development because there are buyers who also have the capital and
willingness to purchase houses.
Finally, the fourth visualization shows us the median price of the houses sold per ZIP code. This
visualization here is created to help us better understand which localities have the wealthiest
customers. The median price appeared to be the highest at $1.892 million in the zip code 98039,
though that is an outlier statistically. Nonetheless, zip codes 98004 and 98040 too had median prices
of $1.15 million and $993,750 respectively, indicating that wealthier customers do live in those zip
codes and hence, the company could consider investing to construct luxury residential properties
targeted for a relatively smaller market segment with adequate capital to purchase those housing
units. Alternatively, the median price ranges from $235,000 in zip code 98002 to $915,000 in zip
code 98112 that provides us an idea of what the prices of housing properties in the county looks like
and what most properties can be priced between.