SpaceY Data Analytics Final Presentation DJ
SpaceY Data Analytics Final Presentation DJ
Outline
• Executive Summary
• Introduction
• Methodology
• Results
• Conclusion
2
Executive Summary
• Summary of methodologies
In this exercise team used multiple methodologies to acquire and analyze the causal relationship of
successful rocket landings by Space X.
• Collect : Data collected using SpaceX REST API and web scraping techniques acquired data from Wikipedia
• Wrangle : Convert raw data to create usable outcome variable
• Explore : Data visualization techniques to explore trends, considering factors like payload, launch site, yearly
trends.
• Analyze : Analyzing the data with SQLlite
• Geographic Mapping : Geographically visualize the launch site success rates and proximity to geographical locations
using folium
• Dashboard : An interactive dashboard website launch sites with the most success and successful payload ranges
• Build Models : Machine Learning to predict landing outcomes using logistic regression, support vector machine
(SVM), decision tree and K-nearest neighbor (KNN)
3
Executive Summary
• Visualization
• Highest performing payload mass is between 2,000 and 6,000 kg
• Space X has chosen sites near the equator. Other factors such as close to a main road and railway was also import
• Being close to a coast makes some room for failed landings
• Most successful site is KSC LC-39A with a success rate of 77
• Predictive Analytics
• Decision Tree Classifier has the best learning algorithm for the data available
4
Introduction
• SpaceX, officially known as Space Exploration Technologies Corp., is a
revolutionary American aerospace company founded by Elon Musk in
2002.
• Key Accomplishments:
• Reusable Rockets: SpaceX has successfully developed and implemented reusable rockets like the Falcon 9
and Falcon Heavy, significantly reducing launch costs.
• Human Spaceflight: SpaceX achieved the milestone of sending astronauts to the International Space
Station (ISS) with its Dragon spacecraft, marking the first time a private company has transported humans
to orbit.
• Future Outlook:
• SpaceX continues to drive the future of space exploration with its
ambitious projects like Starship and Starlink. The company's
innovative approach and commitment to reducing costs have
positioned it as a major player in the global space industry, shaping
the future of space travel and exploration.
5
Objectives of the
Analysis
• Understand key considerations
for the success of Space X
7
Methodology
Executive Summary
• Data collection methodology:
• Data were collecting mainly using two methodologies, which are web-scarping through
Wikipedia and using SpaceX REST API platforms.
• data = pd.json_normalize(response.json())
html_data =
requests.get(static_url)
Read html data
html_data.status_code Identify the REST Add column names
using Beautiful Soup
API supported URL to a Dataframe
Obejct
soup =
BeautifulSoup(html_data.text)
html_tables =
Write to dataframe
soup.find_all('table’) Call the GET request
Check if the status
for each row with
code is 200
FindAll tr and td
11
Full code can be found here
Data Wrangling
• Key lines of Code
import pandas as pd
Import Pandas
import numpy as np
df=pd.read_csv(csv_file_path)
df.isnull().sum()/len(df)*100
• Using bullet point format, summarize the SQL queries you performed
• %sql create table SPACEXTABLE as select * from SPACEXTBL where Date is not null
• %sql SELECT AVG(PAYLOAD_MASS__KG_) FROM SPACEXTBL WHERE "Booster_Version" LIKE "F9 v1.1%"
15
EDA with SQL
• Using bullet point format, summarize the SQL queries you performed
• %sql SELECT DISTINCT "Booster_Version" FROM SPACEXTBL WHERE "PAYLOAD_MASS__KG_" BETWEEN
4000 AND 6000
• %sql SELECT "Mission_Outcome", COUNT("Mission_Outcome") as "Events" FROM SPACEXTBL WHERE "Date" BETWEEN "2010-06-04"
AND "2017-03-20" GROUP BY "Mission_Outcome" ORDER BY "Events" DESC
• Summarize what map objects such as markers, circles, lines, etc. you created and
added to a folium map
• Explain why you added those objects
• Add the GitHub URL of your completed interactive map with Folium map, as an
external reference and peer-review purpose
17
Build a Dashboard with Plotly Dash
18
Predictive Analysis (Classification)
• Summarize how you built, evaluated, improved, and found the best
performing classification model
• You need present your model development process using key phrases and
flowchart
• Add the GitHub URL of your completed predictive analysis lab, as an external
reference and peer-review purpose
19
Section 2
Flight Number vs. Launch Site
21
Payload vs. Launch Site
We can see the Falcon X team has done very high payloads ( ~ 15,000 kg) with a
high success rate.
Most launches around 7000 kg were successful.
VAFB SL 4E site was not used for launches about 10,000 kg
22
Success Rate vs. Orbit Type
• We can observe a 100% success rate for
ES-L1, GEO , HEO and SSO orbit types
23
Flight Number vs. Orbit Type
24
Payload vs. Orbit Type
25
Launch Success Yearly Trend
26
All Launch Site Names
• The SPACEXTBL has all the launches. Hence we need to find distinct names
under “Launch Site” column to see the names of all sites.
• Query to get the above result are as below.
• %sql SELECT DISTINCT "Launch_Site" FROM SPACEXTBL
27
Launch Site Names Begin with 'CCA'
as follows. cheese
Dragon
2012-05- F9 v1.0 CCAFS NASA No
• %sql SELECT * FROM SPACEXTBL 22
7:44:00
B0005 LC-40
demo
flight C2
525 LEO (ISS)
(COTS)
Success
attempt
WHERE "Launch_Site" LIKE "CCA%" 2012-10- F9 v1.0 CCAFS SpaceX NASA No
0:35:00 500 LEO (ISS) Success
LIMIT 5 08 B0006 LC-40 CRS-1 (CRS) attempt
2013-03- F9 v1.0 CCAFS SpaceX NASA No
15:10:00 677 LEO (ISS) Success
01 B0007 LC-40 CRS-2 (CRS) attempt
28
Total Payload Mass
• Total Payload Mass for the customer NASA CRS missions are 45,596 kg
• Code is as following
• %sql SELECT SUM(PAYLOAD_MASS__KG_) FROM SPACEXTBL WHERE CUSTOMER IS
'NASA (CRS)’
29
Average Payload Mass by F9 v1.1
30
First Successful Ground Landing Date
31
Successful Drone Ship Landing with Payload between 4000 and 6000
• Boosters which have successfully landed on drone ship and had payload mass
greater than 4000 but less than 6000 Booster_Version
F9 FT B1022
F9 FT B1026
F9 FT B1021.2
• SQL query is as below. F9 FT B1031.2
32
Total Number of Successful and Failure Mission Outcomes
33
Boosters Carried Maximum Payload
5. F9 B5 B1048.5
6. F9 B5 B1051.4
7. F9 B5 B1049.5
8. F9 B5 B1060.2
9. F9 B5 B1058.3
10. F9 B5 B1051.6
11. F9 B5 B1060.3
12. F9 B5 B1049.7
34
2015 Launch Records
35
Rank Landing Outcomes Between 2010-06-04 and 2017-03-20
36
Section 3
SpaceX Launch Sites
Key considerations selecting a space
launch sites are as below, which can be
observed from here and other assessments
below.
1. Proximity to the Equator:
Earth's Rotational Velocity provides an initial boost to a rocket
launched eastward. This boost is maximized closer to the
equator, where the Earth's rotational speed is highest. And there
is also Orbital Inclination which Launching closer to the equator
allows for easier access to a wider range of orbital inclinations,
including geostationary orbits.
2. Downrange Safety:
Special considerations such as using unpopulated areas where
Launch sites are typically located in remote areas with minimal
population density to minimize the risk of casualties in case of
launch failures. Further Many launch sites are situated near
large bodies of water to ensure that falling debris lands in a safe,
unpopulated area.
3. Infrastructure and Accessibility:
Transportation: Good transportation links are essential for
transporting rocket components, fuel, and personnel to the
launch site.
Support Facilities: Adequate infrastructure, including power,
water, and communication systems, is necessary for the
operation of the launch site.
38
Launch Outcomes
However
• 10/13 ( 77%) of the launches made
from KSC-LC 39A were successful
43
KSC LC-39A Site Success Rate
44
Impact on Payload Mass to Launch Success
47
Confusion Matrix of Decision Tree Classifier
• A confusion matrix summarizes the performance of a
classification algorithm
• The fact that there are false positives (Type 1 error) is not a good
indicator, which reduces the precision and F1 score
• Confusion Matrix Outputs are as below for the decision tree
classifire:
• 12 True positive
• 2 True negative
• 4 False positive
• 0 False Negative
48
Conclusions
49