0% found this document useful (0 votes)

19 views54 pages

Data analysis with Python

The document outlines a project analyzing SpaceX launch data to understand cost reduction through reusability strategies. It details the methodology for data collection, processing, and analysis, including the development of predictive models and interactive dashboards. The expected outcome is a machine learning model predicting the reusability of the first stage of rockets, aimed at supporting decision-making for Space Y.

Uploaded by

Hoài An Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views54 pages

Data analysis with Python

Uploaded by

Hoài An Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Nguyen Hoai An

November, 2024
Outline

• Executive Summary
• Introduction
• Methodology
• Results
• Conclusion
• Appendix

2
Executive Summary

 Methodology

 Data collection on SpaceX launches and reusability

success rates.
 Application of data science techniques to build and
evaluate predictive models.
 Use of visual dashboards to present insights and
support business decisions.
 Expected Outcomes

 A machine learning model that predicts the reusability 3

of the first stage.

Introduction

The project analyse on Space X records to understand:

• How SpaceX’s success in reducing launch costs?
• Which reusable first stage and its impact on lowering SpaceX’s launch costs?
• How does SpaceX’s reusability strategy provide a competitive advantage in
the commercial and scientific sectors?
• Which factors led to of cost reduction in space travel and how companies are
achieving it.
Project goal: Development of visual dashboards to support decision-making
and strategy formulation at Space Y.

4
Section 1

5
Methodology

Executive Summary
• Data collection methodology:
• Describe how data was collected
• Perform data wrangling
• Describe how data was processed
• Perform exploratory data analysis (EDA) using visualization
and SQL
• Perform interactive visual analytics using Folium
and Plotly Dash
• Perform predictive analysis using classification models 6

• How to build, tune, evaluate classification models

https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/spacex_
Data Collection – Scraping - Wrangling API.ipynb
https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/Webscr
aping.ipynb

Objective: Collect and analyze SpaceX past launch 2. Data Normalization and Transformation
data to predict rocket landing outcomes.
• Step 2: Convert JSON Data
• Use json_normalize to flatten JSON data into a table format.
1. Understanding the SpaceX REST API
• Step 3: Gather Additional Data
• Step 1: Accessing API Data
• Make further API calls to collect specific details like Booster and
• Target API Endpoint:
Launchpad information if needed.
api.spacexdata.com/v4/launches/past
• Use a GET request with the requests library to 3. Data Cleaning and Filtering
retrieve data.
• Step 4: Filter by Rocket Type
• Data includes details on rocket types,
• Remove Falcon 1 launches, focusing analysis on Falcon 9 only.
payloads, and landing outcomes.
• Step 5: Handle NULL Values
• For PayloadMass, replace NULL values with the column's mean.
• Keep NULLs in the LandingPad column unchanged.
7
https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/spacex_
Data Collection – Scraping - Wrangling API.ipynb
https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/Webscr
aping.ipynb
Data
Collection Wrangling

API URLs: Scraping

Data Retrieval

Data Filtering
and
Transformation
Store all data in
DataFrame

8
SpaceX API - Data Collection
https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/spacex_
API.ipynb

spacex_url
API URLs:
static_json_url

requests.get
Data Retrieval
converts to
DataFrame.
Retrieve Rocket Append
Obtain Rocket ID SpaceX API
Info `BoosterVersion`
Append `Longitude`,
Obtain Launchpad Retrieve
SpaceX API `Latitude`,
Data Filtering and ID Launchpad Info
`LaunchSite`
Append
Transformation Retrieve Payload
Obtain Payload ID SpaceX API `PayloadMass`,
Info
`Orbit`
Retrieve Core
Obtain Core ID SpaceX API Append core details
Info
Store all data in DataFrame

9
SpaceX API - Data Wrangling
https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/spacex_
API.ipynb

Filter Data for Keep only launches with

Single-Core and exactly one core and one
Payload Launches payload.
Flatten nested lists for core
Map IDs and payload fields.
Convert date_utc to datetime
Date Conversion
and filter launches before
and Filtering November 13, 2020.
Create Launch Combine lists into a structured
Dictionary dictionary for each flight.
Calculate mean of
PayloadMass, fill missing
Handle Null Values values, and verify
completeness.

10
SpaceX REST API Calls https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/Webscr
aping.ipynb
Web Scraping Process – Data Wrangling

Static URL Definition

Download and Parse HTML

Web Scraping
Process
Finding Tables and Columns

Extracting Launch Data

Convert extracted data

Data Wrangling
dictionary to a pandas
Process
DataFrame.

11
Exploratory Data Analysis https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20SQL.ipynb

2. Key Features for Prediction

Objective: Highlight the importance of EDA in
uncovering insights from data, forming the • Step 3: Identify Critical Attributes
foundation for data science projects. • Recognize essential attributes such as:
• Launch number
• Launch site (e.g., different sites have varied success rates).
1. Understanding Exploratory Data Analysis • Step 4: Feature Combination for Accuracy
• Step 1: EDA Overview • Combine features to enhance prediction accuracy.
• Analyze datasets to summarize main • Example: At site CCAFS LC-40, the success rate is 60%, but
characteristics, focusing on visualizations. rises to 100% when mass exceeds 10,000 kg.
• Gain an initial understanding of the data before
further analysis. 3. Preparing Data for Machine Learning
• Step 2: Engage in EDA Labs
• Use a database to perform EDA. • Step 5: Convert Categorical Variables
• Objective: Explore data to predict the success of • Apply one-hot encoding to convert categorical variables for
Falcon 9’s first-stage landings. model compatibility.
• Step 6: Identify Correlated Attributes
• Determine which attributes correlate with successful
landings to boost predictive model accuracy.
12
EDA with SQL https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20SQL.ipynb

1. Display Unique Launch Sites:

• SELECT DISTINCT Launch_Site FROM SPACEXTABLE;
• This query selects the distinct values from the Launch_Site column.
2. Display Launches from Launch Sites Starting with 'CCA' (Top 5):
• SELECT * FROM SPACEXTABLE WHERE Launch_Site LIKE 'CCA%' LIMIT 5;
• This query selects all columns from SPACEXTABLE where the Launch_Site starts with
"CCA" and limits the results to the top 5.
3. Total Payload Mass for NASA (CRS) Launches:
• SELECT SUM(PAYLOAD_MASS__KG_) AS Total_Payload_Mass FROM SPACEXTABLE WHERE
Customer = 'NASA (CRS)';
4.• Average
This query Payload
calculates the
"NASA (CRS)".
Masssumfor
of PAYLOAD_MASS__KG_
F9 v1.1 Boosters:for launches where the Customer is

• SELECT AVG(PAYLOAD_MASS__KG_) AS Average_Payload_Mass FROM SPACEXTABLE

WHERE Booster_Version = 'F9 v1.1';
• This query calculates the average of PAYLOAD_MASS__KG_ for launches with the
Booster_Version "F9 v1.1". 13
EDA with SQL https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20SQL.ipynb

5. Date of First Successful Ground Pad Landing:

• SELECT MIN(Date) AS First_Successful_Landing_Date FROM SPACEXTABLE WHERE Mission_Outcome = 'Success';
• This query finds the minimum date (MIN(Date)) for successful missions (Mission_Outcome = 'Success').

6. Booster Versions with Successful Drone Ship Landings (4000kg < Payload Mass < 6000kg):
• SELECT Booster_Version FROM SPACEXTABLE WHERE Landing_Outcome = 'Success (drone ship)' AND PAYLOAD_MASS__KG_ >
4000 AND PAYLOAD_MASS__KG_ < 6000;
• This query selects Booster_Version for successful drone ship landings (Landing_Outcome = 'Success (drone ship)') with payload
mass between 4000kg and 6000kg.

7. Total Successful and Failed Missions (Two approaches):

• Approach 1: SELECT Mission_Outcome, COUNT(*) AS Count FROM SPACEXTABLE GROUP BY Mission_Outcome ORDER BY Count
DESC; This query groups missions by Mission_Outcome and counts the occurrences for each. It sorts them by count in
descending order.
• Approach 2: SELECT SUM(CASE WHEN Mission_Outcome = 'Success' THEN 1 ELSE 0 END) AS Successful_Missions, SUM(CASE
WHEN Mission_Outcome = 'Failure (in flight)' THEN 1 ELSE 0 END) AS Failed_Missions FROM SPACEXTABLE; This query uses a
CASE statement to create separate counts for successful and failed missions based on Mission_Outcome.

14
EDA with SQL https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20SQL.ipynb

8. Booster Versions with Maximum Payload Mass (using a subquery):

• SELECT Booster_Version, PAYLOAD_MASS__KG_ FROM SPACEXTABLE WHERE
PAYLOAD_MASS__KG_ = (SELECT MAX(PAYLOAD_MASS__KG_) FROM SPACEXTABLE );
• uses a subquery to find the maximum PAYLOAD_MASS__KG_,
• selects the corresponding Booster_Version and PAYLOAD_MASS__KG_ from the main table.
9. Failure Drone Ship Landings by Month (2015) with Booster Version and Launch Site:
• Filtering for 2015: WHERE substr(Date, 0, 5) = '2015'.
• Filtering for Drone Ship Failures: AND Landing_Outcome= 'Failure (drone ship)'
• Extracting Month Names: from the Date column by extracting the second and third characters
and matching them to month names.
10.• Rank
Selecting Relevant
the count Columns:
of landing selects the extracted Month, Landing_Outcome, Booster_Version,
outcomes
and Launch_Site columns.
• Filtering by Date Range: WHERE Date BETWEEN '2010-06-04' AND '2017-03-20'
• Grouping and Counting: GROUP BY Landing_Outcome column. COUNT(*) AS Count counts the
number of records in each group.
• Ranking: ORDER BY Count DESC orders the results by the Count column in descending order,
providing a ranking of landing outcomes based on their frequency.
15
EDA with SQL https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20SQL.ipynb

8. Booster Versions with Maximum Payload Mass (using a subquery):

• SELECT Booster_Version, PAYLOAD_MASS__KG_ FROM SPACEXTABLE WHERE
PAYLOAD_MASS__KG_ = (SELECT MAX(PAYLOAD_MASS__KG_) FROM SPACEXTABLE );
• uses a subquery to find the maximum PAYLOAD_MASS__KG_,
• selects the corresponding Booster_Version and PAYLOAD_MASS__KG_ from the main table.
9. Failure Drone Ship Landings by Month (2015) with Booster Version and Launch Site:
• Filtering for 2015: WHERE substr(Date, 0, 5) = '2015'.
• Filtering for Drone Ship Failures: AND Landing_Outcome= 'Failure (drone ship)'
• Extracting Month Names: from the Date column by extracting the second and third characters
and matching them to month names.
10.• Rank
Selecting Relevant
the count Columns:
of landing selects the extracted Month, Landing_Outcome, Booster_Version,
outcomes
and Launch_Site columns.
• Filtering by Date Range: WHERE Date BETWEEN '2010-06-04' AND '2017-03-20'
• Grouping and Counting: GROUP BY Landing_Outcome column. COUNT(*) AS Count counts the
number of records in each group.
• Ranking: ORDER BY Count DESC orders the results by the Count column in descending order,
providing a ranking of landing outcomes based on their frequency.
16
EDA with Data Visualization https://github.com/ViviAhn/Pyth
on_Capstone/blob/main/EDA%2
0with%20Visualization%20Lab.i
pynb

Visualize the relationship between Flight

Number and Launch Site

Visualize the relationship between Payload Mass

and Launch Site
These charts allow exploration of potential
Visualize the relationship between success rate patterns and relationships between variables,
of each orbit type showing whether variables are concentrated
or more evenly distributed across them.
Visualize the relationship between FlightNumber
and Orbit type

Visualize the relationship between Payload Mass

and Orbit type

Visualize the launch success yearly trend

17
Interactive Visual Analytics and Dashboards
Objective: Develop interactive visualizations and 2. Building with Folium
dashboards to enable real-time data exploration and • Step 3: Analyze Launch Site Geolocations
enhance data storytelling. • Use Folium to visualize launch sites on an interactive map.
• Mark locations and examine proximities to reveal patterns.
1. Interactive Visual Analytics • Step 4: Determine Optimal Launch Sites
• Step 1: Enable User Interactions • Use map exploration to identify potential launch site advantages.
• Allow users to interact with data using:
3. Creating a Dashboard with Plotly Dash

Zooming

Panning • Step 5: Set Up Dashboard Components

Filtering • Build a dashboard using Plotly Dash.

Searching • Add interactive input components such as:

Linking 
Dropdown lists
• Goal: Facilitate quicker identification of visual 
Range sliders
patterns. • Step 6: Visualize SpaceX Data
• Step 2: Advantages of Interactive Dashboards • Create interactive visualizations.
• Provide a dynamic way to present findings. • Allow users to interact with charts to gain deeper insights into
• Offer more engagement compared to static graphs. SpaceX data.
18
https://nbviewer.org/github/Vivi
Build an Interactive Map with Folium Ahn/Python_Capstone/blob/mai
n/Interactive%20Visual%20Anal
ytics%20with%20Folium%20lab.
ipynb

Folium map with the NASA Markers, circles, and lines are added to the map to
Johnson Space Center and all enhance visualization and convey specific information
launch sites about locations, distances, and regions on the map.
 Markers: Represent specific points of interest, such as
a launch site.
Folium map with
Provide information when clicked, often using pop-ups
success/failed launches for
or custom icons. Useful for showing precise locations.
each site
 Circles: Highlight areas around a point, such as a
radius around a launch site
Folium map and the distances
between a launch site to its  Lines (or PolyLines): Connect two or more locations or
proximities distances between points. Lines can also indicate
relationships between a launch site and a destination
or transportation network.
Together, these elements help users interpret spatial
19
data in an intuitive and interactive way, adding context
Build a Dashboard with Plotly Dash

This project is based on a Plotly Dash application that performs interactive

visual analytics on SpaceX launch data, allowing users to select specific
launch sites from a dropdown menu. The analysis includes:
• A pie chart of launch outcomes
• A scatter plot showing the correlation between payload and success

Using these plots, we can highlight the success rate and payload
capabilities of each launch site, providing an overview of the performance
at different launch sites.

Github for the spacex_dash_app.py:

https://github.com/ViviAhn/Python_Capstone/blob/main/spacex_dash_app.
py

Dash for results: http://127.0.0.1:8050/

20
Predictive Analysis (Classification)

2. Model Training and Hyperparameter Tuning

Objective: Predict the successful landing of the
Falcon 9 first stage. • Step 3: Algorithm Selection
• Select and test algorithms:

Logistic Regression
1. Building the Machine Learning Pipeline 
Support Vector Machines
• Step 1: Data Preprocessing

Decision Tree Classifier
• Standardize data for consistency and accuracy in

K-nearest neighbors
• Step 4: Hyperparameter Tuning
analysis.
• Step 2: Data Splitting • Perform Grid Search to identify optimal
• Use Train_test_split to split data into training hyperparameters for each algorithm.
and testing sets.
3. Model Evaluation
• Step 5: Model Accuracy Assessment
• Evaluate model accuracy using training data.
• Step 6: Result Summarization
• Summarize results with a confusion matrix to
analyze performance and accuracy.
21
https://github.com/ViviAhn/Pytho
Predictive Analysis (Classification) n_Capstone/blob/main/4.%20Pred
ictive%20Analysis.ipynb

2. Model Training and Hyperparameter Tuning

Objective: Predict the successful landing of the
Falcon 9 first stage. • Step 3: Algorithm Selection
• Select and test algorithms:

Logistic Regression
1. Building the Machine Learning Pipeline 
Support Vector Machines
• Step 1: Data Preprocessing

Decision Tree Classifier
• Standardize data for consistency and accuracy in

K-nearest neighbors
• Step 4: Hyperparameter Tuning
analysis.
• Step 2: Data Splitting • Perform Grid Search to identify optimal
• Use Train_test_split to split data into training hyperparameters for each algorithm.
and testing sets.
3. Model Evaluation
• Step 5: Model Accuracy Assessment
• Evaluate model accuracy using training data.
• Step 6: Result Summarization
• Summarize results with a confusion matrix to
analyze performance and accuracy.
22
https://github.com/ViviAhn/Pytho
Predictive Analysis (Classification) n_Capstone/blob/main/4.%20Pred
ictive%20Analysis.ipynb

Imported and standardized

Data Loading and feature data (X) and target
Preprocessing: data (Y) for consistent
scaling.
Split dataset into training Logistic
and testing subsets to Regression,
Train-Test Split: evaluate generalizability
(test size = 0.2, random
state = 2).
Support Vector
Model Selection Model Initialization Machine (SVM),
and
Hyperparameter
Tuning: Hyperparameters tuning
with GridSearchCV. Decision Tree,

Accuracy (best_score_) for

Model Evaluation:
each model K-Nearest
Neighbors (KNN).
Highest accuracy as the
Find Best Model:
best-performing model. 23
Results

• Exploratory data analysis results

• Interactive analytics demo in screenshots
• Predictive analysis results

24
Section 2
Flight Number vs. Launch Site

26
Payload vs. Launch Site

27
Success Rate vs. Orbit Type

orbit_success_rate = df.groupby('Orbit')
['Class'].mean().reset_index()

# Plotting the bar chart

plt.figure(figsize=(10, 6))
sns.barplot(x='Orbit', y='Class',
data=orbit_success_rate,
palette="bright") # or "hls"

# Set axis labels and title

plt.xlabel("Orbit Type", fontsize=14)
plt.ylabel("Success Rate", fontsize=14)
plt.title("Success Rate for Each Orbit",
fontsize=16)

# Rotate x-axis labels for better

readability 28
plt.xticks(rotation=45)
Flight Number vs. Orbit Type

29
Payload vs. Orbit Type

30
Launch Success Yearly Trend

31
All Launch Site Names

The unique launch sites are:

CCAFS SLC 40: Cape Canaveral Space Force Station Space Launch Complex 40
CCAFS LC 40: Cape Canaveral Space Launch Complex 40
VAFB SLC 4E: Vandenberg Air Force Base Space Launch Complex 4E
KSC LC 39A: Kennedy Space Center Launch Complex 39A

These locations are primary launch sites used for different missions, likely
chosen based on mission requirements such as orbit and payload type.
These sites are unique because of their specific geographical positions,
infrastructure, and suitability for different types of orbits, which allows for
a diverse range of mission profiles based on payload and destination.

32
Launch Site Names Begin with 'CCA'

All the records retrieved have the launch site starting with "CCA", indicating that they were launched from the
Cape Canaveral Air Force Station (CCAFS).

33
Total Payload Mass

the result shows the total weight of all the cargo and payloads that
NASA has sent into space using SpaceX's rockets.

34
Average Payload Mass by F9 v1.1

This represents the average payload mass, in kilograms, carried

by the F9 v1.1 booster version.

35
First Successful Ground Landing Date

This indicates that the first successful ground landing achieved by SpaceX
occurred on December 22, 2015.

36
Successful Drone Ship Landing with Payload between 4000 and 6000

• These are different versions of the Falcon 9 Full

Thrust (denoted by F9 FT), specifically the booster
serial numbers used for these successful launches
with payloads between 4,000 kg and 6,000 kg.
• F9 FT a version of SpaceX's Falcon 9 rocket designed
for increased performance and reusability.
• The booster serial numbers (e.g., B1022, B1026)
represent the specific Falcon 9 boosters used in these
missions. These are identifiers for the individual
rockets used in the launches.
37
• The output shows that the specified boosters were
Total Number of Successful and Failure Mission Outcomes

• Success (98 occurrences): Indicates that

there were 98 missions where the outcome
was successful.
• Success (payload status unclear) (1
occurrence): Indicates that there was 1
mission with a successful launch outcome,
but the payload's status after deployment
was unclear.
• Success (1 occurrence, possible data
anomaly): Another row labeled "Success"
with a count of 1 suggests a potential
inconsistency in the data, where the same
outcome is recorded in slightly different ways
(e.g., extra spaces or case sensitivity).
38
• Failure (in flight) (1 occurrence): Indicates
there was 1 mission where the launch failed
Boosters Carried Maximum Payload

The result shows all the launches

where the payload mass was 15,600
kg, and the corresponding booster
versions Falcon 9 Block 5 boosters (F9
B5) used for those launches.

39
2015 Launch Records

The records for failure landing_outcomes in drone ship in 2015

are both F9 v1.1 booster from CCAFS LC-40

40
Rank Landing Outcomes Between 2010-06-04 and 2017-03-20

o The most common outcome is no attempt at landing, with 10

occurrences, meaning the landing operation was not part of
the mission.
o Successful and failed drone ship landings are balanced at 5
each, showing both successful and unsuccessful efforts.
o There are fewer landings on ground pads, with 3 successful
recoveries.
o A variety of ocean-based landing outcomes show different
levels of control and success, with 3 controlled and 2
uncontrolled ocean landings.
o Other less frequent outcomes include parachute failures and
a precluded attempt.
o This breakdown provides insights into the success rate and
types of landing attempts made for rocket boosters, showing
where the missions either succeeded or faced challenges
41
during recovery operations.
Section 3
Task 1: Mark all launch sites on a map

The map effectively highlights the launch sites relative to NASA's center. There is a
concentration of 3 launch sites along the East Coast of the United States, particularly
in Florida. This region benefits from favorable weather conditions and proximity to
the Atlantic Ocean, which is often used as a safety measure for rocket launches. 43The
inclusion of Vandenberg Air Force Base (VAFB) on the West Coast demonstrates the
importance of having launch sites on both coasts. This allows for launches into
Task 2: Mark the
success/failed launches for
each site on the map
The map highlights launch sites, visually
represented by the clusters of markers.
 Green Markers: represent successful
launches. A higher concentration of green
markers at a site indicates a higher success
rate.
 Red Markers: represent failed launches. A
higher number of red markers suggests a
higher failure rate.

When zoom in the map, we can see the

marker for each launch site. For example, the
image shows results for two launch sites:
o CCAFS SLC-40 with 7 markers, meaning
44
7 launches with 4 failed (red) and 3
success (green)
TASK 3: Calculate the distances between a launch
site to its proximities

The map highlights the

distances from each
launch site to the
nearest railway (green
line), highway (pink
line), and coastline
(blue line). It shows
that the launch sites
are equidistant from all
transport methods,
indicating that these
sites are conveniently
accessible for testing
purposes.
45
Section 4
Launch success count for all sites

• The pie chart shows the distribution

of successful launches across
different launch sites, suggesting that
KSC LC-39A has been the most
reliable and successful launch site for
SpaceX, contributing to 41.7% of the
total.
• The chart also provides information
about the payload capacity of
SpaceX's rockets and the range of
missions they can undertake, which
extends from 0 kg to 9000 kg

47
Launch site with highest launch success ratio

The chart provides a concise

overview of the launch history
of KSC LC-39A, highlighting its
success rate and payload
capabilities. It indicates that
KSC LC-39A has a relatively high
success rate, with a majority of
launches ending successfully at
76.9%.

48
Payload vs. Launch Outcome for all sites

The plot shows that SpaceX has consistently improved the reliability of its rockets, leading to high success rates
across different payload masses and booster versions.
Different booster versions have varying capabilities and performance characteristics. However, the plot suggests
that SpaceX has been able to achieve high success rates across all versions.

49
Section 5
Classification Accuracy

Best performing model: Decision Tree

Accuracy: 0.9036 51
Confusion Matrix

•The matrix shows the number of

correct and incorrect predictions for
each class
•Based on this confusion matrix, we
can calculate several performance
metrics:
• Precision: 0.8

• Recall: 1.0

• F1-Score: 0.8889

•The model has a high accuracy,

precision, and recall for predicting
"landed" instances.

52
Conclusions
CCAFS SLC-40 was the most used launch site

The first successful ground landing achieved by SpaceX occurred on June 4,

2010. Successful outnumbered Failure Mission Outcomes…
Launch sites Proximities analysis: Markers, circles, and lines are added to the
map to enhance visualization and convey specific information about locations,
distances, and regions. All four launch sites are located near the coastline,
ensuring safety, ease of launching, and environmental protection. The distance
from each launch site to the transportation network is less than 1 km, making it
convenient for transportation.
Dashboard: KSC LC-39A has been the most reliable and successful launch site for
SpaceX, contributing to 41.7% of the total.

Predictive Analysis: Best performing model: Decision Tree with accuracy: 0.9036
53

Instant Access to Python Real-World Projects: Crafting your Python Portfolio with Deployable Applications Steven F. Lott ebook Full Chapters
100% (5)
Instant Access to Python Real-World Projects: Crafting your Python Portfolio with Deployable Applications Steven F. Lott ebook Full Chapters
51 pages
IBM DS Certificate CapstoneProject SamiAlaruri
No ratings yet
IBM DS Certificate CapstoneProject SamiAlaruri
49 pages
DS Capstone Presentation
No ratings yet
DS Capstone Presentation
46 pages
Marimba Bar and Resonator Dimensions
No ratings yet
Marimba Bar and Resonator Dimensions
9 pages
BAB 10 WELDING EQUIPMENTS - WIPRO
No ratings yet
BAB 10 WELDING EQUIPMENTS - WIPRO
14 pages
Unit II notes (4)
No ratings yet
Unit II notes (4)
54 pages
Esbe GB General Katalog 2013
100% (1)
Esbe GB General Katalog 2013
212 pages
Existing Situation of Solid Waste Managment in Ibadan PDF
No ratings yet
Existing Situation of Solid Waste Managment in Ibadan PDF
18 pages
The Resistance Guide For Civic Action Resource Compilation - Third Edition
100% (3)
The Resistance Guide For Civic Action Resource Compilation - Third Edition
27 pages
Weigh Bridge
100% (2)
Weigh Bridge
4 pages
My Capstone Project Presentation
No ratings yet
My Capstone Project Presentation
46 pages
Data Visulization Chapter 2
No ratings yet
Data Visulization Chapter 2
24 pages
Applied Data Science Capstone - Spacex
No ratings yet
Applied Data Science Capstone - Spacex
49 pages
Capstone Presentation
No ratings yet
Capstone Presentation
36 pages
FINAL FINDINGS_IBM-DataScience-Professional-Cert_Applied_Capstone_Project
No ratings yet
FINAL FINDINGS_IBM-DataScience-Professional-Cert_Applied_Capstone_Project
48 pages
Presentation
No ratings yet
Presentation
39 pages
Winning Space Race With Data Science
No ratings yet
Winning Space Race With Data Science
46 pages
IBM Data Science Capstone Project 2022
No ratings yet
IBM Data Science Capstone Project 2022
49 pages
Annual Accomplishment Report in Ict
100% (30)
Annual Accomplishment Report in Ict
12 pages
IBM Data Science Capstone
89% (9)
IBM Data Science Capstone
51 pages
Data Science Specialization Capstone Presentation
No ratings yet
Data Science Specialization Capstone Presentation
46 pages
Final Project
No ratings yet
Final Project
48 pages
Facts and Figures
No ratings yet
Facts and Figures
14 pages
Capstone SpaceX Final ASM
No ratings yet
Capstone SpaceX Final ASM
46 pages
Winning Space Race with Data Science (1)
No ratings yet
Winning Space Race with Data Science (1)
46 pages
SpaceY Data Analytics Final Presentation DJ
No ratings yet
SpaceY Data Analytics Final Presentation DJ
50 pages
Asmat Pace Tech 3-20-24
No ratings yet
Asmat Pace Tech 3-20-24
52 pages
DATASCIENCE Capstone
No ratings yet
DATASCIENCE Capstone
45 pages
00 - SpaceX - Final Presentation - JF
100% (1)
00 - SpaceX - Final Presentation - JF
43 pages
4.3 Applied Data Science Capstone-Collecting the Data 1
No ratings yet
4.3 Applied Data Science Capstone-Collecting the Data 1
14 pages
SpaceX First Stage Landing Prediction
No ratings yet
SpaceX First Stage Landing Prediction
46 pages
IBM Data Science Journey - 005
No ratings yet
IBM Data Science Journey - 005
47 pages
IBM_Data_Science_Professional_Certificate_Capstone_signed
No ratings yet
IBM_Data_Science_Professional_Certificate_Capstone_signed
48 pages
00_Final_Presentation_Echeverria
No ratings yet
00_Final_Presentation_Echeverria
42 pages
Spacex Report
No ratings yet
Spacex Report
41 pages
Delhivery Feature Engineering - Solution Approach
No ratings yet
Delhivery Feature Engineering - Solution Approach
7 pages
Ds Capstone Template Coursera
No ratings yet
Ds Capstone Template Coursera
36 pages
Capstone Presentation
No ratings yet
Capstone Presentation
36 pages
Capstone Story Presentation
No ratings yet
Capstone Story Presentation
21 pages
Project Proposal Template
No ratings yet
Project Proposal Template
6 pages
Vetina ES3
No ratings yet
Vetina ES3
2 pages
1st Dec'09 - DPR
No ratings yet
1st Dec'09 - DPR
325 pages
Ds Capstone Template Coursera
No ratings yet
Ds Capstone Template Coursera
47 pages
KP Iom - en - R2 PDF
No ratings yet
KP Iom - en - R2 PDF
25 pages
DS Capstone Presentation
No ratings yet
DS Capstone Presentation
46 pages
Python EL
No ratings yet
Python EL
25 pages
IBMData Science Capstone
No ratings yet
IBMData Science Capstone
52 pages
RH Weathernews Case Study f14581wg 201811 en
No ratings yet
RH Weathernews Case Study f14581wg 201811 en
4 pages
Examen Final Coursera
No ratings yet
Examen Final Coursera
50 pages
Ds Capstone Template Coursera
No ratings yet
Ds Capstone Template Coursera
49 pages
0-Ibm Capstone
No ratings yet
0-Ibm Capstone
52 pages
IBM Capstone SpaceY Taylor Collard
No ratings yet
IBM Capstone SpaceY Taylor Collard
47 pages
Henry Yan 3-Jan-2022
No ratings yet
Henry Yan 3-Jan-2022
46 pages
DS Capstone Powerpoint
No ratings yet
DS Capstone Powerpoint
46 pages
Project PPT
No ratings yet
Project PPT
47 pages
FICCI Contacts PDF
No ratings yet
FICCI Contacts PDF
25 pages
Datascience Capestone Presentation - Final
No ratings yet
Datascience Capestone Presentation - Final
47 pages
DS Capstone Presentation
No ratings yet
DS Capstone Presentation
46 pages
131 110 Falk A Plus Parallel and Right Angle Gear Drives Catalog
No ratings yet
131 110 Falk A Plus Parallel and Right Angle Gear Drives Catalog
119 pages
Organized
No ratings yet
Organized
47 pages
IBM Data Science Capstone
No ratings yet
IBM Data Science Capstone
51 pages
Ds Capstone Template Coursera
No ratings yet
Ds Capstone Template Coursera
50 pages
Ds Capstone Presentation
No ratings yet
Ds Capstone Presentation
47 pages
Capstone Final
100% (1)
Capstone Final
40 pages
Smartphone Integration Package (Code 14U) : Individual News
No ratings yet
Smartphone Integration Package (Code 14U) : Individual News
2 pages
King Post
No ratings yet
King Post
9 pages
Tiago Flores 2021-10-28
No ratings yet
Tiago Flores 2021-10-28
51 pages
Detailing To BS 8110
No ratings yet
Detailing To BS 8110
88 pages
Spacex Falcon 9 Prediction Mini Report
No ratings yet
Spacex Falcon 9 Prediction Mini Report
3 pages
FILTERS and Strainers Calculate Pressure Drop
No ratings yet
FILTERS and Strainers Calculate Pressure Drop
7 pages
British Airways Internship Report
No ratings yet
British Airways Internship Report
26 pages
Capstone Story Template
No ratings yet
Capstone Story Template
30 pages
IMS Plan PT Shema 2024
No ratings yet
IMS Plan PT Shema 2024
4 pages
EPB-Sub-01 Epic Residential Condensing Boiler Models Epb080-199
No ratings yet
EPB-Sub-01 Epic Residential Condensing Boiler Models Epb080-199
2 pages
M.Tech CSE Syllabus Notes
No ratings yet
M.Tech CSE Syllabus Notes
32 pages
The 8 Useful Java Testing Tools
No ratings yet
The 8 Useful Java Testing Tools
4 pages
PYTHON Poster
No ratings yet
PYTHON Poster
1 page
Data Science Capstone Project
No ratings yet
Data Science Capstone Project
21 pages
SPACEX
No ratings yet
SPACEX
19 pages
Merlin Gerin LV Switchgear, Fusegear & Electrical Distribution
No ratings yet
Merlin Gerin LV Switchgear, Fusegear & Electrical Distribution
31 pages
Data Science Journey1
No ratings yet
Data Science Journey1
13 pages
Thecus N4100 Nsync Quick Guide 2008-08-18
No ratings yet
Thecus N4100 Nsync Quick Guide 2008-08-18
13 pages
Survey Procedures For Mivan Formwork
100% (5)
Survey Procedures For Mivan Formwork
14 pages
MS Underground Utilty 002
No ratings yet
MS Underground Utilty 002
15 pages
Mongodb Architecture Guide
No ratings yet
Mongodb Architecture Guide
13 pages
Ee101 19BP002 PKG3 e SH SD 001 001 PDF
No ratings yet
Ee101 19BP002 PKG3 e SH SD 001 001 PDF
1 page
Apache Spark Unleashed: Advanced Techniques for Data Processing and Analysis
From Everand
Apache Spark Unleashed: Advanced Techniques for Data Processing and Analysis
Adam Jones
No ratings yet
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data analysis with Python

Uploaded by

Data analysis with Python

Uploaded by

Nguyen Hoai An

 Data collection on SpaceX launches and reusability

 A machine learning model that predicts the reusability 3

of the first stage.

The project analyse on Space X records to understand:

• How to build, tune, evaluate classification models

API URLs: Scraping

Filter Data for Keep only launches with

Static URL Definition

Download and Parse HTML

Extracting Launch Data

Convert extracted data

2. Key Features for Prediction

1. Display Unique Launch Sites:

• SELECT AVG(PAYLOAD_MASS__KG_) AS Average_Payload_Mass FROM SPACEXTABLE

5. Date of First Successful Ground Pad Landing:

7. Total Successful and Failed Missions (Two approaches):

8. Booster Versions with Maximum Payload Mass (using a subquery):

8. Booster Versions with Maximum Payload Mass (using a subquery):

Visualize the relationship between Flight

Visualize the relationship between Payload Mass

Visualize the relationship between Payload Mass

Visualize the launch success yearly trend

This project is based on a Plotly Dash application that performs interactive

Github for the spacex_dash_app.py:

Dash for results: http://127.0.0.1:8050/

2. Model Training and Hyperparameter Tuning

2. Model Training and Hyperparameter Tuning

Imported and standardized

Accuracy (best_score_) for

• Exploratory data analysis results

# Plotting the bar chart

# Set axis labels and title

# Rotate x-axis labels for better

The unique launch sites are:

This represents the average payload mass, in kilograms, carried

• These are different versions of the Falcon 9 Full

• Success (98 occurrences): Indicates that

The result shows all the launches

The records for failure landing_outcomes in drone ship in 2015

o The most common outcome is no attempt at landing, with 10

When zoom in the map, we can see the

The map highlights the

• The pie chart shows the distribution

The chart provides a concise

Best performing model: Decision Tree

•The matrix shows the number of

•The model has a high accuracy,

The first successful ground landing achieved by SpaceX occurred on June 4,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.