Skip to content

Commit 792dc5a

Browse files
authored
Merge pull request larymak#286 from gideonclottey/development
Development
2 parents 73d0c76 + 032912b commit 792dc5a

File tree

3 files changed

+107
-0
lines changed

3 files changed

+107
-0
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
Job Title,Location,Salary,Company Name
2+
SEN Tutor,"SW1, South West London, SW1A 2DD",Recently,Deckers
3+
"SW1, South West London, SW1A 2DD",Recently,�28 - �33 per hour,Deckers
4+
Recently,�28 - �33 per hour,SEN Tutor,Targeted Provision Ltd
5+
�28 - �33 per hour,SEN Tutor,"SW1, South West London, SW1A 2DD",Deckers
6+
SEN Tutor,"SW1, South West London, SW1A 2DD",Recently,Deckers
7+
"SW1, South West London, SW1A 2DD",Recently,�28 - �33 per hour,Deckers
8+
Recently,�28 - �33 per hour,Supply Chain Administrator,Deckers
9+
�28 - �33 per hour,Supply Chain Administrator,"WC2, Central London, WC2N 5DU",EMBS
10+
Supply Chain Administrator,"WC2, Central London, WC2N 5DU",Recently,Deckers
11+
"WC2, Central London, WC2N 5DU",Recently,Unspecified,CV Screen Ltd
12+
Recently,Unspecified,Accounts Payable Assistant,Deckers
13+
Unspecified,Accounts Payable Assistant,"St James, WC2N 5DU",Deckers
14+
Accounts Payable Assistant,"St James, WC2N 5DU",Recently,Webhelp UK
15+
"St James, WC2N 5DU",Recently,Unspecified,Applause IT Limited
16+
Recently,Unspecified,Total Rewards Analyst,Johnson & Associates Rec Specialists Ltd
17+
Unspecified,Total Rewards Analyst,"WC2, Central London, WC2N 5DU",Johnson & Associates Rec Specialists Ltd
18+
Total Rewards Analyst,"WC2, Central London, WC2N 5DU",Recently,Johnson & Associates Rec Specialists Ltd
19+
"WC2, Central London, WC2N 5DU",Recently,Unspecified,Johnson & Associates Rec Specialists Ltd
20+
Recently,Unspecified,SEN Tutor,Elliot Marsh
21+
Unspecified,SEN Tutor,"WC2, Central London, WC2N 5DU",Elliot Marsh
22+
SEN Tutor,"WC2, Central London, WC2N 5DU",Recently,Get Recruited (UK) Ltd
23+
"WC2, Central London, WC2N 5DU",Recently,�28 - �33 per hour,Elliot Marsh
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
import csv
2+
import requests
3+
from bs4 import BeautifulSoup
4+
5+
#Url to the jobsite (using tottal job as an examples)
6+
url = 'https://www.totaljobs.com/jobs/in-london'
7+
8+
r = requests.get(url)
9+
10+
# parsing the html to beautiful soup
11+
html_soup= BeautifulSoup(r.content, 'html.parser')
12+
13+
# Targeting the jobs container
14+
job_details = html_soup.find('div', class_='ResultsContainer-sc-1rtv0xy-2')
15+
16+
# Pulling out the needed tags
17+
job_titles =job_details.find_all(['h2','li','dl'])
18+
company_name =job_details.find_all('div', class_='sc-fzoiQi')
19+
20+
total_job_info = job_titles + company_name
21+
22+
# Writing the data to a CSV file
23+
with open('job_data_2.csv', mode='w', newline='') as file:
24+
writer = csv.writer(file)
25+
writer.writerow(['Job Title', 'Location', 'Salary', 'Company Name']) # header row
26+
min_length = min(len(job_titles), len(company_name))
27+
for i in range(0, min_length - 3):
28+
job_title = job_titles[i].text.strip()
29+
location = job_titles[i+1].text.strip()
30+
salary = job_titles[i+2].text.strip()
31+
company = company_name[i+3].text.strip()
32+
writer.writerow([job_title, location, salary, company])
33+
# print(job_title)
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# WebScraping-for-job-Website
2+
3+
In this code we are fetching information from a job website named totaljobs about job listing available, filters them out according to skills and saves the output
4+
in a local file
5+
6+
This program is able to fetch the:
7+
* Job Title/Role needed
8+
* Company name
9+
* location
10+
* salary
11+
12+
### User Story
13+
As a data analyst I want to be able to get web large information in csv file.
14+
15+
### Acceptance Criteria
16+
Acceptance Criteria
17+
18+
- It is done when I can make a request to a specified url.
19+
- It is done when I get response from that url.
20+
- It is done when I get the target content from the url.
21+
- It is done when that content is saved in csv file.
22+
23+
24+
#### Sample Output
25+
![](https://github.com/larymak/Python-project-Scripts/blob/main/WebScraping/posts/Capture.PNG)
26+
27+
### Packages used
28+
- BeautifulSoup
29+
- requests
30+
- csv file
31+
32+
### Challenges encountered:
33+
- The only real difficulty was trying to locate the precise ID and passing robots elements (such as find element by ID, x-path, class, and find_all) that would appropriately transmit the information back.
34+
- In overall our team was succussful to apply python on web scraping to complete our assignment.
35+
36+
37+
## Steps To Execution
38+
- Fork this repository and navigate to the WebScraping-Data-Analytics folder
39+
- Execute the program by running the pydatanalytics.py file using `$ python pydatanalytics.py`
40+
- The program will then fetch the information and put the information into a csv file.
41+
42+
### Team Members
43+
- [@gideonclottey](https://github.com/gideonclottey)
44+
- [@Dev-Godswill](https://github.com/Dev-Godswill)
45+
- [@ozomata](https://github.com/ozomata)
46+
- [@narinder-bit](https://github.com/narinder-bit)
47+
- [@Sonia-devi](https://github.com/Sonia-devi)
48+
49+
50+
51+

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy