Mawdoo3 Data Engieer Screening Test
Mawdoo3 Data Engieer Screening Test
com
Question 1:
You and your team decided to create a web application that is very similar to Twitter, but
very simplified. the App requirements are as follows:
- Users can sign up and sign in using their email and password
- Users can publish textual posts
- Posts can be associated with #tags
- Users can follow other users, in which case they will be able to see their posts
- A user can interact with a post by commenting, like or dislike
You're asked to design an SQL database model for this app.
1. Draw an ER diagram showing the tables and the relations between them.
2. Based on the model you designed, write an sql query to find the top 10 trending
#tags in the last 7 days, considering the number of posts published on each #tag
Question 2:
Write a function in (python, Java, or ...) that accepts 2 inputs representing time spans in
the format [“HH:mm”, “HH:mm”] (the first item in the array represents ‘from
time’, second item represents ‘to time’)
The function should sort the 2 timespans ascending, and merge them if they intertwined
● example 1:
○ input:
■ time_span1 = [“12:00”, “14:00”]
■ time_span2 = [“13:00”, ”15:00”]
○ Output should be:
■ [[“12:00”, “15:00”]]
● example 2:
○ input:
■ time_span1 = [“17:00”, “20:00”]
■ time_span2 = [“01:00”, “02:40”]
○ Output should be:
■ [[01:00-02:40], [17:00-20:00]]
Mawdoo3.com
Question 3:
Write a function in (python or java) to
read data from the s3 file <s3://dc-test-dataset/test.csv.gz> and find the following
insights: (the data represents bikes booking transactions)
1. What is the average trip time in minutes?
2. What is the percentage of female customers?
3. How old are the youngest and the oldest customers? Do the numbers make
sense? If not, how to fix them?