What is the difference between a map and a flatMap in Scala
Data Engineer Interview Questions
18,724 data engineer interview questions shared by candidates
Q: How big of a dataset have you worked with before in your projects
What is a dual eligibility?
Bellow is a table called fact_daily_users, which contains the users who were active on a specific date with the specific action they have made (aggregate total actions per row). user_id | date | action| total_actions| day_in_row 332 |17/06| view | 1 | 1 332 |20/06| view | 6 | 1 332 |20/06| click | 2 | 1 221 |24/06| view | 4 | 1 221 |24/06| click | 2 | 1 221 |21/06| view | 1 | 2 221 |20/06| view | 1 | 1 332 |21/06| view | 4 | 2 332 |21/06| click | 3 | 2 Q1: write a query that calculates the logic of day_in_row field with SQL, no joins are allowed. "day_in_row" shows consecutive days for user - users that return day after day to the website per row. Q2: write a function (Python / Java) that gets sql query and returns the output in json object
What is Integration Runtime in azure?
Python 1. How to remove duplicates 2. Words frequency in a dictionary 3. Multiple Constraints based dictionary questions SQL 1. Find the time between next station for a train
How would you use Python to flatten a JSON object?
Aws - Boto3 SQL - Case Satement - Mod function - Qualify Row Number - Checksum Unix - Find hidden files within a folder Tons of behavioral questions
1) Imagine you are data modeling Netflix and create the entities and relationships involved using a modeling tool. 2) follow a link to a sql tool which shows 4 tables. Run some provided sql to populate the tables, Create the sql to aggregate the data and provide a result set showing the top row.
Query a many to many relationship while not violating the grain of a fact table.
Viewing 461 - 470 interview questions