Quanti anni hai? Parli inglese?
Lead Data Engineer Interview Questions
208 lead data engineer interview questions shared by candidates
In SQL, what is the difference between GROUP BY and PARTITION BY?
- How to ensure idempotency? - What is an integration test? - How to define vision, mission, values and goals of a team - How to manage underperformance of a team member?
Context Given a night with X ride requests and Y available drivers in a fictional city, you have to develop a batch processing application that would aggregate and expose data coming from the matching engine. Every Z seconds, the matching engine tries to match every pair of request and driver that are available in the city. Some are matched, some are not. The results of each matching tick are stored in a set of files. Given this data we want to be able to get a overview of the marketplace health and multiple applications could follow such as heat maps. The metrics that we want to use for heat mapping are driver match rate and request match rate. Exercice Develop a microservice that will: Aggregate matching data by fetching new matching data and exploiting this data in order to aggregate driver match rates and request match rates by geo-spatial units of your choice. Expose an endpoint which return the adjusted values of request and driver match rate, following this formula.. Bonus Plot the driver and request match rate for one night
Q: How to measure the satisfaction of team members? Q: What is GIL in Python? Q: How to optimise SQL query?
Generic questions about previous roles, motivations etc
on the call, difficult situation faced and how you handled it.
Experience based questions, handle different challenges, what solution did I provide, different ETL approaches , various AWS services
Find the maximum value in all possible subarrays of size K (sliding window maximum problem). Given an array of integers and a window size K, return the maximum values as the window slides through the array. This is a LeetCode hard-level algorithm question that's rarely relevant for data engineering roles, which typically focus on SQL optimization, ETL pipelines, and data architecture rather than complex algorithmic problems.
- What was/were the every day process/rituals of my current position and how this affected my productivity? Also, I was presented with the HTB existing processes and asked to comment on them too. - What tech stack I am currently working and what I am most comfortable to work with? Additionally, I was informed with the existing HTB tech stack and made a detailed comparison between the two. - Opinion/View on future HTB projects/features that I am likely to be involved
Viewing 121 - 130 interview questions