Qualification : Bachelor’s degree and at least four years of experience in designing, developing, deploying and/or supporting data pipelines using Databricks
Key Accountabilities & Responsibilities:
• Designing and implementing data ingestion pipelines from multiple sources using Azure Databricks
• Developing scalable and re-usable frameworks for ingesting of data sets
• Integrating the end to end data pipleline - to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times
• Working with event based / streaming technologies to ingest and process data
• Working with other members of the project team to support delivery of additional project components (API interfaces, Search)
• Evaluating the performance and applicability of multiple tools against customer requirements
• Migrate and optimize legacy SQL scripts to Databricks SQL to ensure efficient query execution on large-scale datasets stored in Databricks and Delta Lake.
• Modify and adapt SQL queries to account for differences between traditional relational database systems and Databricks' distributed environment, including handling large datasets, partitioning, and parallel execution.
• Ensure that converted SQL queries return accurate and expected results. Perform thorough testing and validation to confirm data integrity after migration.
• Address issues related to performance, query optimization, and troubleshooting of SQL-based workflows in Databricks, ensuring smooth operation within the Databricks environment.
• Stay up-to-date with the latest Databricks SQL features, best practices, and emerging technologies to continuously improve SQL conversion and execution processes.
• Work with cross-functional teams including data engineers, data scientists, and business intelligence teams to ensure seamless integration of Databricks SQL queries with existing data pipelines and analytics tools
Technical / Professional Skills
Technical Expertise:
• Expertise in designing and deploying data applications on cloud solutions, such as Azure or AWS
• Hands-on experience with Databricks, including Databricks SQL, Delta Lake, and the Databricks SQL Analytics workspace.
• In-depth knowledge of SQL, including complex queries, joins, subqueries, window functions, aggregations, and data transformations.
• Knowledge of relational databases and experience with migrating queries from traditional SQL environments (e.g., MySQL, SQL Server) to Databricks SQL.
• Familiarity with business intelligence tools like Tableau, Power BI, SSRS, and their integration with Databricks.
• Demonstrated analytical and problem-solving skills particularly those that apply to a big data environment
• Strong troubleshooting and debugging abilities, especially in the context of data migrations and performance issues
Education & Experience: Minimum 5+ years of relevant experience
Compensation:(depending upon the interview)
Location: Work from Office Hinjewadi, Pune