Role Summary
Senior Data Product Engineer with deep expertise in DBT, Databricks, Power BI/Tableau and AWS, who can architect and implement scalable, reliable, and high-performance data products. Design and deliver production-grade data pipelines and models, optimize query and compute performance, and expose data to end users through well-structured semantic layers and dashboards.
Responsibilities
- Architect and implement end-to-end ELT workflows with DBT (core and Cloud), ensuring modular, testable, and reusable transformations.
- Build high-performance data pipelines in Databricks (PySpark, Delta Lake, Unity Catalog) for batch and streaming workloads.
- Engineer scalable data ingestion pipelines into AWS (S3, Kinesis, Glue, Lambda, Step Functions) with strong monitoring and fault tolerance. Ensure observability, cost efficiency & scalability in all pipeline and compute designs.
- Design normalized and star-schema models for analytical workloads, following dbtโs best practices and software engineering principles.
- Implement data quality testing frameworks (dbt tests, Great Expectations, or custom validations) with automated CI/CD integration.
- Manage data versioning, lineage, and governance through tools such as Unity Catalog and AWS Lake Formation.
- Develop semantic data layers that support self-service analytics across BI tools (Tableau, Power BI etc.).
- Build interactive, real-time dashboards with metric consistency and role-based access control.
- Partner with analysts and data scientists to optimize queries and deliver production-ready datasets.
- Automate deployment pipelines with CI/CD (GitHub Actions, GitLab CI, or AWS CodePipeline) for dbt and Databricks.
- Implement infrastructure-as-code (IaaC) for reproducibility (Terraform, CloudFormation).
- Ensure system reliability through observability and monitoring (Datadog, CloudWatch, Prometheus, or similar).
- Benchmark and optimize SQL, Spark, and BI query performance at scale.
Qualifications
- Bachelorโs degree in Computer Science, Information Systems, Engineering, or related field.
- 8+ years in data engineering, analytics engineering, or data platform development.
- Expert-level proficiency in DBT: advanced macros, Jinja, testing, exposures, dbt Cloud deployment.
- Databricks: Spark (PySpark, SQL), Delta Lake, Unity Catalog.
- AWS: S3, Glue, Lambda, Step Functions, Datasync, EMR, Redshift, IAM and networking/security fundamentals.
- Data Visualization: Power BI or Tableau.
- Strong programming in Python and SQL (including query optimization).
- Experience with distributed systems and large-scale datasets (TBโPB scale).
- Experience implementing CI/CD pipelines, data testing, and infrastructure as code.
- Solid understanding of data governance, security, and compliance in enterprise environments.
Preferred Skills
- Experience with real-time data pipelines (Kafka, Kinesis, Delta Live Tables).
- Familiarity with containerization and orchestration (Docker, Kubernetes, EKS).
- Exposure to machine learning workflows in Databricks.
Education
- Listed in Qualifications above.
Additional Requirements
- Location: US-based candidates (salary band provided in the description).