Role Summary
The Data Architect will be an integral member of the Digital Information Delivery team, based in Teaneck, NJ (Remote). This hands-on role leads the architecture, delivery, and governance of reporting and data delivery at Phibro, with proven experience in data warehousing and IBM Cognos BI, focusing on data architecture, sourcing, reporting, and analytics through project lifecycles.
Responsibilities
- Modern Data Architecture Leadership
- Own and enhance Phibro’s modernized data architecture built on Microsoft Fabric.
- Define technical standards for medallion architecture, CDC strategy, Fabric pipelines, optimization techniques, and Lakehouse/Warehouse usage.
- Evaluate and guide decisions on Spark vs. T-SQL warehouse design patterns.
- Oversee performance tuning to reduce compute consumption (CUs) across lakehouse notebooks, pipelines, and warehouse SQL objects.
- Fabric Migration & Platform Modernization
- Lead the migration of legacy SQL Server/SSIS processes to Fabric Lakehouse, Warehouse, and Data Factory.
- Re-engineer existing pipelines, stored procedures, and ETL frameworks into scalable Fabric-native solutions.
- Manage mirroring, CDC enablement, and ingestion performance for JDE and other enterprise systems.
- Validate and refine Silver and Gold layer ingestion, including schema evolution, SCD Type 2, and materialized view refresh logic.
- Data Engineering & Integration
- Design and maintain robust data ingestion pipelines for financials, sales, inventory, and master data domains.
- Partner with engineering teams to optimize Spark notebooks, Fabric pipelines, and delta lake tables.
- Ensure efficient CDC handling, bronze-to-gold transformations, and dependency orchestration.
- Establish operational runbooks and streamline orchestration sequencing.
- Data Warehouse, MDM, & Governance
- Architect scalable enterprise data models for Finance, Supply Chain, Sales, Animal Health operations, and Planning.
- Support MDM modernization initiatives such as Customer Master redesign and Business Unit Master optimization.
- Promote data quality, auditability, lineage, and governance using Purview.
- Establish standards for schema design, keys, constraints, and gold-layer performance.
- Business Partnership & Delivery
- Partner with Finance, Supply Chain, Commercial, and Production teams to understand reporting and analytics needs.
- Support month-end close processes through reliable data pipelines and validated warehouse layers.
- Provide guidance to leadership on capacity planning, cost optimization, and architecture selection.
- Collaborate with vendors (Microsoft, Virtusa, Qlik, etc.) on engineering, troubleshooting, and roadmap planning.
Qualifications
- Required: 10+ years of experience in data architecture, including extensive SQL Server and ETL/SSIS experience.
- Required: Hands-on expertise with Microsoft Fabric components — Lakehouse, Warehouse, Pipelines, Data Factory, Spark, Dataflows; Delta Lake, Materialized Views, Mirroring, CDC frameworks.
- Required: Strong understanding of medallion architecture and big-data ingestion patterns.
- Required: Proven experience modernizing legacy environments into cloud-first architectures.
- Required: Experience with financial systems, reporting, and month-end processes.
- Required: Strong performance optimization skills (Spark tuning, SQL optimization, CU cost control).
- Preferred: Microsoft Azure Data Engineering or Fabric certification.
- Preferred: Experience with Qlik Replicate, Azure SQL Managed Instance, Synapse, or Databricks.
- Preferred: Familiarity with data governance tools (Purview) and MDM frameworks.
- Preferred: Experience with Power BI, Cognos, or similar analytics tools.
- Preferred: Prior exposure to integrating with ERP systems such as JDE.
Education
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or related field.
Skills
- Key Technologies: Microsoft Fabric (Lakehouse, Warehouse, Pipelines, CDC, Mirroring, Spark)
- Azure services: Data Factory, SQL Database, Managed Instance, Data Lake Storage
- SQL Server / SSIS
- Data Modeling: SCDs, star schema, dimensional modeling
- Visualization: Power BI, Cognos
- Governance: Purview
- ETL/ELT: Qlik Replicate, ADF, T-SQL, Spark