Summary
The successful candidate will collaborate closely with various departments to design and implement innovative data solutions that drive operational excellence and informed decision-making.
Project Objectives
- Construct and maintain efficient ETL/ELT pipelines within the Navy/DOD's common data environments (Advana/Jupiter) to support reporting, dashboarding, and modeling initiatives.
- Proactively identify and resolve data pipeline issues through in-depth root cause analysis and effective remediation strategies.
- Conduct rigorous data quality assessments, exploratory data analysis (EDA), and validation procedures to safeguard data integrity.
- Develop and implement comprehensive database documentation leveraging automated metadata collection techniques, while providing detailed upstream transformation information.
- Optimize complex SQL queries for performance and reusability.
- Collaborate with non-technical stakeholders to elicit and refine data requirements.
Project Scope
- Data Engineering: Design, build, and maintain scalable data pipelines using PySpark, SQL, and Python on big data platforms like Databricks or Snowflake.
- Data Quality: Implement robust data quality checks and monitoring mechanisms.
- Data Governance: Establish and enforce data standards and best practices.
- Data Visualization: Collaborate with BI teams to develop interactive dashboards and reports.
- Stakeholder Management: Effectively communicate technical concepts to non-technical audiences.
Project Deliverables
- Fully operational ETL/ELT pipelines within Advana/Jupiter.
- High-quality, reliable, and secure data assets.
- Optimized database performance and query efficiency.
- Comprehensive data documentation and metadata management.
- Collaborative relationships with internal and external stakeholders.