Job Description
We have
a direct-hire
position
for a
Data Engineer
with one of our clients in
Jacksonville, FL.
No third party candidates considered for this position.
This position will lead the design, evaluation, implementation, and testing of a scalable, high-performance enterprise data warehouse. In this role, you will be responsible for developing dimensional data models, ETL/ELT pipelines, and cloud-based data solutions that enable robust analytics and reporting. You will collaborate with cross-functional teams to ensure data quality, governance, and security while optimizing performance for large-scale datasets. The ideal candidate has a strong background in SQL, cloud data platforms (AWS, GCP, or Azure), and data pipeline automation and is enthusiastic about building modern data architectures and enabling data-driven decision-making.
Essential Functions, Duties, and Responsibilities
- Data Warehouse Design & Architecture
- Lead the end-to-end design and implementation of a scalable, high-performance data warehouse.
- Define data architecture principles, strategies, and best practices to ensure optimal performance and maintainability.
Dimensional Modeling & Data Modeling
- Design and implement robust dimensional models (star and snowflake schemas) to optimize analytical queries.
- Develop conceptual, logical, and physical data models to support reporting, analytics, and business intelligence (BI).
- Ensure data models align with business requirements and support self-service analytics.
- Leverages existing data infrastructure to fulfill all data-related requests, perform necessary data housekeeping, data cleansing, normalization, hashing, and implementation of required data model changes.
ETL/ELT Development & Data Pipelines
- Design, develop, and optimize ETL/ELT pipelines to ingest, transform, and load structured and unstructured data from various sources.
- Ensure data pipelines are scalable, efficient, and maintainable while handling large datasets.
- Implement incremental data processing strategies to manage real-time and batch workloads.
Data Governance, Quality, and Security
- Establish data governance practices, including data cataloging, lineage tracking, and metadata management.
- Implement data validation, anomaly detection, and quality monitoring to ensure accuracy and consistency.
- Collaborate with security teams to enforce role-based access control (RBAC), encryption, and compliance standards (GDPR, HIPAA, etc.).
Performance Tuning & Optimization
- Optimize database performance by indexing, partitioning, caching, and query tuning.
- Monitor and troubleshoot slow queries, ensuring efficient use of resources.
- Analyze data to spot anomalies, trends and correlate similar data sets.
Cloud Technologies
- Architect and implement cloud-based data warehouse solutions (e.g., Snowflake, AWS Redshift, Google BigQuery, Azure Synapse).
Collaboration & Cross-Functional Work
- Work closely with business stakeholders & analysts to translate requirements into scalable data solutions.
- Partner with software engineers and DevOps teams to ensure seamless data integration and infrastructure reliability.
Monitoring & Incident Response
- Establish monitoring solutions for data pipeline failures, schema changes, and data anomalies.
- Set up logging and alerting mechanisms to proactively identify and resolve issues.
Mentorship & Best Practices
- Guide and mentor other members of the data team on best practices in data modeling, ETL, and architecture.
- Define and document standards for data warehouse development, maintenance, and governance.
Data Science & Machine Learning
- Designs, develops, and implements statistical models to carry out various novel aspects of classification and information extraction from data.
- Designs, develops, and implements natural language processing software modules.
Qualifications
Educational and Experience Requirements
- Bachelor’s degree and 6+ years of experience in data analytics, data engineering, data architecture, or software engineering with a focus on data warehouse design and implementation.
- Proven experience designing and implementing dimensional data models (star schema, snowflake schema) for enterprise-scale data warehouses.
- Hands-on experience with ETL/ELT development using tools like dbt, Informatica, Talend, Apache Airflow, or custom data pipelines.
- 3+ years of experience working with cloud-based data warehouse platforms such as Snowflake, AWS Redshift, Google BigQuery, or Azure Synapse Analytics.
- Strong knowledge of SQL, Python, and/or Scala for data processing and transformation.
- Experience with relational and (preferably) NoSQL databases (e.g., SQL Server, MySQL, MongoDB, Cassandra).
- Experience with Microsoft Fabric/Synapse/OneLake preferred.
- Experience implementing data governance, data quality frameworks, and role-based access control (RBAC).
- Experience working closely with other departments and teams to ensure alignment on goals and objectives for successful project execution.
- Perform work under minimal supervision.
- Experience with understanding data requirements and performing data preparation and feature engineering tasks to support model training and evaluation preferred.
Knowledge, Skills, and Abilities
- Identify areas of improvement, troubleshoot issues, and propose creative solutions that drive efficiency and better business results.
- Continuously assess and refine internal processes to enhance team productivity and reduce bottlenecks.
- Able to assess complex problems, weigh options, and devise practical solutions.
- Handle complex issues and problems and refers only the most complex issues to higher-level staff.
- Deep understanding of dimensional modeling (star and snowflake schemas), OLAP vs. OLTP, and modern data warehouse design principles.
- In-depth knowledge of ETL/ELT best practices, incremental data processing, and tools like dbt, Apache Airflow, Informatica, Talend, Fivetran.
- Understanding of data governance frameworks, data lineage, metadata management, role-based access control (RBAC), and compliance regulations (GDPR, HIPAA, CCPA).
- Knowledge of query tuning, indexing, partitioning, caching strategies, and workload optimization for large-scale data warehouses.
- Regularly update and communicate project status, timelines, and risks to both internal and external stakeholders.
- Ability to write optimized, complex SQL queries for data retrieval, transformation, and aggregation.
- Hands-on experience building scalable, fault-tolerant data pipelines for batch and real-time processing.
- Experience with RESTful APIs, event-driven architecture (Kafka, Pub/Sub), and integrating third-party data sources.
- Practical experience deploying and managing data solutions on AWS (S3, Glue, Lambda), GCP (Dataflow, Pub/Sub), or Azure (Data Factory, Synapse Analytics).
- Familiarity with BI tools (Tableau, Looker, Power BI, Mode Analytics) and self-service analytics enablement.
- Design and implement high-performance, resilient, and cost-effective data architectures.
- Analyze and resolve data integrity, scalability, and pipeline failures with structured problem-solving.
- Provide leadership, coaching, and/or mentoring to foster growth in the team.
- Explain technical concepts to non-technical stakeholders and advocate for data-driven decision-making.
- Stay current with evolving data engineering trends, tools, and methodologies.
Skill Requirements
- Typing/computer keyboard
- Utilize computer software (specified above)
- Retrieve and compile information
- Verify data and information
- Organize and prioritize information/tasks
- Advanced mathematical concepts (fractions, decimals, ratios, percentages, graphs)
- Verbal communication
- Written communication
- Research, analyze and interpret information
- Investigate, evaluate, recommend action
- Basic mathematical concepts (e.g. add, subtract)
- Abstract mathematical concepts (interpolation, inference, frequency, reliability, formulas, equations, statistics)
Required Skills
SQL, ETL Processes, Python, Azure, GoogleCloud, Scala, Snowflake, Java, Azure Data Factory, AWS, hadoop, Data Vault 2.0