SAIC is seeking a Sr. Data Engineer to perform data model design, data formatting, and ETL development optimized for efficient storage, access, and computation to serve various use cases. You will work closely with data scientists, software developers, and leadership to understand use cases and requirements, then leverage appropriate tools and resources to achieve desired customer deliverables. You will explore data from various sources; develop new tools, code, and services to execute data engineering activities;develop new, and modify existing data models; and write code for ETL processes in a fast-paced environment.
Job Duties Include:
Movement of structure and unstructured data (gigabyte to terabyte range) using Sponsor approved methods.
Execute data ingestion activities for storing data in a local or enterprise level (Integrated Data Layer) location.
View data in its source format.
Develop code to format data that facilitates exploration.
Analyze source data formats and work with Data Scientists and Mission Partners to determine the formats and transforms that best meet mission objectives.
Develop code and tools to provide one-time and on-going data formatting and transformations into enterprise or boutique data models.
Implement existing ETL code and best practices/standards that are currently in use in the enterprise.
Develop an ETL Code Transition Plan when the Sponsor identifies a specific project. Projects will be identified periodically.
Develop and deliver Software Documentation for each code project that includes ETL mappings, code use guide, code location (generally GitHub) and access instructions), and anomalies encountered.
Facilitate Code Reviews twice a year for each mission partner organization and one for each project.
Must have an active/current TS/SCI with Polygraph
Bachelor’s Degree or equivalent years and 14 + years of experience
Experience with ETL code and tools
Amazon Web Services (AWS) experience
Experience working and developing capabilities on Linux and Windows
5 + years of Spark experience
5 + years of Hadoop experience
Experience developing, testing, and maintaining Python programs as packages or notebooks
Experience developing and maintaining data processing flows using NiFi
Experience working with and maintaining SQL database systems, particularly PostgreSQL and MySQL/MariaDB
Experience working search oriented NoSQL systems such as ElasticSearch, SOLR, and key value data stores
Experience developing Data pipelines and automation
Experience working with diverse data types including text, image, video, audio, and binary files
Experience with geospatial tools and processing / analyzing geospatial data
Experience developing and maintaining dashboards for users to engage with data using Kibana or equivalent technologies