HPC Engineering Lead - Military veterans preferred

SAIC (www.saic.com)


  full-time   employee

United States


JOB DESCRIPTION: This position will provide technical leadership and systems engineering management for High Performance Computing. This role will be responsible for systems planning, system and lower level requirements development, system design, analyses and trade studies as it relates to High Performance Computing (HPC) solutions that provide a robust, reliable, cost-effective, and scalable infrastructure for Next Generation Gene Sequencing and other high throughput data analysis requirements. Support shall include providing planning, analysis, design, development, testing, configuration, installation, implementation, integration, maintenance and management of all HPC related hardware, software and infrastructure.

Job Specific Responsibilities: 

  • Provide efficient and effective HPC Support of HPC-related hardware, software, and infrastructure components to maximize the performance and availability of HPC solutions.
  • Provide timely and effective maintenance and repair support on HPC-related hardware, software, and infrastructure components to maximize the performance and availability of HPC solutions.
  • Provide efficient performance monitoring of all HPC- related hardware, software, and infrastructure components, including the issuance of timely and accurate notification of HPC-related issues.
  • Provide after-hours monitoring and timely resolution commensurate with the mission criticality of the affected system(s).
  • Support, maintain, and enhance as required, established strategies for applications hosting to ensure continuity of business operations and timely recovery in the event of disaster.
  • Collect, store, and analyze data relevant to HPC solutions to perform and report accurate root causal analysis of all related issues and support trend analysis and forecasting.
  • Effectively manage the procurement and maintenance of all supplies, materials, and supporting software licenses and service agreements required to ensure supported HPC-related hardware, software, and infrastructure components to maximize the performance and availability of HPC solutions.
  • Ensure effective change and configuration management of all supported HPC solutions to establish and maintain consistency of their performance, security, and functional and physical attributes with approved requirements, design, and operational information throughout its life.
  • Assist in the development and maintenance of standard operating procedures for operation, maintenance, and repair of HPC hardware, software, and infrastructure components.
  • Ensure all HPC-related data and documentation is added to and maintained current within the Knowledge Database and Document Library to provide efficient access to a complete and current source of operationally relevant structured and unstructured data.


  • Cisco Systems, Nexus
  • Brocade
  • Microsoft Windows
  • Linux
  • VMware
  • EMC
  • DDN
  • Dell
  • IBM 


  • Performance Management
  • Requirements Development
  • High Performance Computing Cluster
  • HPC cluster and management tool, job schedulers
  • SAN-Storage • Security
  • Systems Architecture
  • Network Systems Architecture
  • Communication Systems T


  • Cisco Systems
  • Routing/Switching
  • Networking
  • Unified Communications
  • VMware
  • Unified Communications (including Skype for Business and Polycom VT) 




  • Bachelor Degree in Computer Science, Computer Engineering or related field
  • Minimum of 10 years of experience as Information Systems Engineer
  • Minimum seven years HPC experience
  • In depth knowledge of HPC cluster and software such as cluster management/provision tools, job schedulers(SGE, PBS, etc), parallel file system, MPI, MPICH etc
  • Deep understanding and knowledge of HPC technology such as storage, high speed interconnects, infiniBand, 10GigE etc.   cluster file systems (GPFS, Lustre, etc)
  • Experience with scientific computing support include scientific computing software and application support.  Experience with bioinformatics, biomechanics software and application support is a plus.
  • Eight to Ten Years of Experience in Team Leadership
  • Five Years of Experience providing formal documentation and Briefings
  • Public Trust L5 Security Clearable  


  • Master of Science in Computer Science, Computer Engineering
  • Graduate Degree in Computer Science, Computer Systems Engineering
  • Experience with or knowledge of HHS EPLC