RedLine Performance Solutions (RedLine) has been in the HPC solutions engineering services business for approximately 17 years and is consistently determined to keep the “bar of excellence” quite high for new hires. This enables RedLine to accomplish what other firms cannot and promotes a high level of staff retention. We offer services ranging from full life cycle HPC systems engineering to remote managed services to HPC program analysis. We are located in the Washington, DC area and are looking for a Cloud System Administrator Lead – Openstack to join us for our NASA NACS High Performance Computing contract.
US citizenship and the ability to obtain a Public Trust security clearance are mandatory requirements for this position. The position is located at a customer site in Greenbelt, MD. Preference is for local candidates, but we will consider relocation as well.
The Cloud System Administrator is responsible for leading a team of system administrators integrating, supporting and troubleshooting a compute intensive private cloud environment. Primary system administration duties include maintaining a 300+ node OpenStack environment, KVM virtualization, multi-petabyte GPFS storage filesystem and SLURM workload management supporting a technical workload primarily doing analytics and scientific computing in the realm of climate science. Regular interaction and communication with the Program Manager, Site Lead, Customer, and site staff at scheduled customer meetings is required. This role is responsible for keeping the customer informed of activities and progress and answering customer inquiries concerning all aspects of the various HPC systems and private and public cloud infrastructures.
Duties and Responsibilities:
- Troubleshooting and resolving in-depth technical problems on Linux
- Required skills directing the work of other system administrators in integration and in troubleshooting.
- Monitor systems performance and maintain high-availability of critical company systems
- Test and certify security patches and new software before production deployment
- Provide off-hours support as required to maintain the availability of key systems and services
- Working on custom special projects as assigned
- Participate in weekly teleconference team meetings and prepare minutes of meetings.
- Deploy and test OpenStack cloud
- Troubleshoot encountered issues and provide solutions.
- Contribute to deployment and administrative repositories through creating Puppet manifests, Shell scripts, Python scripts etc.
- Actively contribute to reviews and documentation
- Design and implement/customize OpenStack features, fix defects and provide improvements wherever required in Python
- Understand OpenStack ecosystem, engage in discussions with the OpenStack community, implement best practices
- Bachelor’s degree in Computer Science, Management Information Systems or other technical discipline required plus 10 years of experience. Master’s degree or equivalent plus 5 years of relevant experience.
- Recent cloud computing experience utilizing AWS (preferred), Google Cloud Platform, MS Azure, or other related platforms, and/or private cloud deployments utilizing OpenStack or similar.
- Required skills include Linux, OpenStack, KVM, GPFS, SLURM, Puppet and automation of AWS or other public cloud vendors.
- Strong scripting abilities (Bash, Korn, PowerShell, Ruby, Python, etc.)
- Experience with Automation technologies covering automated deployment, configuration, testing, monitoring
- Experience in troubleshooting rpm or apt Linux Distributions
- Experience with web based development, scientific computing or numerical analytics a plus.
- Experience in communicating with users, other technical teams, and management to collect requirements, describe software product features, and technical designs.
- Good organization skills to balance and prioritize work, and ability to multitask
- Good communication skills to communicate with support personnel, customer, and managers
Travel required: Minimal
Please email firstname.lastname@example.org with your resume if this opportunity is of interest to you.
It all starts with a conversation
Let’s talk about how RedLine can help you achieve HPC success