InfiniBand is among the most common and well-known cluster interconnect technologies. However, the complexities of an InfiniBand (IB) network can frustrate the most experienced cluster administrators. Maintaining a balanced fabric topology, dealing with underperforming hosts and links, and chasing potential improvements keeps all of us on our toes. Sometimes, though, a little research and experimentation […]
Most organizations need some level and type of archive, whether it’s small or large. Sometimes it’s a legal requirement, like for companies that keep medical or financial records. Firms utilizing HPC systems often need to store large data sets long-term in order to analyze them later. Organizations with small archive requirements, say less than a […]
There’s a big difference between basic system monitoring and performance monitoring. In the world of HPC, this distinction is greatly magnified. In the former case, monitoring often boils down to checking binary indicators to make sure system components are up or down, on or off, available or not. Red light/green light monitoring is certainly a […]
We’ve been talking about how baselining and benchmarking helps optimize performance of your HPC system. Last time, we discussed the importance of maintaining a set of consistent benchmarks so you can analyze the performance of your HPC system throughout its lifetime. In this post, we’ll cover some specific baselining tips. Baselining should always start with […]
HPC Systems are frustratingly complex beasts to tame. Competing interests from hardware and software support teams, coupled with demands from IT security compliance, can make having “consistency” in a cluster tough to achieve. Baselining is the natural answer. With baselining, we set a series of “custom” benchmarks. By custom, I mean they are a given […]
Keep connected—subscribe to our blog by email.