High-Performance Chemistry

There’s an old standby which tells us that A supercomputer is a device for turning compute-bound problems into I/O-bound problems.

High-Performance Chemistry

There’s an old standby which tells us that A supercomputer is a device for turning compute-bound problems into I/O-bound problems.

Fill out form to continue
All fields required.
Enter your info once to access all resources.
By submitting this form, you agree to Expero’s Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Sebastian Good

February 10, 2012

High-Performance Chemistry

There’s an old standby which tells us that A supercomputer is a device for turning compute-bound problems into I/O-bound problems.

Tags:

There’s an old standby which tells us that

A supercomputer is a device for turning compute-bound problems into I/O-bound problems.

If that’s the case, then perhaps concepts from freshman chemistry are useful when looking at HPC problems. In this case, a rate limiting step is the step which constrains the rest of the reaction. We are looking at a problem these days where we would like to compute a 12MB extract from 600GB of data, and if we had our druthers, we’d do it in 100ms. There are lots of pieces of this pipeline, but it’s worth considering just the raw bandwidth required, as we would consider flux in a chemical reaction. For us, that’s 6TBs-1. Not a lot of devices can push that many bytes per second. Wikipedia has a handy list of device bandwidths we can look at to figure out just what kinds of things might get close. One device that they don’t list there is the speed of GPU RAM, but there’s an entry over at Video Card that has some details about graphics card RAM, which has pretty fabulous bandwidth.

The short answer is that only RAM gets close to that kind of bandwidth. I didn’t mention that the 600GB is a subset of about 5TB of data we’re going to keep live in RAM, and we can’t afford that much GPU RAM. It appears the maximum theoretical bandwidth of main system RAM is on the order of 20GB-1. That suggests that if we really have to look at all 600GB of the data, and it’s all in RAM, then we still need a minimum of three hundred RAM buses to have a hope of answering our question in that time period — if it’s perfectly parallelizable!

There are probably other important rate limiting steps, like the speed of the floating point units, or vector processing power, but this is at least a useful to get on the table for our problem. There are also ways to make better use of that bandwidth, such as compressing data in RAM. What if we could compress 10x? Or use a heuristic to estimate 50% of the data? These are potential improvements, but the basic number is an excellent sanity check.

User Audience

Services

Project Details

Similar Resources

Serverless ML

You can now deploy your models and get real-time scalable results without ever having to provision a server. Let me show you how I did it.

Watch Demo

Software Craftsmanship in Context

Software quality matters. Learn from a real use-case how we follow the Software Craftsmanship method to push a hard project forward.

Watch Demo

Company for a Cup of Joe: Finding Your Tribe

Combining traditional search techniques with graph algorithms to efficiently find subgroups within data.

Watch Demo

Testing React Applications Using Jest and React Testing Library

Avoid bugs & gain confidence when refactoring code by writing tests for your React code using Jest, React Testing Library & a Test Driven Development approach.

Watch Demo