NUNA

The “vader” shared memory transport in Open MPI: Now featuring 3 flavors of zero copy!

Today’s blog post is by Nathan Hjelm, a Research Scientist at Los Alamos National Laboratory, and a core developer on the Open MPI project. The latest version of the “vader” shared memory Byte Transport Layer (BTL) in the upcoming Open MPI v1.8.4 release is bringing better small me…

Traffic (redux)

I’ve written about network traffic before (see this post and this post). It’s the subject of endless blog posts, help forums, and instructional guides across the internet. In a High Performance Computing (HPC) context, there are some fascinating aspects about network traffic that are fai…

Process affinity: Hop on the bus, Gus!

Today’s blog post is written by Joshua Ladd, Open MPI developer and HPC Algorithms Engineer at Mellanox Technologies. At some point in the process of pondering this blog post I noticed that my subconscious had, much to my annoyance, registered a snippet of the chorus to Paul Simon’s time…

Open MPI: Binding to core by default

After years of discussion, the upcoming release of Open MPI 1.7.4 will change how processes are laid out (“mapped”) and bound by default.  Here’s the specifics: If the number of processes is <= 2, processes will be mapped by core If the number of processes is > 2, processes…

EuroMPI’13 Cisco slides: Open MPI Process Affinity User Interface

The slides below are from my presentation at EuroMPI’13 about Open MPI’s flexible process affinity interface (in OMPI 1.7.2 and later).  I described this system in a prior blog entries (one, two, three), but many people keep asking me about it. Josh Hursey from U. Wisconsin, LaCrosse, wr…

How many network links do you have for MPI traffic?

If you’re a bargain basement HPC user, you might well scoff at the idea of having more than one network interface for your MPI traffic. “I’ve got (insert your favorite high bandwidth network name here)! That’s plenty to serve all my cores! Why would I need more than that?R…

Latency Analogies (part 2)

In a prior blog post, I talked about latency analogies.  I compared levels of latencies to your home, your neighborhood, a far-away neighborhood, and another city.  I talked about these localities in terms of communication. Let’s extend that analogy to talk about data locality.…

Latency Analogies

Multiple readers have told me that it is difficult for them to understand and/or visualize the effects of latency on their HPC applications, particularly in modern NUMA (non-uniform memory access) and NUNA (non-uniform network access) environments. Let’s breaks down the different levels of lat…