In my previous blog entry, I answered a user question about how MPI defines its global constants, specifically in the context of interactions with other languages.
I went beyond that answer, and also explained why MPI does not define an ABI.
In this entry, I’ll go into the “how does MPI interact with other languages?” part of the question.
Let me start off by saying: MPI officially only defines bindings for C and Fortran.
MPI-2 defined C++ bindings, but those have been deprecated and later deleted (they were quite complicated to maintain — there were many C++-specific errata during the MPI-2 timeframe — and very few people used them).
But the point remains: other languages are heating up in HPC these days. For example, scripting languages are becoming popular again.
You obviously get lower raw performance from scripting languages than C or Fortran, but that’s not necessarily the goal. There’s a wide range of applications that can gain huge performance benefits from parallelization, even if they leave a little performance on the table due to language runtime overhead. The typical argument is that prototyping and productivity are higher in scripting languages than C (and possibly Fortran).
- mpi4py (MPI for Python) is likely the best example of a well-supported, active scripting language project for MPI.
- There’s a few commercial products out there, too, such as Matlab’s Parallel Computing Toolbox.
- There’s a minor resurgence of using MPI with Java (e.g., bundled in Open MPI). Java’s not quite a scripting language, but it fits the category of “not C or Fortran,” so I included it in this list. There were several Java MPI variants in the late ’90s, most (all?) of which no longer exist.
- And finally, there’s a bunch of old, unmaintained MPI-based inter-language projects available (Google for them). These reflect interest and desire to interface to MPI, but have become abandonware over time.
MPI’s support for non-C/Fortran languages is mainly comprised of two things:
- “That which works with C, works with MPI.” This is not as trite as it sounds: the point here is that many scripting languages include some type of back-end interface to C. Hence, using this interface, one can effectively create an MPI interface in any language that has a good interface to C.
- MPI-3.0 added the MPI_MPROBE suite of functions. One of its explicit goals was targeted at scripting languages: provide the ability to safely receive a message of unknown size.
Honestly, that’s pretty weak support for non-C/Fortran languages. 🙁
Even though the MPI Forum is unlikely to support languages other than C or Fortran in the near term, I’d be interested in hearing from readers what you need from MPI to support your language.
Is better inter-language operability support something that the MPI Forum should add in MPI-4? If so, what does it look like? What kinds of hooks would be necessary? What kinds of interfaces and functionality would be useful?
FWIW, I asked this question to the Perl CPAN community in 2009, but didn’t get any response. Is there any more interest these days?
Thanks!
I can only talk about strongly-typed compiled languages (like Rust, which is my main interest).
For a zero-cost MPI wrapper I want to map to the same C representation of a given implementation. Rust allows me to do that via its FFI with C. The only thing I need to know are the C types of the API (constants, functions, …). For each MPI implementation I have different types in its C interface so I need a different FFI C-wrapper for each. This is particularly painful in Rust because it does not integrate the C pre-processor (which is a good thing), so it means that it has to preprocess the sources in the build system or know the types for each version of each implementation and hope that they don’t change.
Then I want to write code against one MPI implementation and have it working with any implementation via a recompile. I have to hope that changing the types from one implementation to another doesn’t break my program. I don’t think this ever happened to me in practice but it feels very fragile.
As far as user codes are concerned — and inter-language bindings wrappers should be no different than other MPI applications — the types are the same, regardless of the implementation.
If Rust can’t utilize the C preprocessor — i.e., it can’t see/utilize the internals of mpi.h, where such types are declared — that sounds like the Rust<-->C integration is not fully complete. Specifically, from your description, it sounds like publicly-declared types (e.g., structs) would need to be mirrored in Rust, since their respective C .h files could not be used. That seems odd / incomplete.
As for the reasons why MPI does not have an ABI, see my prior blog entry: http://blogs.cisco.com/performance/mpi-outside-of-c-and-fortran The reasons that I cite may not make you happy, but know that at least there are definite, deliberate reasons why there is no ABI today (that may change someday, but that’s the way it is today).
I’ll settle for good Fortran bindings 😛 How many implementations are fully functional when Fortran codes are compiled with 64b integers (the -i8 way)? How many support Fortran 2008 float128 everywhere?
Sounds like a quality of implementation issue, not a standards issue. 🙂
Open MPI has worked properly when Fortran is compiled with -i8 in the past, but I don’t think it’s something we test regularly.
Thanks Jeff it was a very interesting read