MPI Test Suite Test Descriptions
Each of the collapsible regions below corresponds to the test groups displayed
in the MPI Test Suite Results table. Click the name of a test group to
see the individual tests which comprise it. Some tests appear in multiple
groups.
Back to Results Table
Topology
The Network topology tests are designed to examine the operation of specific communication patterns such as Cartesian and Graph topology.
MPI_Cart_create basic
This test creates a cartesian mesh and tests for errors.
MPI_Cartdim_get zero-dim
Check that the MPI implementation properly handles zero-dimensional Cartesian communicators - the original standard implies that these should be consistent with higher dimensional topologies and therefore should work with any MPI implementation. MPI 2.1 made this requirement explicit.
MPI_Cart_map basic
This test creates a cartesian map and tests for errors.
MPI_Cart_shift basic
This test exercises MPI_Cart_shift().
MPI_Cart_sub basic
This test exercises MPI_Cart_sub().
MPI_Dims_create nodes
This test uses multiple variations for the arguments of MPI_Dims_create() and tests whether the product of ndims (number of dimensions) and the returned dimensions are equal to nnodes (number of nodes) thereby determining if the decomposition is correct. The test also checks for compliance with the MPI_- standard section 6.5 regarding decomposition with increasing dimensions. The test considers dimensions 2-4.
MPI_Dims_create special 2d/4d
This test is similar to topo/dims1 but only exercises dimensions 2 and 4 including test cases whether all dimensions are specified.
MPI_Dims_create special 3d/4d
This test is similar to topo/dims1 but only considers special cases using dimensions 3 and 4.
MPI_Dist_graph_create
This test excercises MPI_Dist_graph_create() and MPI_Dist_graph_adjacent().
MPI_Graph_create null/dup
Create a communicator with a graph that contains null edges and one that contains duplicate edges.
MPI_Graph_create zero procs
Create a communicator with a graph that contains no processes.
MPI_Graph_map basic
Simple test of MPI_Graph_map().
MPI_Topo_test datatypes
Check that topo test returns the correct type, including MPI_UNDEFINED.
MPI_Topo_test dgraph
Specify a distributed graph of a bidirectional ring of the MPI_COMM_WORLD communicator. Thus each node in the graph has a left and right neighbor.
MPI_Topo_test dup
Create a cartesian topology, get its characteristics, then dup it and check that the new communicator has the same properties.
Neighborhood collectives
A basic test for the 10 (5 patterns x {blocking,non-blocking}) MPI-3 neighborhood collective routines.
Basic Functionality
This group features tests that emphasize basic MPI functionality such as initializing MPI and retrieving its rank.
Basic send/recv
This is a basic test of the send/receive with a barrier using MPI_Send() and MPI_Recv().
Const cast
This test is designed to test the new MPI-3.0 const cast applied to a "const *" buffer pointer.
Elapsed walltime
This test measures how accurately MPI can measure 1 second.
Generalized request basic
Simple test of generalized requests. This simple code allows us to check that requests can be created, tested, and waited on in the case where the request is complete before the wait is called.
Init arguments
In MPI-1.1, it is explicitly stated that an implementation is allowed to require that the arguments argc and argv passed by an application to MPI_Init in C be the same arguments passed into the application as the arguments to main. In MPI-2 implementations are not allowed to impose this requirement. Conforming implementations of MPI allow applications to pass NULL for both the argc and argv arguments of MPI_Init(). This test prints the result of the error status of MPI_Init(). If the test completes without error, it reports 'No errors.'
Input queuing
Test of a large number of MPI datatype messages with no preposted receive so that an MPI implementation may have to queue up messages on the sending side. Uses MPI_Type_Create_indexed_block to create the send datatype and receives data as ints.
Intracomm communicator
This program calls MPI_Reduce with all Intracomm Communicators.
Isend and Request_free
Test multiple non-blocking send routines with MPI_Request_Free. Creates non-blocking messages with MPI_Isend(), MPI_Ibsend(), MPI_Issend(), and MPI_Irsend() then frees each request.
Large send/recv
This test sends the length of a message, followed by the message body.
MPI_ANY_{SOURCE,TAG}
This test uses MPI_ANY_SOURCE and MPI_ANY_TAG in repeated MPI_Irecv() calls. One implementation delivered incorrect data when using both ANY_SOURCE and ANY_TAG.
MPI_Abort() return exit
This program calls MPI_Abort and confirms that the exit status in the call is returned to the invoking environment.
MPI Attribues test
This is a test of creating and inserting attribues in different orders to ensure that the list management code handles all cases.
MPI_BOTTOM basic
Simple test using MPI_BOTTOM for MPI_Send() and MPI_Recv().
MPI_Bsend alignment
Simple test for MPI_Bsend() that sends and receives multiple messages with message sizes chosen to expose alignment problems.
MPI_Bsend buffer alignment
Test bsend with a buffer with alignment between 1 and 7 bytes.
MPI_Bsend detach
Test the handling of MPI_Bsend() operations when a detach occurs between MPI_Bsend() and MPI_Recv(). Uses busy wait to ensure detach occurs between MPI routines and tests with a selection of communicators.
MPI_Bsend() intercomm
Simple test for MPI_Bsend() that creates an intercommunicator with two evenly sized groups and then repeatedly sends and receives messages between groups.
MPI_Bsend ordered
Test bsend message handling where different messages are received in different orders.
MPI_Bsend repeat
Simple test for MPI_Bsend() that repeatedly sends and receives messages.
MPI_Bsend with init and start
Simple test for MPI_Bsend() that uses MPI_Bsend_init() to create a persistent communication request and then repeatedly sends and receives messages. Includes tests using MPI_Start() and MPI_Startall().
MPI_Cancel completed sends
Calls MPI_Isend(), forces it to complete with a barrier, calls MPI_Cancel(), then checks cancel status. Such a cancel operation should silently fail. This test returns a failure status if the cancel succeeds.
MPI_Cancel sends
Test of various send cancel calls. Sends messages with MPI_Isend(), MPI_Ibsend(), MPI_Irsend(), and MPI_Issend() and then immediately cancels them. Then verifies message was cancelled and was not received by destination process.
MPI_Finalized() test
This tests when MPI_Finalized() will work correctly if MPI_INit() was not called. This behaviour is not defined by the MPI standard, therefore this test is not garanteed.
MPI_Get_library_version test
MPI-3.0 Test returns MPI library version.
MPI_Get_version() test
This MPI_3.0 test prints the MPI version. If running a version of MPI < 3.0, it simply prints "No Errors".
MPI_Ibsend repeat
Simple test for MPI_Ibsend() that repeatedly sends and receives messages.
MPI_{Is,Query}_thread() test
This test examines the MPI_Is_thread() and MPI_Query_thread() call after being initilized using MPI_Init_thread().
MPI_Isend root cancel
This test case has the root send a non-blocking synchronous message to itself, cancels it, then attempts to read it.
MPI_Isend root
This is a simple test case of sending a non-blocking message to the root process. Includes test with a null pointer. This test uses a single process.
MPI_Isend root probe
This is a simple test case of the root sending a message to itself and probing this message.
MPI_Mprobe() series
This tests MPI_Mprobe() using a series of tests. Includes tests with send and Mprobe+Mrecv, send and Mprobe+Imrecv, send and Improbe+Mrecv, send and Improbe+Irecv, Mprobe+Mrecv with MPI_PROC_NULL, Mprobe+Imrecv with MPI_PROC_NULL, Improbe+Mrecv with MPI_PROC_NULL, Improbe+Imrecv, and test to verify MPI_Message_c2f() and MPI_Message_f2c() are present.
MPI_Probe() null source
This program checks that MPI_Iprobe() and MPI_Probe() correctly handle a source of MPI_PROC_NULL.
MPI_Probe() unexpected
This program verifies that MPI_Probe() is operating properly in the face of unexpected messages arriving after MPI_Probe() has been called. This program may hang if MPI_Probe() does not return when the message finally arrives. Tested with a variety of message sizes and number of messages.
MPI_Request_get_status
Test MPI_Request_get_status(). Sends a message with MPI_Ssend() and creates receives request with MPI_Irecv(). Verifies Request_get_status does not return correct values prior to MPI_Wait() and returns correct values afterwards. The test also checks that MPI_REQUEST_NULL and MPI_STATUS_IGNORE work as arguments as required beginning with MPI-2.2.
MPI_Request many irecv
This test issues many non-blocking receives followed by many blocking MPI_Send() calls, then issues an MPI_Wait() on all pending receives using multiple processes and increasing array sizes. This test may fail due to bugs in the handling of request completions or in queue operations.
MPI_{Send,Receive} basic
This is a simple test using MPI_Send() and MPI_Recv(), MPI_Sendrecv(), and MPI_Sendrecv_replace() to send messages between two processes using a selection of communicators and datatypes and increasing array sizes.
MPI_{Send,Receive} large backoff
Head to head MPI_Send() and MPI_Recv() to test backoff in device when large messages are being transferred. Includes a test that has one process sleep prior to calling send and recv.
MPI_{Send,Receive} vector
This is a simple test of MPI_Send() and MPI_Recv() using MPI_Type_vector() to create datatypes with an increasing number of blocks.
MPI_Send intercomm
Simple test of intercommunicator send and receive using a selection of intercommunicators.
MPI_Status large count
This test manipulates an MPI status object using MPI_Status_set_elements_x() with various large count values and verifying MPI_Get_elements_x() and MPI_Test_cancelled() produce the correct values.
MPI_Test pt2pt
This test program checks that the point-to-point completion routines can be applied to an inactive persistent request, as required by the MPI-1 standard. See section 3.7.3. It is allowed to call MPI TEST with a null or inactive request argument. In such a case the operation returns with flag = true and empty status. Tests both persistent send and persistent receive requests.
MPI_Waitany basic
This is a simple test of MPI_Waitany().
MPI_Waitany comprehensive
This program checks that the various MPI_Test and MPI_Wait routines allow both null requests and in the multiple completion cases, empty lists of requests.
MPI_Wtime() test
This program tests the ability of mpiexec to timeout a process after no more than 3 minutes. By default, it will run for 30 secs.
Many send/cancel order
Test of various receive cancel calls. Creates multiple receive requests then cancels three requests in a more interesting order to ensure the queue operation works properly. The other request receives the message.
Message patterns
This test sends/receives a number of messages in different patterns to make sure that all messages are received in the order they are sent. Two processes are used in the test.
Persistent send/cancel
Test cancelling persistent send calls. Tests various persistent send calls including MPI_Send_init(), MPI_Bsend_init(), MPI_Rsend_init(), and MPI_Ssend_init() followed by calls to MPI_Cancel().
Ping flood
This test sends a large number of messages in a loop in the source process, and receives a large number of messages in a loop in the destination process using a selection of communicators, datatypes, and array sizes.
Preposted receive
Test root sending to self with a preposted receive for a selection of datatypes and increasing array sizes. Includes tests for MPI_Send(), MPI_Ssend(), and MPI_Rsend().
Race condition
Repeatedly sends messages to the root from all other processes. Run this test with 8 processes. This test was submitted as a result of problems seen with the ch3:shm device on a Solaris system. The symptom is that the test hangs; this is due to losing a message, probably due to a race condition in a message-queue update.
Sendrecv from/to
This test uses MPI_Sendrecv() sent from and to rank=0. Includes test for MPI_Sendrecv_replace().
Simple thread finalize
The test here is a simple one that Finalize exits, so the only action is to write no error.
Simple thread initialize
The test initializes a thread, then calls MPI_Finalize() and prints "No errors".
Communicator Testing
This group features tests that emphasize MPI calls that create, manipulate, and delete MPI Communicators.
Comm_create_group excl 4 rank
This test using 4 processes creates a group with the even processes using MPI_Group_excl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group excl 8 rank
This test using 8 processes creates a group with the even processes using MPI_Group_excl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 2 rank
This test using 2 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 4 rank
This test using 4 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 8 rank
This test using 8 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group random 2 rank
This test using 2 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_create_group random 4 rank
This test using 4 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_create_group random 8 rank
This test using 8 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_create group tests
Simple test that gets the group of an intercommunicator using MPI_Group_rank() and MPI_Group_size() using a selection of intercommunicators.
Comm_create intercommunicators
This program tests MPI_Comm_create() using a selection of intercommunicators. Creates a new communicator from an intercommunicator, duplicates the communicator, and verifies that it works. Includes test with one side of intercommunicator being set with MPI_GROUP_EMPTY.
Comm creation comprehensive
Check that Communicators can be created from various subsets of the processes in the communicator. Uses MPI_Comm_group(), MPI_Group_range_incl(), and MPI_Comm_dup() to create new communicators.
Comm_{dup,free} contexts
This program tests the allocation and deallocation of contexts by using MPI_Comm_dup() to create many communicators in batches and then freeing them in batches.
Comm_dup basic
This test exercises MPI_Comm_dup() by duplicating a communicator, checking basic properties, and communicating with this new communicator.
Comm_dup contexts
Check that communicators have separate contexts. We do this by setting up non-blocking receives on two communicators and then sending to them. If the contexts are different, tests on the unsatisfied communicator should indicate no available message. Tested using a selection of intercommunicators.
Comm_{get,set}_name basic
Simple test for MPI_Comm_get_name() using a selection of communicators.
Comm_idup 2 rank
Multiple tests using 2 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.]
Comm_idup 4 rank
Multiple tests using 4 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.
Comm_idup 9 rank
Multiple tests using 9 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.]
Comm_idup multi
Simple test creating multiple communicators with MPI_Comm_idup.
Comm_idup overlap
Each pair of processes uses MPI_Comm_idup() to dup the communicator such that the dups are overlapping. If this were done with MPI_Comm_dup() this should deadlock.
Comm_split basic
Simple test for MPI_Comm_split().
Comm_split intercommunicators
This tests MPI_Comm_split() using a selection of intercommunicators. The split communicator is tested using simple send and receive routines.
Comm_split key order
This test ensures that MPI_Comm_split breaks ties in key values by using the original rank in the input communicator. This typically corresponds to the difference between using a stable sort or using an unstable sort. It checks all sizes from 1..comm_size(world)-1, so this test does not need to be run multiple times at process counts from a higher-level test driver.
Comm_split_type basic
Tests MPI_Comm_split_type() including a test using MPI_UNDEFINED.
Comm_with_info dup 2 rank
This test exercises MPI_Comm_dup_with_info() with 2 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Comm_with_info dup 4 rank
This test exercises MPI_Comm_dup_with_info() with 4 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Comm_with_info dup 9 rank
This test exercises MPI_Comm_dup_with_info() with 9 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Context split
This test uses MPI_Comm_split() to repeatedly create and free communicators. This check is intended to fail if there is a leak of context ids. This test needs to run longer than many tests because it tries to exhaust the number of context ids. The for loop uses 10000 iterations, which is adequate for MPICH (with only about 1k context ids available).
Intercomm_create basic
A simple test of MPI_Intercomm_create() that creates an intercommunicator and verifies that it works.
Intercomm_create many rank 2x2
Test for MPI_Intercomm_create() using at least 33 processes that exercises a loop bounds bug by creating and freeing two intercommunicators with two processes each.
Intercomm_merge
Test MPI_Intercomm_merge() using a selection of intercommunicators. Includes multiple tests with different choices for the high value.
Intercomm probe
Test MPI_Probe() with a selection of intercommunicators. Creates and intercommunicator, probes it, and then frees it.
MPI_Info_create basic
Simple test for MPI_Comm_{set,get}_info.
Multiple threads context dup
This test creates communicators concurrently in different threads.
Multiple threads context idup
This test creates communicators concurrently, non-blocking, in different threads.
Multiple threads dup leak
This test repeatedly duplicates and frees communicators with multiple threads concurrently to stress the multithreaded aspects of the context ID allocation code. Thanks to IBM for providing the original version of this test.
Simple thread comm dup
This is a simple test of threads in MPI with communicator duplication.
Simple thread comm idup
This is a simple test of threads in MPI with non-blocking communicator duplication.
Thread Group creation
Every thread paticipates in a distinct MPI_Comm_create group, distinguished by its thread-id (used as the tag). Threads on even ranks join an even comm and threads on odd ranks join the odd comm.
Threaded group
In this test a number of threads are created with a distinct MPI communicator (or comm) group distinguished by its thread-id (used as a tag). Threads on even ranks join an even comm and threads on odd ranks join the odd comm.
Error Processing
This group features tests of MPI error processing.
Error Handling
Reports the default action taken on an error. It also reports if error handling can be changed to "returns", and if so, if this functions properly.
File IO error handlers
This test exercises MPI I/O and MPI error handling techniques.
MPI_Abort() return exit
This program calls MPI_Abort and confirms that the exit status in the call is returned to the invoking environment.
MPI_Add_error_class basic
Create NCLASSES new classes, each with 5 codes (160 total).
MPI_Comm_errhandler basic
Test comm_{set,call}_errhandle.
MPI_Error_string basic
Test that prints out MPI error codes from 0-53.
MPI_Error_string error class
Simple test where an MPI error class is created, and an error string introduced for that string.
User error handling 1 rank
Ensure that setting a user-defined error handler on predefined communicators does not cause a problem at finalize time. Regression test for former issue. Runs on 1 rank.
User error handling 2 rank
Ensure that setting a user-defined error handler on predefined communicators does not cause a problem at finalize time. Regression test for former issue. Runs on 2 ranks.
UTK Test Suite
This group features the test suite developed at the University of Tennesss Knoxville for MPI-2.2 and earlier specifications. Though techically not a functional group, it was retained to allow comparison with the previous benchmark suite.
Alloc_mem
Simple check to see if MPI_Alloc_mem() is supported.
Assignment constants
Test for Named Constants supported in MPI-1.0 and higher. The test is a Perl script that constructs a small seperate main program in either C or FORTRAN for each constant. The constants for this test are used to assign a value to a const integer type in C and an integer type in Fortran. This test is the de facto test for any constant recognized by the compiler. NOTE: The constants used in this test are tested against both C and FORTRAN compilers. Some of the constants are optional and may not be supported by the MPI implementation. Failure to verify these constants does not necessarily constitute failure of the MPI implementation to satisfy the MPI specifications. ISSUE: This test may timeout if separate program executions initialize slowly.
C/Fortran interoperability supported
Checks if the C-Fortran (F77) interoperability functions are supported using the MPI-2.2 specification.
Communicator attributes
Returns all communicator attributes that are not supported. The test is run as a single process MPI job and fails if any attributes are not supported.
Compiletime constants
The MPI-3.0 specifications require that some named constants be known at compiletime. The report includes a record for each constant of this class in the form "X MPI_CONSTANT is [not] verified by METHOD" where X is either 'c' for the C compiler, or 'F' for the FORTRAN 77 compiler. For a C langauge compile, the constant is used as a case label in a switch statement. For a FORTRAN language compile, the constant is assigned to a PARAMETER. The report sumarizes with the number of constants for each compiler that was successfully verified.
Datatypes
Tests for the presence of constants from MPI-1.0 and higher. It constructs small separate main programs in either C, FORTRAN, or C++ for each datatype. It fails if any datatype is not present. ISSUE: This test may timeout if separate program executions initialize slowly.
Deprecated routines
Checks all MPI deprecated routines as of MPI-2.2, but not including routines removed by MPI-3 if this is an MPI-3 implementation.
Error Handling
Reports the default action taken on an error. It also reports if error handling can be changed to "returns", and if so, if this functions properly.
Errorcodes
The MPI-3.0 specifications require that the same constants be available for the C language and FORTRAN. The report includes a record for each errorcode of the form "X MPI_ERRCODE is [not] verified" where X is either 'c' for the C compiler, or 'F' for the FORTRAN 77 compiler. The report sumarizes with the number of errorcodes for each compiler that were successfully verified.
Extended collectives
Checks if "extended collectives" are supported, i.e., collective operations with MPI-2 intercommunicators.
Init arguments
In MPI-1.1, it is explicitly stated that an implementation is allowed to require that the arguments argc and argv passed by an application to MPI_Init in C be the same arguments passed into the application as the arguments to main. In MPI-2 implementations are not allowed to impose this requirement. Conforming implementations of MPI allow applications to pass NULL for both the argc and argv arguments of MPI_Init(). This test prints the result of the error status of MPI_Init(). If the test completes without error, it reports 'No errors.'
MPI-2 replaced routines
Checks the presence of all MPI-2.2 routines that replaced deprecated routines.
MPI-2 type routines
This test checks that a subset of MPI-2 routines that replaced MPI-1 routines work correctly.
Master/slave
This test running as a single MPI process spawns four slave processes using MPI_Comm_spawn(). The master process sends and receives a message from each slave. If the test completes, it will report 'No errors.', otherwise specific error messages are listed.
One-sided communication
Checks MPI-2.2 one-sided communication modes reporting those that are not defined. If the test compiles, then "No errors" is reported, else, all undefined modes are reported as "not defined."
One-sided fences
Verifies that one-sided communication with active target synchronization with fences functions properly. If all operations succeed, one-sided communication with active target synchronization with fences is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with fences is reported as NOT supported.
One-sided passiv
Verifies one-sided-communication with passive target synchronization functions properly. If all operations succeed, one-sided-communication with passive target synchronization is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with passive target synchronization with fences is reported as NOT supported.
One-sided post
Verifies one-sided-communication with active target synchronization with post/start/complete/wait functions properly. If all operations succeed, one-sided communication with active target synchronization with post/start/complete/wait is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with post/start/complete/wait is reported as NOT supported.
One-sided routines
Reports if one-sided communication routines are defined. If this test compiles, one-sided communication is reported as defined, otherwise "not supported".
Thread support
Reports the level of thread support provided. This is either MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, or MPI_THREAD_MULTIPLE.
Group Communicator
This group features tests of MPI communicator group calls.
MPI_Group_Translate_ranks perf
Measure and compare the relative performance of MPI_Group_translate_ranks with small and large group2 sizes but a constant number of ranks. This serves as a performance sanity check for the Scalasca use case where we translate to MPI_COMM_WORLD ranks. The performance should only depend on the number of ranks passed, not the size of either group (especially group2). This test is probably only meaningful for large-ish process counts.
MPI_Group_excl basic
This is a test of MPI_Group_excl().
MPI_Group_incl basic
This is a simple test of creating a group array.
MPI_Group_incl empty
This is a test to determine if an empty group can be created.
MPI_Group irregular
This is a test comparing small groups against larger groups, and use groups with irregular members (to bypass optimizations in group_translate_ranks for simple groups).
MPI_Group_translate_ranks
This is a test of MPI_Group_translate_ranks().
Win_get_group basic
This is a simple test of MPI_Win_get_group() for a selection of communicators.
Parallel Input/Output
This group features tests that involve MPI parallel input/output operations.
Asynchronous IO basic
Test asynchronous I/O with multiple completion. Each process writes to separate files and reads them back.
Asynchronous IO collective
Test asynchronous collective reading and writing. Each process asynchronously to to a file then reads it back.
Asynchronous IO contig
Test contiguous asynchronous I/O. Each process writes to separate files and reads them back. The file name is taken as a command-line argument, and the process rank is appended to it.
Asynchronous IO non-contig
Tests noncontiguous reads/writes using non-blocking I/O.
File IO error handlers
This test exercises MPI I/O and MPI error handling techniques.
MPI_File_get_type_extent
Test file_get_extent.
MPI_File_set_view displacement_current
Test set_view with DISPLACEMENT_CURRENT. This test reads a header then sets the view to every "size" int, using set view and current displacement. The file is first written using a combination of collective and ordered writes.
MPI_File_write_ordered basic
Test reading and writing ordered output.
MPI_File_write_ordered zero
Test reading and writing data with zero length. The test then looks for errors in the MPI IO routines and reports any that were found, otherwise "No errors" is reported.
MPI_Info_set file view
Test file_set_view. Access style is explicitly described as modifiable. Values include read_once, read_mostly, write_once, write_mostly, random.
MPI_Type_create_resized basic
Test file views with MPI_Type_create_resized.
MPI_Type_create_resized x2
Test file views with MPI_Type_create_resized, with a resizing of the resized type.
Datatypes
This group features tests that involve named MPI and user defined datatypes.
Aint add and diff
Tests the MPI 3.1 standard functions MPI_Aint_diff and MPI_Aint_add.
Blockindexed contiguous convert
This test converts a block indexed datatype to a contiguous datatype.
Blockindexed contiguous zero
This tests the behavior with a zero-count blockindexed datatype.
C++ datatypes
This test checks for the existence of four new C++ named predefined datatypes that should be accessible from C and Fortran.
Datatype commit-free-commit
This test creates a valid datatype, commits and frees the datatype, then repeats the process for a second datatype of the same size.
Datatype get structs
This test was motivated by the failure of an example program for RMA involving simple operations on a struct that included a struct. The observed failure was a SEGV in the MPI_Get.
Datatype inclusive typename
Sample some datatypes. See 8.4, "Naming Objects" in MPI-2. The default name is the same as the datatype name.
Datatype match size
Test of type_match_size. Check the most likely cases. Note that it is an error to free the type returned by MPI_Type_match_size. Also note that it is an error to request a size not supported by the compiler, so Type_match_size should generate an error in that case.
Datatype reference count
Test to check if freed datatypes have reference count semantics. The idea here is to create a simple but non-contiguous datatype, perform an irecv with it, free it, and then create many new datatypes. If the datatype was freed and the space was reused, this test may detect an error.
Datatypes basic and derived
This program is derived from one in the MPICH-1 test suite. It tests a wide variety of basic and derived datatypes.
Datatypes comprehensive
This program is derived from one in the MPICH-1 test suite. This test sends and receives EVERYTHING from MPI_BOTTOM, by putting the data into a structure.
Datatypes
Tests for the presence of constants from MPI-1.0 and higher. It constructs small separate main programs in either C, FORTRAN, or C++ for each datatype. It fails if any datatype is not present. ISSUE: This test may timeout if separate program executions initialize slowly.
Get_address math
This routine shows how math can be used on MPI addresses and verifies that it produces the correct result.
Get_elements contig
Uses a contig of a struct in order to satisfy two properties: (A) a type that contains more than one element type (the struct portion) (B) a type that has an odd number of ints in its "type contents" (1 in this case). This triggers a specific bug in some versions of MPICH.
Get_elements pair
Send a { double, int, double} tuple and receive as a pair of MPI_DOUBLE_INTs. this should (a) be valid, and (b) result in an element count of 3.
Get_elements partial
Receive partial datatypes and check that MPI_Getelements gives the correct version.
LONG_DOUBLE size
This test ensures that simplistic build logic/configuration did not result in a defined, yet incorrectly sized, MPI predefined datatype for long double and long double Complex. Based on a test suggested by Jim Hoekstra @ Iowa State University. The test also considers other datatypes that are optional in the MPI-3 specification.
Large counts for types
This test checks for large count functionality ("MPI_Count") mandated by MPI-3, as well as behavior of corresponding pre-MPI-3 interfaces that have better defined behavior when an "int" quantity would overflow.
Large types
This test checks that MPI can handle large datatypes.
Local pack/unpack basic
This test users MPI_Pack() on a communication buffer, then call MPU_Unpack() to confirm that the unpacked data matches the original. This routine performs all work within a simple processor.
Noncontiguous datatypes
This test uses a structure datatype that describes data that is contiguous, but is is manipulated as if it is noncontiguous. The test is designed to expose flaws in MPI memory management should they exist.
Pack/Unpack matrix transpose
This test confirms that an MPI packed matrix can be unpacked correctly by the MPI infrastructure.
Pack/Unpack multi-struct
This test confirms that packed structures, including array-of-struct and struct-of-struct unpack properly.
Pack/Unpack sliced
This test confirms that sliced array pack and unpack properly.
Pack/Unpack struct
This test confirms that a packed structure unpacks properly.
Pack basic
Tests functionality of MPI_Type_get_envelope() and MPI_Type_get_contents() on a MPI_FLOAT. Returns the number of errors encountered.
Pack_external_size
Tests functionality of MPI_Type_get_envelope() and MPI_Type_get_contents() on a packed-external MPI_FLOAT. Returns the number of errors encountered.
Pair types optional
Check for optional datatypes such as LONG_DOUBLE_INT.
Simple contig datatype
This test checks to see if we can create a simple datatype made from many contiguous copies of a single struct. The struct is built with monotone decreasing displacements to avoid any struct->config optimizations.
Simple zero contig
Tests behaviour with a zero count contig.
Struct zero count
Tests behavior with a zero-count struct of builtins.
Type_commit basic
Tests that verifies that the MPI_Type_commit succeeds.
Type_commit basic
This test builds datatypes using various components and confirms that MPI_Type_commit() succeeded.
Type_create_darray cyclic
Several cyclic checks of a custom struct darray.
Type_create_darray pack
Performs a sequence of tests building darrays with single-element blocks, running through all the various positions that the element might come from.
Type_create_darray pack many rank
Performs a sequence of tests building darrays with single-element blocks, running through all the various positions that the element might come from. Should be run with many ranks (at least 32).
Type_create_hindexed_block contents
This test is a simple check of MPI_Type_create_hindexed_block() using MPI_Type_get_envelope() and MPI_Type_get_contents().
Type_create_hindexed_block
Tests behavior with a hindexed_block that can be converted to a contig easily. This is specifically for coverage. Returns the number of errors encountered.
Type_create_resized 0 lower bound
Test of MPI datatype resized with 0 lower bound.
Type_create_resized lower bound
Test of MPI datatype resized with non-zero lower bound.
Type_create_resized
Tests behavior with resizing of a simple derived type.
Type_create_subarray basic
This test creates a subarray and confirms its contents.
Type_create_subarray pack/unpack
This test confirms that a packed sub-array can be properly unpacked.
Type_free memory
This test is used to confirm that memory is properly recovered from freed datatypes. The test may be run with valgrind or similar tools, or it may be run with MPI implementation specific options. For this test it is run only with standard MPI error checking enabled.
Type_get_envelope basic
This tests the functionality of MPI_Type_get_envelope() and MPI_Type_get_contents().
Type_hindexed zero
Tests hindexed types with all zero length blocks.
Type_hvector_blklen loop
Inspired by the Intel MPI_Type_hvector_blklen test. Added to include a test of a dataloop optimization that failed.
Type_hvector counts
Tests vector and struct type creation and commits with varying counts and odd displacements.
Type_indexed many
Tests behavior with an indexed array that can be compacted but should continue to be stored as an indexed type. Specifically for coverage. Returns the number of errors encountered. This test checks that both the upper and lower boundary of an hindexed MPI type is correct. This routine checks the alignment of a custom datatype. This test creates an MPI_Type_struct() datatype, assigns data and sends the structure to a second process. The second process receives the structure and confirms that the information contained in the structure agrees with the original data. This test is inspired by the Intel MPI_Type_vector_blklen test. The test fundamentally tries to deceive MPI into scrambling the data using padded struct types, and MPI_Pack() and MPI_Unpack(). The data is then checked to make sure the original data was not lost in the process. If "No errors" is reported, then the MPI functions that manipulated the data did not corrupt the test data. This test creates an empty packed indexed type, and then checks that the last 40 entrines of the unpacked recv_buffer have the corresponding elements from the send buffer.Type_indexed not compacted
Type_{lb,ub,extent}
Type_struct() alignment
Type_struct basic
Type_vector blklen
Zero sized blocks
Collectives
This group features tests of utilizing MPI collectives.
Allgather basic
Gather data from a vector to a contiguous vector for a selection of communicators. This is the trivial version based on the allgather test (allgatherv but with constant data sizes).
Allgather double zero
This test is similar to "Allgather in-place null", but uses MPI_DOUBLE with separate input and output arrays and performs an additional test for a zero byte gather operation.
Allgather in-place null
This is a test of MPI_Allgather() using MPI_IN_PLACE and MPI_DATATYPE_NULL to repeatedly gather data from a vector that increases in size each iteration for a selection of communicators.
Allgather intercommunicators
Allgather tests using a selection of intercommunicators and increasing array sizes. Processes are split into two groups and MPI_Allgather() is used to have each group send data to the other group and to send data from one group to the other.
Allgatherv 2D
This test uses MPI_Allgatherv() to define a two-dimensional table.
Allgatherv in-place
Gather data from a vector to a contiguous vector using MPI_IN_PLACE for a selection of communicators. This is the trivial version based on the coll/allgather tests with constant data sizes.
Allgatherv intercommunicators
Allgatherv test using a selection of intercommunicators and increasing array sizes. Processes are split into two groups and MPI_Allgatherv() is used to have each group send data to the other group and to send data from one group to the other. Similar to Allgather test (coll/icallgather).
Allgatherv large
This test is the same as Allgatherv basic (coll/coll6) except the size of the table is greater than the number of processors.
Allreduce flood
Tests the ability of the implementation to handle a flood of one-way messages by repeatedly calling MPI_Allreduce(). Test should be run with 2 processes.
Allreduce in-place
MPI_Allreduce() Test using MPI_IN_PLACE for a selection of communicators.
Allreduce intercommunicators
Allreduce test using a selection of intercommunicators and increasing array sizes.
Allreduce mat-mult
This test implements a simple matrix-matrix multiply for a selection of communicators using a user-defined operation for MPI_Allreduce(). This is an associative but not commutative operation where matSize=matrix. The number of matrices is the count argument, which is currently set to 1. The matrix is stored in C order, so that c(i,j) = cin[j+i*matSize].
Allreduce non-commutative
This tests MPI_Allreduce() using apparent non-commutative operators using a selection of communicators. This forces MPI to run code used for non-commutative operators.
Allreduce operations
This tests all possible MPI operation codes using the MPI_Allreduce() routine.
Allreduce user-defined
This example tests MPI_Allreduce() with user-defined operations using a selection of communicators similar to coll/allred3, but uses 3x3 matrices with integer-valued entries. This is an associative but not commutative operation. The number of matrices is the count argument. Tests using separate input and output matrices and using MPI_IN_PLACE. The matrix is stored in C order.
Allreduce user-defined long
Tests user-defined operation on a long value. Tests proper handling of possible pipelining in the implementation of reductions with user-defined operations.
Allreduce vector size
This tests MPI_Allreduce() using vectors with size greater than the number of processes for a selection of communicators.
Alltoall basic
Simple test for MPI_Alltoall().
Alltoall communicators
Tests MPI_Alltoall() by calling it with a selection of communicators and datatypes. Includes test using MPI_IN_PLACE.
Alltoall intercommunicators
Alltoall test using a selction of intercommunicators and increasing array sizes.
Alltoall threads
The LISTENER THREAD waits for communication from any source (including calling thread) messages which it receives that has tag REQ_TAG. Each thread enters an infinite loop that will stop only if every node in the MPI_COMM_WORLD sends a message containing -1.
Alltoallv communicators
This program tests MPI_Alltoallv() by having each processor send different amounts of data to each processor using a selection of communicators. The test uses only MPI_INT which is adequate for testing systems that use point-to-point operations. Includes test using MPI_IN_PLACE.
Alltoallv halo exchange
This tests MPI_Alltoallv() by having each processor send data to two neighbors only, using counts of 0 for the other neighbors for a selection of communicators. This idiom is sometimes used for halo exchange operations. The test uses MPI_INT which is adequate for testing systems that use point-to-point operations.
Alltoallv intercommunicators
This program tests MPI_Alltoallv using int array and a selection of intercommunicators by having each process send different amounts of data to each process. This test sends i items to process i from all processes.
Alltoallw intercommunicators
This program tests MPI_Alltoallw by having each process send different amounts of data to each process. This test is similar to the Alltoallv test (coll/icalltoallv), but with displacements in bytes rather than units of the datatype. This test sends i items to process i from all process.
Alltoallw matrix transpose
Tests MPI_Alltoallw() by performing a blocked matrix transpose operation. This more detailed example test was taken from MPI - The Complete Reference, Vol 1, p 222-224. Please refer to this reference for more details of the test.
Alltoallw matrix transpose comm
This program tests MPI_Alltoallw() by having each processor send different amounts of data to all processors. This is similar to the "Alltoallv communicators" test, but with displacements in bytes rather than units of the datatype. Currently, the test uses only MPI_INT which is adequate for testing systems that use point-to-point operations. Includes test using MPI_IN_PLACE.
Alltoallw zero types
This test makes sure that counts with non-zero-sized types on the send (recv) side match and don't cause a problem with non-zero counts and zero-sized types on the recv (send) side when using MPI_Alltoallw and MPI_Alltoallv. Includes tests using MPI_IN_PLACE.
BAND operations
Test MPI_BAND (bitwise and) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
BOR operations
Test MPI_BOR (bitwise or) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
BXOR Operations
Test MPI_BXOR (bitwise excl or) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
Barrier intercommunicators
This test checks that MPI_Barrier() accepts intercommunicators. It does not check for the semantics of a intercomm barrier (all processes in the local group can exit when (but not before) all processes in the remote group enter the barrier.
Bcast basic
Test broadcast with various roots, datatypes, and communicators.
Bcast intercommunicators
Broadcast test using a selection of intercommunicators and increasing array sizes.
Bcast intermediate
Test broadcast with various roots, datatypes, sizes that are not powers of two, larger message sizes, and communicators.
Bcast sizes
Tests MPI_Bcast() repeatedly using MPI_INT with a selection of data sizes.
Bcast zero types
Tests broadcast behavior with non-zero counts but zero-sized types.
Collectives array-of-struct
Tests various calls to MPI_Reduce(), MPI_Bcast(), and MPI_Allreduce() using arrays of structs.
Exscan basic
Simple test of MPI_Exscan() using single element int arrays.
Exscan communicators
Tests MPI_Exscan() using int arrays and a selection of communicators and array sizes. Includes tests using MPI_IN_PLACE.
Extended collectives
Checks if "extended collectives" are supported, i.e., collective operations with MPI-2 intercommunicators.
Gather 2D
This test uses MPI_Gather() to define a two-dimensional table.
Gather basic
This tests gathers data from a vector to contiguous datatype using doubles for a selection of communicators and array sizes. Includes test for zero length gather using MPI_IN_PLACE.
Gather communicators
This test gathers data from a vector to contiguous datatype using a double vector for a selection of communicators. Includes a zero length gather and a test to ensure aliasing is disallowed correctly.
Gather intercommunicators
Gather test using a selection of intercommunicators and increasing array sizes.
Gatherv 2D
This test uses MPI_Gatherv() to define a two-dimensional table. This test is similar to Gather test (coll/coll2).
Gatherv intercommunicators
Gatherv test using a selection of intercommunicators and increasing array sizes.
Iallreduce basic
Simple test for MPI_Iallreduce() and MPI_Allreduce().
Ibarrier
This test calls MPI_Ibarrier() followed by an conditional loop containing usleep(1000) and MPI_Test(). This test hung indefinitely under some MPI implementations.
LAND operations
Test MPI_LAND (logical and) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
LOR operations
Test MPI_LOR (logical or) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
LXOR operations
Test MPI_LXOR (logical excl or) operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
MAXLOC operations
Test MPI_MAXLOC operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
MAX operations
Test MPI_MAX operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
MINLOC operations
Test MPI_MINLOC operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
MIN operations
Test MPI_Min operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
MScan
Tests user defined collective operations for MPI_Scan(). The operations are inoutvec[i] += invec[i] op inoutvec[i] and inoutvec[i] = invec[i] op inoutvec[i] (see MPI-1.3 Message-Passing Interface section 4.9.4). The order of operation is important. Note that the computation is in process rank (in the communicator) order independant of the root process.
Non-blocking basic
This is a weak sanity test that all non-blocking collectives specified by MPI-3 are present in the library and accept arguments as expected. This test does not check for progress, matching issues, or sensible output buffer values.
Non-blocking intracommunicator
This is a basic test of all 17 non-blocking collective operations specified by the MPI-3 standard. It only exercises the intracommunicator functionality, does not use MPI_IN_PLACE, and only transmits/receives simple integer types with relatively small counts. It does check a few fancier issues, such as ensuring that "premature user releases" of MPI_Op and MPI_Datatype objects does not result in a segfault.
Non-blocking overlapping
This test attempts to execute multiple simultaneous non-blocking collective (NBC) MPI routines at the same time, and manages their completion with a variety of routines (MPI_{Wait,Test}{,_all,_any,_some}). The test also exercises a few point-to-point operations.
Non-blocking wait
This is a weak sanity test that all non-blocking collectives specified by MPI-3 are present in the library and take arguments as expected. Includes calls to MPI_Wait() immediately following non-blocking collective calls. This test does not check for progress, matching issues, or sensible output buffer values.
Op_{create,commute,free}
A simple test of MPI_Op_Create/Commutative/free on predefined reduction operations and both commutative and non-commutative user defined operations.
PROD operations
Test MPI_PROD operations using MPI_Reduce() on optional datatypes. Note that failing this test does not mean that there is something wrong with the MPI implementation.
Reduce/Bcast multi-operation
This test repeats pairs of calls to MPI_Reduce() and MPI_Bcast() using different reduction operations and checks for errors.
Reduce/Bcast user-defined
Tests user defined collective operations for MPI_Reduce() followed by MPI_Broadcast(). The operation is inoutvec[i] = invec[i] op inoutvec[i] (see MPI-1.3 Message-Passing Interface section 4.9.4). The order of operation is important. Note that the computation is in process rank (in the communicator) order independant of the root process.
Reduce/Bcast user-defined
This test calls MPI_Reduce() and MPI_Bcast() with a user defined operation.
Reduce_Scatter_block large data
Test of reduce scatter block with large data (needed in MPICH to trigger the long-data algorithm). Each processor contributes its rank + the index to the reduction, then receives the ith sum. Can be called with any number of processors.
Reduce_Scatter intercomm. large
Test of reduce scatter block with large data on a selection of intercommunicators (needed in MPICH to trigger the long-data algorithm). Each processor contributes its rank + the index to the reduction, then receives the ith sum. Can be called with any number of processors.
Reduce_Scatter large data
Test of reduce scatter with large data (needed to trigger the long-data algorithm). Each processor contributes its rank + index to the reduction, then receives the "ith" sum. Can be run with any number of processors.
Reduce_Scatter user-defined
Test of reduce scatter using user-defined operations. Checks that the non-communcative operations are not commuted and that all of the operations are performed.
Reduce any-root user-defined
This tests implements a simple matrix-matrix multiply with an arbitrary root using MPI_Reduce() on user-defined operations for a selection of communicators. This is an associative but not commutative operation. For a matrix size of matsize, the matrix is stored in C order where c(i,j) is cin[j+i*matSize].
Reduce basic
A simple test of MPI_Reduce() with the rank of the root process shifted through each possible value using a selection of communicators.
Reduce communicators user-defined
This tests implements a simple matrix-matrix multiply using MPI_Reduce() on user-defined operations for a selection of communicators. This is an associative but not commutative operation. For a matrix size of matsize, the matrix is stored in C order where c(i,j) is cin[j+i*matSize].
Reduce intercommunicators
Reduce test using a selection of intercommunicators and increasing array sizes.
Reduce_local basic
A simple test of MPI_Reduce_local(). Uses MPI_SUM as well as user defined operators on arrays of increasing size.
Reduce_scatter basic
Test of reduce scatter. Each processor contribues its rank plus the index to the reduction, then receives the ith sum. Can be called with any number of processors.
Reduce_scatter_block basic
Test of reduce scatter block. Each process contributes its rank plus the index to the reduction, then receives the ith sum. Can be called with any number of processors.
Reduce_scatter_block user-def
Test of reduce scatter block using user-defined operations to check that non-commutative operations are not commuted and that all operations are performed. Can be called with any number of processors.
Reduce_scatter intercommunicators
Test of reduce scatter with large data on a selection of intercommunicators (needed in MPICH to trigger the long-data algorithm). Each processor contributes its rank + the index to the reduction, then receives the ith sum. Can be called with any number of processors.
SUM operations
This test looks at integer or integer related datatypes not required by the MPI-3.0 standard (e.g. long long) using MPI_Reduce(). Note that failure to support these datatypes is not an indication of a non-compliant MPI implementation.
Scan basic
A simple test of MPI_Scan() on predefined operations and user-defined operations with with inoutvec[i] = invec[i] op inoutvec[i] (see 4.9.4 of the MPI standard 1.3) and inoutvec[i] += invec[i] op inoutvec[i]. The order is important. Note that the computation is in process rank (in the communicator) order, independent of the root.
Scatter 2D
This test uses MPI_Scatter() to define a two-dimensional table. See also Gather test (coll/coll2) and Gatherv test (coll/coll3) for similar tests.
Scatter basic
This MPI_Scatter() test sends a vector and receives individual elements, except for the root process that does not receive any data.
Scatter contiguous
This MPI_Scatter() test sends contiguous data and receives a vector on some nodes and contiguous data on others. There is some evidence that some MPI implementations do not check recvcount on the root process. This test checks for that case.
Scatter intercommunicators
Scatter test using a selection of intercommunicators and increasing array sizes.
Scatterv 2D
This test uses MPI_Scatterv() to define a two-dimensional table.
Scatter vector-to-1
This MPI_Scatter() test sends a vector and receives individual elements.
Scatterv intercommunicators
Scatterv test using a selection of intercommunicators and increasing array sizes.
Scatterv matrix
This is an example of using scatterv to send a matrix from one process to all others, with the matrix stored in Fortran order. Note the use of an explicit upper bound (UB) to enable the sources to overlap. This tests uses scatterv to make sure that it uses the datatype size and extent correctly. It requires the number of processors used in the call to MPI_Dims_create.
User-defined many elements
Test user-defined operations for MPI_Reduce() with a large number of elements. Added because a talk at EuroMPI'12 claimed that these failed with more than 64k elements.
MPI_Info Objects
The info tests emphasize the MPI Info object functionality.
MPI_Info_delete basic
This test exercises the MPI_Info_delete() function.
MPI_Info_dup basic
This test exercises the MPI_Info_dup() function.
MPI_Info_get basic
This is a simple test of the MPI_Info_get() function.
MPI_Info_get ext. ins/del
Test of info that makes use of the extended handles, including inserts and deletes.
MPI_Info_get extended
Test of info that makes use of the extended handles.
MPI_Info_get ordered
This is a simple test that illustrates how named keys are ordered.
MPI_Info_get_valuelen basic
Simple info set and get_valuelen test.
MPI_Info_set/get basic
Simple info set and get test.
Dynamic Process Management
This group features tests that add processes to a running communicator, joining separately started applications, then handling faults/failures.
Creation group intercomm test
In this test processes create an intracommunicator, and creation is collective only on the members of the new communicator, not on the parent communicator. This is accomplished by building up and merging intercommunicators starting with MPI_COMM_SELF for each process involved.
MPI_Comm_accept basic
This tests exercises MPI_Open_port(), MPI_Comm_accept(), and MPI_Comm_disconnect().
MPI_Comm_connect 2 processes
This test checks to make sure that two MPI_Comm_connects to two different MPI ports match their corresponding MPI_Comm_accepts.
MPI_Comm_connect 3 processes
This test checks to make sure that three MPI_Comm_connections to three different MPI ports match their corresponding MPI_Comm_accepts.
MPI_Comm_disconnect basic
A simple test of Comm_disconnect with a master and 2 spawned ranks.
MPI_Comm_disconnect-reconnect basic
A simple test of Comm_connect/accept/disconnect.
MPI_Comm_disconnect-reconnect groups
This test tests the disconnect code for processes that span process groups. This test spawns a group of processes and then merges them into a single communicator. Then the single communicator is split into two communicators, one containing the even ranks and the other the odd ranks. Then the two new communicators do MPI_Comm_accept/connect/disconnect calls in a loop. The even group does the accepting while the odd group does the connecting.
MPI_Comm_disconnect-reconnect repeat
This test spawns two child jobs and has them open a port and connect to each other. The two children repeatedly connect, accept, and disconnect from each other.
MPI_Comm_disconnect send0-1
A test of Comm_disconnect with a master and 2 spawned ranks, after sending from rank 0 to 1.
MPI_Comm_disconnect send1-2
A test of Comm_disconnect with a master and 2 spawned ranks, after sending from rank 1 to 2.
MPI_Comm_join basic
A simple test of Comm_join.
MPI_Comm_spawn basic
A simple test of Comm_spawn.
MPI_Comm_spawn complex args
A simple test of Comm_spawn, with complex arguments.
MPI_Comm_spawn inter-merge
A simple test of Comm_spawn, followed by intercomm merge.
MPI_Comm_spawn many args
A simple test of Comm_spawn, with many arguments.
MPI_Comm_spawn_multiple appnum
This tests spawn_mult by using the same executable and no command-line options. The attribute MPI_APPNUM is used to determine which executable is running.
MPI_Comm_spawn_multiple basic
A simple test of Comm_spawn_multiple with info.
MPI_Comm_spawn repeat
A simple test of Comm_spawn, called twice.
MPI_Comm_spawn with info
A simple test of Comm_spawn with info.
MPI_Intercomm_create
Use Spawn to create an intercomm, then create a new intercomm that includes processes not in the initial spawn intercomm.This test ensures that spawned processes are able to communicate with processes that were not in the communicator from which they were spawned.
MPI_Publish_name basic
This test confirms the functionality of MPI_Open_port() and MPI_Publish_name().
MPI spawn-connect-accept send/recv
Spawns two processes, one connecting and one accepting. It synchronizes with each then waits for them to connect and accept. The connector and acceptor respectively send and receive some data.
MPI spawn-connect-accept
Spawns two processes, one connecting and one accepting. It synchronizes with each then waits for them to connect and accept.
MPI spawn test with threads
Create a thread for each task. Each thread will spawn a child process to perform its task.
Multispawn
This (is a placeholder for a) test that creates 4 threads, each of which does a concurrent spawn of 4 more processes, for a total of 17 MPI processes. The resulting intercomms are tested for consistency (to ensure that the spawns didn't get confused among the threads). As an option, it will time the Spawn calls. If the spawn calls block the calling thread, this may show up in the timing of the calls.
Process group creation
In this test, processes create an intracommunicator, and creation is collective only on the members of the new communicator, not on the parent communicator. This is accomplished by building up and merging intercommunicators using Connect/Accept to merge with a master/controller process.
Taskmaster threaded
This is a simple test that creates threads to verifiy compatibility between MPI and the pthread library.
Threads
This group features tests that utilize thread compliant MPI implementations. This includes the threaded environment provided by MPI-3.0, as well as POSIX compliant threaded libraries such as PThreads.
Alltoall threads
The LISTENER THREAD waits for communication from any source (including calling thread) messages which it receives that has tag REQ_TAG. Each thread enters an infinite loop that will stop only if every node in the MPI_COMM_WORLD sends a message containing -1.
MPI_T multithreaded
This test is adapted from test/mpi/mpi_t/mpit_vars.c. But this is a multithreading version in which multiple threads will call MPI_T routines.
With verbose set, thread 0 will prints out MPI_T control variables, performance variables and their categories.
Multiple threads context dup
This test creates communicators concurrently in different threads.
Multiple threads context idup
This test creates communicators concurrently, non-blocking, in different threads.
Multiple threads dup leak
This test repeatedly duplicates and frees communicators with multiple threads concurrently to stress the multithreaded aspects of the context ID allocation code. Thanks to IBM for providing the original version of this test.
Multispawn
This (is a placeholder for a) test that creates 4 threads, each of which does a concurrent spawn of 4 more processes, for a total of 17 MPI processes. The resulting intercomms are tested for consistency (to ensure that the spawns didn't get confused among the threads). As an option, it will time the Spawn calls. If the spawn calls block the calling thread, this may show up in the timing of the calls.
Multi-target basic
Run concurrent sends to a single target process. Stresses an implementation that permits concurrent sends to different targets.
Multi-target many
Run concurrent sends to different target processes. Stresses an implementation that permits concurrent sends to different targets.
Multi-target non-blocking
Run concurrent sends to different target processes. Stresses an implementation that permits concurrent sends to different targets. Uses non-blocking sends, and have a single thread complete all I/O.
Multi-target non-blocking send/recv
Run concurrent sends to different target processes. Stresses an implementation that permits concurrent sends to different targets. Uses non-blocking sends and recvs, and have a single thread complete all I/O.
Multi-target self
Send to self in a threaded program.
Multi-threaded [non]blocking
The tests blocking and non-blocking capability within MPI.
Multi-threaded send/recv
The buffer size needs to be large enough to cause the rndv protocol to be used. If the MPI provider doesn't use a rndv protocol then the size doesn't matter.
Simple thread comm dup
This is a simple test of threads in MPI with communicator duplication.
Simple thread comm idup
This is a simple test of threads in MPI with non-blocking communicator duplication.
Simple thread finalize
The test here is a simple one that Finalize exits, so the only action is to write no error.
Simple thread initialize
The test initializes a thread, then calls MPI_Finalize() and prints "No errors".
Taskmaster threaded
This is a simple test that creates threads to verifiy compatibility between MPI and the pthread library.
Thread Group creation
Every thread paticipates in a distinct MPI_Comm_create group, distinguished by its thread-id (used as the tag). Threads on even ranks join an even comm and threads on odd ranks join the odd comm.
Thread/RMA interaction
This is a simple test of threads in MPI.
Threaded group
In this test a number of threads are created with a distinct MPI communicator (or comm) group distinguished by its thread-id (used as a tag). Threads on even ranks join an even comm and threads on odd ranks join the odd comm.
Threaded ibsend
This program performs a short test of MPI_BSEND in a multithreaded environment. It starts a single receiver thread that expects NUMSENDS messages and NUMSENDS sender threads, that use MPI_Bsend to send a message of size MSGSIZE to its right neigbour or rank 0 if (my_rank==comm_size-1), i.e. target_rank = (my_rank+1)%size.
After all messages have been received, the receiver thread prints a message, the threads are joined into the main thread and the application terminates.
Threaded request
Threaded generalized request tests.
Threaded wait/test
Threaded wait/test request tests.
MPI-Toolkit Interface
This group features tests that involve the MPI Tool interface available in MPI-3.0 and higher.
MPI_T 3.1 get index call
Tests that the MPI 3.1 Toolkit interface *_get_index name lookup functions work as expected.
MPI_T cycle variables
To print out all MPI_T control variables, performance variables and their categories in the MPI implementation.
MPI_T multithreaded
This test is adapted from test/mpi/mpi_t/mpit_vars.c. But this is a multithreading version in which multiple threads will call MPI_T routines.
With verbose set, thread 0 will prints out MPI_T control variables, performance variables and their categories.
MPI_T string handling
A test that MPI_T string handling is working as expected.
MPI_T write variable
This test writes to control variables exposed by MPI_T functionality of MPI_3.0.
MPI-3.0
This group features tests that exercises MPI-3.0 and higher functionality. Note that the test suite was designed to be compiled and executed under all versions of MPI. If the current version of MPI the test suite is less that MPI-3.0, the executed code will report "MPI-3.0 or higher required" and will exit.
Aint add and diff
Tests the MPI 3.1 standard functions MPI_Aint_diff and MPI_Aint_add.
C++ datatypes
This test checks for the existence of four new C++ named predefined datatypes that should be accessible from C and Fortran.
Comm_create_group excl 4 rank
This test using 4 processes creates a group with the even processes using MPI_Group_excl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group excl 8 rank
This test using 8 processes creates a group with the even processes using MPI_Group_excl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 2 rank
This test using 2 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 4 rank
This test using 4 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group incl 8 rank
This test using 8 processes creates a group with ranks less than size/2 using MPI_Group_range_incl() and uses this group to create a communicator. Then both the communicator and group are freed.
Comm_create_group random 2 rank
This test using 2 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_create_group random 4 rank
This test using 4 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_create_group random 8 rank
This test using 8 processes creates and frees groups by randomly adding processes to a group, then creating a communicator with the group.
Comm_idup 2 rank
Multiple tests using 2 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.]
Comm_idup 4 rank
Multiple tests using 4 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.
Comm_idup 9 rank
Multiple tests using 9 processes that make rank 0 wait in a blocking receive until all other processes have called MPI_Comm_idup(), then call idup afterwards. Should ensure that idup doesn't deadlock. Includes a test using an intercommunicator.]
Comm_idup multi
Simple test creating multiple communicators with MPI_Comm_idup.
Comm_idup overlap
Each pair of processes uses MPI_Comm_idup() to dup the communicator such that the dups are overlapping. If this were done with MPI_Comm_dup() this should deadlock.
Comm_split_type basic
Tests MPI_Comm_split_type() including a test using MPI_UNDEFINED.
Comm_with_info dup 2 rank
This test exercises MPI_Comm_dup_with_info() with 2 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Comm_with_info dup 4 rank
This test exercises MPI_Comm_dup_with_info() with 4 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Comm_with_info dup 9 rank
This test exercises MPI_Comm_dup_with_info() with 9 processes by setting the info for a communicator, duplicating it, and then testing the communicator.
Compare_and_swap contention
Tests MPI_Compare_and_swap using self communication, neighbor communication, and communication with the root causing contention.
Datatype get structs
This test was motivated by the failure of an example program for RMA involving simple operations on a struct that included a struct. The observed failure was a SEGV in the MPI_Get.
Fetch_and_op basic
This simple set of tests executes the MPI_Fetch_and op() calls on RMA windows using a selection of datatypes with multiple different communicators, communication patterns, and operations.
Get_acculumate basic
Get Accumulated Test. This is a simple test of MPI_Get_accumulate() on a local window.
Get_accumulate communicators
Get Accumulate Test. This simple set of tests executes MPI_Get_accumulate on RMA windows using a selection of datatypes with multiple different communicators, communication patterns, and operations.
Iallreduce basic
Simple test for MPI_Iallreduce() and MPI_Allreduce().
Ibarrier
This test calls MPI_Ibarrier() followed by an conditional loop containing usleep(1000) and MPI_Test(). This test hung indefinitely under some MPI implementations.
Large counts for types
This test checks for large count functionality ("MPI_Count") mandated by MPI-3, as well as behavior of corresponding pre-MPI-3 interfaces that have better defined behavior when an "int" quantity would overflow.
Large types
This test checks that MPI can handle large datatypes.
Linked list construction fetch/op
This test constructs a distributed shared linked list using MPI-3 dynamic windows with MPI_Fetch_and_op. Initially process 0 creates the head of the list, attaches it to an RMA window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached.
Linked_list construction
Construct a distributed shared linked list using MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1".
Linked list construction lockall
Construct a distributed shared linked list using MPI-3 dynamic RMA windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached. This version of the test suite uses MPI_Win_lock_all() instead of MPI_Win_lock(MPI_LOCK_EXCLUSIVE, ...).
Linked_list construction lock excl
MPI-3 distributed linked list construction example. Construct a distributed shared linked list using proposed MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1". The test uses the MPI_LOCK_EXCLUSIVE argument with MPI_Win_lock().
Linked-list construction lock shr
This test constructs a distributed shared linked list using MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1". This test is similar to Linked_list construction test 2 (rma/linked_list_bench_lock_excl) but uses an MPI_LOCK_SHARED parameter to MPI_Win_Lock().
Linked_list construction put/get
This test constructs a distributed shared linked list using MPI-3 dynamic windows with MPI_Put and MPI_Get. Initially process 0 creates the head of the list, attaches it to an RMA window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached.
MCS_Mutex_trylock
This test exercises the MCS_Mutex_lock calls by having multiple competing processes repeatedly lock and unlock a mutex.
MCS_Mutex_trylock
This test exercises the MCS_Mutex_lock calls by having multiple competing processes repeatedly lock and unlock a mutex.
MPI_Dist_graph_create
This test excercises MPI_Dist_graph_create() and MPI_Dist_graph_adjacent().
MPI_Get_library_version test
MPI-3.0 Test returns MPI library version.
MPI_Info_create basic
Simple test for MPI_Comm_{set,get}_info.
MPI_Info_get basic
This is a simple test of the MPI_Info_get() function.
MPI_Mprobe() series
This tests MPI_Mprobe() using a series of tests. Includes tests with send and Mprobe+Mrecv, send and Mprobe+Imrecv, send and Improbe+Mrecv, send and Improbe+Irecv, Mprobe+Mrecv with MPI_PROC_NULL, Mprobe+Imrecv with MPI_PROC_NULL, Improbe+Mrecv with MPI_PROC_NULL, Improbe+Imrecv, and test to verify MPI_Message_c2f() and MPI_Message_f2c() are present.
MPI RMA read-and-ops
This test exercises atomic, one-sided read-and-operation calls. Includes multiple tests for different RMA request-based operations, communicators, and wait patterns.
MPI_Status large count
This test manipulates an MPI status object using MPI_Status_set_elements_x() with various large count values and verifying MPI_Get_elements_x() and MPI_Test_cancelled() produce the correct values.
MPI_T 3.1 get index call
Tests that the MPI 3.1 Toolkit interface *_get_index name lookup functions work as expected.
MPI_T cycle variables
To print out all MPI_T control variables, performance variables and their categories in the MPI implementation.
MPI_T multithreaded
This test is adapted from test/mpi/mpi_t/mpit_vars.c. But this is a multithreading version in which multiple threads will call MPI_T routines.
With verbose set, thread 0 will prints out MPI_T control variables, performance variables and their categories.
MPI_T string handling
A test that MPI_T string handling is working as expected.
MPI_T write variable
This test writes to control variables exposed by MPI_T functionality of MPI_3.0.
MPI_Win_allocate_shared
Test MPI_Win_allocate and MPI_Win_allocate_shared when allocating memory with size of 1GB per process. Also tests having every other process allocate zero bytes and tests having every other process allocate 0.5GB.
Matched Probe
This routine is designed to test the MPI-3.0 matched probe support. The support provided in MPI-2.2 was not thread safe allowing other threads to usurp messages probed in other threads.
The rank=0 process generates a random array of floats that is sent to mpi rank 1. Rank 1 send a message back to rank 0 with the message length of the received array. Rank 1 spawns 2 or more threads that each attempt to read the message sent by rank 0. In general, all of the threads have equal access to the data, but the first one to probe the data will eventually end of processing the data, and all the others will relent. The threads use MPI_Improbe(), so if there is nothing to read, the thread will rest for 0.1 secs before reprobing. If nothing is probed within a fixed number of cycles, the thread exists and sets it thread exit status to 1. If a thread is able to read the message, it returns an exit status of 0.
Multiple threads context dup
This test creates communicators concurrently in different threads.
Multiple threads context idup
This test creates communicators concurrently, non-blocking, in different threads.
Non-blocking basic
This is a weak sanity test that all non-blocking collectives specified by MPI-3 are present in the library and accept arguments as expected. This test does not check for progress, matching issues, or sensible output buffer values.
Non-blocking intracommunicator
This is a basic test of all 17 non-blocking collective operations specified by the MPI-3 standard. It only exercises the intracommunicator functionality, does not use MPI_IN_PLACE, and only transmits/receives simple integer types with relatively small counts. It does check a few fancier issues, such as ensuring that "premature user releases" of MPI_Op and MPI_Datatype objects does not result in a segfault.
Non-blocking overlapping
This test attempts to execute multiple simultaneous non-blocking collective (NBC) MPI routines at the same time, and manages their completion with a variety of routines (MPI_{Wait,Test}{,_all,_any,_some}). The test also exercises a few point-to-point operations.
Non-blocking wait
This is a weak sanity test that all non-blocking collectives specified by MPI-3 are present in the library and take arguments as expected. Includes calls to MPI_Wait() immediately following non-blocking collective calls. This test does not check for progress, matching issues, or sensible output buffer values.
One-Sided get-accumulate indexed
This code performs N strided get accumulate operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type.
One-Sided get-accumulate shared
This code performs N strided get accumulate operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type. Shared buffers are created by MPI_Win_allocate_shared.
One-Sided put-get shared
This code performs N strided put operations followed by get operations into a 2-D patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type. Shared buffers are created by MPI_Win_allocate_shared.
RMA MPI_PROC_NULL target
Test MPI_PROC_NULL as a valid target for many RMA operations using active target synchronization, passive target synchronization, and request-based passive target synchronization.
RMA Shared Memory
This simple RMA shared memory test uses MPI_Win_allocate_shared() with MPI_Win_fence() and MPI_Put() calls with and without assert MPI_MODE_NOPRECEDE.
RMA zero-byte transfers
Tests zero-byte transfers for a selection of communicators for many RMA operations using active target synchronizaiton and request-based passive target synchronization.
RMA zero-size compliance
The test uses various combinations of either zero size datatypes or zero size counts for Put, Get, Accumulate, and Get_Accumulate. All tests should pass to be compliant with the MPI-3.0 specification.
Request-based operations
Example 11.21 from the MPI 3.0 spec. The following example shows how RMA request-based operations can be used to overlap communication with computation. Each process fetches, processes, and writes the result for NSTEPS chunks of data. Instead of a single buffer, M local buffers are used to allow up to M communication operations to overlap with computation.
Simple thread comm idup
This is a simple test of threads in MPI with non-blocking communicator duplication.
Thread/RMA interaction
This is a simple test of threads in MPI.
Threaded group
In this test a number of threads are created with a distinct MPI communicator (or comm) group distinguished by its thread-id (used as a tag). Threads on even ranks join an even comm and threads on odd ranks join the odd comm.
Type_create_hindexed_block contents
This test is a simple check of MPI_Type_create_hindexed_block() using MPI_Type_get_envelope() and MPI_Type_get_contents().
Type_create_hindexed_block
Tests behavior with a hindexed_block that can be converted to a contig easily. This is specifically for coverage. Returns the number of errors encountered.
Win_allocate_shared zero
Test MPI_Win_allocate_shared when size of the shared memory region is 0 and when the size is 0 on every other process and 1 on the others.
Win_create_dynamic
This test exercises dynamic RMA windows using the MPI_Win_create_dynamic() and MPI_Accumulate() operations.
Window same_disp_unit
Test the acceptance of the MPI 3.1 standard same_disp_unit info key for window creation.
Win_flush basic
Window Flush. This simple test flushes a shared window using MPI_Win_flush() and MPI_Win_flush_all().
Win_flush_local basic
Window Flush. This simple test flushes a shared window using MPI_Win_flush_local() and MPI_Win_flush_local_all().
Win_get_attr
This test determines which "flavor" of RMA is created by creating windows and using MPI_Win_get_attr to access the attributes of each window.
Win_info
This test creates an RMA info object, sets key/value pairs on the object, then duplicates the info object and confirms that the key/value pairs are in the same order as the original.
Win_shared_query basic
This simple test exercises the MPI_Win_shared_query() by querying a shared window and verifying it produced the correct results.
Win_shared_query non-contig put
MPI_Put test with noncontiguous datatypes using MPI_Win_shared_query() to query windows on different ranks and verify they produced the correct results.
Win_shared_query non-contiguous
This test exercises MPI_Win_shared_query() by querying windows on different ranks and verifying they produced the correct results.
MPI-2.2
This group features tests that exercises MPI functionality of MPI-2.2 and earlier.
Alloc_mem
Simple check to see if MPI_Alloc_mem() is supported.
C/Fortran interoperability supported
Checks if the C-Fortran (F77) interoperability functions are supported using the MPI-2.2 specification.
Comm_create intercommunicators
This program tests MPI_Comm_create() using a selection of intercommunicators. Creates a new communicator from an intercommunicator, duplicates the communicator, and verifies that it works. Includes test with one side of intercommunicator being set with MPI_GROUP_EMPTY.
Comm_split intercommunicators
This tests MPI_Comm_split() using a selection of intercommunicators. The split communicator is tested using simple send and receive routines.
Communicator attributes
Returns all communicator attributes that are not supported. The test is run as a single process MPI job and fails if any attributes are not supported.
Deprecated routines
Checks all MPI deprecated routines as of MPI-2.2, but not including routines removed by MPI-3 if this is an MPI-3 implementation.
Error Handling
Reports the default action taken on an error. It also reports if error handling can be changed to "returns", and if so, if this functions properly.
Extended collectives
Checks if "extended collectives" are supported, i.e., collective operations with MPI-2 intercommunicators.
Init arguments
In MPI-1.1, it is explicitly stated that an implementation is allowed to require that the arguments argc and argv passed by an application to MPI_Init in C be the same arguments passed into the application as the arguments to main. In MPI-2 implementations are not allowed to impose this requirement. Conforming implementations of MPI allow applications to pass NULL for both the argc and argv arguments of MPI_Init(). This test prints the result of the error status of MPI_Init(). If the test completes without error, it reports 'No errors.'
MPI-2 replaced routines
Checks the presence of all MPI-2.2 routines that replaced deprecated routines.
MPI-2 type routines
This test checks that a subset of MPI-2 routines that replaced MPI-1 routines work correctly.
MPI_Topo_test dgraph
Specify a distributed graph of a bidirectional ring of the MPI_COMM_WORLD communicator. Thus each node in the graph has a left and right neighbor.
Master/slave
This test running as a single MPI process spawns four slave processes using MPI_Comm_spawn(). The master process sends and receives a message from each slave. If the test completes, it will report 'No errors.', otherwise specific error messages are listed.
One-sided communication
Checks MPI-2.2 one-sided communication modes reporting those that are not defined. If the test compiles, then "No errors" is reported, else, all undefined modes are reported as "not defined."
One-sided fences
Verifies that one-sided communication with active target synchronization with fences functions properly. If all operations succeed, one-sided communication with active target synchronization with fences is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with fences is reported as NOT supported.
One-sided passiv
Verifies one-sided-communication with passive target synchronization functions properly. If all operations succeed, one-sided-communication with passive target synchronization is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with passive target synchronization with fences is reported as NOT supported.
One-sided post
Verifies one-sided-communication with active target synchronization with post/start/complete/wait functions properly. If all operations succeed, one-sided communication with active target synchronization with post/start/complete/wait is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with post/start/complete/wait is reported as NOT supported.
One-sided routines
Reports if one-sided communication routines are defined. If this test compiles, one-sided communication is reported as defined, otherwise "not supported".
Reduce_local basic
A simple test of MPI_Reduce_local(). Uses MPI_SUM as well as user defined operators on arrays of increasing size.
Thread support
Reports the level of thread support provided. This is either MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, or MPI_THREAD_MULTIPLE.
RMA
This group features tests that involve Remote Memory Access, sometimes called one-sided communication. Remote Memory Access is similar in fuctionality to shared memory access.
ADLB mimic
This test uses one server process (S), one target process (T) and a bunch of origin processes (O). 'O' PUTs (LOCK/PUT/UNLOCK) data to a distinct part of the window, and sends a message to 'S' once the UNLOCK has completed. The server forwards this message to 'T'. 'T' GETS the data from this buffer (LOCK/GET/UNLOCK) after it receives the message from 'S', to see if it contains the correct contents.
Accumulate fence sum alloc_mem
Test MPI_Accumulate with fence. This test is the same as "Accumulate with fence sum" except that it uses alloc_mem() to allocate memory.
Accumulate parallel pi
This test calculates pi by integrating the function 4/(1+x*x) using MPI_Accumulate and other RMA functions.
Accumulate with Lock
Accumulate Lock. This test uses MAXLOC and MINLOC with MPI_Accumulate on a 2Int datatype with and without MPI_Win_lock set with MPI_LOCK_SHARED.
Accumulate with fence comms
Simple test of Accumulate/Replace with fence for a selection of communicators and datatypes.
Accumulate with fence sum
Test MPI_Accumulate using MPI_SUM with fence using a selection of communicators and datatypes and verifying the operations produce the correct result.
Alloc_mem
Simple check to see if MPI_Alloc_mem() is supported.
Alloc_mem basic
Allocate Memory. Simple test where MPI_Alloc_mem() and MPI_Free_mem() work together.
Compare_and_swap contention
Tests MPI_Compare_and_swap using self communication, neighbor communication, and communication with the root causing contention.
Contention Put/Get
Contended RMA put/get test. Each process issues COUNT put and get operations to non-overlapping locations on every other process.
Contention Put
Contended RMA put test. Each process issues COUNT put operations to non-overlapping locations on every other process and checks the correct result was returned.
Contiguous Get
This program calls MPI_Get with an indexed datatype. The datatype comprises a single integer at an initial displacement of 1 integer. That is, the first integer in the array is to be skipped. This program found a bug in IBM's MPI in which MPI_Get ignored the displacement and got the first integer instead of the second. Run with one (1) process.
Fetch_and_add allocmem
Fetch and add example from Using MPI-2 (the non-scalable version, Fig. 6.12). This test is the same as fetch_and_add test 1 (rma/fetchandadd) but uses MPI_Alloc_mem and MPI_Free_mem.
Fetch_and_add basic
Fetch and add example from Using MPI-2 (the non-scalable version, Fig. 6.12). Root provides a shared counter array that other processes fetch and increment. Each process records the sum of values in the counter array after each fetch then the root gathers these sums and verifies each counter state is observed.
Fetch_and_add tree allocmem
Scalable tree-based fetch and add example from Using MPI-2, pg 206-207. This test is the same as fetch_and_add test 3 but uses MPI_Alloc_mem and MPI_Free_mem.
Fetch_and_add tree atomic
Scalable tree-based fetch and add example from the book Using MPI-2, p. 206-207. This test is functionally attempting to perform an atomic read-modify-write sequence using MPI-2 one-sided operations. This version uses a tree instead of a simple array, where internal nodes of the tree hold the sums of the contributions of their children. The code in the book (Fig 6.16) has bugs that are fixed in this test.
Fetch_and_op basic
This simple set of tests executes the MPI_Fetch_and op() calls on RMA windows using a selection of datatypes with multiple different communicators, communication patterns, and operations.
{Get,set}_name
This simple test exercises MPI_Win_set_name() and MPI_Win_get_name() using a selection of different windows.
Get_acculumate basic
Get Accumulated Test. This is a simple test of MPI_Get_accumulate() on a local window.
Get_accumulate communicators
Get Accumulate Test. This simple set of tests executes MPI_Get_accumulate on RMA windows using a selection of datatypes with multiple different communicators, communication patterns, and operations.
Get series allocmem
Tests a series of Gets. Run with 2 processors. Same as "Get series" test (rma/test5) but uses alloc_mem.
Get series
Tests a series of Gets. Runs using exactly two processors.
Get with fence basic
Get with Fence. This is a simple test using MPI_Get() with fence for a selection of communicators and datatypes.
Keyvalue create/delete
Free keyval window. Test freeing keyvals while still attached to an RMA window, then make sure that the keyval delete code is still executed. Tested with a selection of windows.
Linked list construction fetch/op
This test constructs a distributed shared linked list using MPI-3 dynamic windows with MPI_Fetch_and_op. Initially process 0 creates the head of the list, attaches it to an RMA window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached.
Linked_list construction
Construct a distributed shared linked list using MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1".
Linked list construction lockall
Construct a distributed shared linked list using MPI-3 dynamic RMA windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached. This version of the test suite uses MPI_Win_lock_all() instead of MPI_Win_lock(MPI_LOCK_EXCLUSIVE, ...).
Linked_list construction lock excl
MPI-3 distributed linked list construction example. Construct a distributed shared linked list using proposed MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1". The test uses the MPI_LOCK_EXCLUSIVE argument with MPI_Win_lock().
Linked-list construction lock shr
This test constructs a distributed shared linked list using MPI-3 dynamic windows. Initially process 0 creates the head of the list, attaches it to the window, and broadcasts the pointer to all processes. Each process "p" then appends N new elements to the list when the tail reaches process "p-1". This test is similar to Linked_list construction test 2 (rma/linked_list_bench_lock_excl) but uses an MPI_LOCK_SHARED parameter to MPI_Win_Lock().
Linked_list construction put/get
This test constructs a distributed shared linked list using MPI-3 dynamic windows with MPI_Put and MPI_Get. Initially process 0 creates the head of the list, attaches it to an RMA window, and broadcasts the pointer to all processes. All processes then concurrently append N new elements to the list. When a process attempts to attach its element to the tail of list it may discover that its tail pointer is stale and it must chase ahead to the new tail before the element can be attached.
Lock-single_op-unlock
Test passive target RMA on 2 processes with the original datatype derived from the target datatype. Includes multiple tests for MPI_Accumulate, MPI_Put, MPI_Put with MPI_Get move-to-end optimization, and MPI_Put with a MPI_Get already at the end move-to-end optimization.
Locks with no RMA ops
This test creates a window, clears the memory in it using memset(), locks and unlocks it, then terminates.
MCS_Mutex_trylock
This test exercises the MCS_Mutex_lock calls by having multiple competing processes repeatedly lock and unlock a mutex.
MCS_Mutex_trylock
This test exercises the MCS_Mutex_lock calls by having multiple competing processes repeatedly lock and unlock a mutex.
MPI RMA read-and-ops
This test exercises atomic, one-sided read-and-operation calls. Includes multiple tests for different RMA request-based operations, communicators, and wait patterns.
MPI_Win_allocate_shared
Test MPI_Win_allocate and MPI_Win_allocate_shared when allocating memory with size of 1GB per process. Also tests having every other process allocate zero bytes and tests having every other process allocate 0.5GB.
Matrix transpose PSCW
Transposes a matrix using post/start/complete/wait and derived datatypes. Uses vector and hvector (Example 3.32 from MPI 1.1 Standard). Run using 2 processors.
Matrix transpose accum
This does a transpose-accumulate operation. Uses vector and hvector datatypes (Example 3.32 from MPI 1.1 Standard). Run using 2 processors.
Matrix transpose get hvector
This test transpose a matrix with a get operation, fence, and derived datatypes. Uses vector and hvector (Example 3.32 from MPI 1.1 Standard). Run using exactly 2 processorss.
Matrix transpose local accum
This does a local transpose-accumulate operation. Uses vector and hvector datatypes (Example 3.32 from MPI 1.1 Standard). Run using exactly 1 processor.
Matrix transpose passive
Transposes a matrix using passive target RMA and derived datatypes. Uses vector and hvector (Example 3.32 from MPI 1.1 Standard). Run using 2 processors.
Matrix transpose put hvector
Transposes a matrix using put, fence, and derived datatypes. Uses vector and hvector (Example 3.32 from MPI 1.1 Standard). Run using 2 processors.
Matrix transpose put struct
Transposes a matrix using put, fence, and derived datatypes. Uses vector and struct (Example 3.33 from MPI 1.1 Standard). We could use vector and type_create_resized instead. Run using exactly 2 processors.
Mixed synchronization test
Perform several RMA communication operations, mixing synchronization types. Use multiple communication to avoid the single-operation optimization that may be present.
One-Sided accumulate indexed
This code performs N accumulates into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type.
One-Sided accumulate one lock
This code performs one-sided accumulate into a 2-D patch of a shared array.
One-Sided accumulate subarray
This code performs N accumulates into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI subarray type.
One-Sided get-accumulate indexed
This code performs N strided get accumulate operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type.
One-Sided get-accumulate shared
This code performs N strided get accumulate operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type. Shared buffers are created by MPI_Win_allocate_shared.
One-Sided get indexed
This code performs N strided get operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type.
One-Sided put-get indexed
This code performs N strided put operations followed by get operations into a 2-D patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed datatype.
One-Sided put-get shared
This code performs N strided put operations followed by get operations into a 2-D patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI indexed type. Shared buffers are created by MPI_Win_allocate_shared.
One-sided communication
Checks MPI-2.2 one-sided communication modes reporting those that are not defined. If the test compiles, then "No errors" is reported, else, all undefined modes are reported as "not defined."
One-sided fences
Verifies that one-sided communication with active target synchronization with fences functions properly. If all operations succeed, one-sided communication with active target synchronization with fences is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with fences is reported as NOT supported.
One-sided passiv
Verifies one-sided-communication with passive target synchronization functions properly. If all operations succeed, one-sided-communication with passive target synchronization is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with passive target synchronization with fences is reported as NOT supported.
One-sided post
Verifies one-sided-communication with active target synchronization with post/start/complete/wait functions properly. If all operations succeed, one-sided communication with active target synchronization with post/start/complete/wait is reported as supported. If one or more operations fail, the failures are reported and one-sided-communication with active target synchronization with post/start/complete/wait is reported as NOT supported.
One-sided routines
Reports if one-sided communication routines are defined. If this test compiles, one-sided communication is reported as defined, otherwise "not supported".
Put-Get-Accum PSCW allocmem
Tests put and get with post/start/complete/wait on 2 processes. Same as Put,Gets,Accumulate test 4 (rma/test2) but uses alloc_mem.
Put-Get-Accum PSCW
Tests put and get with post/start/complete/wait on 2 processes.
Put-Get-Accum fence allocmem
Tests a series of put, get, and accumulate on 2 processes using fence. This test is the same as "Put-Get-Accumulate fence" (rma/test1) but uses alloc_mem.
Put-Get-Accum fence derived
Tests a series of puts, gets, and accumulate on 2 processes using fence. Same as "Put-Get-Accumulate fence" (rma/test1) but uses derived datatypes to receive data.
Put-Get-Accum fence
Tests a series of puts, gets, and accumulate on 2 processes using fence.
Put-Get-Accum lock opt allocmem
Tests passive target RMA on 2 processes. tests the lock-single_op-unlock optimization. Same as "Put-Get-accum lock opt" test (rma/test4) but uses alloc_mem.
Put-Get-Accum lock opt
Tests passive target RMA on 2 processes using a lock-single_op-unlock optimization.
Put-Get-Accum true-1 allocmem
Tests the example in Fig 6.8, pg 142, MPI-2 standard. Process 1 has a blocking MPI_Recv between the Post and Wait. Therefore, this example will not run if the one-sided operations are simply implemented on top of MPI_Isends and Irecvs. They either need to be implemented inside the progress engine or using threads with Isends and Irecvs. In MPICH-2, they are implemented in the progress engine. This test is the same as Put,Gets,Accumulate test 6 (rma/test3) but uses alloc_mem.
Put-Get-Accum true one-sided
Tests the example in Fig 6.8, pg 142, MPI-2 standard. Process 1 has a blocking MPI_Recv between the Post and Wait. Therefore, this example will not run if the one-sided operations are simply implemented on top of MPI_Isends and Irecvs. They either need to be implemented inside the progress engine or using threads with Isends and Irecvs. In MPICH-2 (in MPICH), they are implemented in the progress engine.
Put with fences
Put with Fences used to seperate epochs. This test looks at the behavior of MPI_Win_fence and epochs. Each MPI_Win_fence may both begin and end both the exposure and access epochs. Thus, it is not necessary to use MPI_Win_fence in pairs. Tested with a selection of communicators and datatypes.
The tests have the following form:
Process A Process B fence fence put,put fence fence put,put fence fence put,put put,put fence fence
RMA MPI_PROC_NULL target
Test MPI_PROC_NULL as a valid target for many RMA operations using active target synchronization, passive target synchronization, and request-based passive target synchronization.
RMA Shared Memory
This simple RMA shared memory test uses MPI_Win_allocate_shared() with MPI_Win_fence() and MPI_Put() calls with and without assert MPI_MODE_NOPRECEDE.
RMA contiguous calls
This test exercises the one-sided contiguous MPI calls using repeated RMA calls for multiple operations. Includes multiple tests for different lock modes and assert types.
RMA fence PSCW ordering
This post/start/complete/wait operation test checks an oddball case for generalized active target synchronization where the start occurs before the post. Since start can block until the corresponding post, the group passed to start must be disjoint from the group passed to post for processes to avoid a circular wait. Here, odd/even groups are used to accomplish this and the even group reverses its start/post calls.
RMA fence null
This simple test creates a window with a null pointer then performs a post/start/complete/wait operation.
RMA fence put PSCW
Put with Post/Start/Complete/Wait using a selection of communicators and datatypes.
RMA fence put base
This code performs N strided put operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI datatype. This test generates a datatype that is relative to an arbitrary base address in memory and tests the RMA implementation's ability to perform the correct transfer.
RMA fence put bottom
One-Sided MPI 2-D Strided Put Test. This code performs N strided put operations into a 2d patch of a shared array. The array has dimensions [X, Y] and the subarray has dimensions [SUB_X, SUB_Y] and begins at index [0, 0]. The input and output buffers are specified using an MPI datatype. This test generates a datatype that is relative to MPI_BOTTOM and tests the RMA implementation's ability to perform the correct transfer.
RMA fence put indexed
Put with Fence for an indexed datatype. One MPI Implementation fails this test with sufficiently large values of blksize. It appears to convert this type to an incorrect contiguous move.
RMA fence put
Tests MPI_Put and MPI_Win_fence with a selection of communicators and datatypes.
RMA get attributes
This test creates a window, then extracts its attributes through a series of MPI_Win_get_attr calls.
RMA lock contention accumulate
This is a modified version of Put,Gets,Accumulate test 9 (rma/test4). Tests passive target RMA on 3 processes. Tests the lock-single_op-unlock optimization.
RMA lock contention basic
Multiple tests for lock contention, including special cases within the MPI implementation; in this case, our coverage analysis showed the lockcontention test was not covering all cases and revealed a bug in the code. In all of these tests, each process writes (or accesses) the values rank + i*size_of_world for NELM times. This test strives to avoid operations not strictly permitted by MPI RMA, for example, it doesn't target the same locations with multiple put/get calls in the same access epoch.
RMA lock contention optimized
Multiple additional tests for lock contention. These are designed to exercise some of the optimizations within MPICH, but all are valid MPI programs. Tests structure includes:
Lock local (must happen
at this time since application can use load store after thelock)
Send message to partner
Receive message
Send ack
Receive ack
Provide a delay so that the partner will see the conflict
Partner executes:
Lock // Note: this may block rma operations (see below)
Unlock
Send back to partner
Unlock
Receive from partner
Check for correct data
The delay may be implemented as a ring of message communication; this is likely to automatically scale the time to what is needed.
RMA many ops basic
Many RMA operations. This simple test creates an RMA window, locks it, and performs many accumulate operations on it.
RMA many ops sync
Tests for correct handling of the case where many RMA operations occur between synchronization events. Includes options for multiple different RMA operations, and is currently run for accumulate with fence. This is one of the ways that RMA may be used, and is used in the reference implementation of the graph500 benchmark.
RMA post/start/complete test
Tests put and get with post/start/complete/test on 2 processes. Same as "Put-Get-Accum PSCW" test (rma/test2), but uses win_test instead of win_wait.
RMA post/start/complete/wait
Accumulate Post-Start-Complete-Wait. This test uses accumulate/replace with post/start/complete/wait for source and destination processes on a selection of communicators and datatypes.
RMA rank 0
Test RMA calls to self using multiple RMA operations and checking the accuracy of the result.
RMA zero-byte transfers
Tests zero-byte transfers for a selection of communicators for many RMA operations using active target synchronizaiton and request-based passive target synchronization.
RMA zero-size compliance
The test uses various combinations of either zero size datatypes or zero size counts for Put, Get, Accumulate, and Get_Accumulate. All tests should pass to be compliant with the MPI-3.0 specification.
Request-based operations
Example 11.21 from the MPI 3.0 spec. The following example shows how RMA request-based operations can be used to overlap communication with computation. Each process fetches, processes, and writes the result for NSTEPS chunks of data. Instead of a single buffer, M local buffers are used to allow up to M communication operations to overlap with computation.
Thread/RMA interaction
This is a simple test of threads in MPI.
Win_allocate_shared zero
Test MPI_Win_allocate_shared when size of the shared memory region is 0 and when the size is 0 on every other process and 1 on the others.
Win_create_dynamic
This test exercises dynamic RMA windows using the MPI_Win_create_dynamic() and MPI_Accumulate() operations.
Win_create_errhandler
This test creates 1000 RMA windows using MPI_Alloc_mem(), then frees the dynamic memory and the RMA windows that were created.
Window attributes order
Test creating and inserting and deleting attributes in different orders using MPI_Win_set_attr and MPI_Win_delete_attr to ensure the list management code handles all cases.
Window same_disp_unit
Test the acceptance of the MPI 3.1 standard same_disp_unit info key for window creation.
Win_errhandler
This test creates and frees MPI error handlers in a loop (1000 iterations) to test the internal MPI RMA memory allocation routines.
Win_flush basic
Window Flush. This simple test flushes a shared window using MPI_Win_flush() and MPI_Win_flush_all().
Win_flush_local basic
Window Flush. This simple test flushes a shared window using MPI_Win_flush_local() and MPI_Win_flush_local_all().
Win_get_attr
This test determines which "flavor" of RMA is created by creating windows and using MPI_Win_get_attr to access the attributes of each window.
Win_get_group basic
This is a simple test of MPI_Win_get_group() for a selection of communicators.
Win_info
This test creates an RMA info object, sets key/value pairs on the object, then duplicates the info object and confirms that the key/value pairs are in the same order as the original.
Win_shared_query basic
This simple test exercises the MPI_Win_shared_query() by querying a shared window and verifying it produced the correct results.
Win_shared_query non-contig put
MPI_Put test with noncontiguous datatypes using MPI_Win_shared_query() to query windows on different ranks and verify they produced the correct results.
Win_shared_query non-contiguous
This test exercises MPI_Win_shared_query() by querying windows on different ranks and verifying they produced the correct results.
Attributes Tests
This group features tests that involve attributes objects.
At_Exit attribute order
The MPI-2.2 specification makes it clear that attributes are called on MPI_COMM_WORLD and MPI_COMM_SELF at the very beginning of MPI_Finalize in LIFO order with respect to the order in which they are set. This is useful for tools that want to perform the MPI equivalent of an "at_exit" action.
This test uses 20 attributes to ensure that the hash-table based MPI implementations do not accidentally pass the test except by being extremely "lucky". There are (20!) possible permutations providing about a 1 in 2.43e18 chance of getting LIFO ordering out of a hash table assuming a decent hash function is used.
At_Exit function
This test demonstrates how to attach an "at-exit()" function to MPI_Finalize().
Attribute/Datatype
This program creates a contiguous datatype from type MPI_INT, attaches an attribute to the type, duplicates it, then deletes both the original and duplicate type.
Attribute callback error
This test exercises attribute routines. It checks for correct behavior of the copy and delete functions on an attribute, particularly the correct behavior when the routine returns a failure.
MPI 1.2 Clarification: Clarification of Error Behavior of Attribute Callback Functions. Any return value other than MPI_SUCCESS is erroneous. The specific value returned to the user is undefined (other than it can't be MPI_SUCCESS). Proposals to specify particular values (e.g., user's value) failed.
Attribute comm callback error
This test exercises attribute routines. It checks for correct behavior of the copy and delete functions on an attribute, particularly the correct behavior when the routine returns failure.
MPI 1.2 Clarification: Clarification of Error Behavior of Attribute Callback Functions. Any return value other than MPI_SUCCESS is erroneous. The specific value returned to the user is undefined (other than it can't be MPI_SUCCESS). Proposals to specify particular values (e.g., user's value) failed. This test is similar in function to attrerr but uses communicators.
Attribute delete/get
This program illustrates the use of MPI_Comm_create_keyval() that creates a new attribute key.
Attribute order
This test creates and inserts attributes in different orders to ensure that the list management code handles all cases properly.
Attribute type callback error
This test checks for correct behavior of the copy and delete functions on an attribute, particularly the correct behavior when the routine returns failure.
Any return value other than MPI_SUCCESS is erroneous. The specific value returned to the user is undefined (other than it can't be MPI_SUCCESS). Proposals to specify particular values (e.g., user's value) have not been successful. This test is similar in function to attrerr but uses types.
Basic Attributes
This test accesses many attributes such as MPI_TAG_UB, MPI_HOST, MPI_IO, MPI_WTIME_IS_GLOBAL, and many others and reports any errors.
Basic MPI-3 attribute
This program tests the integrity of the MPI-3.0 base attributes. The attribute keys tested are: MPI_TAG_UB, MPI_HOST, MPI_IO, MPI_WTIME_IS_GLOBAL, MPI_APPNUM, MPI_UNIVERSE_SIZE, MPI_LASTUSEDCODE
Communicator Attribute Order
This test creates and inserts communicator attributes in different orders to ensure that the list management code handles all cases properly.
Communicator attributes
Returns all communicator attributes that are not supported. The test is run as a single process MPI job and fails if any attributes are not supported.
Function keyval
This test illustrates the use of the copy and delete functions used in the manipulation of keyvals. It also tests to confirm that attributes are copied when communicators are duplicated.
Intercommunicators
This test exercises communicator attribute routines for intercommunicators.
Keyval communicators
This test tests freeing of keyvals while still attached to a communicator, then tests to make sure that the keyval delete and copy functions are executed properly.
Keyval test with types
This tests illustrates the use of keyvals associated with datatypes.
Multiple keyval_free
This tests multiple invocations of keyval_free on the same keyval.
RMA get attributes
This test creates a window, then extracts its attributes through a series of MPI_Win_get_attr calls.
Type Attribute Order
This test creates and inserts type attributes in different orders to ensure that the list management codes handles all cases properly.
Varying communicator orders/types
This test is similar to attr/attrordertype (creates/inserts attributes) but uses a different strategy of mixing attribute order, types, and with different types of communicators.
Performance
This group features tests that involve realtime latency performance analysis of MPI appications. Although performance testing is not an established goal of this test suite, these few tests were included because there has been discussion of including performance testing in future versions of the test suite. Such tests might be useful to aide users in determining what MPI features should be used for their particular application. These tests are exemplary of what future tests could provide.
Datatype creation
Make sure datatype creation is independent of data size. However, that there is no guarantee or expectation that the time would be constant. In particular, some optimizations might take more time than others.
The real goal of this is to ensure that the time to create a datatype doesn't increase strongly with the number of elements within the datatype, particularly for these datatypes that are quite simple patterns.
Group creation
This is a performance test indexed by group number to look at how communicator creation scales with group. The cost should be linear or at worst ts*log(ts), where ts <= number of communicators.
MPI_Group_Translate_ranks perf
Measure and compare the relative performance of MPI_Group_translate_ranks with small and large group2 sizes but a constant number of ranks. This serves as a performance sanity check for the Scalasca use case where we translate to MPI_COMM_WORLD ranks. The performance should only depend on the number of ranks passed, not the size of either group (especially group2). This test is probably only meaningful for large-ish process counts.
MPI-Tracing package
This code is intended to test the trace overhead when using an MPI tracing package. The test is currently run in verbose mode with the number of processes set to 32 to run on the greatest number of HPC systems.
MPI_{pack,unpack} perf
This code may be used to test the performance of some of the noncontiguous datatype operations, including vector and indexed pack and unpack operations. To simplify the use of this code for tuning an MPI implementation, it uses no communication, just the MPI_Pack and MPI_Unpack routines. In addition, the individual tests are in separate routines, making it easier to compare the compiler-generated code for the user (manual) pack/unpack with the code used by the MPI implementation. Further, to be fair to the MPI implementation, the routines are passed the source and destination buffers; this ensures that the compiler can't optimize for statically allocated buffers.
Network performance
This test calculates bulk transfer rates and latency as a function of message buffer size.
Send/Receive basic perf
This program provides a simple test of send-receive performance between two (or more) processes. This test is sometimes called head-to-head or ping-ping test, as both processes send at the same time.
Synchonization basic perf
This test compares the time it takes between a synchronization step between rank 0 and rank 1. If that difference is greater than 10 percent, it is considered an error.
Timer sanity
Check that the timer produces monotone nondecreasing times and that the tick is reasonable.
Transposition type
This test transposes a (100x100) two-dimensional array using two options: (1) manually send and transpose, and (2) send using an automatic hvector type. It fails if (2) is too much slower than (1).
Variable message length
This test measures the latency involved in sending/receiving messages of varying size.