What is an mdspan, and what is it used for?

2.8k Views Asked by At

Over the past year or so I've noticed a few C++-related answers on StackOverflow refer to mdspan's - but I've never actually seen these in C++ code. I tried looking for them in my C++ compiler's standard library directory and in the C++ coding guidelines - but couldn't find them. I did find std::span's; I'm guessing they're related - but how? And what does this addition of "md" stand for?

Please explain what this mysterious entity is about, and when I might want to use it.

1

There are 1 best solutions below

0
On BEST ANSWER

TL;DR: mdspan is an extension of std::span for multiple dimensions - with a lot of (unavoidable) flexibile configurability w.r.t. memory layout and modes of access.


Before you read this answer, you should make sure you're clear on what a span is and what it's used for. Now that that's out of the way: Since mdspan's can be rather complex beasts (typically ~7x or more source code as an std::span implementation), we'll start with a simplified description, and keep the advanced capabilities for further below.

"What is it?" (simple version)

An mdspan<T> is:

  1. Literally, a "multi-dimensional span" (of type-T elements).
  2. A generalization of std::span<T>, from a uni-dimensional/linear sequence of elements to multiple dimensions.
  3. A non-owning view of a contiguous sequence of elements of type T in memory, interpreted as a multi-dimensional array.
  4. Basically just a struct { T * ptr; size_type extents[d]; } with some convenience methods (for d dimensions determined at run-time).

Illustration of mdspan-interpreted layout

If we have:

std::vector v = {1,2,3,4,5,6,7,8,9,10,11,12};

we can view the data of v as a 1D array of 12 elements, similar to its original definition:

auto sp1 = std::span(v.data(), 12);
auto mdsp1 = std::mdspan(v.data(), 12);

or a 2D array of extents 2 x 6:

auto mdsp2 = std::mdspan(v.data(), 2, 6 );
// (  1,  2,  3,  4,  5,  6 ),
// (  7,  8,  9, 10, 11, 12 )

or a 3D array 2 x 3 x 2:

auto ms3 = std::mdspan(v.data(), 2, 3, 2);
// ( ( 1,  2 ), ( 3,  4 ), (  5,  6 ) ),
// ( ( 7,  8 ), ( 9, 10 ), ( 11, 12 ) )

and we could also consider it as a 3 x 2 x 2 or 2 x 2 x 3 array, or 3 x 4 and so on.

"When should I use it?"

  • (C++23 and later) When you want to use the multi-dimensional operator[] on some buffer you get from somewhere. Thus in the example above, ms3[1, 2, 0] is 11 and ms3[0, 1, 1] is 4 .

  • When you want to pass multi-dimensional data without separating the raw data pointer and the dimensions. You've gotten a bunch of elements in memory, and want to refer to them using more than one dimension. Thus instead of:

    void print_matrix_element(
       float const* matrix, size_t row_width, size_t x, size_t y) 
    {
       std::print("{}", matrix[row_width * x + y]);
    }
    

    you could write:

    void print_matrix_element(
        std::mdspan<float const, std::dextents<size_t, 2>> matrix,
        size_t x, size_t y)
    {
       std::print("{}", matrix[x, y]);
    }
    
  • As the right type for passing multidimensional C arrays around:
    C supports multidimensional arrays perfectly... as long as their dimensions are given at compile time, and you don't try passing them to functions. Doing that is a bit tricky because the outermost dimension experiences decay, so you would actually be passing a pointer. But with mdspans, you can write this:

    template <typename T, typename Extents>
    void print_3d_array(std::mdspan<T, Extents> ms3)
    {
       static_assert(ms3.rank() == 3, "Unsupported rank");
       // read back using 3D view
       for(size_t i=0; i != ms3.extent(0); i++) {
         fmt::print("slice @ i = {}\n", i);
         for(size_t j=0; j != ms3.extent(1); j++) {
           for(size_t k=0; k != ms3.extent(2); k++)
             fmt::print("{} ",  ms3[i, j, k]);
           fmt::print("\n");
         }
       }  
    }
    
    int main() {
        int arr[2][3][2] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
    
        auto ms3 = std::mdspan(&arr[0][0][0], 2, 3, 2);
          // Note: This construction can probably be improved, it's kind of fugly
    
        print_3d_array(ms3);
    }
    

Standardization status

While std::span was standardized in C++20, std::mdspan was not. However, it is part of C++23, which is almost-finalized (awaiting final ballot).

You can already use a reference implementation. It is part of the US' Sandia National Laboratory's "Kokkos performance portability ecosystem".

"What are those 'extra capabilities' which mdspan offers?"

An mdspan actually has 4 template parameters, not just the element type and the extents:

template <
    class T,
    class Extents,
    class LayoutPolicy = layout_right,
    class AccessorPolicy = default_accessor<ElementType>
>
class mdspan;

This answer is already rather long, so we won't give the full details, but:

  • Some of the extents can be "static" rather than "dynamic", specified in compile-time, and thus not stored in instance data members. Only the "dynamic" instances are stored. For example, this:

    auto my_extents extents<dynamic_extent, 3, dynamic_extent>{ 2, 4 };
    

    ... is an extents objects corresponding to dextents<size_t>{ 2, 3, 4 }, but which only stores the values 2 and 4 in the class instance; with the compiler knowing it needs to plug in 3 whenever the second dimension is used.

  • You can have the dimensions go from-minor-to-major, in Fortran style instead of from-major-to-minor like in C. Thus, if you set LayoutPolicy = layout_left, then mds[x,y] is at mds.data[mds.extent(0) * y + x] instead of the usual mds.data[mds.extent(1) * x + y].

  • You can "reshape" your mdspan into another mdspan with different dimensions but the same overall size.

  • You can define a layout policy with "strides": Have consecutive elements in the mdspan be at a fixed distance in memory; have extra offsets and the beginning and/or the end of each line or dimensional slice; etc.

  • You can "cut up" your mdspan with offsets in every dimension (e.g. take a submatrix of a matrix) - and the result is still an mdspan! ... that's because you can have an mdspan with a LayoutPolicy which incorporates these offsets. This functionality is not available in C++23 IIANM.

  • Using the AccessorPolicy, you can make mdspan's which actually do own the data they refer to, individually or collectively.

Further reading

(some examples were adapted from these sources.)