I am looking at the code copying a sparse file (to another sparse file), it uses DeviceIoControl(... FSCTL_QUERY_ALLOCATED_RANGES ...) to get a list of ranges containing actual data.

Is it guaranteed that result contains ranges that:

  • don't intersect?

  • are ordered by FileOffset field?

  • aren't empty?

  • have FileOffset + Length > FileOffset (i.e. no wraparound wrt uint64_t)?

Edit:

I've implemented validation just in case if OS doesn't give any of these guarantees:

// version of std::remove_if that permits coalescing
template<class ForwardIt, class BinaryPredicate>
ForwardIt coalesce_neighbors(ForwardIt first, ForwardIt last, BinaryPredicate p)
{
    for(ForwardIt i = first; i != last; ++i)
        if (!p(*first, *i))
        {
            if (first != i)
                *first = std::move(*i);
            ++first;
        }
    return first;
}


// for given range set do: sort by offset, check for validity, discard empty ones, coalesce intersecting/adjacent ones
FILE_ALLOCATED_RANGE_BUFFER* sanitize_ranges_(FILE_ALLOCATED_RANGE_BUFFER* p, FILE_ALLOCATED_RANGE_BUFFER* p_end)
{
    auto ui = [](LARGE_INTEGER const& v){ return static_cast<ULONGLONG>(v.QuadPart); };

    std::sort(p, p_end, [=](auto& l, auto& r){ return ui(l.FileOffset) < ui(r.FileOffset); });  // sort ranges by offset

    return coalesce_neighbors(p, p_end, [=](auto& l, auto& r){
        if (std::numeric_limits<ULONGLONG>::max() - ui(r.FileOffset) < ui(r.Length))            // no wraparounds allowed
            throw std::logic_error("invalid range (wraparound)");

        if (ui(r.Length) == 0) return true;                                                     // discard empty ranges

        if (&l != &r && ui(l.FileOffset) + ui(l.Length) >= ui(r.FileOffset))                    // 'l.offset <= r.offset' is guranteed due to sorting
        {
            l.Length.QuadPart = ui(r.FileOffset) + ui(r.Length) - ui(l.FileOffset);             // coalesce intersecting/adjacent ranges
            return true;                                                                        // ... and discard neighbor we ate
        }

        return false;
    });
}
1

There are 1 best solutions below

1
On

Well, it is not much, but the usage of FSCTL_QUERY_ALLOCATED_RANGES in this file from a Microsoft repository seems to indicate that the answer for your second question is: yes, the results are indeed ordered by FileOffset.

The query is made for the whole file, but if ERROR_MORE_DATA is returned, the query is done again starting from the end of the last returned range.