I very often find myself wanting to filter a vector based on its index rather than its values.

auto some_values = std::vector{1, 0, 4, 6, 2};

// Somewhere I figure out which items to remove.
// This may be from user interaction or something
// nothing to do with the original values:
const auto removing = std::vector<bool>{0, 0, 1, 0, 1};

So, I'm tempted to use erase_if like so:

std::erase_if(some_values, [it = removing.begin()](auto) mutable {return *it++;});

It seems to works for gcc and clang. However, there doesn't seem to be anything on the std::erase cppref page about the order in which the predicate is called, so I assume this is undefined behaviour?

Same issue with std::remove_if. Note that zipping the ranges won't work for most zipping options as usually the resultant range can't resize the underlying data.

Using a for loop and creating a copy of the array isn't too much boilerplate, but I'm currently applying filters to some low-ish level code where I can't just copy all the data. Some of the data will be huge and needs to be responsive during these filter operations.

Worst-case scenario I can add functions like this to solve the problem:

template <class T, auto N, class Pred>
size_t RemoveIfIndex(std::span<T, N> span, Pred&& pred)
{
  size_t write = 0; // Also behaves as a counter
  
  // Initialise to the first deleted index:
  for (; (write < span.size()) && !pred(write); ++write);
  
  // Shuffle the elements:
  for (size_t read = write + 1; read < span.size(); ++read)
  {
    if (!pred(read))
    {
      span[write] = span[read];
      ++write;
    }
  }
  return write;
}

template <class T, class Alloc, class Pred>
size_t EraseIfIndex(std::vector<T, Alloc>& vec, Pred&& pred)
{
  const size_t new_size = RemoveIfIndex(std::span{vec.begin(), vec.end()}, pred);
  vec.resize(new_size);
  return new_size;
}

But I'm hesitant to add more code to our low-level library when there's similar categories of functions I'll need to add (that need index info), and I suspect there's some insight into ranges/adaptors/filters that I'm missing that'd make these functions unnecessary.

4

There are 4 best solutions below

2
Pepijn Kramer On

== warning incorrect example, left for educational purpose == (remove_if moves elements around in memory so indices will not line up during iteration)

Why worst case? Having a working function is a good way to work. It is not always about the minimum lines of code. I would create a function anyway to check for boundary conditions (e.g. vector sizes should match)

#include <algorithm>
#include <iostream>
#include <vector>
#include <stdexcept>

void remove_by_index(std::vector<int>& values, const std::vector<bool>& filter)
{
    if (values.size() != filter.size()) throw std::invalid_argument("");
    std::size_t index{ 0 };
    auto it = std::remove_if(values.begin(), values.end(), [&](const auto) { return filter[index++]; });
    values.erase(it, values.end());
}

int main()
{
    auto some_values = std::vector{ 1, 0, 4, 6, 2 };
    const auto remove = std::vector<bool>{ 0, 0, 1, 0, 1 };

    remove_by_index(some_values, remove);

    for (const auto value : some_values)
    {
        std::cout << value << " ";
    }

    return 0;
}
0
Jerry Coffin On

It may or may not qualify as what you think of "STL", but std::valarray supports mask arrays like this a little more generally, so you can do something like this:

#include <valarray>
#include <iostream>

int main() {
    std::valarray<int> values { 1, 2, 3, 4, 5 };
    std::valarray<bool> mask  { 0, 1, 1, 0, 1 };

    // zero all the values where there's a 1 in the mask array
    values[mask] = 0;

    // sum where there were zeros in the mask array:
    std::cout << values.sum(); // result: 5 (1+4)
}

This can also work with the contents of the array. For example, you can specify a condition (where the name of the valarray acts as kind of a placeholder):

#include <valarray>
#include <iostream>

int main() {
    std::valarray<int> values { 1, 2, 3, 4, 5 };

    values[values > 3] = 0;
    // values = { 1, 2, 3, 0, 0 };
    std::cout << values.sum(); // 6 (1+2+3)
}

So it's not usually for removing elements like you're talking about, but depending on the situation, you may be able to get a similar effect, basically ignoring the elements you don't need/want.

Disclaimer: mostly posting this as something interesting that few people know about. Honestly, it's open to a lot of question how useful it really is--valarray can do some kind of cool things, but it also has some rather strange limitations. For example, from looking at the preceding code, it might initially seem like you should be able to do something like this:

#include <valarray>
#include <iostream>

int main() {
    std::valarray<int> values { 1, 2, 3, 4, 5 };
    std::valarray<bool> mask  { 0, 1, 1, 0, 1 };

    std::cout << values[mask].sum();
}

...but no, that won't work--it'll normally give an error to the effect that std::mask_array doesn't have a member function named sum().

There are pretty good reasons valarray is so rarely used.

0
463035818_is_not_an_ai On

You are right that your erase_if call is not guaranteed to work.

Depending on what you need the filtered elements for, it might be viable to not erase them but rather build a table to look them up in the original vector. Something along the line of ...

#include <vector>
#include <iostream>

template <typename T>
struct filtered_vector {
    std::vector<T>& data;
    std::vector<size_t> indices;
    size_t s;
    filtered_vector(std::vector<T>& data,const std::vector<bool>& removing) : data(data) {
        size_t count = 0;
        for (const auto& r : removing) {
            if (!r) indices.push_back(count);
            ++count;
        }
    }
    T& operator[](size_t index) {
        return data[indices[index]];
    }
    size_t size() { return indices.size();}
};

int main() {
    auto some_values = std::vector{1, 0, 4, 6, 2};
    const auto removing = std::vector<bool>{0, 0, 1, 0, 1};
    filtered_vector<int> f{some_values,removing};
    for (size_t i=0;i<f.size();++i) std::cout << f[i] << " ";
}

As a sidenote, if you do erase and insert elements often but at the same time require stable references to other elements, it might be worth to consider std::list with stable iterators.

2
RyanCu On

Simple C++

std::erase_if(some_values, 
              [&removing, beginPtr = &*some_values.cbegin()](const int& i) 
              {
                return removing[&i - beginPtr];
              });

Using boost

#include <boost/tuple/tuple.hpp>
#include <boost/range/combine.hpp>
#include <boost/range/algorithm/remove_if.hpp>

const auto it = boost::remove_if(
    boost::combine(some_values, removing),
    [](const auto& Elem)
    {
        return boost::get<1>(Elem);
    });
some_values.erase(
    boost::get<0>(it.get_iterator_tuple()),
    some_values.end());

Using std::ranges

See https://stackoverflow.com/a/67845735/266742