How many std::future objects can exist in a system at a time simultaneously.?

300 Views Asked by At

I wanted to perform hashing of a stream of input messages in multithreading, so was trying to implement std::vector<std::future<HashData>> futures; but not sure as how many future objects can exist in a system, at a time simultaneously.

std::vector<std::future<HashData>> futures;
std::vector<std::string> messages;

for (int i = 0; i < messages.size(); i++)
{
  std::promise<HashData> promiseHashData;
  std::future<HashData> futureHashData = promiseHashData.get_future();
  futures.emplace_back(std::move(futureHashData));
  std::async(std::launch::async, [&]() {PerformHash(std::move(promiseHashData), messages[i]);});
}

std::vector<HashData> vectorOfHashData;
// wait for  all async tasks to complete
for (auto& futureObj : futures)
{
  vectorOfHashData.push_back(futureObj.get());
}

Is there any limit for creation of future objects in a system (similar to how system may reach thread saturation level, if the existing threads won't get destroyed and new ones gets created continuously), As i will be calling PerformHash() method in async manner for large data of messages.

i am exploring concurrency in c++ during recent times and wanted to improve the hashing task performance. So this thought came to my mind, but not sure as whether it will work or not. wanted to know if i am missing something here.

1

There are 1 best solutions below

0
Yakk - Adam Nevraumont On

The problem isn't going to be "how many futures can a vector hold"; futures (on most systems) are just a shared pointer to a block of memory with some cheap concurrency primitives in it.

The problem is you are creating a thread per future then blocking forward progress until the thread is finished. If you fix that problem, then your code is using dangling references.

std::vector<std::future<HashData>> futures;
std::vector<std::string> messages;

for (int i = 0; i < messages.size(); i++)
{
  std::promise<HashData> promiseHashData;
  std::future<HashData> futureHashData = promiseHashData.get_future();
  futures.emplace_back(std::move(futureHashData));
  // this captures a promiseHashData by reference
  // It also creates a thread, then blocks until the
  // thread finishes.
  std::async(std::launch::async, [&]() {PerformHash(std::move(promiseHashData), messages[i]);});
}

So a few points:

  1. Unless the hash data is worth consuming in small pieces, a future<vector<HashData>> is going to be more efficient.

  2. If you want a vector<future>, you'll also want a vector<promise>. Then create a bounded number of threads (or get them from a pool you write) and fullfill those promises.

Creating an unbounded number of futures, then creating an unbounded number of threads to service those futures, is a bad plan.

Finally, std::async is funny in that it returns a std::future itself. When that future is destroyed, it blocks on the completion of the thread it creates. This is atypical behavior, but it prevents losing track of a thread of execution.