Is there a optimal batch size for arc4random_buf?

154 Views Asked by At

I need billions of random bytes from arc4random_buf, and my strategy is to request X random bytes at a time, and repeat this many times.

My question is how large should X be. Since the nbytes argument to arc4random_buf can be arbitrarily large, I suppose there must be some kind of internal loop that generates some entropy each time its body is executed. Say, if X is a multiple of the number of random bytes generated each iteration, the performance can be improved because I’m not wasting any entropy.

I’m on macOS, which is unfortunately closed-source, so I cannot simply read the source code. Is there any portable way to determine the optimal X?

1

There are 1 best solutions below

0
rici On

Doing some benchmarks on typical target systems is probably the best way to figure this out, but looking at a couple of implementations, it seems unlikely that the buffer size will make much difference to the cost of arc4random_buffer.

The original implementation implements arc4random_buffer as a simple loop around a function which generates one byte. As long as the buffer is big enough to avoid excessive call overhead, it should make little difference.

The FreeBSD library implementation appears to attempt to optimise by periodically computing about 1K of random bytes. Then arc4random_buffer uses memcpy to copy the bytes from the internal buffer to the user buffer.

For the FreeBSD implementation, the optimal buffer size would be the amount of data available in the internal buffer, because that minimizes the number of calls to memcpy. However, there's no way to know how much that is, and it will not be the same on every call because of the rekeying algorithm.

My guess is that you will find very little difference between buffer sizes greater than, say, 16K, and probably even less. For the FreeBSD implementation, it will be very slightly more efficient if your buffer size is a multiple of 8.


Addendum: All the implementations I know of have a global rekey threshold, so you cannot influence the cost of rekeying by changing the buffer size in arc4random_buffer. The library simply rekeys every X bytes generated.