I need billions of random bytes from arc4random_buf, and my strategy is to request X random bytes at a time, and repeat this many times.
My question is how large should X be. Since the nbytes argument to arc4random_buf can be arbitrarily large, I suppose there must be some kind of internal loop that generates some entropy each time its body is executed. Say, if X is a multiple of the number of random bytes generated each iteration, the performance can be improved because I’m not wasting any entropy.
I’m on macOS, which is unfortunately closed-source, so I cannot simply read the source code. Is there any portable way to determine the optimal X?
Doing some benchmarks on typical target systems is probably the best way to figure this out, but looking at a couple of implementations, it seems unlikely that the buffer size will make much difference to the cost of
arc4random_buffer.The original implementation implements
arc4random_bufferas a simple loop around a function which generates one byte. As long as the buffer is big enough to avoid excessive call overhead, it should make little difference.The FreeBSD library implementation appears to attempt to optimise by periodically computing about 1K of random bytes. Then
arc4random_bufferusesmemcpyto copy the bytes from the internal buffer to the user buffer.For the FreeBSD implementation, the optimal buffer size would be the amount of data available in the internal buffer, because that minimizes the number of calls to
memcpy. However, there's no way to know how much that is, and it will not be the same on every call because of the rekeying algorithm.My guess is that you will find very little difference between buffer sizes greater than, say, 16K, and probably even less. For the FreeBSD implementation, it will be very slightly more efficient if your buffer size is a multiple of 8.
Addendum: All the implementations I know of have a global rekey threshold, so you cannot influence the cost of rekeying by changing the buffer size in
arc4random_buffer. The library simply rekeys every X bytes generated.