How to randomly chose any number of elements from array while reading it

40 Views Asked by At

I need to randomly (with equal probability) pick some fixed number of elements from array which is in the file. I want to read file once and just keep picked elements because an array can be very long and I don't want to keep it in memory. There should be equal probability that each subarray is chosen. And also at the beginning I don't know the size of array.

How can I do it?

2

There are 2 best solutions below

0
On BEST ANSWER

You need something called Reservoir Sampling.

It's explained pretty well in this blog:

http://gregable.com/2007/10/reservoir-sampling.html

1
On

If you don't care about the exact number of elements you are picking up, an easy solution would be to read the file and pick each element with a fixed probability.

If you want an exact number, you would need to know before reading the whole file how many elements there are in this file, compute a list of elements you want (as a list of integers), then read the file and pick the right elements.