Ascending Integers in TXT to Array

90 Views Asked by At

my problem is to get huge Text Files (UTF-8 -1byte (ANSI)) containing unsigned Integers without duplicates in Ascending Order into an Array. FAST! So I was going for something like:

while(scan.hasNextInt()) x.add(scan.nextInt());

But whether i go with an ArrayList, Vectors or a plain Array with Files containing millions of Integers it would be wise to determine the maximum Capacity needed to avoid increasing the array size later.

With File.length() i will get the amount of digits + Line Feeds in the File.

In the worst Case it would start at 0 and in each line only increment by 1.
I think somehow the max. capacity is calculable using combinatorics, but I am at a dead end. The fact that smaller Numbers don't get filled with Zeros (002) somehow throws me off.

Taking the size of the first Int into consideration i think one might also be able to approximate a little further to the real amount.

So my most important question is to calculate an approximated [in O(1)]maximum Capacity needed.

In addition I am asking my self if scan.hasNextInt() and scan.nextInt() are the fastest considering this rather unique problem and if parallelization via Threads could speed up the process even more (considering the features of reading from a Hard Drive probably not).

regards Halo

1

There are 1 best solutions below

3
On

Assuming there is only one byte used to separate two numbers (eg. a '\n') we have

  • 10 numbers with 1 digit -> 20 bytes
  • 90 numbers with 2 digits -> 270 bytes
  • 900 numbers with 3 digits -> 3600 bytes
  • ... you get the pattern

If your file size is now 1000 bytes, the max you can have is the 10 1 digits, the 90 two digits, with 710 bytes left for 3 digit numbers. 710/4 = 177.5, which makes at most 10+90+177 = 277 numbers.