Java: parse text into primitive numbers without object instantiation

1k Views Asked by At

Using the java language I read text files that contain numbers. There are terabytes of data and hundreds of billions of numbers.

The goal is to fetch the data as fast as possible, and minimize GC activity. I want to parse text directly into primitives (double, float, int).

By directly I mean:

  • without instantiating any transient helper object
  • without boxing data in java.lang.Double, java.lang.Float...
  • without creating transient java.lang.String instances (a mandatory step if you are to call JDK Double.parseDouble(...))

So far I have been using the javolution framework:

double javolution.text.TypeFormat.parseDouble(CharSequence sequence);

I looked at the javolution code and it truly does not allocate any transient object. And because it accepts a CharSequence, you can present the characters decoded from the data files without instantiating transient Strings.

Are there alternatives or better ways?

2

There are 2 best solutions below

5
On

The method Double.parseDouble(String) does instantiate an object under the hood, but it uses caching, returning a double read from the string.
This answer offers more details.

For the rest of 'em: the Javolution package seems to be written for real-time performance, thus it seems to be a proper package.

1
On

StreamTokenizer, examined here, may be worth profiling. It parses decimal numbers as double but does not handle scientific notation.