Indexed .lzo log files performing slower than .gz compressionxt

135 Views Asked by Carl Sagan At 18 October 2025 at 07:31

I have some log files compressed at lzo setting 7 and gzip at default compression and my results are as follows:

MapReduce job over:

1GB .gz file - 340 seconds
1GB .lzo file un-indexed - 410 seconds
1GB .lzo file indexed - 380 seconds

The MapReduce job simply utilizes the Hadoop-LZO library's LzoTextInputFormat class instead of the usual TextInputFormat class. That's the only difference.

I see 37 map tasks come through and split up the job and use the .index file, but the performance leaves a lot to be desired. Any ideas?

Original Q&A

Indexed .lzo log files performing slower than .gz compressionxt

There are 0 best solutions below

Related Questions in JAVA

Related Questions in HADOOP

Related Questions in MAPREDUCE

Related Questions in LZO

Trending Questions

Popular # Hahtags

Popular Questions