I have to analyze a huge log file for management report purpose.
The format of the log file is as below:-
[2014-08-28 08:49:40 GMT][Level:DEBUG] Connection from UGUBUKBBBHJGJ.mt.site (123.131.21.20) , user : 12345678 for compositeId : com.my.solution.name.abc
[2014-08-28 08:49:41 GMT][Level:DEBUG] Connection from TYIYIYPOYUUGG.mt.site (123.131.21.20) , user : 12345678 for compositeId : com.my.solution.name.def
[2014-08-29 05:55:21 GMT][Level:DEBUG] Connection from OJPPMMJOOHJIH.mt.site (123.131.22.33) , user : 12345678 for compositeId : com.my.solution.name.ghi
[2014-08-29 05:55:22 GMT][Level:DEBUG] Connection from HGJJKHKHKHKJH.mt.site (123.131.22.33) , user : 12345678 for compositeId : com.my.solution.name.jkl
I have replaced the actual values in logs with some dummy ones.
How can I split my log file in such a way that my one inputsplit contains logs of only single date and thus one mapper processes all logs of a single day.