How to design simple Log analyser in java

799 Views Asked by At

I want to design a log analyser where i will provide application.log file to java program it will parse the log file and try to capture few fields like time, Ip Address, Status Code (200/401/500 etc), Request Type (GET/POST/PUT etc) etc. i want to store it in some data base where i can use it later for some analysis.

I have few questions that i want some more understanding.

  1. is there any utility in java for parse a log file?
  2. Should i use sql or no sql DB for storing this data?
  3. How to handle parsing big file like 500 MB?
  4. Should i use multithreading to read file more faster?

Sample input of file can be

84.55.41.57 - - [16/Apr/2016:20:21:56 +0100] "GET /john/assets/js/skel.min.js HTTP/1.1" 200 3532 "http://www.example.com/john/index.php" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"
84.55.41.57 - - [16/Apr/2016:20:21:56 +0100] "GET /john/images/pic01.jpg HTTP/1.1" 200 9501 "http://www.example.com/john/index.php" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"
84.55.41.57 - - [16/Apr/2016:20:21:56 +0100] "GET /john/images/pic03.jpg HTTP/1.1" 200 5593 "http://www.example.com/john/index.php" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"

  
1

There are 1 best solutions below

0
On
  1. Is there any utility in java for parse a log file?

Yes, you can use simple Regex(java.util.regex.Pattern API) to parse the lines if you already know the format of log. You can use AWStats or even logstash if your usecase is complex.

  1. Should i use sql or no sql DB for storing this data?

There's no straight answer. It depends on your usecase ( read/write, replication, etc)

  1. How to handle parsing big file like 500 MB?

Java's FileInputStream is capable of reading files of any size

  1. Should i use multithreading to read file more faster?

I don't think you can use multi threading to read a single file without splitting it. However, you can run several threads to parse separate files.