I'm using proc_open in php to call java application, pass it text to be processed and read output text. Java execution time is quite long and I found the reason for that is reading input takes most of the time. I'm not sure whether it's php's or java's fault.
My PHP code:
$process_cmd = "java -Dfile.encoding=UTF-8 -jar test.jar";
$env = NULL;
$options = ["bypass_shell" => true];
$cwd = NULL;
$descriptorspec = [
0 => ["pipe", "r"], //stdin is a pipe that the child will read from
1 => ["pipe", "w"], //stdout is a pipe that the child will write to
2 => ["file", "java.error", "a"]
];
$process = proc_open($process_cmd, $descriptorspec, $pipes, $cwd, $env, $options);
if (is_resource($process)) {
//feeding text to java
fwrite($pipes[0], $input);
fclose($pipes[0]);
//reading output text from java
$output = stream_get_contents($pipes[1]);
fclose($pipes[1]);
$return_value = proc_close($process);
}
My java code:
public static void main(String[] args) throws Exception {
long start;
long end;
start = System.currentTimeMillis();
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String in;
String input = "";
br = new BufferedReader(new InputStreamReader(System.in));
while ((in = br.readLine()) != null) {
input += in + "\n";
}
end = System.currentTimeMillis();
log("Input: " + Long.toString(end - start) + " ms");
start = System.currentTimeMillis();
org.jsoup.nodes.Document doc = Jsoup.parse(input);
end = System.currentTimeMillis();
log("Parser: " + Long.toString(end - start) + " ms");
start = System.currentTimeMillis();
System.out.print(doc);
end = System.currentTimeMillis();
log("Output: " + Long.toString(end - start) + " ms");
}
I'm passing to java html file of 3800 lines (~200KB in size as a standalone file). These are broken down execution times in the log file:
Input: 1169 ms
Parser: 98 ms
Output: 12 ms
My question is this: why does input take 100 times longer than output? Is there a way to make it faster?
Inspect your read block in the Java program: Try to use a
StringBuilder
to concat the data (instead of using+=
on aString
):Details are covered here: Why using StringBuilder explicitly
Generally speaking, to make it faster, consider using an application server (or a simple socket based server), to have a permanently running JVM. There is always some overhead when you start a JVM, on top of it the JIT needs some time as well to optimize your code. This effort is lost, after the the JVM exits.
As for the PHP program: Try to feed the Java program from the shell, just use
cat
to pipe the data (on a UNIX system like Linux). As an alternative, rewrite your Java program to accept a command line parameter for the file as well. Then you can judge, if your PHP code pipes the data fast enough.As for the Java program: If you do performance analysis, consider the recommendations in How do I write a correct micro-benchmark in Java