Groovy vs Java file I/O comparison, and how to improve it using withWriter.
I learnt by making a mistake and hope to help someone with this post.
This weekend I tried to compare Java file I/O with Groovy on two fronts:
a. Code bevity.
b. Performance (time taken to execute task.)
Groovy code was 34 lines and Java came out to be 48. The groovy code could be shorter but somehow I couldn't get the multiple assignments working. Got run time error.
On the performance front Java was a winner. Java code took 425 ms and Groovy code took 37112 ms. I was expecting Groovy to be slower, but not by this margin! I was hoping I was doing something wrong.
Some code is omitted for brevity. You can download the source code from GitHub repository.
final File outputFile = new File(/D:\code\IProjects\GroovyPlayground\BigFileInGroovy.txt/)
... //Code omitted for brevity
for (final int i in 1..200000) {
... //Code omitted for brevity
String line = "${accountNumber}, ${customerNumber}, ${totalMin}, ${totalMinsUsed++}, ${totalMinsAvail}, ${price}, John Smith\n"
if (i == 1) {
outputFile.write(line)
} else {
outputFile.append(line)
}
}
So I sent an email to the Groovy user group and got prompt replies back from the users. Thanks to the awesome Groovy community I learnt what I was doing wrong. I was using the append method on the File object directly, which was creating a new reader, opening the file, writing the line at the end of the file and then closing the file and reader. Quoting Jochen Theodorou: "In your Java program you use BufferedWritter#append and in Groovy you use File#append. The big difference between those two methods is, that File#append, has to create a new reader, open the file, go to its end and there attach the line, just to close file and reader again. This takes a lot of time and burns away your performance."
So the trick to improve performance was to use withWriter on the File object that provides a buffered writer for the file in the context of the closure:
outputFile.withWriter('UTF-8') { writer ->
String line = "${accountNumber}, ${customerNumber}, ${totalMin}, ${totalMinsUsed++}, ${totalMinsAvail}, ${price}, John Smith\n"
writer << (line)
}
Now the execution time is 671 msec. Way better then previous 37112 msec.
Big thanks to Leonard Axelsson and Jochen Theodorou and in general to everyone who replied to my question on Groovy user group!