Count Lines/Words/Chars |
Back to the Language Shootout Back to Doug's Homepage |
|
[Note: Values have been normalized to fall in the range of 0-10 for aesthetic reasons. Original value ranges are included on the X-axis. Click here for more detailed data and graphs. [Results last updated: Tue Oct 9 18:40:32 2001 CDT] |
For this test, each program should be implemented to do the same thing, following the guidelines below:
Each program reads the input from standard input, and counts the lines, words (whitespace delimited tokens), and characters, and outputs each count. The programs should not read the input by more than 4K at a time. To give a baseline of expected performance I allow bash to use an external process (wc). All other solutions should be implemented natively.
This test is essentially the same as the wordcount test from Timing Trials, or, the Trials of Timing: Experiments with Scripting and User-Interface Languages by Brian W. Kernighan and Christopher J. Van Wyk.
Note that as in the original version of this test, whitespace is defined as space, newline and tab characters. This is a little different from the Unix wc command, which defines a few more characters to be whitespace.
The programs can assume that the file ends in a newline, and they should be able to handle arbitrarily long lines.
Input file (it
is repeated N times).
The correct output (for N = 500, i.e. a 500 copies of the input) looks like this:
12500 68500 3048000
The original C program is significantly slower than the version here which bypasses stdio.
This section is for displaying alternate solutions that are either slower than ones above or perhaps don't quite meet my criteria for the competition, but are otherwise worthy of comment.
Back to the Language Shootout Back to Doug's Homepage |
Send me comments or suggestions. |