Add a helper script to improve accuracy when benchmarking
This script put the CPU in 'performance' mode to avoid distortion in the
results because of CPU frequency changes. It also gives the process
maximum CPU time and I/O priority using nice and ionice respectively.
Finally, it only uses the number of CPUs specifiend in the CPUS
environment variable to run the program.
The scripts is used like this:
CPUS=2 ./bench.sh some_prog --some-args
This runs "some_prog" passing "--some-args" as arguments, using 2 CPUs.
stats.py: Allow specifying input expression and output format
Now the script can take an arbitrary (optional) expression as the first
argument to parse the input file, using a separator (taken as the third
argument) and field specification for the expression, using $1 for the
first field, $2 for the second, etc. (similar to AWK). As second argument
the script takes the output format, in Python format specification, with
keys: min, max, mean, and std.
The defaults are '$1' for the first argument (expression to parse as
input), '%(min)s,%(mean)s,%(max)s,%(std)s' as output format and ',' as
input separator. This defaults match the old behaviour.
Remove some micro benchmarks that provided no added value, rename others
to have shorter names and add a few; some Olden benchmarks and a couple
of benchmarks to exercise concurrency.
Run the micro benchmarks several times to collect timing statistics. Only
the total run time is collected (in CSV format) and then the minimum,
mean, maximum and standard deviation are calculated based on the collected
values.
Add arguments to some micro benchmarks that needed them too.
The naive collector is now gone, there are no statistics collection, etc.
The object files and binaries are generated in separate directories now.
The statistics graph generation is still there but not working, only
building the benchmarks works, including dil, which is added as a
submodule for simplicity (and keeping track of working version).
The graph is now splat in two, reversing the y axis of the second graph
(after collection), so the tics labels are now properly displayed as
positive numbers.
Pass the environment variable D_GC_STATS=1 to programs when running them
to tell them to collect GC statistics, just in case the collector don't
do it by default and understand the meaning of the environment variable.
Originally a ".c" suffix was used to mark files as being related to
*C*ollections. But since both collections and allocations statistics are
now plotted in the same file, there is no point in that distinction (which,
OTOH, is wrong too).
micro: Plot allocation statistics (time, space and histogram)
The allocation statistics are plotted in the same graph where the
collections are plotted using the same program run time axe (where
possible) to easy comparisons and analysis.
There is no way to take histogram tics labels from a data file in GNUPlot.
A new script is added to generate the histogram tics labels from the data,
suitable to be included directly in a GNUPlot script.
micro: Improve generation of data to plot the histogram
If a program allocates a lot of cell with different sizes the histogram
would have a lot of bars to plot, making it very hard to read.
The script to generate the histogram data is improved to take a maximum
number of cell sizes. If that limit is passed, the cell sizes are grouped
together in ranges of cell sizes, keeping the histogram readable.
micro: Generate data to plot an allocation histogram
The newly generated %.h.csv files has how many allocations has been
requested for each object (cell) size. A histogram will be plotted based
on this data in the future.