Age | Commit message (Collapse) | Author |
|
Run "make update-copyright", but then also run this,
perl -pi -e 's/2\d\d\d-//' tests/sample-test
to make that one script use the single most recent year number.
|
|
* tests/misc/shuf-reservoir.sh: Restrict the valgrind
"exit on leak" behavior to developer environments where
specific "lint" code is enabled to avoid inconsequential leaks.
|
|
Reservoir sampling optimizes selecting K random lines from large or
unknown-sized input: http://en.wikipedia.org/wiki/Reservoir_sampling
Note this also avoids reading any input when -n0 is specified.
* src/shuf.c (main): Use reservoir-sampling when the number of output
lines is known, and the input size is large or unknown.
(input_size): A new function to get the input size for regular files.
(read_input_reservoir_sampling): New function to read lines from input,
keeping only K lines in memory, replacing lines with decreasing prob.
(write_permuted_output_reservoir): New function to output reservoir.
* tests/misc/shuf-reservoir.sh: An expensive_ test using valgrind to
exercise the reservoir-sampling code.
* tests/local.mk: Reference new test.
* NEWS: Mention the improvement.
|