Age | Commit message (Collapse) | Author |
|
* tests/misc/cut-huge-range.sh (subtract_one): New string.
(CUT_MAX): Don't pass a too-large integer to 'expr'.
|
|
Use a sentinel value that's checked implicitly, rather than
a bit array, to determine if an item should be output.
Benchmark results for this change are:
$ yes abcdfeg | head -n1MB > big-file
$ for c in orig sentinel; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== orig ==
real 0m0.049s
user 0m0.044s
sys 0m0.005s
== sentinel ==
real 0m0.035s
user 0m0.032s
sys 0m0.002s
## Again with --output-delimiter ##
$ for c in orig sentinel; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 --output-delimiter=: big-file > /dev/null
done
== orig ==
real 0m0.106s
user 0m0.103s
sys 0m0.002s
== sentinel ==
real 0m0.055s
user 0m0.052s
sys 0m0.003s
eol_range_start: Removed. 'n-' is no longer treated specially,
and instead SIZE_MAX is set for the 'hi' limit, and tested implicitly.
complement_rp: Used to complement 'rp' when '--complement' is specified.
ADD_RANGE_PAIR: Macro renamed to 'add_range_pair' function.
* tests/misc/cut-huge-range.sh: Adjust to the SENTINEL value.
Also remove the overlapping range test as this is no longer
dependent on large ranges and also is already handled with
the EOL-subsumed-3 test in cut.pl.
|
|
This issue was introduced in commit v8.21-43-g3e466ad
* src/cut.c (set_fields): Process all range pairs when merging.
* tests/misc/cut-huge-range.sh: Add a test for this edge case.
Also fix an issue where we could miss reported errors due
to truncation of the 'err' file.
|
|
print_kth() is the central function of cut used to
determine if an item is to be output or not,
so simplify it by moving some logic outside.
Benchmark results for this change are:
$ yes abcdfeg | head -n1MB > big-file
$ for c in orig split; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== orig ==
real 0m0.111s
user 0m0.108s
sys 0m0.002s
== split ==
real 0m0.088s
user 0m0.081s
sys 0m0.007s
* src/cut.c (print_kth): Refactor a branch to outside the function.
Related to http://bugs.gnu.org/13127
|
|
The current implementation of cut, uses a bit array,
an array of `struct range_pair's, and (when --output-delimiter
is specified) a hash_table. The new implementation will use
only an array of `struct range_pair's.
The old implementation is memory inefficient because:
1. When -b with a big num is specified, it allocates a lot of
memory for `printable_field'.
2. When --output-delimiter is specified, it will allocate 31 buckets.
Even if only a few ranges are specified.
Note CPU overhead is increased to determine if an item is to be printed,
as shown by:
$ yes abcdfeg | head -n1MB > big-file
$ for c in with-bitarray without-bitarray; do
src/cut-$c 2>/dev/null
echo -ne "\n== $c =="
time src/cut-$c -b1,3 big-file > /dev/null
done
== with-bitarray ==
real 0m0.084s
user 0m0.078s
sys 0m0.006s
== without-bitarray ==
real 0m0.111s
user 0m0.108s
sys 0m0.002s
Subsequent patches will reduce this overhead.
* src/cut.c (set_fields): Set and initialize RP
instead of printable_field.
* src/cut.c (is_range_start_index): Use CURRENT_RP rather than a hash.
* tests/misc/cut.pl: Check if `eol_range_start' is set correctly.
* tests/misc/cut-huge-range.sh: Rename from cut-huge-to-eol-range.sh,
and add a test to verify large amounts of mem aren't allocated.
Fixes http://bugs.gnu.org/13127
|
|
* init.cfg (require_ulimit_v_): Renamed from require_ulimit_
as this only checks for ulimit -v support. Other uses of
ulimit -t and ulimit -n in tests shouldn't cause false failures
if not supported.
* cfg.mk (sc_prohibit_test_ulimit_without_require_): A new syntax check
to ensure that require_ulimit_v_() is used iff required.
* tests/misc/head-c.sh: Add missing call to require_ulimit_v_.
* tests/rm/many-dir-entries-vs-OOM.sh: Likewise.
* tests/split/r-chunk.sh: Remove non mandatory require_ulimit_ call.
* tests/misc/sort-merge-fdlimit.sh: Likewise.
* tests/cp/link-heap.sh: Adjust to renamed require_ulimit_v_.
* tests/dd/no-allocate.sh: Likewise.
* tests/misc/csplit-heap.sh: Likewise.
* tests/misc/cut-huge-to-eol-range.sh: Likewise.
* tests/misc/printf-surprise.sh: Likewise.
|
|
* src/head.c (elide_tail_bytes_pipe): Don't use calloc as that
bypasses memory overcommit due to the zeroing requirement.
Also realloc rather than malloc the pointer array to avoid
dependence on overcommit entirely.
* tests/misc/head-c.sh: Add a test case.
Fixes http://bugs.gnu.org/13530
|
|
With --suppress-matched, the lines that match the pattern will not be
printed in the output files. I.E. the first line from the second
and subsequent splits will be suppressed.
* src/csplit.c: process_regexp(),process_line_count(): Don't output the
matched lines. Since csplit includes "up to but not including" matched
lines in each split, the first line (in the next group) is the matched
line - so just skip it.
main(): Handle new option.
usage(): Mention new option.
* doc/coreutils.texi (csplit invocation): Mention new option, examples.
* tests/misc/csplit-suppress-matched.pl: New test script.
* tests/local.mk: Reference the new test.
* NEWS: Mention new feature.
|
|
* src/shuf.c (main): If -n0 specified then no data would ever be output,
so exit without reading input.
* tests/misc/shuf.sh: Augment the related test with this case.
|
|
Reservoir sampling optimizes selecting K random lines from large or
unknown-sized input: http://en.wikipedia.org/wiki/Reservoir_sampling
Note this also avoids reading any input when -n0 is specified.
* src/shuf.c (main): Use reservoir-sampling when the number of output
lines is known, and the input size is large or unknown.
(input_size): A new function to get the input size for regular files.
(read_input_reservoir_sampling): New function to read lines from input,
keeping only K lines in memory, replacing lines with decreasing prob.
(write_permuted_output_reservoir): New function to output reservoir.
* tests/misc/shuf-reservoir.sh: An expensive_ test using valgrind to
exercise the reservoir-sampling code.
* tests/local.mk: Reference new test.
* NEWS: Mention the improvement.
|
|
* tests/misc/uniq.pl: Previously, if LOCALE_FR was not defined, all
tests would be skipped. Modified to skip only the relevant test.
|
|
* src/uniq.c (usage): Summarize the new option,
and adjust the --all-repeated option to be more consistent.
(check_file): Merge the --group functionality into
the core loop for the default uniq operation since
it's very similar and can output lines immediately upon reading.
(main): Handle the new --group option and make it
mutually exclusive with other selection options.
* tests/misc/uniq.pl: Add tests.
* NEWS: Mention the new feature.
* doc/coreutils.texi (uniq invocation): Describe --group.
|
|
* src/system.h (emit_ancillary_info): Link to the bug report email
addresses and general help URLs online rather than specifying directly.
This give us greater scope to present better info like describing
the difference between bug-coreutils@gnu.org and coreutils@gnu.org etc.
* tests/misc/help-version.sh: Remove the check for bug-coreutils@gnu.org
* tests/local.mk: Remove the no longer needed PACKAGE_BUGREPORT.
|
|
* tests/misc/uniq.pl: add tests for --ignore-case.
|
|
* NEWS: Mention join's new option: --zero-terminated (-z).
* src/join.c: Add new option, --zero-terminated (-z), to make
join use the NUL byte as separator/delimiter rather than newline.
(get_line): Use readlinebuffer_delim in place of readlinebuffer.
(main): Handle the new option.
(usage): Describe new option the same way sort does.
* doc/coreutils.texi (join invocation): Describe the new option.
* tests/misc/join.pl: add tests for -z option.
|
|
Both factor and numfmt recently introduced debug messages
for developers, enabled by --verbose and ---devdebug respectively.
There were a few issues though:
1. They used different mechanisms to enable these messages.
2. factor used --verbose which might be needed for something else
3. They used different methods to output the messages,
and numfmt used error() which added an unwanted newline
4. numfmt marked all these messages for translation and factor
marked a couple. We really don't need these translated.
So we fix the above issues here while renaming the enabling
option for both commands to ---debug (still undocumented).
* src/factor.c (verbose): Rename to dev_debug and change from int to
bool as it's just a toggle flag.
(long_options): Rename --verbose to ---debug.
* src/system.h (devmsg): A new inline function to output a message
if enabled by a global dev_debug variable in the compilation unit.
* src/numfmt.c: Use devmsg() rather than error().
Also remove the translation tags from these messages.
Also change debug flag to bool from int.
* tests/misc/numfmt.pl: Adjust for the ---devdebug to ---debug change.
* cfg.mk (sc_marked_devdiagnostics): Add a syntax check to ensure
translations are not added to devmsg calls.
Reported by Göran Uddeborg in http://bugs.gnu.org/13665
|
|
* tests/misc/numfmt.pl: When the system locale grouping doesn't
match our expected format for grouping 1234 in the fr_FR locale,
reset the locale to 'C' so as to skip all locale tests.
|
|
Originally requested in Red Hat bugzilla #445213.
* src/stty.c (mode_info): Add support for DTR/DSR hardware flow control,
if available.
* doc/coreutils.texi: Document it.
* tests/misc/stty.sh: Add it to the list of serial options to avoid.
* NEWS: Mention the improvement.
|
|
* AUTHORS: Add my name.
* NEWS: Mention the new program.
* README: Reference the new program.
* src/numfmt.c: New file.
* src/.gitignore: Ignore the new binary.
* build-aux/gen-lists-of-programs.sh: Update.
* scripts/git-hooks/commit-msg: Allow numfmt: commit prefix.
* po/POTFILES.in: Add new c file.
* tests/misc/numfmt.pl: A new test file giving >93% coverage.
* tests/local.mk: Reference the new test.
* man/.gitignore: Ignore the new man page.
* man/local.mk: Reference the new man page.
* man/numfmt.x: A new template.
* doc/coreutils.texi: Document the new command.
|
|
Fixes the issue introduced in unreleased commit v8.20-60-gec48bea.
* src/cut.c (set_fields): Don't access the bit array if
we've an open ended range that's outside any finite range.
* tests/misc/cut.pl: Add tests for this case.
Reported by Marcel Böhme in http://bugs.gnu.org/13627
|
|
* src/timeout.c (unblock_signal): A new function to unblock a
specified signal, or warn if not possible.
(set_timeout): Ensure SIGALRM is unblocked before we setup the timer.
* tests/misc/timeout-blocked.pl: A new test for the issue.
* tests/local.mk: Reference the new test.
* NEWS: Mention the fix.
Fixes: http://bugs.gnu.org/13535
|
|
* src/seq.c (main): With 3 positive integer args we were
checking the end value was == "1", rather than the step value.
* tests/misc/seq.pl: Add tests for this case.
Reported by Marcel Böhme in http://bugs.gnu.org/13525
|
|
* src/seq.c (get_default_format): Also account for the case where '.'
is auto added to the start value, which is significant when the
number sequence narrows.
* tests/misc/seq.pl: Add two new tests for the failing cases.
* NEWS: Mention the fix.
Fixes http://bugs.gnu.org/13394
|
|
* src/cut.c (cut_fields): Handle the edge case where '\n' is
the delimiter, which could be used for example to suppress
the last line if it doesn't contain a '\n'.
* test/misc/cut.pl: Add tests for this edge case.
|
|
Previously line N+1 was inspected before line N was fully output,
which causes output ordering issues at the terminal or delays
from intermittent sources like tail -f.
* src/cut.c (cut_fields): Adjust so that we record the
previous output character so we can use that info to
determine wether to output a '\n' or not.
* tests/misc/cut.pl: Add tests to ensure existing
functionality isn't broken.
* NEWS: Mention the fix.
Fixes bug http://bugs.gnu.org/13498
|
|
Run "make update-copyright", but then also run this,
perl -pi -e 's/2\d\d\d-//' tests/sample-test
to make that one script use the single most recent year number.
|
|
This regression was introduced in commit v8.19-132-g3786fb6.
* src/seq.c (seq_fast): Don't use puts() to output the first number,
and instead insert it into the buffer as for other numbers.
Also output the terminator unconditionally.
* tests/misc/seq.pl: Add some basic tests for the -s option.
* NEWS: Mention the fix.
* THANKS.in: Reported by Philipp Gortan.
|
|
The -z option has been introduced in commit v8.15-60-ga3eb71a,
i.e. in coreutils-8.16. Time to add some tests for it.
* tests/misc/basename.pl: Add tests exercising the -z option.
In the foreach loop to append a newline to the end of each
expected 'OUT' string, skip the -z tests.
|
|
* tests/misc/timeout-group.sh: The kernel might possibly delay
signal propagation to timeout.cmd long enough, that it exits
normally without running the signal handler (as sleep will
be in the same process group and so get the signal too).
So avoid this by explicitly checking that the signal handler
is called, which should always happen under normal circumstances.
Reported by Stefano Lattarini on linux-2.6.30-2-686 and bash-4.2.36.
|
|
* tests/misc/cut-huge-to-eol-range.sh: New test, showing that
the change in v8.20-51-g7d03466 is a bug fix after all.
* tests/local.mk (all_tests): Add it.
* NEWS (Bug fixes): Mention it.
|
|
* tests/misc/cut.pl: This particular bug existed up to v8.10.
|
|
* tests/misc/cut.pl: Since we now output the more
complete error message irrespective of running
in a multi-byte locale or not, adjust the test accordingly.
|
|
* src/cut.c (main): Treat a NUL delimiter (-d '') consistently
with non NUL delimiters, and disallow such a delimiter option,
unless a field is also specified.
(set_fields): Provide a more accurate error message when
a given list is invalid.
* tests/misc/cut.pl: Add a test case.
|
|
When printing output delimiters, and when a to-EOL range subsumes
at least one other range, cut would mistakenly print delimiters for
the subsumed range. This bug was probably introduced via commit
v5.2.1-639-g847e066.
* src/cut.c (set_fields): Ignore any range that is subsumed by a
to-EOL range. Also, move two declarations down.
* tests/misc/cut.pl: Add tests to exercise this.
* NEWS (Bug fixes): Mention it.
Reported by Marcel Böhme in http://bugs.gnu.org/12966
|
|
* src/cut.c (set_fields): When two right-open-ended ranges are
specified, don't blindly let the latter one take precedence over
the former. Instead, use the union of the ranges.
* tests/misc/cut.pl: Add tests to exercise this.
* NEWS (Bug fixes): Mention it.
Reported by Marcel Böhme in http://bugs.gnu.org/12966
Thanks to Berhard Voelker for catching log and NEWS typos.
|
|
* tests/misc/timeout.sh: Take advantage of recent support for
sub-second timeouts to decrease runtime from about 6s to 2s.
|
|
* src/seq.c (scan_arg): Calculate the width more accurately
for numbers specified using scientific notation.
* tests/misc/seq.pl: Add tests for cases that were mishandled.
* NEWS: Mention the fix.
* THANKS.in: Reported by Marcel Böhme.
Fixes http://bugs.gnu.org/12959
|
|
The command "echo 12345 | cut -b 0-" prints an empty line while
it should fail with "fields and positions are numbered from 1".
* src/cut.c (set_fields): Add a diagnostic for the invalid open
range which starts with Zero, i.e., the range 0-.
* tests/misc/cut.pl: Add tests to ensure the range 0- fails for
fields (-f) and for positions (-b, -c).
* NEWS: Mention the fix.
Reported by Marcel Böhme in <http://bugs.gnu.org/12903>.
|
|
It's useful for commands that support running for an indeterminite
amount of time, to not return a specific timeout exit status (124),
and instead let the command handle the timeout signal and return
a status for the work done so far.
* doc/coreutils.texi (timeout invocation): Describe the new option.
* src/timeout.c (preserve_status): A new global boolean to
enable the --preserve-status behavior.
(usage): Describe the new option.
(main): Don't return EXIT_TIMEOUT of preserve_status is set.
* tests/misc/timeout.sh: Add a test for the new option.
|
|
* tests/misc/pr.pl: Refactor this test into ...
* tests/pr/pr-tests.pl: ... here.
* tests/local.mk: Remove the reference to the removed test
Improved by Jim Meyering
|
|
* tests/misc/factor.pl: Correct the precedence and
regular expression in the command to check for GMP.
|
|
In the recent factor rewrite, the GMP code
wasn't actually used; just an error was printed
on integer overflow. While fixing that it was noticed
that correct input validation wasn't done in all cases
when falling back to the GMP code.
* src/factor.c (print_factors) Fallback to GMP on overflow.
(strto2uintmax): Scan the string for invalid characters,
so that case can be detected independently of overflow.
Return an error when an empty string is passed.
Also allow leading spaces and '+' in input numbers.
* tests/misc/factor.pl: Ensure the GMP code is exercised
when compiled in. Also add a test to verify leading
spaces and '+' are allowed.
|
|
When listing a directory containing dangling symlinks,
and not outputting a long format listing, and orphaned links
are set to no coloring in LS_COLORS, then the symlinks
would get no color rather than reverting to the standard
symlink color. The issue was introduced in v8.13-19-g84457c4
* src/ls.c (print_color_indicator): Use the standard method
to check if coloring is specified for orphaned symlinks.
The existing method would consider 'or=00' or 'or=0' as significant
in LS_COLORS. Even 'or=' was significant as in that case the
string='or=' and the length=0. Also apply the same change
for missing symlinks for consistency.
(gobble_file): Remove the simulation of linkok, which is only
tested in print_color_indicator() which now handles this directly
by keying on the LS_COLORS values correctly.
* tests/misc/ls-misc.pl: Add a test case.
* THANKS: Add the reporter.
* NEWS: Mention the fix.
Reported-by: David Matei
|
|
Handle non-negative whole numbers robustly and efficiently when
the increment is 1 and when no format-changing option is specified.
On the correctness front, for very large numbers, seq now works fine:
$ b=1000000000000000000000000000
$ src/seq ${b}09 ${b}11
100000000000000000000000000009
100000000000000000000000000010
100000000000000000000000000011
while the old one would infloop, printing garbage:
$ seq ${b}09 ${b}11 | head -2
99999999999999999997315645440
99999999999999999997315645440
The new code is much more efficient, too:
Old vs new: 55.81s vs 0.82s
$ env time --f=%e seq $((10**8)) > /dev/null
55.81
$ env time --f=%e src/seq $((10**8)) > /dev/null
0.82
* seq.c (incr): New function, inspired by the one in cat.c.
(cmp, seq_fast): New functions, inspired by code in nt-factor
by Torbjörn Granlund and Niels Möller.
(trim_leading_zeros): New function, without which cmp would malfunction.
(all_digits_p): New function.
(main): Hoist the format_str-vs-equal_width check to precede first
treatment of operands, and insert code to call seq_fast when possible.
* NEWS (Bug fixes): Mention the correctness fix.
(Improvements): Mention the speed-up.
* tests/misc/seq.pl: Exercise the new code.
Improved by: Bernhard Voelker.
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/3340
|
|
* NEWS (Bug fixes): Mention it.
* tests/misc/factor.pl: Add five of Torbjörn's tests.
|
|
* Makefile.am (SUBDIRS): Remove 'tests'.
(include): The '$(top_srcdir)/tests/local.mk' file.
(check-root): Remove this convenience target, it's no longer needed
now that the "real" check-root target once in 'tests/Makefile' will
land in the top-level makefile.
* configure.ac (AC_CONFIG_FILES): Remove 'tests/Makefile'.
* tests/Makefile.am: Rename ...
* tests/local.mk: ... like this, with a lot of adjustments.
* tests/init.cfg: Move ...
* init.cfg: ... here. This is necessary, for a limitation of the
gnulib-provided 'tests/init.sh', which unconditionally look for
'init.cfg' in the $(srcdir) directory.
* tests/*/*.sh: Adjust: expect init.sh to be in '$srcdir/tests',
not in '$srcdir', and extend $PATH with './src', not with '../src'.
* tests/Coreutils.pm: Adjust similarly.
* tests/pr/pr-tests.pl ($pfx): Likewise.
|
|
Not only this shrinks the size of the generated Makefile (from > 6300
lines to ~3000), but will allow further simplifications in future
changes.
* tests/Makefile.am (TEST_EXTENSIONS): Add '.sh' and '.pl'.
(PL_LOG_COMPILER, SH_LOG_COMPILER): New, still defined simply to
$(LOG_COMPILER) for the time being.
(TESTS, root_tests): Adjust as described.
* All tests: Rename as described.
|
|
* tests/misc/sort-exit-early: skip_if_root_ as this test
requires an unwritable input and an unreadable output.
|
|
The format used is the BSD traditional format which looks like:
MD5 (/dev/null) = d41d8cd98f00b204e9800998ecf8427e
* NEWS: Add new feature info.
* doc/coreutils.texi (md5sum invocation): Add detailed information
about the new --tag option.
* src/md5sum.c: Add the new --tag option for BSD-style output.
(bsd_split_3): Add ESCAPED_FILENAME parameter.
(print_filename): New function refactored from main().
(filename_unescape): New function refactored from split_3().
* tests/misc/md5sum-bsd: Add tests for the new feature.
|
|
We use print_ver_ to run "PROG --version" for each program under
test. Some tests have been derived from others, while the
argument(s) to print_ver_ have not been adapted.
Add a new cfg.mk rule to prohibit this.
* cfg.mk (sc_prohibit_test_calls_print_ver_with_irrelevant_argument):
New rule, to prohibit a test script from calling print_env_ for a
program not actually used by that test.
* tests/chown/basic: s/\(print_ver_\) chgrp/\1 chown/
* tests/cp/acl: s/\(print_ver_\) mv/\1 cp/
* tests/cp/capability: s/\(print_ver_\) ls/\1 cp/
* tests/cp/cp-parents: s/(print_ver_\) mv/\1 cp/
* tests/du/bind-mount-dir-cycle: s/(print_ver_\) rm/\1 du/
* tests/misc/wc-parallel: s/(print_ver_\) md5sum/\1 wc/
|