summaryrefslogtreecommitdiff
path: root/tests/misc/cut.pl
AgeCommit message (Collapse)Author
2015-09-12cut: refactor into set-fields moduleAssaf Gordon
Extract the functionality of parsing --field=LIST into a separate module, to be used by other programs. * src/cut.c: move field parsing code from here ... * src/set-fields.{c,h}: ... to here. (set_fields): generalize by supporting multiple parsing/reporting options. (struct range_pair): rename to field_range_pair. * src/local.mk: link cut with set-field. * po/POTFILES.in: add set-field.c * tests/misc/cut.pl: update wording of error messages
2015-01-01maint: update all copyright year number rangesPádraig Brady
Run "make update-copyright" and then... * tests/sample-test: Adjust to use the single most recent year. * tests/du/bind-mount-dir-cycle-v2.sh: Fix case in copyright message, so that year is updated automatically in future.
2014-06-01cut: restore special case handling of -f with -d$'\n'Pádraig Brady
commits v8.20-98-g51ce0bf and v8.20-99-gd302aed changed cut(1) to process each line independently and thus promptly output each line without buffering. As part of those changes we removed the special handling of --delimiter=$'\n' --fields=... which could be used to select arbitrary (ranges of) lines, so as to simplify and optimize the implementation while also matching the behavior of different cut(1) implementations. However that GNU behavior was in place for a long time, and could be useful in certain cases like making a separated list like `seq 10 | cut -f1- -d$'\n' --output-delimiter=,` although other tools like head(1) and paste(1) are more suited to this operation. This patch reinstates that functionality but restricts the "line behind" buffering behavior to only the -d$'\n' case. We also fix the following related edge case to be more consistent: before> printf "\n" | cut -s -d$'\n' -f1- | wc -l 2 before> printf "\n" | cut -d$'\n' -f1- | wc -l 1 after > printf "\n" | cut -s -d$'\n' -f1- | wc -l 1 after > printf "\n" | cut -d$'\n' -f1- | wc -l 1 * src/cut.c (cut_fields): Adjust as discussed above. * tests/misc/cut.pl: Likewise. * NEWS: Mention the change in behavior both for v8.21 and this effective revert. * cfg.mk (old_NEWS_hash): Adjust for originally omitted v8.21 entry. * src/paste.c: s/delimeter/delimiter/ comment typo fix.
2014-01-02maint: update all copyright year number rangesBernhard Voelker
Run "make update-copyright", but then also run this, perl -pi -e 's/2\d\d\d-//' tests/sample-test to make that one script use the single most recent year number.
2013-04-29cut: make memory allocation independent of range widthCojocaru Alexandru
The current implementation of cut, uses a bit array, an array of `struct range_pair's, and (when --output-delimiter is specified) a hash_table. The new implementation will use only an array of `struct range_pair's. The old implementation is memory inefficient because: 1. When -b with a big num is specified, it allocates a lot of memory for `printable_field'. 2. When --output-delimiter is specified, it will allocate 31 buckets. Even if only a few ranges are specified. Note CPU overhead is increased to determine if an item is to be printed, as shown by: $ yes abcdfeg | head -n1MB > big-file $ for c in with-bitarray without-bitarray; do src/cut-$c 2>/dev/null echo -ne "\n== $c ==" time src/cut-$c -b1,3 big-file > /dev/null done == with-bitarray == real 0m0.084s user 0m0.078s sys 0m0.006s == without-bitarray == real 0m0.111s user 0m0.108s sys 0m0.002s Subsequent patches will reduce this overhead. * src/cut.c (set_fields): Set and initialize RP instead of printable_field. * src/cut.c (is_range_start_index): Use CURRENT_RP rather than a hash. * tests/misc/cut.pl: Check if `eol_range_start' is set correctly. * tests/misc/cut-huge-range.sh: Rename from cut-huge-to-eol-range.sh, and add a test to verify large amounts of mem aren't allocated. Fixes http://bugs.gnu.org/13127
2013-02-04cut: fix a segfault with disjoint open ended rangesPádraig Brady
Fixes the issue introduced in unreleased commit v8.20-60-gec48bea. * src/cut.c (set_fields): Don't access the bit array if we've an open ended range that's outside any finite range. * tests/misc/cut.pl: Add tests for this case. Reported by Marcel Böhme in http://bugs.gnu.org/13627
2013-01-26cut: fix -f to work with the -d$'\n' edge casePádraig Brady
* src/cut.c (cut_fields): Handle the edge case where '\n' is the delimiter, which could be used for example to suppress the last line if it doesn't contain a '\n'. * test/misc/cut.pl: Add tests for this edge case.
2013-01-26cut: with -f, process each line independentlyPádraig Brady
Previously line N+1 was inspected before line N was fully output, which causes output ordering issues at the terminal or delays from intermittent sources like tail -f. * src/cut.c (cut_fields): Adjust so that we record the previous output character so we can use that info to determine wether to output a '\n' or not. * tests/misc/cut.pl: Add tests to ensure existing functionality isn't broken. * NEWS: Mention the fix. Fixes bug http://bugs.gnu.org/13498
2013-01-01maint: update all copyright year number rangesJim Meyering
Run "make update-copyright", but then also run this, perl -pi -e 's/2\d\d\d-//' tests/sample-test to make that one script use the single most recent year number.
2012-12-06maint: fix a referenced coreutils version in a test commentPádraig Brady
* tests/misc/cut.pl: This particular bug existed up to v8.10.
2012-12-06tests: cut.pl: adjust for changed diagnosticPádraig Brady
* tests/misc/cut.pl: Since we now output the more complete error message irrespective of running in a multi-byte locale or not, adjust the test accordingly.
2012-12-06cut: improve error reportingCojocaru Alexandru
* src/cut.c (main): Treat a NUL delimiter (-d '') consistently with non NUL delimiters, and disallow such a delimiter option, unless a field is also specified. (set_fields): Provide a more accurate error message when a given list is invalid. * tests/misc/cut.pl: Add a test case.
2012-11-24cut: do not print extraneous delimiters in some unusual casesJim Meyering
When printing output delimiters, and when a to-EOL range subsumes at least one other range, cut would mistakenly print delimiters for the subsumed range. This bug was probably introduced via commit v5.2.1-639-g847e066. * src/cut.c (set_fields): Ignore any range that is subsumed by a to-EOL range. Also, move two declarations down. * tests/misc/cut.pl: Add tests to exercise this. * NEWS (Bug fixes): Mention it. Reported by Marcel Böhme in http://bugs.gnu.org/12966
2012-11-24cut: treat -b2-,3- like -b2-, not like -b3-Jim Meyering
* src/cut.c (set_fields): When two right-open-ended ranges are specified, don't blindly let the latter one take precedence over the former. Instead, use the union of the ranges. * tests/misc/cut.pl: Add tests to exercise this. * NEWS (Bug fixes): Mention it. Reported by Marcel Böhme in http://bugs.gnu.org/12966 Thanks to Berhard Voelker for catching log and NEWS typos.
2012-11-19cut: do not accept the invalid range 0-Bernhard Voelker
The command "echo 12345 | cut -b 0-" prints an empty line while it should fail with "fields and positions are numbered from 1". * src/cut.c (set_fields): Add a diagnostic for the invalid open range which starts with Zero, i.e., the range 0-. * tests/misc/cut.pl: Add tests to ensure the range 0- fails for fields (-f) and for positions (-b, -c). * NEWS: Mention the fix. Reported by Marcel Böhme in <http://bugs.gnu.org/12903>.
2012-08-30tests: add .sh and .pl suffixes to shell and perl tests, respectivelyStefano Lattarini
Not only this shrinks the size of the generated Makefile (from > 6300 lines to ~3000), but will allow further simplifications in future changes. * tests/Makefile.am (TEST_EXTENSIONS): Add '.sh' and '.pl'. (PL_LOG_COMPILER, SH_LOG_COMPILER): New, still defined simply to $(LOG_COMPILER) for the time being. (TESTS, root_tests): Adjust as described. * All tests: Rename as described.