diff options
author | James Youngman <jay@gnu.org> | 2008-02-19 14:13:00 +0100 |
---|---|---|
committer | Jim Meyering <meyering@redhat.com> | 2008-02-19 15:17:39 +0100 |
commit | a1e715698a038af7ff341011a2aeecf6729c8de9 (patch) | |
tree | 7786e67b64636ee6cca2a2ca720dcc9f1ef14fbf /doc | |
parent | 4242d4f5c4f32374b684882a74e1b773ad01b1d6 (diff) | |
download | coreutils-a1e715698a038af7ff341011a2aeecf6729c8de9.tar.xz |
join: new options: --check-order and --nocheck-order.
* src/join.c: Support --check-order and --nocheck-order.
New variables check_input_order, seen_unpairable and
issued_disorder_warning[]. For --check-order, verify that the
input files are in sorted order. For the default case, check the
order only if there are unpairable lines.
(join): Perform ordering checks after reaching EOF on either
input.
(usage): Mention --check-order and --nocheck-order.
(dupline): Save a copy of the previously-read input line so that
we can detect disorder on the input.
(get_line): Temporarily save a copy of the previous line (by
calling dupline) and check relative ordering (by calling
checkorder) before returning the newly-read line.
(getseq, join): Tell get_line which file we are reading from.
(advance_seq): New function, factoring out some of the code
commonly surrounding calls to getseq.
(checkorder): New function. Verifies that a pair of consecutive
input lines are in sorted order.
* doc/coreutils.texi (join invocation): Document the new options
--check-order and --nocheck-order.
* tests/join/Test.pm (tv): Added tests for --check-order and
--nocheck-order.
* NEWS: Mention this new feature.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/coreutils.texi | 27 |
1 files changed, 23 insertions, 4 deletions
diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 016673a3b..e8ccb4bd2 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -5149,10 +5149,10 @@ sort a file on its default join field, but if you select a non-default locale, join field, separator, or comparison options, then you should do so consistently between @command{join} and @command{sort}. -As a @acronym{GNU} extension, if the input has no unpairable lines the -sort order can be any order that considers two fields to be equal if and -only if the sort comparison described above considers them to be equal. -For example: +If the input has no unpairable lines, a @acronym{GNU} extension is +available; the sort order can be any order that considers two fields +to be equal if and only if the sort comparison described above +considers them to be equal. For example: @example $ cat file1 @@ -5169,6 +5169,19 @@ c c1 c2 b b1 b2 @end example +If the @option{--check-order} option is given, unsorted inputs will +cause a fatal error message. If the option @option{--nocheck-order} +is given, unsorted inputs will never cause an error message. If +neither of these options is given, wrongly sorted inputs are diagnosed +only if an input file is found to contain unpairable lines. If an +input file is diagnosed as being unsorted, the @command{join} command +will exit with a nonzero status (and the output should not be used). + +Forcing @command{join} to process wrongly sorted input files +containing unpairable lines by specifying @option{--nocheck-order} is +not guaranteed to produce any particular output. The output will +probably not correspond with whatever you hoped it would be. + The defaults are: @itemize @item the join field is the first field in each line; @@ -5188,6 +5201,12 @@ The program accepts the following options. Also see @ref{Common options}. Print a line for each unpairable line in file @var{file-number} (either @samp{1} or @samp{2}), in addition to the normal output. +@item --check-order +Fail with an error message if either input file is wrongly ordered. + +@item --nocheck-order +Do not check that both input files are in sorted order. This is the default. + @item -e @var{string} @opindex -e Replace those output fields that are missing in the input with |