From 4c795d543908ea4715b3e0bd6c6cf908315936d8 Mon Sep 17 00:00:00 2001 From: Assaf Gordon Date: Wed, 7 Jan 2015 18:30:28 -0500 Subject: split: new -t option to select record separator * src/split.c (eolchar): A new variable to hold the separator character (unibyte for now). This is reference throughout rather than hardcoding '\n'. (usage): Describe the new --separator option, and mention records along with lines so there is no ambiguity that all options treat lines and records equivalently. (main): Have -t update eolchar, or default to '\n'. * tests/split/record-sep.sh: New test case. * tests/local.mk: Reference the new test. * doc/coreutils.texi (split invocation): Document the new option. Adjust --lines, --line-bytes, --number=[lr]/... to mention they pertain to records if --separator is specified. * NEWS: Mention the new feature. --- doc/coreutils.texi | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) (limited to 'doc') diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 1cc65329c..5a3c31a15 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3395,6 +3395,8 @@ The program accepts the following options. Also see @ref{Common options}. @opindex -l @opindex --lines Put @var{lines} lines of @var{input} into each output file. +If @option{--separator} is specified, then @var{lines} determines +the number of records. For compatibility @command{split} also supports an obsolete option syntax @option{-@var{lines}}. New scripts should use @@ -3412,9 +3414,11 @@ Put @var{size} bytes of @var{input} into each output file. @opindex -C @opindex --line-bytes Put into each output file as many complete lines of @var{input} as -possible without exceeding @var{size} bytes. Individual lines longer than -@var{size} bytes are broken into multiple files. +possible without exceeding @var{size} bytes. Individual lines or records +longer than @var{size} bytes are broken into multiple files. @var{size} has the same format as for the @option{--bytes} option. +If @option{--separator} is specified, then @var{lines} determines +the number of records. @item --filter=@var{command} @opindex --filter @@ -3445,7 +3449,7 @@ Split @var{input} to @var{chunks} output files where @var{chunks} may be: @example @var{n} generate @var{n} files based on current size of @var{input} @var{k}/@var{n} only output @var{k}th of @var{n} to stdout -l/@var{n} generate @var{n} files without splitting lines +l/@var{n} generate @var{n} files without splitting lines or records l/@var{k}/@var{n} likewise but only output @var{k}th of @var{n} to stdout r/@var{n} like @samp{l} but use round robin distribution r/@var{k}/@var{n} likewise but only output @var{k}th of @var{n} to stdout @@ -3462,10 +3466,10 @@ or the @var{input} is truncated. For @samp{l} mode, chunks are approximately @var{input} size / @var{n}. The @var{input} is partitioned into @var{n} equal sized portions, with the last assigned any excess. If a line @emph{starts} within a partition -it is written completely to the corresponding file. Since lines +it is written completely to the corresponding file. Since lines or records are not split even if they overlap a partition, the files written can be larger or smaller than the partition size, and even empty -if a line is so long as to completely overlap the partition. +if a line/record is so long as to completely overlap the partition. For @samp{r} mode, the size of @var{input} is irrelevant, and so can be a pipe for example. @@ -3505,6 +3509,17 @@ than the number requested, or if a line is so long as to completely span a chunk. The output file sequence numbers, always run consecutively even when this option is specified. +@item -t @var{separator} +@itemx --separator=@var{separator} +@opindex -t +@opindex --separator +@cindex line separator character +@cindex record separator character +Use character @var{separator} as the record separator instead of the default +newline character (ASCII LF). +To specify ASCII NUL as the separator, use the two-character string @samp{\0}, +e.g., @samp{split -t '\0'}. + @item -u @itemx --unbuffered @opindex -u -- cgit v1.2.3-54-g00ecf