summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/textutils.texi130
1 files changed, 66 insertions, 64 deletions
diff --git a/doc/textutils.texi b/doc/textutils.texi
index 2deeef748..968c0312e 100644
--- a/doc/textutils.texi
+++ b/doc/textutils.texi
@@ -1,7 +1,7 @@
\input texinfo
@c %**start of header
@setfilename textutils.info
-@settitle GNU text utilities
+@settitle @sc{gnu} text utilities
@c %**end of header
@include version.texi
@@ -20,7 +20,7 @@
@ifinfo
@format
START-INFO-DIR-ENTRY
-* Text utilities: (textutils). GNU text utilities.
+* Text utilities: (textutils). GNU text utilities.
* cat: (textutils)cat invocation. Concatenate and write files.
* cksum: (textutils)cksum invocation. Print @sc{posix} CRC checksum.
* comm: (textutils)comm invocation. Compare sorted files by line.
@@ -79,7 +79,7 @@ by the Foundation.
@end ifinfo
@titlepage
-@title GNU @code{textutils}
+@title @sc{gnu} @code{textutils}
@subtitle A set of text utilities
@subtitle for version @value{VERSION}, @value{UPDATED}
@author David MacKenzie et al.
@@ -114,7 +114,7 @@ by the Foundation.
@cindex text utilities
@cindex utilities for text handling
-This manual documents version @value{VERSION} of the GNU text utilities.
+This manual documents version @value{VERSION} of the @sc{gnu} text utilities.
@menu
* Introduction:: Caveats, overview, and authors.
@@ -217,11 +217,12 @@ Opening the software toolbox
This manual is incomplete: No attempt is made to explain basic concepts
in a way suitable for novices. Thus, if you are interested, please get
-involved in improving this manual. The entire GNU community will
+involved in improving this manual. The entire @sc{gnu} community will
benefit.
@cindex POSIX.2
-The GNU text utilities are mostly compatible with the @sc{posix.2} standard.
+The @sc{gnu} text utilities are mostly compatible with the @sc{posix.2}
+standard.
@c This paragraph appears in all of fileutils.texi, textutils.texi, and
@c sh-utils.texi too -- so be sure to keep them consistent.
@@ -251,7 +252,7 @@ overall process.
Certain options are available in all of these programs. Rather than
writing identical descriptions for each of the programs, they are
-described here. (In fact, every GNU program accepts (or should accept)
+described here. (In fact, every @sc{gnu} program accepts (or should accept)
these options.)
Some of these programs recognize the @samp{--help} and @samp{--version}
@@ -763,7 +764,8 @@ is not given at all, the default is 16.
@end table
The next several options map the old, pre-@sc{posix} format specification
-options to the corresponding @sc{posix} format specs. GNU @code{od} accepts
+options to the corresponding @sc{posix} format specs.
+@sc{gnu} @code{od} accepts
any combination of old- and new-style options. Format specification
options accumulate.
@@ -1445,13 +1447,13 @@ one-line header consisting of
before the output for each @var{file}.
@cindex BSD @code{tail}
-GNU @code{tail} can output any amount of data (some other versions of
+@sc{gnu} @code{tail} can output any amount of data (some other versions of
@code{tail} cannot). It also has no @samp{-r} option (print in
reverse), since reversing a file is really a different job from printing
the end of a file; BSD @code{tail} (which is the one with @code{-r}) can
only reverse files that are at most as large as its buffer, which is
typically 32k. A more reliable and versatile way to reverse files is
-the GNU @code{tac} command.
+the @sc{gnu} @code{tac} command.
@code{tail} accepts two option formats: the new one, in which numbers
are arguments to the options (@samp{-n 1}), and the old one, in which
@@ -1901,7 +1903,7 @@ is given, file names are also printed (by default). (With the
@samp{--sysv} option, corresponding file names are printed when there is
at least one file argument.)
-By default, GNU @code{sum} computes checksums using an algorithm
+By default, @sc{gnu} @code{sum} computes checksums using an algorithm
compatible with BSD @code{sum} and prints file sizes in units of
1024-byte blocks.
@@ -2133,9 +2135,9 @@ disables this last-resort comparison so that lines in which all fields
compare equal are left in their original relative order. If no fields
or global options are specified, @samp{-s} has no effect.
-GNU @code{sort} (as specified for all GNU utilities) has no limits on
+@sc{gnu} @code{sort} (as specified for all @sc{gnu} utilities) has no limits on
input line length or restrictions on bytes allowed within lines. In
-addition, if the final byte of an input file is not a newline, GNU
+addition, if the final byte of an input file is not a newline, @sc{gnu}
@code{sort} silently supplies one. A line's trailing newline is not
part of the line for comparison purposes.@footnote{@sc{posix}.2-1992
requires that the trailing newline be part of the comparison, and some
@@ -2333,13 +2335,13 @@ and character positions are numbered starting with 0. See below.
@end table
-In addition, when GNU @code{sort} is invoked with exactly one argument,
+In addition, when @sc{gnu} @code{sort} is invoked with exactly one argument,
options @samp{--help} and @samp{--version} are recognized. @xref{Common
options}.
Historical (BSD and System V) implementations of @code{sort} have
differed in their interpretation of some options, particularly
-@samp{-b}, @samp{-f}, and @samp{-n}. GNU sort follows the @sc{posix}
+@samp{-b}, @samp{-f}, and @samp{-n}. @sc{gnu} sort follows the @sc{posix}
behavior, which is usually (but not always!) like the System V behavior.
According to @sc{posix}, @samp{-n} no longer implies @samp{-b}. For
consistency, @samp{-M} has been changed in the same way. This may
@@ -2538,7 +2540,7 @@ Print only duplicate lines.
Print all duplicate lines and only duplicate lines.
This option is useful mainly in conjunction with other options e.g.,
to ignore case or to compare only selected fields.
-This is a GNU extension.
+This is a @sc{gnu} extension.
@c FIXME: give an example showing *how* it's useful
@item -u
@@ -2667,15 +2669,15 @@ ptx -G [@var{option} @dots{}] [@var{input} [@var{output}]]
@end example
The @samp{-G} (or its equivalent: @samp{--traditional}) option disables
-all GNU extensions and reverts to traditional mode, thus introducing some
+all @sc{gnu} extensions and reverts to traditional mode, thus introducing some
limitations and changing several of the program's default option values.
-When @samp{-G} is not specified, GNU extensions are always enabled. GNU
-extensions to @code{ptx} are documented wherever appropriate in this
+When @samp{-G} is not specified, @sc{gnu} extensions are always enabled.
+@sc{gnu} extensions to @code{ptx} are documented wherever appropriate in this
document. For the full list, see @xref{Compatibility in ptx}.
Individual options are explained in the following sections.
-When GNU extensions are enabled, there may be zero, one or several
+When @sc{gnu} extensions are enabled, there may be zero, one or several
@var{file}s after the options. If there is no @var{file}, the program
reads the standard input. If there is one or several @var{file}s, they
give the name of input files which are all read in turn, as if all the
@@ -2685,7 +2687,7 @@ file names and line numbers refer to individual text input files. In
all cases, the program outputs the permuted index to the standard
output.
-When GNU extensions are @emph{not} enabled, that is, when the program
+When @sc{gnu} extensions are @emph{not} enabled, that is, when the program
operates in traditional mode, there may be zero, one or two parameters
besides the options. If there are no parameters, the program reads the
standard input and outputs the permuted index to the standard output.
@@ -2695,7 +2697,7 @@ respectively the name of the @var{input} file to read and the name of
the @var{output} file to produce. @emph{Be very careful} to note that,
in this case, the contents of file given by the second parameter is
destroyed. This behavior is dictated by System V @code{ptx}
-compatibility; GNU Standards normally discourage output parameters not
+compatibility; @sc{gnu} Standards normally discourage output parameters not
introduced by an option.
Note that for @emph{any} file named as the value of an option or as an
@@ -2724,7 +2726,7 @@ exit without further processing.
@item -G
@itemx --traditional
-As already explained, this option disables all GNU extensions to
+As already explained, this option disables all @sc{gnu} extensions to
@code{ptx} and switches to traditional mode.
@item --help
@@ -2745,7 +2747,7 @@ processing.
As it is set up now, the program assumes that the input file is coded
using 8-bit ISO 8859-1 code, also known as Latin-1 character set,
@emph{unless} it is compiled for MS-DOS, in which case it uses the
-character set of the IBM-PC. (GNU @code{ptx} is not known to work on
+character set of the IBM-PC. (@sc{gnu} @code{ptx} is not known to work on
smaller MS-DOS machines anymore.) Compared to 7-bit @sc{ascii}, the set
of characters which are letters is different; this alters the behavior
of regular expression matching. Thus, the default regular expression
@@ -2778,9 +2780,9 @@ is not part of the Break file is a word constituent. If both options
@samp{-b} and @samp{-W} are specified, then @samp{-W} has precedence and
@samp{-b} is ignored.
-When GNU extensions are enabled, the only way to avoid newline as a
+When @sc{gnu} extensions are enabled, the only way to avoid newline as a
break character is to write all the break characters in the file with no
-newline at all, not even at the end of the file. When GNU extensions
+newline at all, not even at the end of the file. When @sc{gnu} extensions
are disabled, spaces, tabs and newlines are always considered as break
characters even if not included in the Break file.
@@ -2823,7 +2825,7 @@ Using this option changes the default value for option @samp{-S}.
Using this option, the program does not try very hard to remove
references from contexts in output, but it succeeds in doing so
@emph{when} the context ends exactly at the newline. If option
-@samp{-r} is used with @samp{-S} default value, or when GNU extensions
+@samp{-r} is used with @samp{-S} default value, or when @sc{gnu} extensions
are disabled, this condition is always met and references are completely
excluded from the output contexts.
@@ -2834,15 +2836,15 @@ This option selects which regular expression will describe the end of a
line or the end of a sentence. In fact, this regular expression is not
the only distinction between end of lines or end of sentences, and input
line boundaries have no special significance outside this option. By
-default, when GNU extensions are enabled and if @samp{-r} option is not
+default, when @sc{gnu} extensions are enabled and if @samp{-r} option is not
used, end of sentences are used. In this case, this @var{regex} is
-imported from GNU Emacs:
+imported from @sc{gnu} Emacs:
@example
[.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]*
@end example
-Whenever GNU extensions are disabled or if @samp{-r} option is used, end
+Whenever @sc{gnu} extensions are disabled or if @samp{-r} option is used, end
of lines are used; in this case, the default @var{regexp} is just:
@example
@@ -2874,8 +2876,8 @@ corresponding characters by @code{ptx} itself.
@itemx --word-regexp=@var{regexp}
This option selects which regular expression will describe each keyword.
-By default, if GNU extensions are enabled, a word is a sequence of
-letters; the @var{regexp} used is @samp{\w+}. When GNU extensions are
+By default, if @sc{gnu} extensions are enabled, a word is a sequence of
+letters; the @var{regexp} used is @samp{\w+}. When @sc{gnu} extensions are
disabled, a word is by default anything which ends with a space, a tab
or a newline; the @var{regexp} used is @samp{[^ \t\n]+}.
@@ -2895,14 +2897,14 @@ the corresponding characters by @code{ptx} itself.
Output format is mainly controlled by the @samp{-O} and @samp{-T} options
described in the table below. When neither @samp{-O} nor @samp{-T} are
-selected, and if GNU extensions are enabled, the program chooses an
+selected, and if @sc{gnu} extensions are enabled, the program chooses an
output format suitable for a dumb terminal. Each keyword occurrence is
output to the center of one line, surrounded by its left and right
contexts. Each field is properly justified, so the concordance output
can be readily observed. As a special feature, if automatic
references are selected by option @samp{-A} and are output before the
left context, that is, if option @samp{-R} is @emph{not} selected, then
-a colon is added after the reference; this nicely interfaces with GNU
+a colon is added after the reference; this nicely interfaces with @sc{gnu}
Emacs @code{next-error} processing. In this default output format, each
white space character, like newline and tab, is merely changed to
exactly one space, with no special attempt to compress consecutive
@@ -2955,7 +2957,7 @@ context. For any other output format, option @samp{-R} is
ignored, with one exception: with @samp{-R} the width of references
is @emph{not} taken into account in total output width given by @samp{-w}.
-This option is automatically selected whenever GNU extensions are
+This option is automatically selected whenever @sc{gnu} extensions are
disabled.
@item -F @var{string}
@@ -2997,7 +2999,7 @@ processing. Each output line will look like:
@end smallexample
so it will be possible to write a @samp{.xx} roff macro to take care of
-the output typesetting. This is the default output format when GNU
+the output typesetting. This is the default output format when @sc{gnu}
extensions are disabled. Option @samp{-M} can be used to change
@samp{xx} to another macro name.
@@ -3042,13 +3044,13 @@ processing for @TeX{}.
@node Compatibility in ptx
-@subsection The GNU extensions to @code{ptx}
+@subsection The @sc{gnu} extensions to @code{ptx}
This version of @code{ptx} contains a few features which do not exist in
System V @code{ptx}. These extra features are suppressed by using the
@samp{-G} command line option, unless overridden by other command line
-options. Some GNU extensions cannot be recovered by overriding, so the
-simple rule is to avoid @samp{-G} if you care about GNU extensions.
+options. Some @sc{gnu} extensions cannot be recovered by overriding, so the
+simple rule is to avoid @samp{-G} if you care about @sc{gnu} extensions.
Here are the differences between this program and System V @code{ptx}.
@itemize @bullet
@@ -3061,8 +3063,8 @@ or, if a second @var{file} parameter is given on the command, to that
@var{file}.
Having output parameters not introduced by options is a dangerous
-practice which GNU avoids as far as possible. So, for using @code{ptx}
-portably between GNU and System V, you should always use it with a
+practice which @sc{gnu} avoids as far as possible. So, for using @code{ptx}
+portably between @sc{gnu} and System V, you should always use it with a
single input file, and always expect the result on standard output. You
might also want to automatically configure in a @samp{-G} option to
@code{ptx} calls in products using @code{ptx}, if the configurator finds
@@ -3071,9 +3073,9 @@ that the installed @code{ptx} accepts @samp{-G}.
@item
The only options available in System V @code{ptx} are options @samp{-b},
@samp{-f}, @samp{-g}, @samp{-i}, @samp{-o}, @samp{-r}, @samp{-t} and
-@samp{-w}. All other options are GNU extensions and are not repeated in
+@samp{-w}. All other options are @sc{gnu} extensions and are not repeated in
this enumeration. Moreover, some options have a slightly different
-meaning when GNU extensions are enabled, as explained below.
+meaning when @sc{gnu} extensions are enabled, as explained below.
@item
By default, concordance output is not formatted for @code{troff} or
@@ -3082,29 +3084,29 @@ or @code{nroff} output may still be selected through option @samp{-O}.
@item
Unless @samp{-R} option is used, the maximum reference width is
-subtracted from the total output line width. With GNU extensions
+subtracted from the total output line width. With @sc{gnu} extensions
disabled, width of references is not taken into account in the output
line width computations.
@item
All 256 characters, even @kbd{NUL}s, are always read and processed from
-input file with no adverse effect, even if GNU extensions are disabled.
+input file with no adverse effect, even if @sc{gnu} extensions are disabled.
However, System V @code{ptx} does not accept 8-bit characters, a few
control characters are rejected, and the tilde @kbd{~} is also rejected.
@item
-Input line length is only limited by available memory, even if GNU
+Input line length is only limited by available memory, even if @sc{gnu}
extensions are disabled. However, System V @code{ptx} processes only
the first 200 characters in each line.
@item
The break (non-word) characters default to be every character except all
-letters of the underlying character set, diacriticized or not. When GNU
+letters of the underlying character set, diacriticized or not. When @sc{gnu}
extensions are disabled, the break characters default to space, tab and
newline only.
@item
-The program makes better use of output line width. If GNU extensions
+The program makes better use of output line width. If @sc{gnu} extensions
are disabled, the program rather tries to imitate System V @code{ptx},
but still, there are some slight disposition glitches this program does
not completely reproduce.
@@ -3339,7 +3341,7 @@ Print a line for each unpairable line in file @var{file-number}
@end table
-In addition, when GNU @code{join} is invoked with exactly one argument,
+In addition, when @sc{gnu} @code{join} is invoked with exactly one argument,
options @samp{--help} and @samp{--version} are recognized. @xref{Common
options}.
@@ -3447,7 +3449,7 @@ from @var{m} through @var{n}, in ascending order. @var{m} should
collate before @var{n}; if it doesn't, an error results. As an example,
@samp{0-9} is the same as @samp{0123456789}.
-GNU @code{tr} does not support the System V syntax that uses square
+@sc{gnu} @code{tr} does not support the System V syntax that uses square
brackets to enclose ranges. Translations specified in that format
sometimes work as expected, since the brackets are often transliterated
to themselves. However, they should be avoided because they sometimes
@@ -3535,7 +3537,7 @@ The syntax @samp{[=@var{c}=]} expands to all of the characters that are
equivalent to @var{c}, in no particular order. Equivalence classes are
a relatively recent invention intended to support non-English alphabets.
But there seems to be no standard way to define them or determine their
-contents. Therefore, they are not fully implemented in GNU @code{tr};
+contents. Therefore, they are not fully implemented in @sc{gnu} @code{tr};
each character's equivalence class consists only of that character,
which is of no particular use.
@@ -3583,8 +3585,8 @@ BSD @code{tr} pads @var{set2} to the length of @var{set1} by repeating
the last character of @var{set2} as many times as necessary. System V
@code{tr} truncates @var{set1} to the length of @var{set2}.
-By default, GNU @code{tr} handles this case like BSD @code{tr}. When
-the @samp{--truncate-set1} (@samp{-t}) option is given, GNU @code{tr}
+By default, @sc{gnu} @code{tr} handles this case like BSD @code{tr}. When
+the @samp{--truncate-set1} (@samp{-t}) option is given, @sc{gnu} @code{tr}
handles this case like the System V @code{tr} instead. This option is
ignored for operations other than translation.
@@ -3723,7 +3725,7 @@ following warning and error messages, for strict compliance with
@item
When the @samp{--delete} option is given but @samp{--squeeze-repeats}
-is not, and @var{set2} is given, GNU @code{tr} by default prints
+is not, and @var{set2} is given, @sc{gnu} @code{tr} by default prints
a usage message and exits, because @var{set2} would not be used.
The @sc{posix} specification says that @var{set2} must be ignored in
this case. Silently ignoring arguments is a bad idea.
@@ -3735,9 +3737,9 @@ value 400 octal does not fit into a single byte.
@end enumerate
-GNU @code{tr} does not provide complete BSD or System V compatibility.
+@sc{gnu} @code{tr} does not provide complete BSD or System V compatibility.
For example, it is impossible to disable interpretation of the @sc{posix}
-constructs @samp{[:alpha:]}, @samp{[=c=]}, and @samp{[c*10]}. Also, GNU
+constructs @samp{[:alpha:]}, @samp{[=c=]}, and @samp{[c*10]}. Also, @sc{gnu}
@code{tr} does not delete zero bytes automatically, unlike traditional
Unix versions, which provide no way to preserve zero bytes.
@@ -3862,13 +3864,13 @@ Robbins.
@node Toolbox introduction
@unnumberedsec Toolbox introduction
-This month's column is only peripherally related to the GNU Project, in
-that it describes a number of the GNU tools on your Linux system and how they
-might be used. What it's really about is the ``Software Tools'' philosophy
+This month's column is only peripherally related to the @sc{gnu} Project, in
+that it describes a number of the @sc{gnu} tools on your Linux system and how
+they might be used. What it's really about is the ``Software Tools'' philosophy
of program development and usage.
The software tools philosophy was an important and integral concept
-in the initial design and development of Unix (of which Linux and GNU are
+in the initial design and development of Unix (of which Linux and @sc{gnu} are
essentially clones). Unfortunately, in the modern day press of
Internetworking and flashy GUIs, it seems to have fallen by the
wayside. This is a shame, since it provides a powerful mental model
@@ -4361,8 +4363,8 @@ appropriate tool, build one.
As of this writing, all the programs we've discussed are available via
anonymous @code{ftp} from @code{prep.ai.mit.edu} as
@file{/pub/gnu/textutils-1.9.tar.gz}.@footnote{Version 1.9 was current
-when this column was written. Check the nearest GNU archive for the
-current version. The main GNU FTP site is now @code{ftp.gnu.org}.}
+when this column was written. Check the nearest @sc{gnu} archive for the
+current version. The main @sc{gnu} FTP site is now @code{ftp.gnu.org}.}
None of what I have presented in this column is new. The Software Tools
philosophy was first introduced in the book @cite{Software Tools},
@@ -4388,8 +4390,8 @@ whose members had ported the original @code{ratfor} programs to essentially
every computer system with a FORTRAN compiler. The popularity of the
group waned in the middle '80s as Unix began to spread beyond universities.
-With the current proliferation of GNU code and other clones of Unix programs,
-these programs now receive little attention; modern C versions are
+With the current proliferation of @sc{gnu} code and other clones of Unix
+programs, these programs now receive little attention; modern C versions are
much more efficient and do more than these programs do. Nevertheless, as
exposition of good programming style, and evangelism for a still-valuable
philosophy, these books are unparalleled, and I recommend them highly.