diff options
author | Jim Meyering <jim@meyering.net> | 2001-06-16 11:06:49 +0000 |
---|---|---|
committer | Jim Meyering <jim@meyering.net> | 2001-06-16 11:06:49 +0000 |
commit | 0dfd4b77828e83bebbe720bcf07b76fe2515b1db (patch) | |
tree | 64ecedbdedf480946000473da101346e93c1b4a2 /doc | |
parent | 8d4c961d53a57976d17c524e58a6f2cff0f0d983 (diff) | |
download | coreutils-0dfd4b77828e83bebbe720bcf07b76fe2515b1db.tar.xz |
renamed from omni-utils.texi
Diffstat (limited to 'doc')
-rw-r--r-- | doc/coreutils.texi | 11308 |
1 files changed, 11308 insertions, 0 deletions
diff --git a/doc/coreutils.texi b/doc/coreutils.texi new file mode 100644 index 000000000..ec95f354d --- /dev/null +++ b/doc/coreutils.texi @@ -0,0 +1,11308 @@ +\input texinfo +@c %**start of header +@setfilename omni-utils.info +@settitle @sc{gnu} Omni-utils + +@c %**end of header + +@include version.texi +@include constants.texi + +@c Define new indices. +@defcodeindex op +@defcodeindex fl + +@c Put everything in one index (arbitrarily chosen to be the concept index). +@syncodeindex fl cp +@syncodeindex fn cp +@syncodeindex ky cp +@syncodeindex op cp +@syncodeindex pg cp +@syncodeindex vr cp + +@ifinfo +@format +START-INFO-DIR-ENTRY +* @sc{gnu} Utilities: (omni-utils). @sc{gnu} Utilities. +* Common options: Common options. +* File permissions: Access modes. +* Date input formats: Specifying date strings. +* Opening the software toolbox: The software tools philosophy. +* GNU Free Documentation License: The license for this documentation. + +* basename: (omni-utils)basename invocation. Strip directory and suffix. +* cat: (omni-utils)cat invocation. Concatenate and write files. +* chgrp: (omni-utils)chgrp invocation. Change file groups. +* chmod: (omni-utils)chmod invocation. Change file permissions. +* chown: (omni-utils)chown invocation. Change file owners/groups. +* chroot: (omni-utils)chroot invocation. Specify the root directory. +* cksum: (omni-utils)cksum invocation. Print @sc{posix} CRC checksum. +* comm: (omni-utils)comm invocation. Compare sorted files by line. +* cp: (omni-utils)cp invocation. Copy files. +* csplit: (omni-utils)csplit invocation. Split by context. +* cut: (omni-utils)cut invocation. Print selected parts of lines. +* date: (omni-utils)date invocation. Print/set system date and time. +* dd: (omni-utils)dd invocation. Copy and convert a file. +* df: (omni-utils)df invocation. Report filesystem disk usage. +* dir: (omni-utils)dir invocation. List directories briefly. +* dircolors: (omni-utils)dircolors invocation. Color setup for ls. +* dirname: (omni-utils)dirname invocation. Strip non-directory suffix. +* du: (omni-utils)du invocation. Report on disk usage. +* echo: (omni-utils)echo invocation. Print a line of text. +* env: (omni-utils)env invocation. Modify the environment. +* expand: (omni-utils)expand invocation. Convert tabs to spaces. +* expr: (omni-utils)expr invocation. Evaluate expressions. +* factor: (omni-utils)factor invocation. Print prime factors +* false: (omni-utils)false invocation. Do nothing, unsuccessfully. +* fmt: (omni-utils)fmt invocation. Reformat paragraph text. +* fold: (omni-utils)fold invocation. Wrap long input lines. +* groups: (omni-utils)groups invocation. Print group names a user is in. +* head: (omni-utils)head invocation. Output the first part of files. +* hostid: (omni-utils)hostid invocation. Print numeric host identifier. +* hostname: (omni-utils)hostname invocation. Print or set system name. +* id: (omni-utils)id invocation. Print real/effective uid/gid. +* install: (omni-utils)install invocation. Copy and change attributes. +* join: (omni-utils)join invocation. Join lines on a common field. +* ln: (omni-utils)ln invocation. Make links between files. +* logname: (omni-utils)logname invocation. Print current login name. +* ls: (omni-utils)ls invocation. List directory contents. +* md5sum: (omni-utils)md5sum invocation. Print or check message-digests. +* mkdir: (omni-utils)mkdir invocation. Create directories. +* mkfifo: (omni-utils)mkfifo invocation. Create FIFOs (named pipes). +* mknod: (omni-utils)mknod invocation. Create special files. +* mv: (omni-utils)mv invocation. Rename files. +* nice: (omni-utils)nice invocation. Modify scheduling priority. +* nl: (omni-utils)nl invocation. Number lines and write files. +* nohup: (omni-utils)nohup invocation. Immunize to hangups. +* od: (omni-utils)od invocation. Dump files in octal, etc. +* paste: (omni-utils)paste invocation. Merge lines of files. +* pathchk: (omni-utils)pathchk invocation. Check file name portability. +* pr: (omni-utils)pr invocation. Paginate or columnate files. +* printenv: (omni-utils)printenv invocation. Print environment variables. +* printf: (omni-utils)printf invocation. Format and print data. +* ptx: (omni-utils)ptx invocation. Produce permuted indexes. +* pwd: (omni-utils)pwd invocation. Print working directory. +* rm: (omni-utils)rm invocation. Remove files. +* rmdir: (omni-utils)rmdir invocation. Remove empty directories. +* seq: (omni-utils)seq invocation. Print numeric sequences +* shred: (omni-utils)shred invocation. Remove files more securely. +* sleep: (omni-utils)sleep invocation. Delay for a specified time. +* sort: (omni-utils)sort invocation. Sort text files. +* split: (omni-utils)split invocation. Split into fixed-size pieces. +* stty: (omni-utils)stty invocation. Print/change terminal settings. +* su: (omni-utils)su invocation. Modify user and group id. +* sum: (omni-utils)sum invocation. Print traditional checksum. +* sync: (omni-utils)sync invocation. Synchronize memory and disk. +* tac: (omni-utils)tac invocation. Reverse files. +* tail: (omni-utils)tail invocation. Output the last part of files. +* tee: (omni-utils)tee invocation. Redirect to multiple files. +* test: (omni-utils)test invocation. File/string tests. +* touch: (omni-utils)touch invocation. Change file timestamps. +* tr: (omni-utils)tr invocation. Translate characters. +* true: (omni-utils)true invocation. Do nothing, successfully. +* tsort: (omni-utils)tsort invocation. Topological sort. +* tty: (omni-utils)tty invocation. Print terminal name. +* uname: (omni-utils)uname invocation. Print system information. +* unexpand: (omni-utils)unexpand invocation. Convert spaces to tabs. +* uniq: (omni-utils)uniq invocation. Uniquify files. +* users: (omni-utils)users invocation. Print current user names. +* vdir: (omni-utils)vdir invocation. List directories verbosely. +* wc: (omni-utils)wc invocation. Byte, word, and line counts. +* who: (omni-utils)who invocation. Print who is logged in. +* whoami: (omni-utils)whoami invocation. Print effective user id. +* yes: (omni-utils)yes invocation. Print a string indefinitely. +END-INFO-DIR-ENTRY +@end format +@end ifinfo + +@ifinfo +This file documents the GNU command line utilities. + +Copyright (C) 1994, 95, 96, 2001 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with no +Invariant Sections, with no Front-Cover Texts, and with no Back-Cover +Texts. A copy of the license is included in the section entitled ``GNU +Free Documentation License''. + +@end ifinfo + +@titlepage +@title @sc{gnu} @code{Omni-utils} +@subtitle A set of command line utilities +@subtitle for version @value{VERSION}, @value{UPDATED} +@author David MacKenzie et al. + +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1994, 95, 96, 2000 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with no +Invariant Sections, with no Front-Cover Texts, and with no Back-Cover +Texts. A copy of the license is included in the section entitled ``GNU +Free Documentation License''. +@end titlepage + + +@c If your makeinfo doesn't grok this @ifnottex directive, then either +@c get a newer version of makeinfo or do s/ifnottex/ifinfo/ here and on +@c the matching @end directive below. +@ifnottex +@node Top +@top GNU Omni-utils + +@cindex text utilities +@cindex shell utilities +@cindex file utilities +This manual documents version @value{VERSION} of the @sc{gnu} command +line utilities. + +@menu +* Introduction:: Caveats, overview, and authors. +* Common options:: Common options. +* Output of entire files:: cat tac nl od +* Formatting file contents:: fmt pr fold +* Output of parts of files:: head tail split csplit +* Summarizing files:: wc sum cksum md5sum +* Operating on sorted files:: sort uniq comm ptx tsort +* Operating on fields within a line:: cut paste join +* Operating on characters:: tr expand unexpand +* Directory listing:: ls dir vdir d v dircolors +* Basic operations:: cp dd install mv rm shred +* Special file types:: ln mkdir rmdir mkfifo mknod +* Changing file attributes:: chgrp chmod chown touch +* Disk usage:: df du sync +* Printing text:: echo printf yes +* Conditions:: false true test expr +* Redirection:: tee +* File name manipulation:: dirname basename pathchk +* Working context:: pwd stty printenv tty +* User information:: id logname whoami groups users who +* System context:: date uname hostname +* Modified command invocation:: chroot env nice nohup su +* Delaying:: sleep +* Numeric operations:: factor seq +* File permissions:: Access modes. +* Date input formats:: Specifying date strings. +* Opening the software toolbox:: The software tools philosophy. +* GNU Free Documentation License:: The license for this documentation. +* Index:: General index. + +@detailmenu + --- The Detailed Node Listing --- + +Common Options + +* Backup options:: Backup options +* Block size:: Block size +* Target directory:: Target directory +* Trailing slashes:: Trailing slashes + +Output of entire files + +* cat invocation:: Concatenate and write files. +* tac invocation:: Concatenate and write files in reverse. +* nl invocation:: Number lines and write files. +* od invocation:: Write files in octal or other formats. + +Formatting file contents + +* fmt invocation:: Reformat paragraph text. +* pr invocation:: Paginate or columnate files for printing. +* fold invocation:: Wrap input lines to fit in specified width. + +Output of parts of files + +* head invocation:: Output the first part of files. +* tail invocation:: Output the last part of files. +* split invocation:: Split a file into fixed-size pieces. +* csplit invocation:: Split a file into context-determined pieces. + +Summarizing files + +* wc invocation:: Print byte, word, and line counts. +* sum invocation:: Print checksum and block counts. +* cksum invocation:: Print CRC checksum and byte counts. +* md5sum invocation:: Print or check message-digests. + +Operating on sorted files + +* sort invocation:: Sort text files. +* uniq invocation:: Uniquify files. +* comm invocation:: Compare two sorted files line by line. +* ptx invocation:: Produce a permuted index of file contents. +* tsort invocation:: Topological sort. + +@code{ptx}: Produce permuted indexes + +* General options in ptx:: Options which affect general program behavior. +* Charset selection in ptx:: Underlying character set considerations. +* Input processing in ptx:: Input fields, contexts, and keyword selection. +* Output formatting in ptx:: Types of output format, and sizing the fields. +* Compatibility in ptx:: The GNU extensions to @code{ptx} + +Operating on fields within a line + +* cut invocation:: Print selected parts of lines. +* paste invocation:: Merge lines of files. +* join invocation:: Join lines on a common field. + +Operating on characters + +* tr invocation:: Translate, squeeze, and/or delete characters. +* expand invocation:: Convert tabs to spaces. +* unexpand invocation:: Convert spaces to tabs. + +@code{tr}: Translate, squeeze, and/or delete characters + +* Character sets:: Specifying sets of characters. +* Translating:: Changing one characters to another. +* Squeezing:: Squeezing repeats and deleting. +* Warnings in tr:: Warning messages. + +Directory listing + +* ls invocation:: List directory contents +* dir invocation:: Briefly list directory contents +* vdir invocation:: Verbosely list directory contents +* dircolors invocation:: Color setup for @code{ls} + +@code{ls}: List directory contents + +* Which files are listed:: Which files are listed +* What information is listed:: What information is listed +* Sorting the output:: Sorting the output +* More details about version sort:: More details about version sort +* General output formatting:: General output formatting +* Formatting the file names:: Formatting the file names + +Basic operations + +* cp invocation:: Copy files and directories +* dd invocation:: Convert and copy a file +* install invocation:: Copy files and set attributes +* mv invocation:: Move (rename) files +* rm invocation:: Remove files or directories +* shred invocation:: Remove files more securely + +Special file types + +* ln invocation:: Make links between files +* mkdir invocation:: Make directories +* mkfifo invocation:: Make FIFOs (named pipes) +* mknod invocation:: Make block or character special files +* rmdir invocation:: Remove empty directories + +Changing file attributes + +* chown invocation:: Change file owner and group +* chgrp invocation:: Change group ownership +* chmod invocation:: Change access permissions +* touch invocation:: Change file timestamps + +Disk usage + +* df invocation:: Report filesystem disk space usage +* du invocation:: Estimate file space usage +* sync invocation:: Synchronize data on disk with memory + +Printing text + +* echo invocation:: Print a line of text +* printf invocation:: Format and print data +* yes invocation:: Print a string until interrupted + +Conditions + +* false invocation:: Do nothing, unsuccessfully +* true invocation:: Do nothing, successfully +* test invocation:: Check file types and compare values +* expr invocation:: Evaluate expressions + +@code{test}: Check file types and compare values + +* File type tests:: File type tests +* Access permission tests:: Access permission tests +* File characteristic tests:: File characteristic tests +* String tests:: String tests +* Numeric tests:: Numeric tests + +@code{expr}: Evaluate expression + +* String expressions:: <colon> match substr index length quote +* Numeric expressions:: + - * / % +* Relations for expr:: | & < <= = == != >= > +* Examples of expr:: Examples of using @code{expr} + +Redirection + +* tee invocation:: Redirect output to multiple files + +File name manipulation + +* basename invocation:: Strip directory and suffix from a file name +* dirname invocation:: Strip non-directory suffix from a file name +* pathchk invocation:: Check file name portability + +Working context + +* pwd invocation:: Print working directory +* stty invocation:: Print or change terminal characteristics +* printenv invocation:: Print all or some environment variables +* tty invocation:: Print file name of terminal on standard input + +@code{stty}: Print or change terminal characteristics + +* Control:: Control settings +* Input:: Input settings +* Output:: Output settings +* Local:: Local settings +* Combination:: Combination settings +* Characters:: Special characters +* Special:: Special settings + +User information + +* id invocation:: Print real and effective uid and gid +* logname invocation:: Print current login name +* whoami invocation:: Print effective user id +* groups invocation:: Print group names a user is in +* users invocation:: Print login names of users currently logged in +* who invocation:: Print who is currently logged in + +System context + +* date invocation:: Print or set system date and time +* uname invocation:: Print system information +* hostname invocation:: Print or set system name +* hostid invocation:: Print numeric host identifier. + +@code{date}: Print or set system date and time + +* Time directives:: Time directives +* Date directives:: Date directives +* Literal directives:: Literal directives +* Padding:: Padding +* Setting the time:: Setting the time +* Options for date:: Options for @code{date} +* Examples of date:: Examples of @code{date} + +Modified command invocation + +* chroot invocation:: Run a command with a different root directory +* env invocation:: Run a command in a modified environment +* nice invocation:: Run a command with modified scheduling priority +* nohup invocation:: Run a command immune to hangups +* su invocation:: Run a command with substitute user and group id + +Delaying + +* sleep invocation:: Delay for a specified time + +Numeric operations + +* factor invocation:: Print prime factors +* seq invocation:: Print numeric sequences + +File permissions + +* Mode Structure:: Structure of File Permissions +* Symbolic Modes:: Mnemonic permissions representation +* Numeric Modes:: Permissions as octal numbers + +Date input formats + +* General date syntax: General date syntax +* Calendar date items: Calendar date items +* Time of day items: Time of day items +* Time zone items: Time zone items +* Day of week items: Day of week items +* Relative items in date strings: Relative items in date strings +* Pure numbers in date strings: Pure numbers in date strings +* Authors of getdate: Authors of getdate + +Opening the software toolbox + +* Toolbox introduction:: Toolbox introduction +* I/O redirection:: I/O redirection +* The who command:: The @code{who} command +* The cut command:: The @code{cut} command +* The sort command:: The @code{sort} command +* The uniq command:: The @code{uniq} command +* Putting the tools together:: Putting the tools together + +GNU Free Documentation License + +* How to use this License for your documents:: + +@end detailmenu +@end menu + +@end ifnottex + + +@node Introduction +@chapter Introduction + +This manual is incomplete: No attempt is made to explain basic concepts +in a way suitable for novices. Thus, if you are interested, please get +involved in improving this manual. The entire @sc{gnu} community will +benefit. + +@cindex @sc{posix.2} +The @sc{gnu} utilities documented here are mostly compatible with the +@sc{posix.2} standard. +@cindex bugs, reporting +Please report bugs to @email{bug-omni-utils@@gnu.org}. Remember +to include the version number, machine architecture, input files, and +any other information needed to reproduce the bug: your input, what you +expected, what you got, and why it is wrong. Diffs are welcome, but +please include a description of the problem as well, since this is +sometimes difficult to infer. @xref{Bugs, , , gcc, Using and Porting GNU CC}. + +@cindex Berry, K. +@cindex Paterson, R. +@cindex Stallman, R. +@cindex Pinard, F. +@cindex MacKenzie, D. +@cindex Meyering, J. +@cindex Youmans, B. +This manual was originally derived from the Unix man pages in the +distributions, which were written by David MacKenzie and updated by Jim +Meyering. What you are reading now is the authoritative documentation +for these utilities; the man pages are no longer being maintained. The +original @code{fmt} man page was written by Ross Paterson. Fran@,{c}ois +Pinard did the initial conversion to Texinfo format. Karl Berry did the +indexing, some reorganization, and editing of the results. Brian +Youmans of the Free Software Foundation office staff combined the +manuals for textutils, fileutils, and sh-utils to produce the present +omnibus manual. Richard Stallman contributed his usual invaluable +insights to the overall process. + +@node Common options +@chapter Common options + +@cindex common options + +Certain options are available in all of these programs. Rather than +writing identical descriptions for each of the programs, they are +described here. (In fact, every @sc{gnu} program accepts (or should accept) +these options.) + +@vindex POSIXLY_CORRECT +Normally options and operands can appear in any order, and programs act +as if all the options appear before any operands. For example, +@samp{sort -r passwd -t :} acts like @samp{sort -r -t : passwd}, since +@samp{:} is an option-argument of @option{-t}. However, if the +@env{POSIXLY_CORRECT} environment variable is set, options must appear +before operands, unless otherwise specified for a particular command. + +Some of these programs recognize the @samp{--help} and @samp{--version} +options only when one of them is the sole command line argument. + +@table @samp + +@item --help +@opindex --help +@cindex help, online +Print a usage message listing all available options, then exit successfully. + +@item --version +@opindex --version +@cindex version number, finding +Print the version number, then exit successfully. + +@item -- +@opindex -- +@cindex option delimiter +Delimit the option list. Later arguments, if any, are treated as +operands even if they begin with @samp{-}. For example, @samp{sort -- +-r} reads from the file named @file{-r}. + +@end table + +@cindex standard input +@cindex standard output +A single @samp{-} is not really an option, though it looks like one. It +stands for standard input, or for standard output if that is clear from +the context, and it can be used either as an operand or as an +option-argument. For example, @samp{sort -o - -} outputs to standard +output and reads from standard input, and is equivalent to plain +@samp{sort}. Unless otherwise specified, @samp{-} can appear in any +context that requires a file name. + +@menu +* Backup options:: -b -S -V, in some programs. +* Block size:: BLOCK_SIZE and --block-size, in some programs. +* Target directory:: --target-directory, in some programs. +* Trailing slashes:: --strip-trailing-slashes, in some programs. +@end menu + + +@node Backup options +@section Backup options + +@cindex backup options + +Some @sc{gnu} programs (at least @code{cp}, @code{install}, @code{ln}, and +@code{mv}) optionally make backups of files before writing new versions. +These options control the details of these backups. The options are also +briefly mentioned in the descriptions of the particular programs. + +@table @samp + +@item -b +@itemx @w{@kbd{--backup}[=@var{method}]} +@opindex -b +@opindex --backup +@vindex VERSION_CONTROL +@cindex backups, making +Make a backup of each file that would otherwise be overwritten or removed. +Without this option, the original versions are destroyed. +Use @var{method} to determine the type of backups to make. +When this option is used but @var{method} is not specified, +then the value of the @env{VERSION_CONTROL} +environment variable is used. And if @env{VERSION_CONTROL} is not set, +the default backup type is @samp{existing}. + +Note that the short form of this option, @samp{-b} does not accept any +argument. Using @samp{-b} is equivalent to using @samp{--backup=existing}. + +@vindex version-control @r{Emacs variable} +This option corresponds to the Emacs variable @samp{version-control}; +the values for @var{method} are the same as those used in Emacs. +This option also accepts more descriptive names. +The valid @var{method}s are (unique abbreviations are accepted): + +@table @samp +@item none +@itemx off +@opindex none @r{backup method} +Never make backups. + +@item numbered +@itemx t +@opindex numbered @r{backup method} +Always make numbered backups. + +@item existing +@itemx nil +@opindex existing @r{backup method} +Make numbered backups of files that already have them, simple backups +of the others. + +@item simple +@itemx never +@opindex simple @r{backup method} +Always make simple backups. Please note @samp{never} is not to be +confused with @samp{none}. + +@end table + +@item -S @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -S +@opindex --suffix +@cindex backup suffix +@vindex SIMPLE_BACKUP_SUFFIX +Append @var{suffix} to each backup file made with @samp{-b}. If this +option is not specified, the value of the @env{SIMPLE_BACKUP_SUFFIX} +environment variable is used. And if @env{SIMPLE_BACKUP_SUFFIX} is not +set, the default is @samp{~}, just as in Emacs. + +@itemx --version-control=@var{method} +@opindex --version-control +@c FIXME: remove this block one or two releases after the actual +@c removal from the code. +This option is obsolete and will be removed in a future release. +It has been replaced with @w{@kbd{--backup}}. + +@end table + +@node Block size +@section Block size + +@cindex block size + +Some @sc{gnu} programs (at least @code{df}, @code{du}, and @code{ls}) display +file sizes in ``blocks''. You can adjust the block size to make file +sizes easier to read. The block size used for display is independent of +any filesystem block size. + +Normally, disk usage sizes are rounded up, disk free space sizes are +rounded down, and other sizes are rounded to the nearest value with ties +rounding to an even value. + +@opindex --block-size=@var{size} +@vindex BLOCK_SIZE +@vindex DF_BLOCK_SIZE +@vindex DU_BLOCK_SIZE +@vindex LS_BLOCK_SIZE +@vindex POSIXLY_CORRECT@r{, and block size} + +The default block size is chosen by examining the following environment +variables in turn; the first one that is set determines the block size. + +@table @code + +@item DF_BLOCK_SIZE +This specifies the default block size for the @code{df} command. +Similarly, @env{DU_BLOCK_SIZE} specifies the default for @code{du} and +@env{LS_BLOCK_SIZE} for @code{ls}. + +@item BLOCK_SIZE +This specifies the default block size for all three commands, if the +above command-specific environment variables are not set. + +@item POSIXLY_CORRECT +If neither the @env{@var{command}_BLOCK_SIZE} nor the @env{BLOCK_SIZE} +variables are set, but this variable is set, the block size defaults to 512. + +@end table + +If none of the above environment variables are set, the block size +currently defaults to 1024 bytes, but this number may change in the +future. + +@cindex human-readable output +@cindex SI output + +A block size specification can be a positive integer specifying the number +of bytes per block, or it can be @code{human-readable} or @code{si} to +select a human-readable format. + + +With human-readable formats, output sizes are followed by a size letter +such as @samp{M} for megabytes. @code{BLOCK_SIZE=human-readable} uses +powers of 1024; @samp{M} stands for 1,048,576 bytes. +@code{BLOCK_SIZE=si} is similar, but uses powers of 1000; @samp{M} stands +for 1,000,000 bytes. (SI, the International System of Units, defines +these power-of-1000 prefixes.) + +An integer block size can be followed by a size letter to specify a +multiple of that size. When this notation is used, the size letters +normally stand for powers of 1024, and can be followed by an optional +@samp{B} for ``byte''; but if followed by @samp{D} (for ``decimal +byte''), they stand for powers of 1000. For example, +@code{BLOCK_SIZE=4MB} is equivalent to @code{BLOCK_SIZE=4194304}, and +@code{BLOCK_SIZE=4MD} is equivalent to @code{BLOCK_SIZE=4000000}. + +The following size letters are defined. Large sizes like @code{1Y} +may be rejected by your computer due to limitations of its arithmetic. + +@table @samp +@item k +kilo: @math{2^10 = 1024} for @code{human-readable}, +or @math{10^3 = 1000} for @code{si}. +@item M +Mega: @math{2^20 = 1,048,576} +or @math{10^6 = 1,000,000}. +@item G +Giga: @math{2^30 = 1,073,741,824} +or @math{10^9 = 1,000,000,000}. +@item T +Tera: @math{2^40 = 1,099,511,627,776} +or @math{10^12 = 1,000,000,000,000}. +@item P +Peta: @math{2^50 = 1,125,899,906,842,624} +or @math{10^15 = 1,000,000,000,000,000}. +@item E +Exa: @math{2^60 = 1,152,921,504,606,846,976}@* +or @math{10^18 = 1,000,000,000,000,000,000}. +@item Z +Zetta: @math{2^70 = 1,180,591,620,717,411,303,424}@* +or @math{10^21 = 1,000,000,000,000,000,000,000}. +@item Y +Yotta: @math{2^80 = 1,208,925,819,614,629,174,706,176}@* +or @math{10^24 = 1,000,000,000,000,000,000,000,000}. +@end table + +@opindex -k +@opindex --kilobytes +@opindex -h +@opindex --human-readable +@opindex --si + +Block size defaults can be overridden by an explicit +@samp{--block-size=@var{size}} option. The @samp{-k} or +@samp{--kilobytes} option is equivalent to @samp{--block-size=1k}, which +is the default unless the @env{POSIXLY_CORRECT} environment variable is +set. The @samp{-h} or @samp{--human-readable} option is equivalent to +@samp{--block-size=human-readable}. The @samp{--si} option is +equivalent to @samp{--block-size=si}. + +@node Target directory +@section Target directory + +@cindex target directory + +Some @sc{gnu} programs (at least @code{cp}, @code{install}, @code{ln}, and +@code{mv}) allow you to specify the target directory via this option: + +@table @samp + +@itemx @w{@kbd{--target-directory}=@var{directory}} +@opindex --target-directory +@cindex target directory +@cindex destination directory +Specify the destination @var{directory}. + +The interface for most programs is that after processing options and a +finite (possibly zero) number of fixed-position arguments, the remaining +argument list is either expected to be empty, or is a list of items +(usually files) that will all be handled identically. The @code{xargs} +program is designed to work well with this convention. + +The commands in the @code{mv}-family are unusual in that they take +a variable number of arguments with a special case at the @emph{end} +(namely, the target directory). This makes it nontrivial to perform some +operations, e.g., ``move all files from here to ../d/'', because +@code{mv * ../d/} might exhaust the argument space, and @code{ls | xargs ...} +doesn't have a clean way to specify an extra final argument for each +invocation of the subject command. (It can be done by going through a +shell command, but that requires more human labor and brain power than +it should.) + +The @w{@kbd{--target-directory}} option allows the @code{cp}, +@code{install}, @code{ln}, and @code{mv} programs to be used conveniently +with @code{xargs}. For example, you can move the files from the +current directory to a sibling directory, @code{d} like this: +(However, this doesn't move files whose names begin with @samp{.}.) + +@smallexample +ls |xargs mv --target-directory=../d +@end smallexample + +If you use the @sc{gnu} @code{find} program, you can move @emph{all} +files with this command: +@example +find . -mindepth 1 -maxdepth 1 \ + | xargs mv --target-directory=../d +@end example + +But that will fail if there are no files in the current directory +or if any file has a name containing a newline character. +The following example removes those limitations and requires both +@sc{gnu} @code{find} and @sc{gnu} @code{xargs}: +@example +find . -mindepth 1 -maxdepth 1 -print0 \ + | xargs --null --no-run-if-empty \ + mv --target-directory=../d +@end example + +@end table + +@node Trailing slashes +@section Trailing slashes + +@cindex trailing slashes + +Some @sc{gnu} programs (at least @code{cp} and @code{mv}) allow you to +remove any trailing slashes from each @var{source} argument before +operating on it. The @w{@kbd{--strip-trailing-slashes}} option enables +this behavior. + +This is useful when a @var{source} argument may have a trailing slash and +specify a symbolic link to a directory. This scenario is in fact rather +common because some shells can automatically append a trailing slash when +performing file name completion on such symbolic links. Without this +option, @code{mv}, for example, (via the system's rename function) must +interpret a trailing slash as a request to dereference the symbolic link +and so must rename the indirectly referenced @emph{directory} and not +the symbolic link. Although it may seem surprising that such behavior +be the default, it is required by @sc{posix.2} and is consistent with +other parts of that standard. + +@node Output of entire files +@chapter Output of entire files + +@cindex output of entire files +@cindex entire files, output of + +These commands read and write entire files, possibly transforming them +in some way. + +@menu +* cat invocation:: Concatenate and write files. +* tac invocation:: Concatenate and write files in reverse. +* nl invocation:: Number lines and write files. +* od invocation:: Write files in octal or other formats. +@end menu + +@node cat invocation +@section @code{cat}: Concatenate and write files + +@pindex cat +@cindex concatenate and write files +@cindex copying files + +@code{cat} copies each @var{file} (@samp{-} means standard input), or +standard input if none are given, to standard output. Synopsis: + +@example +cat [@var{option}] [@var{file}]@dots{} +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -A +@itemx --show-all +@opindex -A +@opindex --show-all +Equivalent to @samp{-vET}. + +@item -B +@itemx --binary +@opindex -B +@opindex --binary +@cindex binary and text I/O in cat +On MS-DOS and MS-Windows only, read and write the files in binary mode. +By default, @code{cat} on MS-DOS/MS-Windows uses binary mode only when +standard output is redirected to a file or a pipe; this option overrides +that. Binary file I/O is used so that the files retain their format +(Unix text as opposed to DOS text and binary), because @code{cat} is +frequently used as a file-copying program. Some options (see below) +cause @code{cat} to read and write files in text mode because in those +cases the original file contents aren't important (e.g., when lines are +numbered by @code{cat}, or when line endings should be marked). This is +so these options work as DOS/Windows users would expect; for example, +DOS-style text files have their lines end with the CR-LF pair of +characters, which won't be processed as an empty line by @samp{-b} unless +the file is read in text mode. + +@item -b +@itemx --number-nonblank +@opindex -b +@opindex --number-nonblank +Number all nonblank output lines, starting with 1. On MS-DOS and +MS-Windows, this option causes @code{cat} to read and write files in +text mode. + +@item -e +@opindex -e +Equivalent to @samp{-vE}. + +@item -E +@itemx --show-ends +@opindex -E +@opindex --show-ends +Display a @samp{$} after the end of each line. On MS-DOS and +MS-Windows, this option causes @code{cat} to read and write files in +text mode. + +@item -n +@itemx --number +@opindex -n +@opindex --number +Number all output lines, starting with 1. On MS-DOS and MS-Windows, +this option causes @code{cat} to read and write files in text mode. + +@item -s +@itemx --squeeze-blank +@opindex -s +@opindex --squeeze-blank +@cindex squeezing blank lines +Replace multiple adjacent blank lines with a single blank line. On +MS-DOS and MS-Windows, this option causes @code{cat} to read and write +files in text mode. + +@item -t +@opindex -t +Equivalent to @samp{-vT}. + +@item -T +@itemx --show-tabs +@opindex -T +@opindex --show-tabs +Display TAB characters as @samp{^I}. + +@item -u +@opindex -u +Ignored; for Unix compatibility. + +@item -v +@itemx --show-nonprinting +@opindex -v +@opindex --show-nonprinting +Display control characters except for LFD and TAB using +@samp{^} notation and precede characters that have the high bit set with +@samp{M-}. On MS-DOS and MS-Windows, this option causes @code{cat} to +read files and standard input in DOS binary mode, so the CR +characters at the end of each line are also visible. + +@end table + + +@node tac invocation +@section @code{tac}: Concatenate and write files in reverse + +@pindex tac +@cindex reversing files + +@code{tac} copies each @var{file} (@samp{-} means standard input), or +standard input if none are given, to standard output, reversing the +records (lines by default) in each separately. Synopsis: + +@example +tac [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@dfn{Records} are separated by instances of a string (newline by +default). By default, this separator string is attached to the end of +the record that it follows in the file. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx --before +@opindex -b +@opindex --before +The separator is attached to the beginning of the record that it +precedes in the file. + +@item -r +@itemx --regex +@opindex -r +@opindex --regex +Treat the separator string as a regular expression. Users of @code{tac} +on MS-DOS/MS-Windows should note that, since @code{tac} reads files in +binary mode, each line of a text file might end with a CR/LF pair +instead of the Unix-style LF. + +@item -s @var{separator} +@itemx --separator=@var{separator} +@opindex -s +@opindex --separator +Use @var{separator} as the record separator, instead of newline. + +@end table + + +@node nl invocation +@section @code{nl}: Number lines and write files + +@pindex nl +@cindex numbering lines +@cindex line numbering + +@code{nl} writes each @var{file} (@samp{-} means standard input), or +standard input if none are given, to standard output, with line numbers +added to some or all of the lines. Synopsis: + +@example +nl [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@cindex logical pages, numbering on +@code{nl} decomposes its input into (logical) pages; by default, the +line number is reset to 1 at the top of each logical page. @code{nl} +treats all of the input files as a single document; it does not reset +line numbers or logical pages between files. + +@cindex headers, numbering +@cindex body, numbering +@cindex footers, numbering +A logical page consists of three sections: header, body, and footer. +Any of the sections can be empty. Each can be numbered in a different +style from the others. + +The beginnings of the sections of logical pages are indicated in the +input file by a line containing exactly one of these delimiter strings: + +@table @samp +@item \:\:\: +start of header; +@item \:\: +start of body; +@item \: +start of footer. +@end table + +The two characters from which these strings are made can be changed from +@samp{\} and @samp{:} via options (see below), but the pattern and +length of each string cannot be changed. + +A section delimiter is replaced by an empty line on output. Any text +that comes before the first section delimiter string in the input file +is considered to be part of a body section, so @code{nl} treats a +file that contains no section delimiters as a single body section. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b @var{style} +@itemx --body-numbering=@var{style} +@opindex -b +@opindex --body-numbering +Select the numbering style for lines in the body section of each +logical page. When a line is not numbered, the current line number +is not incremented, but the line number separator character is still +prepended to the line. The styles are: + +@table @samp +@item a +number all lines, +@item t +number only nonempty lines (default for body), +@item n +do not number lines (default for header and footer), +@item p@var{regexp} +number only lines that contain a match for @var{regexp}. +@end table + +@item -d @var{cd} +@itemx --section-delimiter=@var{cd} +@opindex -d +@opindex --section-delimiter +@cindex section delimiters of pages +Set the section delimiter characters to @var{cd}; default is +@samp{\:}. If only @var{c} is given, the second remains @samp{:}. +(Remember to protect @samp{\} or other metacharacters from shell +expansion with quotes or extra backslashes.) + +@item -f @var{style} +@itemx --footer-numbering=@var{style} +@opindex -f +@opindex --footer-numbering +Analogous to @samp{--body-numbering}. + +@item -h @var{style} +@itemx --header-numbering=@var{style} +@opindex -h +@opindex --header-numbering +Analogous to @samp{--body-numbering}. + +@item -i @var{number} +@itemx --page-increment=@var{number} +@opindex -i +@opindex --page-increment +Increment line numbers by @var{number} (default 1). + +@item -l @var{number} +@itemx --join-blank-lines=@var{number} +@opindex -l +@opindex --join-blank-lines +@cindex empty lines, numbering +@cindex blank lines, numbering +Consider @var{number} (default 1) consecutive empty lines to be one +logical line for numbering, and only number the last one. Where fewer +than @var{number} consecutive empty lines occur, do not number them. +An empty line is one that contains no characters, not even spaces +or tabs. + +@item -n @var{format} +@itemx --number-format=@var{format} +@opindex -n +@opindex --number-format +Select the line numbering format (default is @code{rn}): + +@table @samp +@item ln +@opindex ln @r{format for @code{nl}} +left justified, no leading zeros; +@item rn +@opindex rn @r{format for @code{nl}} +right justified, no leading zeros; +@item rz +@opindex rz @r{format for @code{nl}} +right justified, leading zeros. +@end table + +@item -p +@itemx --no-renumber +@opindex -p +@opindex --no-renumber +Do not reset the line number at the start of a logical page. + +@item -s @var{string} +@itemx --number-separator=@var{string} +@opindex -s +@opindex --number-separator +Separate the line number from the text line in the output with +@var{string} (default is the TAB character). + +@item -v @var{number} +@itemx --starting-line-number=@var{number} +@opindex -v +@opindex --starting-line-number +Set the initial line number on each logical page to @var{number} (default 1). + +@item -w @var{number} +@itemx --number-width=@var{number} +@opindex -w +@opindex --number-width +Use @var{number} characters for line numbers (default 6). + +@end table + + +@node od invocation +@section @code{od}: Write files in octal or other formats + +@pindex od +@cindex octal dump of files +@cindex hex dump of files +@cindex ASCII dump of files +@cindex file contents, dumping unambiguously + +@code{od} writes an unambiguous representation of each @var{file} +(@samp{-} means standard input), or standard input if none are given. +Synopsis: + +@example +od [@var{option}]@dots{} [@var{file}]@dots{} +od -C [@var{file}] [[+]@var{offset} [[+]@var{label}]] +@end example + +Each line of output consists of the offset in the input, followed by +groups of data from the file. By default, @code{od} prints the offset in +octal, and each group of file data is two bytes of input printed as a +single octal number. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -A @var{radix} +@itemx --address-radix=@var{radix} +@opindex -A +@opindex --address-radix +@cindex radix for file offsets +@cindex file offset radix +Select the base in which file offsets are printed. @var{radix} can +be one of the following: + +@table @samp +@item d +decimal; +@item o +octal; +@item x +hexadecimal; +@item n +none (do not print offsets). +@end table + +The default is octal. + +@item -j @var{bytes} +@itemx --skip-bytes=@var{bytes} +@opindex -j +@opindex --skip-bytes +Skip @var{bytes} input bytes before formatting and writing. If +@var{bytes} begins with @samp{0x} or @samp{0X}, it is interpreted in +hexadecimal; otherwise, if it begins with @samp{0}, in octal; otherwise, +in decimal. Appending @samp{b} multiplies @var{bytes} by 512, @samp{k} +by 1024, and @samp{m} by 1048576. + +@item -N @var{bytes} +@itemx --read-bytes=@var{bytes} +@opindex -N +@opindex --read-bytes +Output at most @var{bytes} bytes of the input. Prefixes and suffixes on +@code{bytes} are interpreted as for the @samp{-j} option. + +@item -s [@var{n}] +@itemx --strings[=@var{n}] +@opindex -s +@opindex --strings +@cindex string constants, outputting +Instead of the normal output, output only @dfn{string constants}: at +least @var{n} (3 by default) consecutive @sc{ascii} graphic characters, +followed by a null (zero) byte. + +@item -t @var{type} +@itemx --format=@var{type} +@opindex -t +@opindex --format +Select the format in which to output the file data. @var{type} is a +string of one or more of the below type indicator characters. If you +include more than one type indicator character in a single @var{type} +string, or use this option more than once, @code{od} writes one copy +of each output line using each of the data types that you specified, +in the order that you specified. + +Adding a trailing ``z'' to any type specification appends a display +of the @sc{ascii} character representation of the printable characters +to the output line generated by the type specification. + +@table @samp +@item a +named character +@item c +@sc{ascii} character or backslash escape, +@item d +signed decimal +@item f +floating point +@item o +octal +@item u +unsigned decimal +@item x +hexadecimal +@end table + +The type @code{a} outputs things like @samp{sp} for space, @samp{nl} for +newline, and @samp{nul} for a null (zero) byte. Type @code{c} outputs +@samp{ }, @samp{\n}, and @code{\0}, respectively. + +@cindex type size +Except for types @samp{a} and @samp{c}, you can specify the number +of bytes to use in interpreting each number in the given data type +by following the type indicator character with a decimal integer. +Alternately, you can specify the size of one of the C compiler's +built-in data types by following the type indicator character with +one of the following characters. For integers (@samp{d}, @samp{o}, +@samp{u}, @samp{x}): + +@table @samp +@item C +char +@item S +short +@item I +int +@item L +long +@end table + +For floating point (@code{f}): + +@table @asis +@item F +float +@item D +double +@item L +long double +@end table + +@item -v +@itemx --output-duplicates +@opindex -v +@opindex --output-duplicates +Output consecutive lines that are identical. By default, when two or +more consecutive output lines would be identical, @code{od} outputs only +the first line, and puts just an asterisk on the following line to +indicate the elision. + +@item -w[@var{n}] +@itemx --width[=@var{n}] +@opindex -w +@opindex --width +Dump @code{n} input bytes per output line. This must be a multiple of +the least common multiple of the sizes associated with the specified +output types. If @var{n} is omitted, the default is 32. If this option +is not given at all, the default is 16. + +@end table + +The next several options map the old, pre-@sc{posix} format specification +options to the corresponding @sc{posix} format specs. +@sc{gnu} @code{od} accepts +any combination of old- and new-style options. Format specification +options accumulate. + +@table @samp + +@item -a +@opindex -a +Output as named characters. Equivalent to @samp{-ta}. + +@item -b +@opindex -b +Output as octal bytes. Equivalent to @samp{-toC}. + +@item -c +@opindex -c +Output as @sc{ascii} characters or backslash escapes. Equivalent to +@samp{-tc}. + +@item -d +@opindex -d +Output as unsigned decimal shorts. Equivalent to @samp{-tu2}. + +@item -f +@opindex -f +Output as floats. Equivalent to @samp{-tfF}. + +@item -h +@opindex -h +Output as hexadecimal shorts. Equivalent to @samp{-tx2}. + +@item -i +@opindex -i +Output as decimal shorts. Equivalent to @samp{-td2}. + +@item -l +@opindex -l +Output as decimal longs. Equivalent to @samp{-td4}. + +@item -o +@opindex -o +Output as octal shorts. Equivalent to @samp{-to2}. + +@item -x +@opindex -x +Output as hexadecimal shorts. Equivalent to @samp{-tx2}. + +@item -C +@itemx --traditional +@opindex --traditional +Recognize the pre-@sc{posix} non-option arguments that traditional @code{od} +accepted. The following syntax: + +@smallexample +od --traditional [@var{file}] [[+]@var{offset}[.][b] [[+]@var{label}[.][b]]] +@end smallexample + +@noindent +can be used to specify at most one file and optional arguments +specifying an offset and a pseudo-start address, @var{label}. By +default, @var{offset} is interpreted as an octal number specifying how +many input bytes to skip before formatting and writing. The optional +trailing decimal point forces the interpretation of @var{offset} as a +decimal number. If no decimal is specified and the offset begins with +@samp{0x} or @samp{0X} it is interpreted as a hexadecimal number. If +there is a trailing @samp{b}, the number of bytes skipped will be +@var{offset} multiplied by 512. The @var{label} argument is interpreted +just like @var{offset}, but it specifies an initial pseudo-address. The +pseudo-addresses are displayed in parentheses following any normal +address. + +@end table + + +@node Formatting file contents +@chapter Formatting file contents + +@cindex formatting file contents + +These commands reformat the contents of files. + +@menu +* fmt invocation:: Reformat paragraph text. +* pr invocation:: Paginate or columnate files for printing. +* fold invocation:: Wrap input lines to fit in specified width. +@end menu + + +@node fmt invocation +@section @code{fmt}: Reformat paragraph text + +@pindex fmt +@cindex reformatting paragraph text +@cindex paragraphs, reformatting +@cindex text, reformatting + +@code{fmt} fills and joins lines to produce output lines of (at most) +a given number of characters (75 by default). Synopsis: + +@example +fmt [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@code{fmt} reads from the specified @var{file} arguments (or standard +input if none are given), and writes to standard output. + +By default, blank lines, spaces between words, and indentation are +preserved in the output; successive input lines with different +indentation are not joined; tabs are expanded on input and introduced on +output. + +@cindex line-breaking +@cindex sentences and line-breaking +@cindex Knuth, Donald E. +@cindex Plass, Michael F. +@code{fmt} prefers breaking lines at the end of a sentence, and tries to +avoid line breaks after the first word of a sentence or before the last +word of a sentence. A @dfn{sentence break} is defined as either the end +of a paragraph or a word ending in any of @samp{.?!}, followed by two +spaces or end of line, ignoring any intervening parentheses or quotes. +Like @TeX{}, @code{fmt} reads entire ``paragraphs'' before choosing line +breaks; the algorithm is a variant of that in ``Breaking Paragraphs Into +Lines'' (Donald E. Knuth and Michael F. Plass, @cite{Software---Practice +and Experience}, 11 (1981), 1119--1184). + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -c +@itemx --crown-margin +@opindex -c +@opindex --crown-margin +@cindex crown margin +@dfn{Crown margin} mode: preserve the indentation of the first two +lines within a paragraph, and align the left margin of each subsequent +line with that of the second line. + +@item -t +@itemx --tagged-paragraph +@opindex -t +@opindex --tagged-paragraph +@cindex tagged paragraphs +@dfn{Tagged paragraph} mode: like crown margin mode, except that if +indentation of the first line of a paragraph is the same as the +indentation of the second, the first line is treated as a one-line +paragraph. + +@item -s +@itemx --split-only +@opindex -s +@opindex --split-only +Split lines only. Do not join short lines to form longer ones. This +prevents sample lines of code, and other such ``formatted'' text from +being unduly combined. + +@item -u +@itemx --uniform-spacing +@opindex -u +@opindex --uniform-spacing +Uniform spacing. Reduce spacing between words to one space, and spacing +between sentences to two spaces. + +@item -@var{width} +@itemx -w @var{width} +@itemx --width=@var{width} +@opindex -@var{width} +@opindex -w +@opindex --width +Fill output lines up to @var{width} characters (default 75). @code{fmt} +initially tries to make lines about 7% shorter than this, to give it +room to balance line lengths. + +@item -p @var{prefix} +@itemx --prefix=@var{prefix} +Only lines beginning with @var{prefix} (possibly preceded by whitespace) +are subject to formatting. The prefix and any preceding whitespace are +stripped for the formatting and then re-attached to each formatted output +line. One use is to format certain kinds of program comments, while +leaving the code unchanged. + +@end table + + +@node pr invocation +@section @code{pr}: Paginate or columnate files for printing + +@pindex pr +@cindex printing, preparing files for +@cindex multicolumn output, generating +@cindex merging files in parallel + +@code{pr} writes each @var{file} (@samp{-} means standard input), or +standard input if none are given, to standard output, paginating and +optionally outputting in multicolumn format; optionally merges all +@var{file}s, printing all in parallel, one per column. Synopsis: + +@example +pr [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@vindex LC_MESSAGES +By default, a 5-line header is printed at each page: two blank lines; +a line with the date, the filename, and the page count; and two more +blank lines. A footer of five blank lines is also printed. With the @samp{-F} +option, a 3-line header is printed: the leading two blank lines are +omitted; no footer is used. The default @var{page_length} in both cases is 66 +lines. The default number of text lines changes from 56 (without @samp{-F}) +to 63 (with @samp{-F}). The text line of the header takes the form +@samp{@var{date} @var{string} @var{page}}, with spaces inserted around +@var{string} so that the line takes up the full @var{page_width}. Here, +@var{date} is the date (see the @option{-D} or @option{--date-format} +option for details), @var{string} is the centered header string, and +@var{page} identifies the page number. The @env{LC_MESSAGES} locale +category affects the spelling of @var{page}; in the default C locale, it +is @samp{Page @var{number}} where @var{number} is the decimal page +number. + +Form feeds in the input cause page breaks in the output. Multiple form +feeds produce empty pages. + +Columns are of equal width, separated by an optional string (default +is @samp{space}). For multicolumn output, lines will always be truncated to +@var{page_width} (default 72), unless you use the @samp{-J} option. For single +column output no line truncation occurs by default. Use @samp{-W} option to +truncate lines in that case. + +The following changes were made in version 1.22i and apply to later +versions of @command{pr}: +@c FIXME: this whole section here sounds very awkward to me. I +@c made a few small changes, but really it all needs to be redone. - Brian +@c OK, I fixed another sentence or two, but some of it I just don't understand. +@ - Brian +@itemize @bullet + +@item +Some small @var{letter options} (@samp{-s}, @samp{-w}) have been +redefined for better @sc{posix} compliance. The output of some further +cases has been adapted to other Unix systems. These changes are not +compatible with earlier versions of the program. + +@item +Some @var{new capital letter} options (@samp{-J}, @samp{-S}, @samp{-W}) +have been introduced to turn off unexpected interferences of small letter +options. The @samp{-N} option and the second argument @var{last_page} +of @samp{+FIRST_PAGE} offer more flexibility. The detailed handling of +form feeds set in the input files requires the @samp{-T} option. + +@item +Capital letter options override small letter ones. + +@item +Some of the option-arguments (compare @samp{-s}, @samp{-S}, @samp{-e}, +@samp{-i}, @samp{-n}) cannot be specified as separate arguments from the +preceding option letter (already stated in the @sc{posix} specification). +@end itemize + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item +@var{first_page}[:@var{last_page}] +@itemx --pages=@var{first_page}[:@var{last_page}] +@opindex +@var{first_page}[:@var{last_page}] +@opindex --pages +Begin printing with page @var{first_page} and stop with @var{last_page}. +Missing @samp{:@var{last_page}} implies end of file. While estimating +the number of skipped pages each form feed in the input file results +in a new page. Page counting with and without @samp{+@var{first_page}} +is identical. By default, counting starts with the first page of input +file (not first page printed). Line numbering may be altered by @samp{-N} +option. + +@item -@var{column} +@itemx --columns=@var{column} +@opindex -@var{column} +@opindex --columns +@cindex down columns +With each single @var{file}, produce @var{column} columns of output +(default is 1) and print columns down, unless @samp{-a} is used. The +column width is automatically decreased as @var{column} increases; unless +you use the @samp{-W/-w} option to increase @var{page_width} as well. +This option might well cause some lines to be truncated. The number of +lines in the columns on each page are balanced. The options @samp{-e} +and @samp{-i} are on for multiple text-column output. Together with +@samp{-J} option column alignment and line truncation is turned off. +Lines of full length are joined in a free field format and @samp{-S} +option may set field separators. @samp{-@var{column}} may not be used +with @samp{-m} option. + +@item -a +@itemx --across +@opindex -a +@opindex --across +@cindex across columns +With each single @var{file}, print columns across rather than down. The +@samp{-@var{column}} option must be given with @var{column} greater than one. +If a line is too long to fit in a column, it is truncated. + +@item -c +@itemx --show-control-chars +@opindex -c +@opindex --show-control-chars +Print control characters using hat notation (e.g., @samp{^G}); print +other nonprinting characters in octal backslash notation. By default, +nonprinting characters are not changed. + +@item -d +@itemx --double-space +@opindex -d +@opindex --double-space +@cindex double spacing +Double space the output. + +@item -D @var{format} +@itemx --date-format=@var{format} +@cindex time formats +@cindex formatting times +Format header dates using @var{format}, using the same conventions as +for the the command @samp{date +@var{format}}; @xref{date invocation, , +,sh-utils,GNU shell utilities}. Except for directives, which start with +@samp{%}, characters in @var{format} are printed unchanged. You can use +this option to specify an arbitrary string in place of the header date, +e.g., @samp{--date-format="Monday morning"}. + +@vindex POSIXLY_CORRECT +@vindex LC_TIME +If the @env{POSIXLY_CORRECT} environment variable is not set, the date +format defaults to @samp{%Y-%m-%d %H:%M} (for example, @samp{2001-12-04 +23:59}); otherwise, the format depends on the @env{LC_TIME} locale +category, with the default being @samp{%b %e %H:%M %Y} (for example, +@samp{Dec@ @ 4 23:59 2001}. + +@item -e[@var{in-tabchar}[@var{in-tabwidth}]] +@itemx --expand-tabs[=@var{in-tabchar}[@var{in-tabwidth}]] +@opindex -e +@opindex --expand-tabs +@cindex input tabs +Expand @var{tab}s to spaces on input. Optional argument @var{in-tabchar} is +the input tab character (default is the TAB character). Second optional +argument @var{in-tabwidth} is the input tab character's width (default +is 8). + +@item -f +@itemx -F +@itemx --form-feed +@opindex -F +@opindex -f +@opindex --form-feed +Use a form feed instead of newlines to separate output pages. The default +page length of 66 lines is not altered. But the number of lines of text +per page changes from default 56 to 63 lines. + +@item -h @var{HEADER} +@itemx --header=@var{HEADER} +@opindex -h +@opindex --header +Replace the filename in the header with the centered string @var{header}. +When using the shell, @var{header} should be quoted and should be +separated from @option{-h} by a space. + +@item -i[@var{out-tabchar}[@var{out-tabwidth}]] +@itemx --output-tabs[=@var{out-tabchar}[@var{out-tabwidth}]] +@opindex -i +@opindex --output-tabs +@cindex output tabs +Replace spaces with @var{tab}s on output. Optional argument @var{out-tabchar} +is the output tab character (default is the TAB character). Second optional +argument @var{out-tabwidth} is the output tab character's width (default +is 8). + +@item -J +@itemx --join-lines +@opindex -J +@opindex --join-lines +Merge lines of full length. Used together with the column options +@samp{-@var{column}}, @samp{-a -@var{column}} or @samp{-m}. Turns off +@samp{-W/-w} line truncation; +no column alignment used; may be used with @samp{-S[@var{string}]}. +@samp{-J} has been introduced (together with @samp{-W} and @samp{-S}) +to disentangle the old (@sc{posix}-compliant) options @samp{-w} and +@samp{-s} along with the three column options. + + +@item -l @var{page_length} +@itemx --length=@var{page_length} +@opindex -l +@opindex --length +Set the page length to @var{page_length} (default 66) lines, including +the lines of the header [and the footer]. If @var{page_length} is less +than or equal to 10 (or <= 3 with @samp{-F}), the header and footer are +omitted, and all form feeds set in input files are eliminated, as if +the @samp{-T} option had been given. + +@item -m +@itemx --merge +@opindex -m +@opindex --merge +Merge and print all @var{file}s in parallel, one in each column. If a +line is too long to fit in a column, it is truncated, unless the @samp{-J} +option is used. @samp{-S[@var{string}]} may be used. Empty pages in +some @var{file}s (form feeds set) produce empty columns, still marked +by @var{string}. The result is a continuous line numbering and column +marking throughout the whole merged file. Completely empty merged pages +show no separators or line numbers. The default header becomes +@samp{@var{date} @var{page}} with spaces inserted in the middle; this +may be used with the @option{-h} or @option{--header} option to fill up +the middle blank part. + +@item -n[@var{number-separator}[@var{digits}]] +@itemx --number-lines[=@var{number-separator}[@var{digits}]] +@opindex -n +@opindex --number-lines +Provide @var{digits} digit line numbering (default for @var{digits} is +5). With multicolumn output the number occupies the first @var{digits} +column positions of each text column or only each line of @samp{-m} +output. With single column output the number precedes each line just as +@samp{-m} does. Default counting of the line numbers starts with the +first line of the input file (not the first line printed, compare the +@samp{--page} option and @samp{-N} option). +Optional argument @var{number-separator} is the character appended to +the line number to separate it from the text followed. The default +separator is the TAB character. In a strict sense a TAB is always +printed with single column output only. The @var{TAB}-width varies +with the @var{TAB}-position, e.g. with the left @var{margin} specified +by @samp{-o} option. With multicolumn output priority is given to +@samp{equal width of output columns} (a @sc{posix} specification). +The @var{TAB}-width is fixed to the value of the first column and does +not change with different values of left @var{margin}. That means a +fixed number of spaces is always printed in the place of the +@var{number-separator tab}. The tabification depends upon the output +position. + +@item -N @var{line_number} +@itemx --first-line-number=@var{line_number} +@opindex -N +@opindex --first-line-number +Start line counting with the number @var{line_number} at first line of +first page printed (in most cases not the first line of the input file). + +@item -o @var{margin} +@itemx --indent=@var{margin} +@opindex -o +@opindex --indent +@cindex indenting lines +@cindex left margin +Indent each line with a margin @var{margin} spaces wide (default is zero). +The total page width is the size of the margin plus the @var{page_width} +set with the @samp{-W/-w} option. A limited overflow may occur with +numbered single column output (compare @samp{-n} option). + +@item -r +@itemx --no-file-warnings +@opindex -r +@opindex --no-file-warnings +Do not print a warning message when an argument @var{file} cannot be +opened. (The exit status will still be nonzero, however.) + +@item -s[@var{char}] +@itemx --separator[=@var{char}] +@opindex -s +@opindex --separator +Separate columns by a single character @var{char}. The default for +@var{char} is the TAB character without @samp{-w} and @samp{no +character} with @samp{-w}. Without @samp{-s} the default separator +@samp{space} is set. @samp{-s[char]} turns off line truncation of all +three column options (@samp{-COLUMN}|@samp{-a -COLUMN}|@samp{-m}) unless +@samp{-w} is set. This is a @sc{posix}-compliant formulation. + + +@item -S[@var{string}] +@itemx --sep-string[=@var{string}] +@opindex -S +@opindex --sep-string +Use @var{string} to separate output columns. The @samp{-S} option doesn't +affect the @samp{-W/-w} option, unlike the @samp{-s} option which does. It +does not affect line truncation or column alignment. +Without @samp{-S}, and with @samp{-J}, @code{pr} uses the default output +separator, TAB. +Without @samp{-S} or @samp{-J}, @code{pr} uses a @samp{space} +(same as @samp{-S" "}). +Using @samp{-S} with no @var{string} is equivalent to @samp{-S""}. +Note that for some of @code{pr}'s options the single-letter option +character must be followed immediately by any corresponding argument; +there may not be any intervening white space. +@samp{-S/-s} is one of them. Don't use @samp{-S "STRING"}. +@sc{posix} requires this. + +@item -t +@itemx --omit-header +@opindex -t +@opindex --omit-header +Do not print the usual header [and footer] on each page, and do not fill +out the bottom of pages (with blank lines or a form feed). No page +structure is produced, but form feeds set in the input files are retained. +The predefined pagination is not changed. @samp{-t} or @samp{-T} may be +useful together with other options; e.g.: @samp{-t -e4}, expand TAB characters +in the input file to 4 spaces but don't make any other changes. Use of +@samp{-t} overrides @samp{-h}. + +@item -T +@itemx --omit-pagination +@opindex -T +@opindex --omit-pagination +Do not print header [and footer]. In addition eliminate all form feeds +set in the input files. + +@item -v +@itemx --show-nonprinting +@opindex -v +@opindex --show-nonprinting +Print nonprinting characters in octal backslash notation. + +@item -w @var{page_width} +@itemx --width=@var{page_width} +@opindex -w +@opindex --width +Set page width to @var{page_width} characters for multiple text-column +output only (default for @var{page_width} is 72). @samp{-s[CHAR]} turns +off the default page width and any line truncation and column alignment. +Lines of full length are merged, regardless of the column options +set. No @var{page_width} setting is possible with single column output. +A @sc{posix}-compliant formulation. + +@item -W @var{page_width} +@itemx --page_width=@var{page_width} +@opindex -W +@opindex --page_width +Set the page width to @var{page_width} characters. That's valid with and +without a column option. Text lines are truncated, unless @samp{-J} +is used. Together with one of the three column options +(@samp{-@var{column}}, @samp{-a -@var{column}} or @samp{-m}) column +alignment is always used. The separator options @samp{-S} or @samp{-s} +don't affect the @samp{-W} option. Default is 72 characters. Without +@samp{-W @var{page_width}} and without any of the column options NO line +truncation is used (defined to keep downward compatibility and to meet +most frequent tasks). That's equivalent to @samp{-W 72 -J}. The header +line is never truncated. + +@end table + + +@node fold invocation +@section @code{fold}: Wrap input lines to fit in specified width + +@pindex fold +@cindex wrapping long input lines +@cindex folding long input lines + +@code{fold} writes each @var{file} (@samp{-} means standard input), or +standard input if none are given, to standard output, breaking long +lines. Synopsis: + +@example +fold [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +By default, @code{fold} breaks lines wider than 80 columns. The output +is split into as many lines as necessary. + +@cindex screen columns +@code{fold} counts screen columns by default; thus, a tab may count more +than one column, backspace decreases the column count, and carriage +return sets the column to zero. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx --bytes +@opindex -b +@opindex --bytes +Count bytes rather than columns, so that tabs, backspaces, and carriage +returns are each counted as taking up one column, just like other +characters. + +@item -s +@itemx --spaces +@opindex -s +@opindex --spaces +Break at word boundaries: the line is broken after the last blank before +the maximum line length. If the line contains no such blanks, the line +is broken at the maximum line length as usual. + +@item -w @var{width} +@itemx --width=@var{width} +@opindex -w +@opindex --width +Use a maximum line length of @var{width} columns instead of 80. + +@end table + + +@node Output of parts of files +@chapter Output of parts of files + +@cindex output of parts of files +@cindex parts of files, output of + +These commands output pieces of the input. + +@menu +* head invocation:: Output the first part of files. +* tail invocation:: Output the last part of files. +* split invocation:: Split a file into fixed-size pieces. +* csplit invocation:: Split a file into context-determined pieces. +@end menu + +@node head invocation +@section @code{head}: Output the first part of files + +@pindex head +@cindex initial part of files, outputting +@cindex first part of files, outputting + +@code{head} prints the first part (10 lines by default) of each +@var{file}; it reads from standard input if no files are given or +when given a @var{file} of @samp{-}. Synopses: + +@example +head [@var{option}]@dots{} [@var{file}]@dots{} +head -@var{number} [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +If more than one @var{file} is specified, @code{head} prints a +one-line header consisting of +@example +==> @var{file name} <== +@end example +@noindent +before the output for each @var{file}. + +@code{head} accepts two option formats: the new one, in which numbers +are arguments to the options (@samp{-q -n 1}), and the old one, in which +the number precedes any option letters (@samp{-1q}). + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{count}@var{options} +@opindex -@var{count} +This option is only recognized if it is specified first. @var{count} is +a decimal number optionally followed by a size letter (@samp{b}, +@samp{k}, @samp{m}) as in @code{-c}, or @samp{l} to mean count by lines, +or other option letters (@samp{cqv}). + +@item -c @var{bytes} +@itemx --bytes=@var{bytes} +@opindex -c +@opindex --bytes +Print the first @var{bytes} bytes, instead of initial lines. Appending +@samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and @samp{m} +by 1048576. + +@itemx -n @var{n} +@itemx --lines=@var{n} +@opindex -n +@opindex --lines +Output the first @var{n} lines. + +@item -q +@itemx --quiet +@itemx --silent +@opindex -q +@opindex --quiet +@opindex --silent +Never print file name headers. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Always print file name headers. + +@end table + + +@node tail invocation +@section @code{tail}: Output the last part of files + +@pindex tail +@cindex last part of files, outputting + +@code{tail} prints the last part (10 lines by default) of each +@var{file}; it reads from standard input if no files are given or +when given a @var{file} of @samp{-}. Synopses: + +@example +tail [@var{option}]@dots{} [@var{file}]@dots{} +tail -@var{number} [@var{option}]@dots{} [@var{file}]@dots{} +tail +@var{number} [@var{option}]@dots{} [@var{file}]@dots{} # obsolescent +@end example + +If more than one @var{file} is specified, @code{tail} prints a +one-line header consisting of +@example +==> @var{file name} <== +@end example +@noindent +before the output for each @var{file}. + +@cindex BSD @code{tail} +@sc{gnu} @code{tail} can output any amount of data (some other versions of +@code{tail} cannot). It also has no @samp{-r} option (print in +reverse), since reversing a file is really a different job from printing +the end of a file; BSD @code{tail} (which is the one with @code{-r}) can +only reverse files that are at most as large as its buffer, which is +typically 32k. A more reliable and versatile way to reverse files is +the @sc{gnu} @code{tac} command. + +@code{tail} accepts two option formats: the new one, in which numbers +are arguments to the options (@samp{-n 1}), and the obsolescent one, in +which the number precedes any option letters (@samp{-1} or @samp{+1}). +Warning: support for the @samp{+1} form will be withdrawn, as future +versions of @sc{posix} will not allow it. + +If any option-argument is a number @var{n} starting with a @samp{+}, +@code{tail} begins printing with the @var{n}th item from the start of +each file, instead of from the end. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{count} +@itemx +@var{count} +@opindex -@var{count} +@opindex +@var{count} +This option is only recognized if it is specified first. @var{count} is +a decimal number optionally followed by a size letter (@samp{b}, +@samp{k}, @samp{m}) as in @code{-c}, or @samp{l} to mean count by lines, +or other option letters (@samp{cfqv}). + +Warning: the @samp{+@var{count}} usage is obsolescent. Future versions +of @sc{posix} will require that support for it be withdrawn. Use +@samp{-n +@var{count}} instead. + +@item -c @var{bytes} +@itemx --bytes=@var{bytes} +@opindex -c +@opindex --bytes +Output the last @var{bytes} bytes, instead of final lines. Appending +@samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and @samp{m} +by 1048576. + +@item -f +@itemx --follow[=@var{how}] +@opindex -f +@opindex --follow +@cindex growing files +@vindex name @r{follow option} +@vindex descriptor @r{follow option} +Loop forever trying to read more characters at the end of the file, +presumably because the file is growing. This option is ignored when +reading from a pipe. +If more than one file is given, @code{tail} prints a header whenever it +gets output from a different file, to indicate which file that output is +from. + +There are two ways to specify how you'd like to track files with this option, +but that difference is noticeable only when a followed file is removed or +renamed. +If you'd like to continue to track the end of a growing file even after +it has been unlinked, use @samp{--follow=descriptor}. This is the default +behavior, but it is not useful if you're tracking a log file that may be +rotated (removed or renamed, then reopened). In that case, use +@samp{--follow=name} to track the named file by reopening it periodically +to see if it has been removed and recreated by some other program. + +No matter which method you use, if the tracked file is determined to have +shrunk, @code{tail} prints a message saying the file has been truncated +and resumes tracking the end of the file from the newly-determined endpoint. + +When a file is removed, @code{tail}'s behavior depends on whether it is +following the name or the descriptor. When following by name, tail can +detect that a file has been removed and gives a message to that effect, +and if @samp{--retry} has been specified it will continue checking +periodically to see if the file reappears. +When following a descriptor, tail does not detect that the file has +been unlinked or renamed and issues no message; even though the file +may no longer be accessible via its original name, it may still be +growing. + +The option values @samp{descriptor} and @samp{name} may be specified only +with the long form of the option, not with @samp{-f}. + +@itemx --retry +@opindex --retry +This option is meaningful only when following by name. +Without this option, when tail encounters a file that doesn't +exist or is otherwise inaccessible, it reports that fact and +never checks it again. + +@itemx --sleep-interval=@var{n} +@opindex --sleep-interval +Change the number of seconds to wait between iterations (the default is 1). +During one iteration, every specified file is checked to see if it has +changed size. + +@itemx --pid=@var{pid} +@opindex --pid +When following by name or by descriptor, you may specify the process ID, +@var{pid}, of the sole writer of all @var{file} arguments. Then, shortly +after that process terminates, tail will also terminate. This will +work properly only if the writer and the tailing process are running on +the same machine. For example, to save the output of a build in a file +and to watch the file grow, if you invoke @code{make} and @code{tail} +like this then the tail process will stop when your build completes. +Without this option, you would have had to kill the @code{tail -f} +process yourself. +@example +$ make >& makerr & tail --pid=$! -f makerr +@end example +If you specify a @var{pid} that is not in use or that does not correspond +to the process that is writing to the tailed files, then @code{tail} +may terminate long before any @var{file}s stop growing or it may not +terminate until long after the real writer has terminated. +Note that @samp{--pid} cannot be supported on some systems; @code{tail} +will print a warning if this is the case. + +@itemx --max-unchanged-stats=@var{n} +@opindex --max-unchanged-stats +When tailing a file by name, if there have been @var{n} (default +n=@value{DEFAULT_MAX_N_UNCHANGED_STATS_BETWEEN_OPENS}) consecutive +iterations for which the size has remained the same, then +@code{open}/@code{fstat} the file to determine if that file name is +still associated with the same device/inode-number pair as before. +When following a log file that is rotated, this is approximately the +number of seconds between when tail prints the last pre-rotation lines +and when it prints the lines that have accumulated in the new log file. +This option is meaningful only when following by name. + +@itemx -n @var{n} +@itemx --lines=@var{n} +@opindex -n +@opindex --lines +Output the last @var{n} lines. + +@item -q +@itemx -quiet +@itemx --silent +@opindex -q +@opindex --quiet +@opindex --silent +Never print file name headers. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Always print file name headers. + +@end table + + +@node split invocation +@section @code{split}: Split a file into fixed-size pieces + +@pindex split +@cindex splitting a file into pieces +@cindex pieces, splitting a file into + +@code{split} creates output files containing consecutive sections of +@var{input} (standard input if none is given or @var{input} is +@samp{-}). Synopsis: + +@example +split [@var{option}] [@var{input} [@var{prefix}]] +@end example + +By default, @code{split} puts 1000 lines of @var{input} (or whatever is +left over for the last section), into each output file. + +@cindex output file name prefix +The output files' names consist of @var{prefix} (@samp{x} by default) +followed by a group of letters @samp{aa}, @samp{ab}, and so on, such +that concatenating the output files in sorted order by file name produces +the original input file. (If more than 676 output files are required, +@code{split} uses @samp{zaa}, @samp{zab}, etc.) + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{lines} +@itemx -l @var{lines} +@itemx --lines=@var{lines} +@opindex -l +@opindex --lines +Put @var{lines} lines of @var{input} into each output file. + +@item -b @var{bytes} +@itemx --bytes=@var{bytes} +@opindex -b +@opindex --bytes +Put the first @var{bytes} bytes of @var{input} into each output file. +Appending @samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and +@samp{m} by 1048576. + +@item -C @var{bytes} +@itemx --line-bytes=@var{bytes} +@opindex -C +@opindex --line-bytes +Put into each output file as many complete lines of @var{input} as +possible without exceeding @var{bytes} bytes. For lines longer than +@var{bytes} bytes, put @var{bytes} bytes into each output file until +less than @var{bytes} bytes of the line are left, then continue +normally. @var{bytes} has the same format as for the @samp{--bytes} +option. + +@itemx --verbose +@opindex --verbose +Write a diagnostic to standard error just before each output file is opened. + +@end table + + +@node csplit invocation +@section @code{csplit}: Split a file into context-determined pieces + +@pindex csplit +@cindex context splitting +@cindex splitting a file into pieces by context + +@code{csplit} creates zero or more output files containing sections of +@var{input} (standard input if @var{input} is @samp{-}). Synopsis: + +@example +csplit [@var{option}]@dots{} @var{input} @var{pattern}@dots{} +@end example + +The contents of the output files are determined by the @var{pattern} +arguments, as detailed below. An error occurs if a @var{pattern} +argument refers to a nonexistent line of the input file (e.g., if no +remaining line matches a given regular expression). After every +@var{pattern} has been matched, any remaining input is copied into one +last output file. + +By default, @code{csplit} prints the number of bytes written to each +output file after it has been created. + +The types of pattern arguments are: + +@table @samp + +@item @var{n} +Create an output file containing the input up to but not including line +@var{n} (a positive integer). If followed by a repeat count, also +create an output file containing the next @var{line} lines of the input +file once for each repeat. + +@item /@var{regexp}/[@var{offset}] +Create an output file containing the current line up to (but not +including) the next line of the input file that contains a match for +@var{regexp}. The optional @var{offset} is a @samp{+} or @samp{-} +followed by a positive integer. If it is given, the input up to the +matching line plus or minus @var{offset} is put into the output file, +and the line after that begins the next section of input. + +@item %@var{regexp}%[@var{offset}] +Like the previous type, except that it does not create an output +file, so that section of the input file is effectively ignored. + +@item @{@var{repeat-count}@} +Repeat the previous pattern @var{repeat-count} additional +times. @var{repeat-count} can either be a positive integer or an +asterisk, meaning repeat as many times as necessary until the input is +exhausted. + +@end table + +The output files' names consist of a prefix (@samp{xx} by default) +followed by a suffix. By default, the suffix is an ascending sequence +of two-digit decimal numbers from @samp{00} to @samp{99}. In any case, +concatenating the output files in sorted order by filename produces the +original input file. + +By default, if @code{csplit} encounters an error or receives a hangup, +interrupt, quit, or terminate signal, it removes any output files +that it has created so far before it exits. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -f @var{prefix} +@itemx --prefix=@var{prefix} +@opindex -f +@opindex --prefix +@cindex output file name prefix +Use @var{prefix} as the output file name prefix. + +@item -b @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -b +@opindex --suffix +@cindex output file name suffix +Use @var{suffix} as the output file name suffix. When this option is +specified, the suffix string must include exactly one +@code{printf(3)}-style conversion specification, possibly including +format specification flags, a field width, a precision specifications, +or all of these kinds of modifiers. The format letter must convert a +binary integer argument to readable form; thus, only @samp{d}, @samp{i}, +@samp{u}, @samp{o}, @samp{x}, and @samp{X} conversions are allowed. The +entire @var{suffix} is given (with the current output file number) to +@code{sprintf(3)} to form the file name suffixes for each of the +individual output files in turn. If this option is used, the +@samp{--digits} option is ignored. + +@item -n @var{digits} +@itemx --digits=@var{digits} +@opindex -n +@opindex --digits +Use output file names containing numbers that are @var{digits} digits +long instead of the default 2. + +@item -k +@itemx --keep-files +@opindex -k +@opindex --keep-files +Do not remove output files when errors are encountered. + +@item -z +@itemx --elide-empty-files +@opindex -z +@opindex --elide-empty-files +Suppress the generation of zero-length output files. (In cases where +the section delimiters of the input file are supposed to mark the first +lines of each of the sections, the first output file will generally be a +zero-length file unless you use this option.) The output file sequence +numbers always run consecutively starting from 0, even when this option +is specified. + +@item -s +@itemx -q +@itemx --silent +@itemx --quiet +@opindex -s +@opindex -q +@opindex --silent +@opindex --quiet +Do not print counts of output file sizes. + +@end table + + +@node Summarizing files +@chapter Summarizing files + +@cindex summarizing files + +These commands generate just a few numbers representing entire +contents of files. + +@menu +* wc invocation:: Print byte, word, and line counts. +* sum invocation:: Print checksum and block counts. +* cksum invocation:: Print CRC checksum and byte counts. +* md5sum invocation:: Print or check message-digests. +@end menu + + +@node wc invocation +@section @code{wc}: Print byte, word, and line counts + +@pindex wc +@cindex byte count +@cindex character count +@cindex word count +@cindex line count + +@code{wc} counts the number of bytes, characters, whitespace-separated +words, and newlines in each given @var{file}, or standard input if none +are given or for a @var{file} of @samp{-}. Synopsis: + +@example +wc [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@cindex total counts +@vindex POSIXLY_CORRECT +@code{wc} prints one line of counts for each file, and if the file was +given as an argument, it prints the file name following the counts. If +more than one @var{file} is given, @code{wc} prints a final line +containing the cumulative counts, with the file name @file{total}. The +counts are printed in this order: newlines, words, characters, bytes. +By default, each count is output right-justified in a 7-byte field with +one space between fields so that the numbers and file names line up nicely +in columns. However, @sc{posix} requires that there be exactly one space +separating columns. You can make @code{wc} use the @sc{posix}-mandated +output format by setting the @env{POSIXLY_CORRECT} environment variable. + +By default, @code{wc} prints three counts: the newline, words, and byte +counts. Options can specify that only certain counts be printed. +Options do not undo others previously given, so + +@example +wc --bytes --words +@end example + +@noindent +prints both the byte counts and the word counts. + +With the @code{--max-line-length} option, @code{wc} prints the length +of the longest line per file, and if there is more than one file it +prints the maximum (not the sum) of those lengths. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -c +@itemx --bytes +@opindex -c +@opindex --bytes +Print only the byte counts. + +@item -m +@itemx --chars +@opindex -m +@opindex --chars +Print only the character counts. + +@item -w +@itemx --words +@opindex -w +@opindex --words +Print only the word counts. + +@item -l +@itemx --lines +@opindex -l +@opindex --lines +Print only the newline counts. + +@item -L +@itemx --max-line-length +@opindex -L +@opindex --max-line-length +Print only the maximum line lengths. + +@end table + + +@node sum invocation +@section @code{sum}: Print checksum and block counts + +@pindex sum +@cindex 16-bit checksum +@cindex checksum, 16-bit + +@code{sum} computes a 16-bit checksum for each given @var{file}, or +standard input if none are given or for a @var{file} of @samp{-}. Synopsis: + +@example +sum [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@code{sum} prints the checksum for each @var{file} followed by the +number of blocks in the file (rounded up). If more than one @var{file} +is given, file names are also printed (by default). (With the +@samp{--sysv} option, corresponding file names are printed when there is +at least one file argument.) + +By default, @sc{gnu} @code{sum} computes checksums using an algorithm +compatible with BSD @code{sum} and prints file sizes in units of +1024-byte blocks. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -r +@opindex -r +@cindex BSD @code{sum} +Use the default (BSD compatible) algorithm. This option is included for +compatibility with the System V @code{sum}. Unless @samp{-s} was also +given, it has no effect. + +@item -s +@itemx --sysv +@opindex -s +@opindex --sysv +@cindex System V @code{sum} +Compute checksums using an algorithm compatible with System V +@code{sum}'s default, and print file sizes in units of 512-byte blocks. + +@end table + +@code{sum} is provided for compatibility; the @code{cksum} program (see +next section) is preferable in new applications. + + +@node cksum invocation +@section @code{cksum}: Print CRC checksum and byte counts + +@pindex cksum +@cindex cyclic redundancy check +@cindex CRC checksum + +@code{cksum} computes a cyclic redundancy check (CRC) checksum for each +given @var{file}, or standard input if none are given or for a +@var{file} of @samp{-}. Synopsis: + +@example +cksum [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@code{cksum} prints the CRC checksum for each file along with the number +of bytes in the file, and the filename unless no arguments were given. + +@code{cksum} is typically used to ensure that files +transferred by unreliable means (e.g., netnews) have not been corrupted, +by comparing the @code{cksum} output for the received files with the +@code{cksum} output for the original files (typically given in the +distribution). + +The CRC algorithm is specified by the @sc{posix.2} standard. It is not +compatible with the BSD or System V @code{sum} algorithms (see the +previous section); it is more robust. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node md5sum invocation +@section @code{md5sum}: Print or check message-digests + +@pindex md5sum +@cindex 128-bit checksum +@cindex checksum, 128-bit +@cindex fingerprint, 128-bit +@cindex message-digest, 128-bit + +@code{md5sum} computes a 128-bit checksum (or @dfn{fingerprint} or +@dfn{message-digest}) for each specified @var{file}. +If a @var{file} is specified as @samp{-} or if no files are given +@code{md5sum} computes the checksum for the standard input. +@code{md5sum} can also determine whether a file and checksum are +consistent. Synopses: + +@example +md5sum [@var{option}]@dots{} [@var{file}]@dots{} +md5sum [@var{option}]@dots{} --check [@var{file}] +@end example + +For each @var{file}, @samp{md5sum} outputs the MD5 checksum, a flag +indicating a binary or text input file, and the filename. +If @var{file} is omitted or specified as @samp{-}, standard input is read. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx --binary +@opindex -b +@opindex --binary +@cindex binary input files +Treat all input files as binary. This option has no effect on Unix +systems, since they don't distinguish between binary and text files. +This option is useful on systems that have different internal and +external character representations. On MS-DOS and MS-Windows, this is +the default. + +@item -c +@itemx --check +Read filenames and checksum information from the single @var{file} +(or from stdin if no @var{file} was specified) and report whether +each named file and the corresponding checksum data are consistent. +The input to this mode of @code{md5sum} is usually the output of +a prior, checksum-generating run of @samp{md5sum}. +Each valid line of input consists of an MD5 checksum, a binary/text +flag, and then a filename. +Binary files are marked with @samp{*}, text with @samp{ }. +For each such line, @code{md5sum} reads the named file and computes its +MD5 checksum. Then, if the computed message digest does not match the +one on the line with the filename, the file is noted as having +failed the test. Otherwise, the file passes the test. +By default, for each valid line, one line is written to standard +output indicating whether the named file passed the test. +After all checks have been performed, if there were any failures, +a warning is issued to standard error. +Use the @samp{--status} option to inhibit that output. +If any listed file cannot be opened or read, if any valid line has +an MD5 checksum inconsistent with the associated file, or if no valid +line is found, @code{md5sum} exits with nonzero status. Otherwise, +it exits successfully. + +@itemx --status +@opindex --status +@cindex verifying MD5 checksums +This option is useful only when verifying checksums. +When verifying checksums, don't generate the default one-line-per-file +diagnostic and don't output the warning summarizing any failures. +Failures to open or read a file still evoke individual diagnostics to +standard error. +If all listed files are readable and are consistent with the associated +MD5 checksums, exit successfully. Otherwise exit with a status code +indicating there was a failure. + +@item -t +@itemx --text +@opindex -t +@opindex --text +@cindex text input files +Treat all input files as text files. This is the reverse of +@samp{--binary}. + +@item -w +@itemx --warn +@opindex -w +@opindex --warn +@cindex verifying MD5 checksums +When verifying checksums, warn about improperly formatted MD5 checksum lines. +This option is useful only if all but a few lines in the checked input +are valid. + +@end table + + +@node Operating on sorted files +@chapter Operating on sorted files + +@cindex operating on sorted files +@cindex sorted files, operations on + +These commands work with (or produce) sorted files. + +@menu +* sort invocation:: Sort text files. +* uniq invocation:: Uniquify files. +* comm invocation:: Compare two sorted files line by line. +* ptx invocation:: Produce a permuted index of file contents. +* tsort invocation:: Topological sort. +@end menu + + +@node sort invocation +@section @code{sort}: Sort text files + +@pindex sort +@cindex sorting files + +@code{sort} sorts, merges, or compares all the lines from the given +files, or standard input if none are given or for a @var{file} of +@samp{-}. By default, @code{sort} writes the results to standard +output. Synopsis: + +@example +sort [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@code{sort} has three modes of operation: sort (the default), merge, +and check for sortedness. The following options change the operation +mode: + +@table @samp + +@item -c +@itemx --check +@opindex -c +@opindex --check +@cindex checking for sortedness +Check whether the given files are already sorted: if they are not all +sorted, print an error message and exit with a status of 1. +Otherwise, exit successfully. + +@item -m +@itemx --merge +@opindex -m +@opindex --merge +@cindex merging sorted files +Merge the given files by sorting them as a group. Each input file must +always be individually sorted. It always works to sort instead of +merge; merging is provided because it is faster, in the case where it +works. + +@end table + +@vindex LC_COLLATE +A pair of lines is compared as follows: if any key fields have been +specified, @code{sort} compares each pair of fields, in the order +specified on the command line, according to the associated ordering +options, until a difference is found or no fields are left. +Unless otherwise specified, all comparisons use the character +collating sequence specified by the @env{LC_COLLATE} locale. + +If any of the global options @samp{bdfgiMnr} are given but no key fields +are specified, @code{sort} compares the entire lines according to the +global options. + +Finally, as a last resort when all keys compare equal (or if no ordering +options were specified at all), @code{sort} compares the entire lines. +The last resort comparison honors the @option{--reverse} (@option{-r}) +global option. The @option{--stable} (@option{-s}) option disables this +last-resort comparison so that lines in which all fields compare equal +are left in their original relative order. If no fields or global +options are specified, @option{--stable} (@option{-s}) has no effect. + +@sc{gnu} @code{sort} (as specified for all @sc{gnu} utilities) has no limits on +input line length or restrictions on bytes allowed within lines. In +addition, if the final byte of an input file is not a newline, @sc{gnu} +@code{sort} silently supplies one. A line's trailing newline is not +part of the line for comparison purposes.@footnote{@sc{posix}.2-1992 +requires that the trailing newline be part of the comparison, and some +@code{sort} implementations obey this requirement, but it is widely +considered to be a bug in the standard and the next version of +@sc{posix}.2 will likely remove this requirement.} + +Upon any error, @code{sort} exits with a status of @samp{2}. + +@vindex TMPDIR +If the environment variable @env{TMPDIR} is set, @code{sort} uses its +value as the directory for temporary files instead of @file{/tmp}. The +@option{--temporary-directory} (@option{-T}) option in turn overrides +the environment variable. + + +The following options affect the ordering of output lines. They may be +specified globally or as part of a specific key field. If no key +fields are specified, global options apply to comparison of entire +lines; otherwise the global options are inherited by key fields that do +not specify any special options of their own. In pre-@sc{posix} +versions of @command{sort}, global options affect only later key fields, +so portable shell scripts should specify global options first. + +@table @samp + +@item -b +@itemx --ignore-leading-blanks +@opindex -b +@opindex --ignore-leading-blanks +@cindex blanks, ignoring leading +@vindex LC_CTYPE +Ignore leading blanks when finding sort keys in each line. +The @env{LC_CTYPE} locale determines character types. + +@item -d +@itemx --dictionary-order +@opindex -d +@opindex --dictionary-order +@cindex dictionary order +@cindex phone directory order +@cindex telephone directory order +@vindex LC_CTYPE +Sort in @dfn{phone directory} order: ignore all characters except +letters, digits and blanks when sorting. +The @env{LC_CTYPE} locale determines character types. + +@item -f +@itemx --ignore-case +@opindex -f +@opindex --ignore-case +@cindex ignoring case +@cindex case folding +@vindex LC_CTYPE +Fold lowercase characters into the equivalent uppercase characters when +comparing so that, for example, @samp{b} and @samp{B} sort as equal. +The @env{LC_CTYPE} locale determines character types. + +@item -g +@itemx --general-numeric-sort +@opindex -g +@opindex --general-numeric-sort +@cindex general numeric sort +@vindex LC_NUMERIC +Sort numerically, using the standard C function @code{strtod} to convert +a prefix of each line to a double-precision floating point number. +This allows floating point numbers to be specified in scientific notation, +like @code{1.0e-34} and @code{10e100}. +The @env{LC_NUMERIC} locale determines the decimal-point character. +Do not report overflow, underflow, or conversion errors. +Use the following collating sequence: + +@itemize @bullet +@item +Lines that do not start with numbers (all considered to be equal). +@item +NaNs (``Not a Number'' values, in IEEE floating point arithmetic) +in a consistent but machine-dependent order. +@item +Minus infinity. +@item +Finite numbers in ascending numeric order (with @math{-0} and @math{+0} equal). +@item +Plus infinity. +@end itemize + +Use this option only if there is no alternative; it is much slower than +@option{--numeric-sort} (@option{-n}) and it can lose information when +converting to floating point. + +@item -i +@itemx --ignore-nonprinting +@opindex -i +@opindex --ignore-nonprinting +@cindex nonprinting characters, ignoring +@cindex unprintable characters, ignoring +@vindex LC_CTYPE +Ignore nonprinting characters. +The @env{LC_CTYPE} locale determines character types. + +@item -M +@itemx --month-sort +@opindex -M +@opindex --month-sort +@cindex months, sorting by +@vindex LC_TIME +An initial string, consisting of any amount of whitespace, followed +by a month name abbreviation, is folded to UPPER case and +compared in the order @samp{JAN} < @samp{FEB} < @dots{} < @samp{DEC}. +Invalid names compare low to valid names. The @env{LC_TIME} locale +determines the month spellings. + +@item -n +@itemx --numeric-sort +@opindex -n +@opindex --numeric-sort +@cindex numeric sort +@vindex LC_NUMERIC +Sort numerically: the number begins each line; specifically, it consists +of optional whitespace, an optional @samp{-} sign, and zero or more +digits possibly separated by thousands separators, optionally followed +by a decimal-point character and zero or more digits. The @env{LC_NUMERIC} +locale specifies the decimal-point character and thousands separator. + +Numeric sort uses what might be considered an unconventional method to +compare strings representing floating point numbers. Rather than first +converting each string to the C @code{double} type and then comparing +those values, @command{sort} aligns the decimal-point characters in the +two strings and compares the strings a character at a time. One benefit +of using this approach is its speed. In practice this is much more +efficient than performing the two corresponding string-to-double (or +even string-to-integer) conversions and then comparing doubles. In +addition, there is no corresponding loss of precision. Converting each +string to @code{double} before comparison would limit precision to about +16 digits on most systems. + +Neither a leading @samp{+} nor exponential notation is recognized. +To compare such strings numerically, use the +@option{--general-numeric-sort} (@option{-g}) option. + +@item -r +@itemx --reverse +@opindex -r +@opindex --reverse +@cindex reverse sorting +Reverse the result of comparison, so that lines with greater key values +appear earlier in the output instead of later. + +@end table + +Other options are: + +@table @samp + +@item -o @var{output-file} +@itemx --output=@var{output-file} +@opindex -o +@opindex --output +@cindex overwriting of input, allowed +Write output to @var{output-file} instead of standard output. +If necessary, @command{sort} reads input before opening +@var{output-file}, so you can safely sort a file in place by using +commands like @code{sort -o F F} and @code{cat F | sort -o F}. + +@vindex POSIXLY_CORRECT +If @option{-c} is not also specified, @option{-o} may appear after an +input file even if @env{POSIXLY_CORRECT} is set, e.g., @samp{sort F -o +F}. Warning: this usage is obsolescent. Future versions of @sc{posix} +will require that support for it be withdrawn. Portable scripts should +specify @samp{-o @var{output-file}} before any input files. + +@item -S @var{size} +@itemx --buffer-size=@var{size} +@opindex -S +@opindex --buffer-size +@cindex size for main memory sorting +Use a main-memory sort buffer of the given @var{size}. By default, +@var{size} is in units of 1,024 bytes. Appending @samp{%} causes +@var{size} to be interpreted as a percentage of physical memory. +Appending @samp{k} multiplies @var{size} by 1,024 (the default), +@samp{M} by 1,048,576, @samp{G} by 1,073,741,824, and so on for +@samp{T}, @samp{P}, @samp{E}, @samp{Z}, and @samp{Y}. Appending +@samp{b} causes @var{size} to be interpreted as a byte count, with no +multiplication. + +This option can improve the performance of @command{sort} by causing it +to start with a larger or smaller sort buffer than the default. +However, this option affects only the initial buffer size. The buffer +grows beyond @var{size} if @command{sort} encounters input lines larger +than @var{size}. + +@item -t @var{separator} +@itemx --field-separator=@var{separator} +@opindex -t +@opindex --field-separator +@cindex field separator character +Use character @var{separator} as the field separator when finding the +sort keys in each line. By default, fields are separated by the empty +string between a non-whitespace character and a whitespace character. +That is, given the input line @w{@samp{ foo bar}}, @code{sort} breaks it +into fields @w{@samp{ foo}} and @w{@samp{ bar}}. The field separator is +not considered to be part of either the field preceding or the field +following. But note that sort fields that extend to the end of the line, +as @samp{-k 2}, or sort fields consisting of a range, as @samp{-k 2,3}, +retain the field separators present between the endpoints of the range. + +@item -T @var{tempdir} +@itemx --temporary-directory=@var{tempdir} +@opindex -T +@opindex --temporary-directory +@cindex temporary directory +@vindex TMPDIR +Use directory @var{tempdir} to store temporary files, overriding the +@env{TMPDIR} environment variable. If this option is given more than +once, temporary files are stored in all the directories given. If you +have a large sort or merge that is I/O-bound, you can often improve +performance by using this option to specify directories on different +disks and controllers. + +@item -u +@itemx --unique +@opindex -u +@opindex --unique +@cindex uniquifying output + +Normally, output only the first of a sequence of lines that compare +equal. For the @option{--check} (@option{-c}) option, +check that no pair of consecutive lines compares equal. + +@item -k @var{pos1}[,@var{pos2}] +@itemx --key=@var{pos1}[,@var{pos2}] +@opindex -k +@opindex --key +@cindex sort field +Specify a sort field that consists of the part of the line between +@var{pos1} and @var{pos2} (or the end of the line, if @var{pos2} is +omitted), @emph{inclusive}. Fields and character positions are numbered +starting with 1. So to sort on the second field, you'd use +@samp{--key=2,2} (@samp{-k 2,2}). See below for more examples. + +@item -z +@itemx --zero-terminated +@opindex -z +@opindex --zero-terminated +@cindex sort zero-terminated lines +Treat the input as a set of lines, each terminated by a zero byte (@sc{ascii} +@sc{nul} (Null) character) instead of an @sc{ascii} @sc{lf} (Line Feed). +This option can be useful in conjunction with @samp{perl -0} or +@samp{find -print0} and @samp{xargs -0} which do the same in order to +reliably handle arbitrary pathnames (even those which contain Line Feed +characters.) + +@item +@var{pos1} [-@var{pos2}] +The obsolescent, traditional option for specifying a sort field. The field +consists of the line between @var{pos1} and up to but @emph{not including} +@var{pos2} (or the end of the line if @var{pos2} is omitted). Fields +and character positions are numbered starting with 0. See below. + +Warning: the @samp{+@var{pos1}} usage is obsolescent. Future versions of +@sc{posix} will require that support for it be withdrawn. Use +@option{--key} (@option{-k}) instead. + +@end table + +Historical (BSD and System V) implementations of @code{sort} have +differed in their interpretation of some options, particularly +@samp{-b}, @samp{-f}, and @samp{-n}. @sc{gnu} sort follows the @sc{posix} +behavior, which is usually (but not always!) like the System V behavior. +According to @sc{posix}, @samp{-n} no longer implies @samp{-b}. For +consistency, @samp{-M} has been changed in the same way. This may +affect the meaning of character positions in field specifications in +obscure cases. The only fix is to add an explicit @samp{-b}. + +A position in a sort field specified with the @samp{-k} or @samp{+} +option has the form @samp{@var{f}.@var{c}}, where @var{f} is the number +of the field to use and @var{c} is the number of the first character +from the beginning of the field (for @samp{+@var{pos}}) or from the end +of the previous field (for @samp{-@var{pos}}). If the @samp{.@var{c}} +is omitted, it is taken to be the first character in the field. If the +@samp{-b} option was specified, the @samp{.@var{c}} part of a field +specification is counted from the first nonblank character of the field +(for @samp{+@var{pos}}) or from the first nonblank character following +the previous field (for @samp{-@var{pos}}). + +A sort key option may also have any of the option letters @samp{Mbdfinr} +appended to it, in which case the global ordering options are not used +for that particular field. The @samp{-b} option may be independently +attached to either or both of the @samp{+@var{pos}} and +@samp{-@var{pos}} parts of a field specification, and if it is inherited +from the global options it will be attached to both. +Keys may span multiple fields. + +Here are some examples to illustrate various combinations of options. +In them, the @sc{posix} @samp{-k} option is used to specify sort keys rather +than the obsolescent @samp{+@var{pos1}-@var{pos2}} syntax. + +@itemize @bullet + +@item +Sort in descending (reverse) numeric order. + +@example +sort -nr +@end example + +@item +Sort alphabetically, omitting the first and second fields. +This uses a single key composed of the characters beginning +at the start of field three and extending to the end of each line. + +@example +sort -k 3 +@end example + +@item +Sort numerically on the second field and resolve ties by sorting +alphabetically on the third and fourth characters of field five. +Use @samp{:} as the field delimiter. + +@example +sort -t : -k 2,2n -k 5.3,5.4 +@end example + +Note that if you had written @samp{-k 2} instead of @samp{-k 2,2} +@command{sort} would have used all characters beginning in the second field +and extending to the end of the line as the primary @emph{numeric} +key. For the large majority of applications, treating keys spanning +more than one field as numeric will not do what you expect. + +Also note that the @samp{n} modifier was applied to the field-end +specifier for the first key. It would have been equivalent to +specify @samp{-k 2n,2} or @samp{-k 2n,2n}. All modifiers except +@samp{b} apply to the associated @emph{field}, regardless of whether +the modifier character is attached to the field-start and/or the +field-end part of the key specifier. + +@item +Sort the password file on the fifth field and ignore any +leading white space. Sort lines with equal values in field five +on the numeric user ID in field three. + +@example +sort -t : -k 5b,5 -k 3,3n /etc/passwd +@end example + +An alternative is to use the global numeric modifier @samp{-n}. + +@example +sort -t : -n -k 5b,5 -k 3,3 /etc/passwd +@end example + +@item +Generate a tags file in case-insensitive sorted order. + +@smallexample +find src -type f -print0 | sort -t / -z -f | xargs -0 etags --append +@end smallexample + +The use of @samp{-print0}, @samp{-z}, and @samp{-0} in this case means +that pathnames that contain Line Feed characters will not get broken up +by the sort operation. + +Finally, to ignore both leading and trailing white space, you +could have applied the @samp{b} modifier to the field-end specifier +for the first key, + +@example +sort -t : -n -k 5b,5b -k 3,3 /etc/passwd +@end example + +or by using the global @samp{-b} modifier instead of @samp{-n} +and an explicit @samp{n} with the second key specifier. + +@example +sort -t : -b -k 5,5 -k 3,3n /etc/passwd +@end example + +@c This example is a bit contrived and needs more explanation. +@c @item +@c Sort records separated by an arbitrary string by using a pipe to convert +@c each record delimiter string to @samp{\0}, then using sort's -z option, +@c and converting each @samp{\0} back to the original record delimiter. +@c +@c @example +@c printf 'c\n\nb\n\na\n'|perl -0pe 's/\n\n/\n\0/g'|sort -z|perl -0pe 's/\0/\n/g' +@c @end example + +@end itemize + + +@node uniq invocation +@section @code{uniq}: Uniquify files + +@pindex uniq +@cindex uniquify files + +@code{uniq} writes the unique lines in the given @file{input}, or +standard input if nothing is given or for an @var{input} name of +@samp{-}. Synopsis: + +@example +uniq [@var{option}]@dots{} [@var{input} [@var{output}]] +@end example + +By default, @code{uniq} prints the unique lines in a sorted file, i.e., +discards all but one of identical successive lines. Optionally, it can +instead show only lines that appear exactly once, or lines that appear +more than once. + +The input must be sorted. If your input is not sorted, perhaps you want +to use @code{sort -u}. + +If no @var{output} file is specified, @code{uniq} writes to standard +output. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{n} +@itemx -f @var{n} +@itemx --skip-fields=@var{n} +@opindex -@var{n} +@opindex -f +@opindex --skip-fields +Skip @var{n} fields on each line before checking for uniqueness. Fields +are sequences of non-space non-tab characters that are separated from +each other by at least one space or tab. + +@item +@var{n} +@itemx -s @var{n} +@itemx --skip-chars=@var{n} +@opindex +@var{n} +@opindex -s +@opindex --skip-chars +Skip @var{n} characters before checking for uniqueness. If you use both +the field and character skipping options, fields are skipped over first. + +Warning: the @samp{+@var{n}} usage is obsolescent. Future versions of +@sc{posix} will require that support for it be withdrawn. Use @samp{-s +@var{n}} instead. + +@item -c +@itemx --count +@opindex -c +@opindex --count +Print the number of times each line occurred along with the line. + +@item -i +@itemx --ignore-case +@opindex -i +@opindex --ignore-case +Ignore differences in case when comparing lines. + +@item -d +@itemx --repeated +@opindex -d +@opindex --repeated +@cindex duplicate lines, outputting +Print only duplicate lines. + +@item -D +@itemx --all-repeated[=@var{delimit-method}] +@opindex -D +@opindex --all-repeated +@cindex all duplicate lines, outputting +Print all duplicate lines and only duplicate lines. +This option is useful mainly in conjunction with other options e.g., +to ignore case or to compare only selected fields. +The optional @var{delimit-method} tells how to delimit +groups of duplicate lines, and must be one of the following: + +@table @samp + +@item none +Do not delimit groups of duplicate lines. +This is equivalent to @option{--all-repeated} (@option{-D}). + +@item prepend +Output a newline before each group of duplicate lines. + +@item separate +Separate groups of duplicate lines with a single newline. +This is the same as using @samp{prepend}, except that +there is no newline before the first group, and hence +may be better suited for output direct to users. +@end table + +Note that when groups are delimited and the input stream contains +two or more consecutive blank lines, then the output is ambiguous. +To avoid that, filter the input through @samp{tr -s '\n'} to replace +each sequence of consecutive newlines with a single newline. + +This is a @sc{gnu} extension. +@c FIXME: give an example showing *how* it's useful + +@item -u +@itemx --unique +@opindex -u +@opindex --unique +@cindex unique lines, outputting +Print only unique lines. + +@item -w @var{n} +@itemx --check-chars=@var{n} +@opindex -w +@opindex --check-chars +Compare @var{n} characters on each line (after skipping any specified +fields and characters). By default the entire rest of the lines are +compared. + +@end table + + +@node comm invocation +@section @code{comm}: Compare two sorted files line by line + +@pindex comm +@cindex line-by-line comparison +@cindex comparing sorted files + +@code{comm} writes to standard output lines that are common, and lines +that are unique, to two input files; a file name of @samp{-} means +standard input. Synopsis: + +@example +comm [@var{option}]@dots{} @var{file1} @var{file2} +@end example + +@vindex LC_COLLATE +Before @code{comm} can be used, the input files must be sorted using the +collating sequence specified by the @env{LC_COLLATE} locale. +If an input file ends in a non-newline +character, a newline is silently appended. The @code{sort} command with +no options always outputs a file that is suitable input to @code{comm}. + +@cindex differing lines +@cindex common lines +With no options, @code{comm} produces three column output. Column one +contains lines unique to @var{file1}, column two contains lines unique +to @var{file2}, and column three contains lines common to both files. +Columns are separated by a single TAB character. +@c FIXME: when there's an option to supply an alternative separator +@c string, append `by default' to the above sentence. + +@opindex -1 +@opindex -2 +@opindex -3 +The options @samp{-1}, @samp{-2}, and @samp{-3} suppress printing of +the corresponding columns. Also see @ref{Common options}. + +Unlike some other comparison utilities, @code{comm} has an exit +status that does not depend on the result of the comparison. +Upon normal completion @code{comm} produces an exit code of zero. +If there is an error it exits with nonzero status. + + +@node tsort invocation +@section @code{tsort}: Topological sort + +@pindex tsort +@cindex topological sort + +@code{tsort} performs a topological sort on the given @var{file}, or +standard input if no input file is given or for a @var{file} of +@samp{-}. Synopsis: + +@example +tsort [@var{option}] [@var{file}] +@end example + +@code{tsort} reads its input as pairs of strings, separated by blanks, +indicating a partial ordering. The output is a total ordering that +corresponds to the given partial ordering. + +For example + +@example +tsort <<EOF +a b c +d +e f +b c d e +EOF +@end example + +@noindent +will produce the output + +@example +a +b +c +d +e +f +@end example + +@code{tsort} will detect cycles in the input and writes the first cycle +encountered to standard error. + +Note that for a given partial ordering, generally there is no unique +total ordering. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node ptx invocation +@section @code{ptx}: Produce permuted indexes + +@pindex ptx + +@code{ptx} reads a text file and essentially produces a permuted index, with +each keyword in its context. The calling sketch is either one of: + +@example +ptx [@var{option} @dots{}] [@var{file} @dots{}] +ptx -G [@var{option} @dots{}] [@var{input} [@var{output}]] +@end example + +The @samp{-G} (or its equivalent: @samp{--traditional}) option disables +all @sc{gnu} extensions and reverts to traditional mode, thus introducing some +limitations and changing several of the program's default option values. +When @samp{-G} is not specified, @sc{gnu} extensions are always enabled. +@sc{gnu} extensions to @code{ptx} are documented wherever appropriate in this +document. For the full list, see @xref{Compatibility in ptx}. + +Individual options are explained in the following sections. + +When @sc{gnu} extensions are enabled, there may be zero, one or several +@var{file}s after the options. If there is no @var{file}, the program +reads the standard input. If there is one or several @var{file}s, they +give the name of input files which are all read in turn, as if all the +input files were concatenated. However, there is a full contextual +break between each file and, when automatic referencing is requested, +file names and line numbers refer to individual text input files. In +all cases, the program outputs the permuted index to the standard +output. + +When @sc{gnu} extensions are @emph{not} enabled, that is, when the program +operates in traditional mode, there may be zero, one or two parameters +besides the options. If there are no parameters, the program reads the +standard input and outputs the permuted index to the standard output. +If there is only one parameter, it names the text @var{input} to be read +instead of the standard input. If two parameters are given, they give +respectively the name of the @var{input} file to read and the name of +the @var{output} file to produce. @emph{Be very careful} to note that, +in this case, the contents of file given by the second parameter is +destroyed. This behavior is dictated by System V @code{ptx} +compatibility; @sc{gnu} Standards normally discourage output parameters not +introduced by an option. + +Note that for @emph{any} file named as the value of an option or as an +input text file, a single dash @kbd{-} may be used, in which case +standard input is assumed. However, it would not make sense to use this +convention more than once per program invocation. + +@menu +* General options in ptx:: Options which affect general program behavior. +* Charset selection in ptx:: Underlying character set considerations. +* Input processing in ptx:: Input fields, contexts, and keyword selection. +* Output formatting in ptx:: Types of output format, and sizing the fields. +* Compatibility in ptx:: +@end menu + + +@node General options in ptx +@subsection General options + +@table @samp + +@item -C +@itemx --copyright +Print a short note about the copyright and copying conditions, then +exit without further processing. + +@item -G +@itemx --traditional +As already explained, this option disables all @sc{gnu} extensions to +@code{ptx} and switches to traditional mode. + +@item --help +Print a short help on standard output, then exit without further +processing. + +@item --version +Print the program version on standard output, then exit without further +processing. + +@end table + + +@node Charset selection in ptx +@subsection Charset selection + +@c FIXME: People don't necessarily know what an IBM-PC was these days. +As it is set up now, the program assumes that the input file is coded +using 8-bit ISO 8859-1 code, also known as Latin-1 character set, +@emph{unless} it is compiled for MS-DOS, in which case it uses the +character set of the IBM-PC. (@sc{gnu} @code{ptx} is not known to work on +smaller MS-DOS machines anymore.) Compared to 7-bit @sc{ascii}, the set +of characters which are letters is different; this alters the behavior +of regular expression matching. Thus, the default regular expression +for a keyword allows foreign or diacriticized letters. Keyword sorting, +however, is still crude; it obeys the underlying character set ordering +quite blindly. + +@table @samp + +@item -f +@itemx --ignore-case +Fold lower case letters to upper case for sorting. + +@end table + + +@node Input processing in ptx +@subsection Word selection and input processing + +@table @samp + +@item -b @var{file} +@item --break-file=@var{file} + +This option provides an alternative (to @samp{-W}) method of describing +which characters make up words. It introduces the name of a +file which contains a list of characters which can@emph{not} be part of +one word; this file is called the @dfn{Break file}. Any character which +is not part of the Break file is a word constituent. If both options +@samp{-b} and @samp{-W} are specified, then @samp{-W} has precedence and +@samp{-b} is ignored. + +When @sc{gnu} extensions are enabled, the only way to avoid newline as a +break character is to write all the break characters in the file with no +newline at all, not even at the end of the file. When @sc{gnu} extensions +are disabled, spaces, tabs and newlines are always considered as break +characters even if not included in the Break file. + +@item -i @var{file} +@itemx --ignore-file=@var{file} + +The file associated with this option contains a list of words which will +never be taken as keywords in concordance output. It is called the +@dfn{Ignore file}. The file contains exactly one word in each line; the +end of line separation of words is not subject to the value of the +@samp{-S} option. + +There is a default Ignore file used by @code{ptx} when this option is +not specified, usually found in @file{/usr/local/lib/eign} if this has +not been changed at installation time. If you want to deactivate the +default Ignore file, specify @code{/dev/null} instead. + +@item -o @var{file} +@itemx --only-file=@var{file} + +The file associated with this option contains a list of words which will +be retained in concordance output; any word not mentioned in this file +is ignored. The file is called the @dfn{Only file}. The file contains +exactly one word in each line; the end of line separation of words is +not subject to the value of the @samp{-S} option. + +There is no default for the Only file. When both an Only file and an +Ignore file are specified, a word is considered a keyword only +if it is listed in the Only file and not in the Ignore file. + +@item -r +@itemx --references + +On each input line, the leading sequence of non-white space characters will be +taken to be a reference that has the purpose of identifying this input +line in the resulting permuted index. For more information about reference +production, see @xref{Output formatting in ptx}. +Using this option changes the default value for option @samp{-S}. + +Using this option, the program does not try very hard to remove +references from contexts in output, but it succeeds in doing so +@emph{when} the context ends exactly at the newline. If option +@samp{-r} is used with @samp{-S} default value, or when @sc{gnu} extensions +are disabled, this condition is always met and references are completely +excluded from the output contexts. + +@item -S @var{regexp} +@itemx --sentence-regexp=@var{regexp} + +This option selects which regular expression will describe the end of a +line or the end of a sentence. In fact, this regular expression is not +the only distinction between end of lines or end of sentences, and input +line boundaries have no special significance outside this option. By +default, when @sc{gnu} extensions are enabled and if @samp{-r} option is not +used, end of sentences are used. In this case, this @var{regex} is +imported from @sc{gnu} Emacs: + +@example +[.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]* +@end example + +Whenever @sc{gnu} extensions are disabled or if @samp{-r} option is used, end +of lines are used; in this case, the default @var{regexp} is just: + +@example +\n +@end example + +Using an empty @var{regexp} is equivalent to completely disabling end of +line or end of sentence recognition. In this case, the whole file is +considered to be a single big line or sentence. The user might want to +disallow all truncation flag generation as well, through option @samp{-F +""}. @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs +Manual}. + +When the keywords happen to be near the beginning of the input line or +sentence, this often creates an unused area at the beginning of the +output context line; when the keywords happen to be near the end of the +input line or sentence, this often creates an unused area at the end of +the output context line. The program tries to fill those unused areas +by wrapping around context in them; the tail of the input line or +sentence is used to fill the unused area on the left of the output line; +the head of the input line or sentence is used to fill the unused area +on the right of the output line. + +As a matter of convenience to the user, many usual backslashed escape +sequences from the C language are recognized and converted to the +corresponding characters by @code{ptx} itself. + +@item -W @var{regexp} +@itemx --word-regexp=@var{regexp} + +This option selects which regular expression will describe each keyword. +By default, if @sc{gnu} extensions are enabled, a word is a sequence of +letters; the @var{regexp} used is @samp{\w+}. When @sc{gnu} extensions are +disabled, a word is by default anything which ends with a space, a tab +or a newline; the @var{regexp} used is @samp{[^ \t\n]+}. + +An empty @var{regexp} is equivalent to not using this option. +@xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs +Manual}. + +As a matter of convenience to the user, many usual backslashed escape +sequences, as found in the C language, are recognized and converted to +the corresponding characters by @code{ptx} itself. + +@end table + + +@node Output formatting in ptx +@subsection Output formatting + +Output format is mainly controlled by the @samp{-O} and @samp{-T} options +described in the table below. When neither @samp{-O} nor @samp{-T} are +selected, and if @sc{gnu} extensions are enabled, the program chooses an +output format suitable for a dumb terminal. Each keyword occurrence is +output to the center of one line, surrounded by its left and right +contexts. Each field is properly justified, so the concordance output +can be readily observed. As a special feature, if automatic +references are selected by option @samp{-A} and are output before the +left context, that is, if option @samp{-R} is @emph{not} selected, then +a colon is added after the reference; this nicely interfaces with @sc{gnu} +Emacs @code{next-error} processing. In this default output format, each +white space character, like newline and tab, is merely changed to +exactly one space, with no special attempt to compress consecutive +spaces. This might change in the future. Except for those white space +characters, every other character of the underlying set of 256 +characters is transmitted verbatim. + +Output format is further controlled by the following options. + +@table @samp + +@item -g @var{number} +@itemx --gap-size=@var{number} + +Select the size of the minimum white space gap between the fields on the +output line. + +@item -w @var{number} +@itemx --width=@var{number} + +Select the maximum output width of each final line. If references are +used, they are included or excluded from the maximum output width +depending on the value of option @samp{-R}. If this option is not +selected, that is, when references are output before the left context, +the maximum output width takes into account the maximum length of all +references. If this option is selected, that is, when references are +output after the right context, the maximum output width does not take +into account the space taken by references, nor the gap that precedes +them. + +@item -A +@itemx --auto-reference + +Select automatic references. Each input line will have an automatic +reference made up of the file name and the line ordinal, with a single +colon between them. However, the file name will be empty when standard +input is being read. If both @samp{-A} and @samp{-r} are selected, then +the input reference is still read and skipped, but the automatic +reference is used at output time, overriding the input reference. + +@item -R +@itemx --right-side-refs + +In the default output format, when option @samp{-R} is not used, any +references produced by the effect of options @samp{-r} or @samp{-A} are +placed to the far right of output lines, after the right context. With +default output format, when the @samp{-R} option is specified, references +are rather placed at the beginning of each output line, before the left +context. For any other output format, option @samp{-R} is +ignored, with one exception: with @samp{-R} the width of references +is @emph{not} taken into account in total output width given by @samp{-w}. + +This option is automatically selected whenever @sc{gnu} extensions are +disabled. + +@item -F @var{string} +@itemx --flac-truncation=@var{string} + +This option will request that any truncation in the output be reported +using the string @var{string}. Most output fields theoretically extend +towards the beginning or the end of the current line, or current +sentence, as selected with option @samp{-S}. But there is a maximum +allowed output line width, changeable through option @samp{-w}, which is +further divided into space for various output fields. When a field has +to be truncated because it cannot extend beyond the beginning or the end of +the current line to fit in, then a truncation occurs. By default, +the string used is a single slash, as in @samp{-F /}. + +@var{string} may have more than one character, as in @samp{-F ...}. +Also, in the particular case when @var{string} is empty (@samp{-F ""}), +truncation flagging is disabled, and no truncation marks are appended in +this case. + +As a matter of convenience to the user, many usual backslashed escape +sequences, as found in the C language, are recognized and converted to +the corresponding characters by @code{ptx} itself. + +@item -M @var{string} +@itemx --macro-name=@var{string} + +Select another @var{string} to be used instead of @samp{xx}, while +generating output suitable for @code{nroff}, @code{troff} or @TeX{}. + +@item -O +@itemx --format=roff + +Choose an output format suitable for @code{nroff} or @code{troff} +processing. Each output line will look like: + +@smallexample +.xx "@var{tail}" "@var{before}" "@var{keyword_and_after}" "@var{head}" "@var{ref}" +@end smallexample + +so it will be possible to write a @samp{.xx} roff macro to take care of +the output typesetting. This is the default output format when @sc{gnu} +extensions are disabled. Option @samp{-M} can be used to change +@samp{xx} to another macro name. + +In this output format, each non-graphical character, like newline and +tab, is merely changed to exactly one space, with no special attempt to +compress consecutive spaces. Each quote character: @kbd{"} is doubled +so it will be correctly processed by @code{nroff} or @code{troff}. + +@item -T +@itemx --format=tex + +Choose an output format suitable for @TeX{} processing. Each output +line will look like: + +@smallexample +\xx @{@var{tail}@}@{@var{before}@}@{@var{keyword}@}@{@var{after}@}@{@var{head}@}@{@var{ref}@} +@end smallexample + +@noindent +so it will be possible to write a @code{\xx} definition to take care of +the output typesetting. Note that when references are not being +produced, that is, neither option @samp{-A} nor option @samp{-r} is +selected, the last parameter of each @code{\xx} call is inhibited. +Option @samp{-M} can be used to change @samp{xx} to another macro +name. + +In this output format, some special characters, like @kbd{$}, @kbd{%}, +@kbd{&}, @kbd{#} and @kbd{_} are automatically protected with a +backslash. Curly brackets @kbd{@{}, @kbd{@}} are protected with a +backslash and a pair of dollar signs (to force mathematical mode). The +backslash itself produces the sequence @code{\backslash@{@}}. +Circumflex and tilde diacritics produce the sequence @code{^\@{ @}} and +@code{~\@{ @}} respectively. Other diacriticized characters of the +underlying character set produce an appropriate @TeX{} sequence as far +as possible. The other non-graphical characters, like newline and tab, +and all other characters which are not part of @sc{ascii}, are merely +changed to exactly one space, with no special attempt to compress +consecutive spaces. Let me know how to improve this special character +processing for @TeX{}. + +@end table + + +@node Compatibility in ptx +@subsection The @sc{gnu} extensions to @code{ptx} + +This version of @code{ptx} contains a few features which do not exist in +System V @code{ptx}. These extra features are suppressed by using the +@samp{-G} command line option, unless overridden by other command line +options. Some @sc{gnu} extensions cannot be recovered by overriding, so the +simple rule is to avoid @samp{-G} if you care about @sc{gnu} extensions. +Here are the differences between this program and System V @code{ptx}. + +@itemize @bullet + +@item +This program can read many input files at once, it always writes the +resulting concordance on standard output. On the other hand, System V +@code{ptx} reads only one file and sends the result to standard output +or, if a second @var{file} parameter is given on the command, to that +@var{file}. + +Having output parameters not introduced by options is a dangerous +practice which @sc{gnu} avoids as far as possible. So, for using @code{ptx} +portably between @sc{gnu} and System V, you should always use it with a +single input file, and always expect the result on standard output. You +might also want to automatically configure in a @samp{-G} option to +@code{ptx} calls in products using @code{ptx}, if the configurator finds +that the installed @code{ptx} accepts @samp{-G}. + +@item +The only options available in System V @code{ptx} are options @samp{-b}, +@samp{-f}, @samp{-g}, @samp{-i}, @samp{-o}, @samp{-r}, @samp{-t} and +@samp{-w}. All other options are @sc{gnu} extensions and are not repeated in +this enumeration. Moreover, some options have a slightly different +meaning when @sc{gnu} extensions are enabled, as explained below. + +@item +By default, concordance output is not formatted for @code{troff} or +@code{nroff}. It is rather formatted for a dumb terminal. @code{troff} +or @code{nroff} output may still be selected through option @samp{-O}. + +@item +Unless @samp{-R} option is used, the maximum reference width is +subtracted from the total output line width. With @sc{gnu} extensions +disabled, width of references is not taken into account in the output +line width computations. + +@item +All 256 characters, even @kbd{NUL}s, are always read and processed from +input file with no adverse effect, even if @sc{gnu} extensions are disabled. +However, System V @code{ptx} does not accept 8-bit characters, a few +control characters are rejected, and the tilde @kbd{~} is also rejected. + +@item +Input line length is only limited by available memory, even if @sc{gnu} +extensions are disabled. However, System V @code{ptx} processes only +the first 200 characters in each line. + +@item +The break (non-word) characters default to be every character except all +letters of the underlying character set, diacriticized or not. When @sc{gnu} +extensions are disabled, the break characters default to space, tab and +newline only. + +@item +The program makes better use of output line width. If @sc{gnu} extensions +are disabled, the program rather tries to imitate System V @code{ptx}, +but still, there are some slight disposition glitches this program does +not completely reproduce. + +@item +The user can specify both an Ignore file and an Only file. This is not +allowed with System V @code{ptx}. + +@end itemize + + +@node Operating on fields within a line +@chapter Operating on fields within a line + +@menu +* cut invocation:: Print selected parts of lines. +* paste invocation:: Merge lines of files. +* join invocation:: Join lines on a common field. +@end menu + + +@node cut invocation +@section @code{cut}: Print selected parts of lines + +@pindex cut +@code{cut} writes to standard output selected parts of each line of each +input file, or standard input if no files are given or for a file name of +@samp{-}. Synopsis: + +@example +cut [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +In the table which follows, the @var{byte-list}, @var{character-list}, +and @var{field-list} are one or more numbers or ranges (two numbers +separated by a dash) separated by commas. Bytes, characters, and +fields are numbered starting at 1. Incomplete ranges may be +given: @samp{-@var{m}} means @samp{1-@var{m}}; @samp{@var{n}-} means +@samp{@var{n}} through end of line or last field. + +The program accepts the following options. Also see @ref{Common +options}. + +@table @samp + +@item -b @var{byte-list} +@itemx --bytes=@var{byte-list} +@opindex -b +@opindex --bytes +Print only the bytes in positions listed in @var{byte-list}. Tabs and +backspaces are treated like any other character; they take up 1 byte. + +@item -c @var{character-list} +@itemx --characters=@var{character-list} +@opindex -c +@opindex --characters +Print only characters in positions listed in @var{character-list}. +The same as @samp{-b} for now, but internationalization will change +that. Tabs and backspaces are treated like any other character; they +take up 1 character. + +@item -f @var{field-list} +@itemx --fields=@var{field-list} +@opindex -f +@opindex --fields +Print only the fields listed in @var{field-list}. Fields are +separated by a TAB character by default. +Also print any line that contains no delimiter character, unless +the @samp{--only-delimited} (@samp{-s}) option is specified + +@item -d @var{input_delim_byte} +@itemx --delimiter=@var{input_delim_byte} +@opindex -d +@opindex --delimiter +For @samp{-f}, fields are separated in the input by the first character +in @var{input_delim_byte} (default is TAB). + +@item -n +@opindex -n +Do not split multi-byte characters (no-op for now). + +@item -s +@itemx --only-delimited +@opindex -s +@opindex --only-delimited +For @samp{-f}, do not print lines that do not contain the field separator +character. + +@itemx --output-delimiter=@var{output_delim_string} +@opindex --output-delimiter +For @samp{-f}, output fields are separated by @var{output_delim_string}. +The default is to use the input delimiter. + + +@end table + + +@node paste invocation +@section @code{paste}: Merge lines of files + +@pindex paste +@cindex merging files + +@code{paste} writes to standard output lines consisting of sequentially +corresponding lines of each given file, separated by a TAB character. +Standard input is used for a file name of @samp{-} or if no input files +are given. + +Synopsis: + +@example +paste [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -s +@itemx --serial +@opindex -s +@opindex --serial +Paste the lines of one file at a time rather than one line from each +file. + +@item -d @var{delim-list} +@itemx --delimiters=@var{delim-list} +@opindex -d +@opindex --delimiters +Consecutively use the characters in @var{delim-list} instead of +TAB to separate merged lines. When @var{delim-list} is +exhausted, start again at its beginning. + +@end table + + +@node join invocation +@section @code{join}: Join lines on a common field + +@pindex join +@cindex common field, joining on + +@code{join} writes to standard output a line for each pair of input +lines that have identical join fields. Synopsis: + +@example +join [@var{option}]@dots{} @var{file1} @var{file2} +@end example + +@vindex LC_COLLATE +Either @var{file1} or @var{file2} (but not both) can be @samp{-}, +meaning standard input. @var{file1} and @var{file2} should be already +sorted in increasing textual order on the join fields, using the +collating sequence specified by the @env{LC_COLLATE} locale. Unless +the @samp{-t} option is given, the input should be sorted ignoring blanks at +the start of the join field, as in @code{sort -b}. If the +@samp{--ignore-case} option is given, lines should be sorted without +regard to the case of characters in the join field, as in @code{sort -f}. + +The defaults are: the join field is the first field in each line; +fields in the input are separated by one or more blanks, with leading +blanks on the line ignored; fields in the output are separated by a +space; each output line consists of the join field, the remaining +fields from @var{file1}, then the remaining fields from @var{file2}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -a @var{file-number} +@opindex -a +Print a line for each unpairable line in file @var{file-number} (either +@samp{1} or @samp{2}), in addition to the normal output. + +@item -e @var{string} +@opindex -e +Replace those output fields that are missing in the input with +@var{string}. + +@item -i +@itemx --ignore-case +@opindex -i +@opindex --ignore-case +Ignore differences in case when comparing keys. +With this option, the lines of the input files must be ordered in the same way. +Use @samp{sort -f} to produce this ordering. + +@item -1 @var{field} +@itemx -j1 @var{field} +@opindex -1 +@opindex -j1 +Join on field @var{field} (a positive integer) of file 1. + +@item -2 @var{field} +@itemx -j2 @var{field} +@opindex -2 +@opindex -j2 +Join on field @var{field} (a positive integer) of file 2. + +@item -j @var{field} +Equivalent to @samp{-1 @var{field} -2 @var{field}}. + +@item -o @var{field-list}@dots{} +Construct each output line according to the format in @var{field-list}. +Each element in @var{field-list} is either the single character @samp{0} or +has the form @var{m.n} where the file number, @var{m}, is @samp{1} or +@samp{2} and @var{n} is a positive field number. + +A field specification of @samp{0} denotes the join field. +In most cases, the functionality of the @samp{0} field spec +may be reproduced using the explicit @var{m.n} that corresponds +to the join field. However, when printing unpairable lines +(using either of the @samp{-a} or @samp{-v} options), there is no way +to specify the join field using @var{m.n} in @var{field-list} +if there are unpairable lines in both files. +To give @code{join} that functionality, @sc{posix} invented the @samp{0} +field specification notation. + +The elements in @var{field-list} +are separated by commas or blanks. Multiple @var{field-list} +arguments can be given after a single @samp{-o} option; the values +of all lists given with @samp{-o} are concatenated together. +All output lines -- including those printed because of any -a or -v +option -- are subject to the specified @var{field-list}. + +@item -t @var{char} +Use character @var{char} as the input and output field separator. + +@item -v @var{file-number} +Print a line for each unpairable line in file @var{file-number} +(either @samp{1} or @samp{2}), instead of the normal output. + +@end table + +In addition, when @sc{gnu} @code{join} is invoked with exactly one argument, +options @samp{--help} and @samp{--version} are recognized. @xref{Common +options}. + + +@node Operating on characters +@chapter Operating on characters + +@cindex operating on characters + +This commands operate on individual characters. + +@menu +* tr invocation:: Translate, squeeze, and/or delete characters. +* expand invocation:: Convert tabs to spaces. +* unexpand invocation:: Convert spaces to tabs. +@end menu + + +@node tr invocation +@section @code{tr}: Translate, squeeze, and/or delete characters + +@pindex tr + +Synopsis: + +@example +tr [@var{option}]@dots{} @var{set1} [@var{set2}] +@end example + +@code{tr} copies standard input to standard output, performing +one of the following operations: + +@itemize @bullet +@item +translate, and optionally squeeze repeated characters in the result, +@item +squeeze repeated characters, +@item +delete characters, +@item +delete characters, then squeeze repeated characters from the result. +@end itemize + +The @var{set1} and (if given) @var{set2} arguments define ordered +sets of characters, referred to below as @var{set1} and @var{set2}. These +sets are the characters of the input that @code{tr} operates on. +The @samp{--complement} (@samp{-c}) option replaces @var{set1} with its +complement (all of the characters that are not in @var{set1}). + +@menu +* Character sets:: Specifying sets of characters. +* Translating:: Changing one characters to another. +* Squeezing:: Squeezing repeats and deleting. +* Warnings in tr:: Warning messages. +@end menu + + +@node Character sets +@subsection Specifying sets of characters + +@cindex specifying sets of characters + +The format of the @var{set1} and @var{set2} arguments resembles +the format of regular expressions; however, they are not regular +expressions, only lists of characters. Most characters simply +represent themselves in these strings, but the strings can contain +the shorthands listed below, for convenience. Some of them can be +used only in @var{set1} or @var{set2}, as noted below. + +@table @asis + +@item Backslash escapes +@cindex backslash escapes + +A backslash followed by a character not listed below causes an error +message. + +@table @samp +@item \a +Control-G. +@item \b +Control-H. +@item \f +Control-L. +@item \n +Control-J. +@item \r +Control-M. +@item \t +Control-I. +@item \v +Control-K. +@item \@var{ooo} +The character with the value given by @var{ooo}, which is 1 to 3 +octal digits, +@item \\ +A backslash. +@end table + +@item Ranges +@cindex ranges + +The notation @samp{@var{m}-@var{n}} expands to all of the characters +from @var{m} through @var{n}, in ascending order. @var{m} should +collate before @var{n}; if it doesn't, an error results. As an example, +@samp{0-9} is the same as @samp{0123456789}. + +@sc{gnu} @code{tr} does not support the System V syntax that uses square +brackets to enclose ranges. Translations specified in that format +sometimes work as expected, since the brackets are often transliterated +to themselves. However, they should be avoided because they sometimes +behave unexpectedly. For example, @samp{tr -d '[0-9]'} deletes brackets +as well as digits. + +Many historically common and even accepted uses of ranges are not +portable. For example, on @sc{ebcdic} hosts using the @samp{A-Z} +range will not do what most would expect because @samp{A} through @samp{Z} +are not contiguous as they are in @sc{ascii}. +If you can rely on a @sc{posix} compliant version of @code{tr}, then +the best way to work around this is to use character classes (see below). +Otherwise, it is most portable (and most ugly) to enumerate the members +of the ranges. + +@item Repeated characters +@cindex repeated characters + +The notation @samp{[@var{c}*@var{n}]} in @var{set2} expands to @var{n} +copies of character @var{c}. Thus, @samp{[y*6]} is the same as +@samp{yyyyyy}. The notation @samp{[@var{c}*]} in @var{string2} expands +to as many copies of @var{c} as are needed to make @var{set2} as long as +@var{set1}. If @var{n} begins with @samp{0}, it is interpreted in +octal, otherwise in decimal. + +@item Character classes +@cindex character classes + +The notation @samp{[:@var{class}:]} expands to all of the characters in +the (predefined) class @var{class}. The characters expand in no +particular order, except for the @code{upper} and @code{lower} classes, +which expand in ascending order. When the @samp{--delete} (@samp{-d}) +and @samp{--squeeze-repeats} (@samp{-s}) options are both given, any +character class can be used in @var{set2}. Otherwise, only the +character classes @code{lower} and @code{upper} are accepted in +@var{set2}, and then only if the corresponding character class +(@code{upper} and @code{lower}, respectively) is specified in the same +relative position in @var{set1}. Doing this specifies case conversion. +The class names are given below; an error results when an invalid class +name is given. + +@table @code +@item alnum +@opindex alnum +Letters and digits. +@item alpha +@opindex alpha +Letters. +@item blank +@opindex blank +Horizontal whitespace. +@item cntrl +@opindex cntrl +Control characters. +@item digit +@opindex digit +Digits. +@item graph +@opindex graph +Printable characters, not including space. +@item lower +@opindex lower +Lowercase letters. +@item print +@opindex print +Printable characters, including space. +@item punct +@opindex punct +Punctuation characters. +@item space +@opindex space +Horizontal or vertical whitespace. +@item upper +@opindex upper +Uppercase letters. +@item xdigit +@opindex xdigit +Hexadecimal digits. +@end table + +@item Equivalence classes +@cindex equivalence classes + +The syntax @samp{[=@var{c}=]} expands to all of the characters that are +equivalent to @var{c}, in no particular order. Equivalence classes are +a relatively recent invention intended to support non-English alphabets. +But there seems to be no standard way to define them or determine their +contents. Therefore, they are not fully implemented in @sc{gnu} @code{tr}; +each character's equivalence class consists only of that character, +which is of no particular use. + +@end table + + +@node Translating +@subsection Translating + +@cindex translating characters + +@code{tr} performs translation when @var{set1} and @var{set2} are +both given and the @samp{--delete} (@samp{-d}) option is not given. +@code{tr} translates each character of its input that is in @var{set1} +to the corresponding character in @var{set2}. Characters not in +@var{set1} are passed through unchanged. When a character appears more +than once in @var{set1} and the corresponding characters in @var{set2} +are not all the same, only the final one is used. For example, these +two commands are equivalent: + +@example +tr aaa xyz +tr a z +@end example + +A common use of @code{tr} is to convert lowercase characters to +uppercase. This can be done in many ways. Here are three of them: + +@example +tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ +tr a-z A-Z +tr '[:lower:]' '[:upper:]' +@end example + +@noindent +But note that using ranges like @code{a-z} above is not portable. + +When @code{tr} is performing translation, @var{set1} and @var{set2} +typically have the same length. If @var{set1} is shorter than +@var{set2}, the extra characters at the end of @var{set2} are ignored. + +On the other hand, making @var{set1} longer than @var{set2} is not +portable; @sc{posix.2} says that the result is undefined. In this situation, +BSD @code{tr} pads @var{set2} to the length of @var{set1} by repeating +the last character of @var{set2} as many times as necessary. System V +@code{tr} truncates @var{set1} to the length of @var{set2}. + +By default, @sc{gnu} @code{tr} handles this case like BSD @code{tr}. When +the @samp{--truncate-set1} (@samp{-t}) option is given, @sc{gnu} @code{tr} +handles this case like the System V @code{tr} instead. This option is +ignored for operations other than translation. + +Acting like System V @code{tr} in this case breaks the relatively common +BSD idiom: + +@example +tr -cs A-Za-z0-9 '\012' +@end example + +@noindent +because it converts only zero bytes (the first element in the +complement of @var{set1}), rather than all non-alphanumerics, to +newlines. + +@noindent +By the way, the above idiom is not portable because it uses ranges. +Assuming a @sc{posix} compliant @code{tr}, here is a better way to write it: + +@example +tr -cs '[:alnum:]' '[\n*]' +@end example + + +@node Squeezing +@subsection Squeezing repeats and deleting + +@cindex squeezing repeat characters +@cindex deleting characters + +When given just the @samp{--delete} (@samp{-d}) option, @code{tr} +removes any input characters that are in @var{set1}. + +When given just the @samp{--squeeze-repeats} (@samp{-s}) option, +@code{tr} replaces each input sequence of a repeated character that +is in @var{set1} with a single occurrence of that character. + +When given both @samp{--delete} and @samp{--squeeze-repeats}, @code{tr} +first performs any deletions using @var{set1}, then squeezes repeats +from any remaining characters using @var{set2}. + +The @samp{--squeeze-repeats} option may also be used when translating, +in which case @code{tr} first performs translation, then squeezes +repeats from any remaining characters using @var{set2}. + +Here are some examples to illustrate various combinations of options: + +@itemize @bullet + +@item +Remove all zero bytes: + +@example +tr -d '\000' +@end example + +@item +Put all words on lines by themselves. This converts all +non-alphanumeric characters to newlines, then squeezes each string +of repeated newlines into a single newline: + +@example +tr -cs '[:alnum:]' '[\n*]' +@end example + +@item +Convert each sequence of repeated newlines to a single newline: + +@example +tr -s '\n' +@end example + +@item +Find doubled occurrences of words in a document. +For example, people often write ``the the'' with the duplicated words +separated by a newline. The bourne shell script below works first +by converting each sequence of punctuation and blank characters to a +single newline. That puts each ``word'' on a line by itself. +Next it maps all uppercase characters to lower case, and finally it +runs @code{uniq} with the @samp{-d} option to print out only the words +that were adjacent duplicates. + +@example +#!/bin/sh +cat "$@@" \ + | tr -s '[:punct:][:blank:]' '\n' \ + | tr '[:upper:]' '[:lower:]' \ + | uniq -d +@end example + +@item +Deleting a small set of characters is usually straightforward. For example, +to remove all @samp{a}s, @samp{x}s, and @samp{M}s you would do this: + +@example +tr -d axM +@end example + +However, when @samp{-} is one of those characters, it can be tricky because +@samp{-} has special meanings. Performing the same task as above but also +removing all @samp{-} characters, we might try @code{tr -d -axM}, but +that would fail because @code{tr} would try to interpret @samp{-a} as +a command-line option. Alternatively, we could try putting the hyphen +inside the string, @code{tr -d a-xM}, but that wouldn't work either because +it would make @code{tr} interpret @code{a-x} as the range of characters +@samp{a}@dots{}@samp{x} rather than the three. +One way to solve the problem is to put the hyphen at the end of the list +of characters: + +@example +tr -d axM- +@end example + +More generally, use the character class notation @code{[=c=]} +with @samp{-} (or any other character) in place of the @samp{c}: + +@example +tr -d '[=-=]axM' +@end example + +Note how single quotes are used in the above example to protect the +square brackets from interpretation by a shell. + +@end itemize + + +@node Warnings in tr +@subsection Warning messages + +@vindex POSIXLY_CORRECT +Setting the environment variable @env{POSIXLY_CORRECT} turns off the +following warning and error messages, for strict compliance with +@sc{posix.2}. Otherwise, the following diagnostics are issued: + +@enumerate + +@item +When the @samp{--delete} option is given but @samp{--squeeze-repeats} +is not, and @var{set2} is given, @sc{gnu} @code{tr} by default prints +a usage message and exits, because @var{set2} would not be used. +The @sc{posix} specification says that @var{set2} must be ignored in +this case. Silently ignoring arguments is a bad idea. + +@item +When an ambiguous octal escape is given. For example, @samp{\400} +is actually @samp{\40} followed by the digit @samp{0}, because the +value 400 octal does not fit into a single byte. + +@end enumerate + +@sc{gnu} @code{tr} does not provide complete BSD or System V compatibility. +For example, it is impossible to disable interpretation of the @sc{posix} +constructs @samp{[:alpha:]}, @samp{[=c=]}, and @samp{[c*10]}. Also, @sc{gnu} +@code{tr} does not delete zero bytes automatically, unlike traditional +Unix versions, which provide no way to preserve zero bytes. + + +@node expand invocation +@section @code{expand}: Convert tabs to spaces + +@pindex expand +@cindex tabs to spaces, converting +@cindex converting tabs to spaces + +@code{expand} writes the contents of each given @var{file}, or standard +input if none are given or for a @var{file} of @samp{-}, to standard +output, with tab characters converted to the appropriate number of +spaces. Synopsis: + +@example +expand [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +By default, @code{expand} converts all tabs to spaces. It preserves +backspace characters in the output; they decrement the column count for +tab calculations. The default action is equivalent to @samp{-8} (set +tabs every 8 columns). + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{tab1}[,@var{tab2}]@dots{} +@itemx -t @var{tab1}[,@var{tab2}]@dots{} +@itemx --tabs=@var{tab1}[,@var{tab2}]@dots{} +@opindex -@var{tab} +@opindex -t +@opindex --tabs +@cindex tabstops, setting +If only one tab stop is given, set the tabs @var{tab1} spaces apart +(default is 8). Otherwise, set the tabs at columns @var{tab1}, +@var{tab2}, @dots{} (numbered from 0), and replace any tabs beyond the +last tabstop given with single spaces. If the tabstops are specified +with the @samp{-t} or @samp{--tabs} option, they can be separated by +blanks as well as by commas. + +@item -i +@itemx --initial +@opindex -i +@opindex --initial +@cindex initial tabs, converting +Only convert initial tabs (those that precede all non-space or non-tab +characters) on each line to spaces. + +@end table + + +@node unexpand invocation +@section @code{unexpand}: Convert spaces to tabs + +@pindex unexpand + +@code{unexpand} writes the contents of each given @var{file}, or +standard input if none are given or for a @var{file} of @samp{-}, to +standard output, with strings of two or more space or tab characters +converted to as many tabs as possible followed by as many spaces as are +needed. Synopsis: + +@example +unexpand [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +By default, @code{unexpand} converts only initial spaces and tabs (those +that precede all non space or tab characters) on each line. It +preserves backspace characters in the output; they decrement the column +count for tab calculations. By default, tabs are set at every 8th +column. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -@var{tab1}[,@var{tab2}]@dots{} +@itemx -t @var{tab1}[,@var{tab2}]@dots{} +@itemx --tabs=@var{tab1}[,@var{tab2}]@dots{} +@opindex -@var{tab} +@opindex -t +@opindex --tabs +If only one tab stop is given, set the tabs @var{tab1} spaces apart +instead of the default 8. Otherwise, set the tabs at columns +@var{tab1}, @var{tab2}, @dots{} (numbered from 0), and leave spaces and +tabs beyond the tabstops given unchanged. If the tabstops are specified +with the @samp{-t} or @samp{--tabs} option, they can be separated by +blanks as well as by commas. This option implies the @samp{-a} option. + +@item -a +@itemx --all +@opindex -a +@opindex --all +Convert all strings of two or more spaces or tabs, not just initial +ones, to tabs. + +@end table + + +@node Directory listing +@chapter Directory listing + +This chapter describes the @code{ls} command and its variants @code{dir} +and @code{vdir}, which list information about files. + +@menu +* ls invocation:: List directory contents. +* dir invocation:: Briefly ls. +* vdir invocation:: Verbosely ls. +* dircolors invocation:: Color setup for ls, etc. +@end menu + + +@node ls invocation +@section @code{ls}: List directory contents + +@pindex ls +@cindex directory listing + +The @code{ls} program lists information about files (of any type, +including directories). Options and file arguments can be intermixed +arbitrarily, as usual. + +For non-option command-line arguments that are directories, by default +@code{ls} lists the contents of directories, not recursively, and +omitting files with names beginning with @samp{.}. For other non-option +arguments, by default @code{ls} lists just the file name. If no +non-option argument is specified, @code{ls} operates on the current +directory, acting as if it had been invoked with a single argument of @samp{.}. + +@vindex LC_COLLATE +By default, the output is sorted alphabetically, according to the locale +settings in effect. @footnote{If you have arranged to use a non-@sc{posix} +locale (e.g., by setting @env{LC_ALL} to @samp{en_US}), then @code{ls} may +produce output that is sorted differently than you're accustomed to. +In that case, set the @env{LC_COLLATE} environment variable to @samp{C}.} +If standard output is +a terminal, the output is in columns (sorted vertically) and control +characters are output as question marks; otherwise, the output is listed +one per line and control characters are output as-is. + +Because @code{ls} is such a fundamental program, it has accumulated many +options over the years. They are described in the subsections below; +within each section, options are listed alphabetically (ignoring case). +The division of options into the subsections is not absolute, since some +options affect more than one aspect of @code{ls}'s operation. + +Also see @ref{Common options}. + +@menu +* Which files are listed:: +* What information is listed:: +* Sorting the output:: +* More details about version sort:: +* General output formatting:: +* Formatting file timestamps:: +* Formatting the file names:: +@end menu + + +@node Which files are listed +@subsection Which files are listed + +These options determine which files @code{ls} lists information for. +By default, any files and the contents of any directories on the command +line are shown. + +@table @samp + +@item -a +@itemx --all +@opindex -a +@opindex --all +List all files in directories, including files that start with @samp{.}. + +@item -A +@itemx --almost-all +@opindex -A +@opindex --almost-all +List all files in directories except for @file{.} and @file{..}. + +@item -B +@itemx --ignore-backups +@opindex -B +@opindex --ignore-backups +@cindex backup files, ignoring +Do not list files that end with @samp{~}, unless they are given on the +command line. + +@item -d +@itemx --directory +@opindex -d +@opindex --directory +List just the names of directories, as with other types of files, rather +than listing their contents. + +@item -H +@itemx --dereference-command-line +@opindex -H +@opindex --dereference-command-line +@cindex symbolic links, dereferencing +If a command line argument specifies a symbolic link, show information +for the file the link references rather than for the link itself. + +@item -I PATTERN +@itemx --ignore=PATTERN +@opindex -I +@opindex --ignore=@var{pattern} +Do not list files whose names match the shell pattern (not regular +expression) @var{pattern} unless they are given on the command line. As +in the shell, an initial @samp{.} in a file name does not match a +wildcard at the start of @var{pattern}. Sometimes it is useful +to give this option several times. For example, + +@smallexample +$ ls --ignore='.??*' --ignore='.[^.]' --ignore='#*' +@end smallexample + +The first option ignores names of length 3 or more that start with @samp{.}, +the second ignores all two-character names that start with @samp{.} +except @samp{..}, and the third ignores names that start with @samp{#}. + +@item -L +@itemx --dereference +@opindex -L +@opindex --dereference +@cindex symbolic links, dereferencing +When showing file information for a symbolic link, show information +for the file the link references rather than the link itself. + +@item -R +@itemx --recursive +@opindex -R +@opindex --recursive +@cindex recursive directory listing +@cindex directory listing, recursive +List the contents of all directories recursively. + +@end table + + +@node What information is listed +@subsection What information is listed + +These options affect the information that @code{ls} displays. By +default, only file names are shown. + +@table @samp + +@item -D +@itemx --dired +@opindex -D +@opindex --dired +@cindex dired Emacs mode support +With the long listing (@samp{-l}) format, print an additional line after +the main output: + +@example +//DIRED// @var{beg1 end1 beg2 end2 @dots{}} +@end example + +@noindent +The @var{begN} and @var{endN} are unsigned integers that record the +byte position of the beginning and end of each file name in the output. +This makes it easy for Emacs to find the names, even when they contain +unusual characters such as space or newline, without fancy searching. + +If directories are being listed recursively (@code{-R}), output a similar +line after each subdirectory: +@example +//SUBDIRED// @var{format} @var{beg1 end1 @dots{}} +@end example + +Finally, output a line of the form: +@example +//DIRED-OPTIONS// --quoting-style=@var{word} +@end example +where @var{word} is the quoting style (@pxref{Formatting the file names}). + +@item --full-time +@opindex --full-time +Produce long format directory listings, and list times in full. It is +equivalent to using @option{--format=long} with +@option{--time-style=full-iso} (@pxref{Formatting file timestamps}). + +@item -g +@opindex -g +Produce long format directory listings, but don't display owner information. + +@item -G +@itemx --no-group +@opindex -G +@opindex --no-group +Inhibit display of group information in a long format directory listing. +(This is the default in some non-@sc{gnu} versions of @code{ls}, so we +provide this option for compatibility.) + +@item -h +@itemx --human-readable +@opindex -h +@opindex --human-readable +@cindex human-readable output +Append a size letter such as @samp{M} for megabytes to each size. +Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes. +Use the @samp{--si} option if you prefer powers of 1000. + +@item -i +@itemx --inode +@opindex -i +@opindex --inode +@cindex inode number, printing +Print the inode number (also called the file serial number and index +number) of each file to the left of the file name. (This number +uniquely identifies each file within a particular filesystem.) + +@item -l +@itemx --format=long +@itemx --format=verbose +@opindex -l +@opindex --format +@opindex long ls @r{format} +@opindex verbose ls @r{format} +In addition to the name of each file, print the file type, permissions, +number of hard links, owner name, group name, size in bytes, and +timestamp (@pxref{Formatting file timestamps}), normally +the modification time. + +For each directory that is listed, preface the files with a line +@samp{total @var{blocks}}, where @var{blocks} is the total disk allocation +for all files in that directory. The block size currently defaults to 1024 +bytes, but this can be overridden (@pxref{Block size}). +The @var{blocks} computed counts each hard link separately; +this is arguably a deficiency. + +@cindex permissions, output by @code{ls} +The permissions listed are similar to symbolic mode specifications +(@pxref{Symbolic Modes}). But @code{ls} combines multiple bits into the +third character of each set of permissions as follows: +@table @samp +@item s +If the setuid or setgid bit and the corresponding executable bit +are both set. + +@item S +If the setuid or setgid bit is set but the corresponding executable bit +is not set. + +@item t +If the sticky bit and the other-executable bit are both set. + +@item T +If the sticky bit is set but the other-executable bit is not set. + +@item x +If the executable bit is set and none of the above apply. + +@item - +Otherwise. +@end table + +Following the permission bits is a single character that specifies +whether an alternate access method applies to the file. When that +character is a space, there is no alternate access method. When it +is a printing character (e.g., @samp{+}), then there is such a method. + +@item -n +@itemx --numeric-uid-gid +@opindex -n +@opindex --numeric-uid-gid +@cindex numeric uid and gid +Produce long format directory listings, but +display numeric UIDs and GIDs instead of the owner and group names. + +@item -o +@opindex -o +Produce long format directory listings, but don't display group information. +It is equivalent to using @samp{--format=long} with @samp{--no-group} . + +@item -s +@itemx --size +@opindex -s +@opindex --size +@cindex disk allocation +@cindex size of files, reporting +Print the disk allocation of each file to the left of the file name. +This is the amount of disk space used by the file, which is usually a +bit more than the file's size, but it can be less if the file has holes. + +Normally the disk allocation is printed in units of +1024 bytes, but this can be overridden (@pxref{Block size}). + +@cindex NFS mounts from BSD to HP-UX +For files that are NFS-mounted from an HP-UX system to a BSD system, +this option reports sizes that are half the correct values. On HP-UX +systems, it reports sizes that are twice the correct values for files +that are NFS-mounted from BSD systems. This is due to a flaw in HP-UX; +it also affects the HP-UX @code{ls} program. + +@itemx --si +@opindex --si +@cindex SI output +Append a size letter such as @samp{M} for megabytes to each size. (SI +is the International System of Units, which defines these letters as +prefixes.) Powers of 1000 are used, not 1024; @samp{M} stands for +1,000,000 bytes. Use the @samp{-h} or @samp{--human-readable} option if +you prefer powers of 1024. + +@end table + + +@node Sorting the output +@subsection Sorting the output + +@cindex sorting @code{ls} output +These options change the order in which @code{ls} sorts the information +it outputs. By default, sorting is done by character code (e.g., ASCII +order). + +@table @samp + +@item -c +@itemx --time=ctime +@itemx --time=status +@itemx --time=use +@opindex -c +@opindex --time +@opindex ctime@r{, printing or sorting by} +@opindex status time@r{, printing or sorting by} +@opindex use time@r{, printing or sorting files by} +If the long listing format (e.g., @samp{-l}, @samp{-o}) is being used, +print the status change time (the @samp{ctime} in the inode) instead of +the modification time. +When explicitly sorting by time (@samp{--sort=time} or @samp{-t}) +or when not using a long listing format, +sort according to the status change time. + +@item -f +@opindex -f +@cindex unsorted directory listing +@cindex directory order, listing by +Primarily, like @samp{-U}---do not sort; list the files in whatever +order they are stored in the directory. But also enable @samp{-a} (list +all files) and disable @samp{-l}, @samp{--color}, and @samp{-s} (if they +were specified before the @samp{-f}). + +@item -r +@itemx --reverse +@opindex -r +@opindex --reverse +@cindex reverse sorting +Reverse whatever the sorting method is---e.g., list files in reverse +alphabetical order, youngest first, smallest first, or whatever. + +@item -S +@itemx --sort=size +@opindex -S +@opindex --sort +@opindex size of files@r{, sorting files by} +Sort by file size, largest first. + +@item -t +@itemx --sort=time +@opindex -t +@opindex --sort +@opindex modification time@r{, sorting files by} +Sort by modification time (the @samp{mtime} in the inode), newest first. + +@item -u +@itemx --time=atime +@itemx --time=access +@opindex -u +@opindex --time +@opindex use time@r{, printing or sorting files by} +@opindex atime@r{, printing or sorting files by} +@opindex access time@r{, printing or sorting files by} +If the long listing format (e.g., @samp{--format=long}) is being used, +print the last access time (the @samp{atime} in the inode). +When explicitly sorting by time (@samp{--sort=time} or @samp{-t}) +or when not using a long listing format, sort according to the access time. + +@item -U +@itemx --sort=none +@opindex -U +@opindex --sort +@opindex none@r{, sorting option for @code{ls}} +Do not sort; list the files in whatever order they are +stored in the directory. (Do not do any of the other unrelated things +that @samp{-f} does.) This is especially useful when listing very large +directories, since not doing any sorting can be noticeably faster. + +@item -v +@itemx --sort=version +@opindex -v +@opindex --sort +@opindex version@r{, sorting option for @code{ls}} +Sort by version name and number, lowest first. It behaves like a default +sort, except that each sequence of decimal digits is treated numerically +as an index/version number. (@xref{More details about version sort}.) + +@item -X +@itemx --sort=extension +@opindex -X +@opindex --sort +@opindex extension@r{, sorting files by} +Sort directory contents alphabetically by file extension (characters +after the last @samp{.}); files with no extension are sorted first. + +@end table + + +@node More details about version sort +@subsection More details about version sort + +The version sort takes into account the fact that file names frequently include +indices or version numbers. Standard sorting functions usually do not produce +the ordering that people expect because comparisons are made on a +character-by-character basis. The version +sort addresses this problem, and is especially useful when browsing +directories that contain many files with indices/version numbers in their +names: + +@example + > ls -1 > ls -1v + foo.zml-1.gz foo.zml-1.gz + foo.zml-100.gz foo.zml-2.gz + foo.zml-12.gz foo.zml-6.gz + foo.zml-13.gz foo.zml-12.gz + foo.zml-2.gz foo.zml-13.gz + foo.zml-25.gz foo.zml-25.gz + foo.zml-6.gz foo.zml-100.gz +@end example + +Note also that numeric parts with leading zeroes are considered as +fractional one: + +@example + > ls -1 > ls -1v + abc-1.007.tgz abc-1.007.tgz + abc-1.012b.tgz abc-1.01a.tgz + abc-1.01a.tgz abc-1.012b.tgz +@end example + +@node General output formatting +@subsection General output formatting + +These options affect the appearance of the overall output. + +@table @samp + +@item -1 +@itemx --format=single-column +@opindex -1 +@opindex --format +@opindex single-column @r{output of files} +List one file per line. This is the default for @code{ls} when standard +output is not a terminal. + +@item -C +@itemx --format=vertical +@opindex -C +@opindex --format +@opindex vertical @r{sorted files in columns} +List files in columns, sorted vertically. This is the default for +@code{ls} if standard output is a terminal. It is always the default +for the @code{dir} and @code{d} programs. +@sc{gnu} @code{ls} uses variable width columns to display as many files as +possible in the fewest lines. + +@item --color [=@var{when}] +@opindex --color +@cindex color, distinguishing file types with +Specify whether to use color for distinguishing file types. @var{when} +may be omitted, or one of: +@itemize @bullet +@item none +@vindex none @r{color option} +- Do not use color at all. This is the default. +@item auto +@vindex auto @r{color option} +@cindex terminal, using color iff +- Only use color if standard output is a terminal. +@item always +@vindex always @r{color option} +- Always use color. +@end itemize +Specifying @samp{--color} and no @var{when} is equivalent to +@samp{--color=always}. +Piping a colorized listing through a pager like @code{more} or +@code{less} usually produces unreadable results. However, using +@code{more -f} does seem to work. + +@item -F +@itemx --classify +@itemx --indicator-style=classify +@opindex -F +@opindex --classify +@opindex --indicator-style +@cindex file type and executables, marking +@cindex executables and file type, marking +Append a character to each file name indicating the file type. Also, +for regular files that are executable, append @samp{*}. The file type +indicators are @samp{/} for directories, @samp{@@} for symbolic links, +@samp{|} for FIFOs, @samp{=} for sockets, and nothing for regular files. + +@item --indicator-style=@var{word} +@opindex --indicator-style +Append a character indicator with style @var{word} to entry names, +as follows: +@table @samp +@item none +Do not append any character indicator; this is the default. +@item file-type +Append @samp{/} for directories, @samp{@@} for symbolic links, @samp{|} +for FIFOs, @samp{=} for sockets, and nothing for regular files. This is +the same as the @samp{-p} or @samp{--file-type} option. +@item classify +Append @samp{*} for executable regular files, otherwise behave as for +@samp{file-type}. This is the same as the @samp{-F} or +@samp{--classify} option. +@end table + +@item -k +@itemx --kilobytes +@opindex -k +@opindex --kilobytes +Print file sizes in 1024-byte blocks, overriding the default block +size (@pxref{Block size}). + +@item -m +@itemx --format=commas +@opindex -m +@opindex --format +@opindex commas@r{, outputting between files} +List files horizontally, with as many as will fit on each line, +separated by @samp{, } (a comma and a space). + +@item -p +@itemx --file-type +@itemx --indicator-style=file-type +@opindex --file-type +@opindex --indicator-style +@cindex file type, marking +Append a character to each file name indicating the file type. This is +like @samp{-F}, except that executables are not marked. + +@item -x @var{format} +@itemx --format=across +@itemx --format=horizontal +@opindex -x +@opindex --format +@opindex across@r{, listing files} +@opindex horizontal@r{, listing files} +List the files in columns, sorted horizontally. + +@item -T @var{cols} +@itemx --tabsize=@var{cols} +@opindex -T +@opindex --tabsize +Assume that each tabstop is @var{cols} columns wide. The default is 8. +@code{ls} uses tabs where possible in the output, for efficiency. If +@var{cols} is zero, do not use tabs at all. + +@item -w +@itemx --width=@var{cols} +@opindex -w +@opindex --width +@vindex COLUMNS +Assume the screen is @var{cols} columns wide. The default is taken +from the terminal settings if possible; otherwise the environment +variable @env{COLUMNS} is used if it is set; otherwise the default +is 80. + +@end table + + +@node Formatting file timestamps +@subsection Formatting file timestamps + +By default, file timestamps are output in abbreviated form. For files +with a time more than six months old or in the future, the timestamp +contains the year instead of the time of day. If the timestamp contains +today's date with the year rather than a time of day, the file's time is +in the future, which means you probably have clock skew problems which +may break programs like @command{make} that rely on file times. + +The following option changes how file timestamps are printed. + +@table @samp +@item --time-style=@var{word} +@opindex --time-style +@cindex time style +Use style @var{word} to output file timestamps. The @var{word} should +be one of the following: + +@table @samp +@item full-iso +List timestamps in full, rather than using the standard abbreviation +heuristics. The format is @sc{iso} 8601 date, time, and time zone +format with nanosecond precision, e.g., @samp{2001-05-14 +23:45:56.477817180 -0700}. It's not possible to change the format, but +you can extract out the date string with @code{cut} and then pass the +result to @code{date -d}. @xref{date invocation, @code{date} +invocation, , sh-utils, Shell utilities}. + +This is useful because the time output includes all the information that +is available from the operating system. For example, this can help +when you have a Makefile that is not regenerating files properly. + +@item iso +Use @sc{iso}-style time stamps like @samp{2001-05-14@ } and @samp{05-14 +23:45}. + +@item locale +@vindex LC_ALL +@vindex LC_TIME +@vindex LANG +Use locale-dependent dates like @samp{touko@ @ 14 2001} and @samp{touko@ +@ 14 23:45}, time stamps that might occur in a Finnish locale. The +locale for formatting timestamps is specified by the first of three +environment variables @env{LC_ALL}, @env{LC_TIME}, @env{LANG} that is +set. + +@item posix-iso +Use traditional @sc{posix}-locale dates like @samp{May 14@ @ 2001} and +@samp{May 14 23:45} unless the user specifies a non-@sc{posix} locale, +in which case use @sc{iso}-style dates. This is the default. +@end table +@end table + +@vindex TIME_STYLE +You can specify the default value of the @option{--time-style} option +with the environment variable @env{TIME_STYLE}. @sc{gnu} Emacs 21 and +later can parse @sc{iso} dates, but older Emacs versions do not, so if +you are using an older version of Emacs and specify a non-@sc{posix} +locale, you may need to set @samp{TIME_STYLE="locale"}. + + +@node Formatting the file names +@subsection Formatting the file names + +These options change how file names themselves are printed. + +@table @samp + +@item -b +@itemx --escape +@itemx --quoting-style=escape +@opindex -b +@opindex --escape +@opindex --quoting-style +@cindex backslash sequences for file names +Quote nongraphic characters in file names using alphabetic and octal +backslash sequences like those used in C. + +@item -N +@itemx --literal +@opindex -N +@opindex --literal +Do not quote file names. + +@item -q +@itemx --hide-control-chars +@opindex -q +@opindex --hide-control-chars +Print question marks instead of nongraphic characters in file names. +This is the default if the output is a terminal and the program is +@code{ls}. + +@item -Q +@itemx --quote-name +@itemx --quoting-style=c +@opindex -Q +@opindex --quote-name +@opindex --quoting-style +Enclose file names in double quotes and quote nongraphic characters as +in C. + +@item --quoting-style=@var{word} +@opindex --quoting-style +@cindex quoting style +Use style @var{word} to quote output names. The @var{word} should +be one of the following: +@table @samp +@item literal +Output names as-is. +@item shell +Quote names for the shell if they contain shell metacharacters or would +cause ambiguous output. +@item shell-always +Quote names for the shell, even if they would normally not require quoting. +@item c +Quote names as for a C language string; this is the same as the +@samp{-Q} or @samp{--quote-name} option. +@item escape +Quote as with @samp{c} except omit the surrounding double-quote +characters; this is the same as the @samp{-b} or @samp{--escape} option. +@item clocale +Quote as with @samp{c} except use quotation marks appropriate for the +locale. +@item locale +@c Use @t instead of @samp to avoid duplicate quoting in some output styles. +Like @samp{clocale}, but quote @t{`like this'} instead of @t{"like +this"} in the default C locale. This looks nicer on many displays. +@end table + +You can specify the default value of the @samp{--quoting-style} option +with the environment variable @env{QUOTING_STYLE}. If that environment +variable is not set, the default value is @samp{literal}, but this +default may change to @samp{shell} in a future version of this package. + +@item --show-control-chars +@opindex --show-control-chars +Print nongraphic characters as-is in file names. +This is the default unless the output is a terminal and the program is +@code{ls}. + +@end table + + +@node dir invocation +@section @code{dir}: Briefly list directory contents + +@pindex dir +@cindex directory listing, brief + +@code{dir} (also installed as @code{d}) is equivalent to @code{ls -C +-b}; that is, by default files are listed in columns, sorted vertically, +and special characters are represented by backslash escape sequences. + +@xref{ls invocation, @code{ls}}. + + +@node vdir invocation +@section @code{vdir}: Verbosely list directory contents + +@pindex vdir +@cindex directory listing, verbose + +@code{vdir} (also installed as @code{v}) is equivalent to @code{ls -l +-b}; that is, by default files are listed in long format and special +characters are represented by backslash escape sequences. + +@node dircolors invocation +@section @code{dircolors}: Color setup for @code{ls} + +@pindex dircolors +@cindex color setup +@cindex setup for color + +@code{dircolors} outputs a sequence of shell commands to set up the +terminal for color output from @code{ls} (and @code{dir}, etc.). +Typical usage: + +@example +eval `dircolors [@var{option}]@dots{} [@var{file}]` +@end example + +If @var{file} is specified, @code{dircolors} reads it to determine which +colors to use for which file types and extensions. Otherwise, a +precompiled database is used. For details on the format of these files, +run @samp{dircolors --print-database}. + +@vindex LS_COLORS +@vindex SHELL @r{environment variable, and color} +The output is a shell command to set the @env{LS_COLORS} environment +variable. You can specify the shell syntax to use on the command line, +or @code{dircolors} will guess it from the value of the @env{SHELL} +environment variable. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -b +@itemx --sh +@itemx --bourne-shell +@opindex -b +@opindex --sh +@opindex --bourne-shell +@cindex Bourne shell syntax for color setup +@cindex @code{sh} syntax for color setup +Output Bourne shell commands. This is the default if the @env{SHELL} +environment variable is set and does not end with @samp{csh} or +@samp{tcsh}. + +@item -c +@itemx --csh +@itemx --c-shell +@opindex -c +@opindex --csh +@opindex --c-shell +@cindex C shell syntax for color setup +@cindex @code{csh} syntax for color setup +Output C shell commands. This is the default if @code{SHELL} ends with +@code{csh} or @code{tcsh}. + +@item -p +@itemx --print-database +@opindex -p +@opindex --print-database +@cindex color database, printing +@cindex database for color setup, printing +@cindex printing color database +Print the (compiled-in) default color configuration database. This +output is itself a valid configuration file, and is fairly descriptive +of the possibilities. + +@end table + + +@node Basic operations +@chapter Basic operations + +@cindex manipulating files + +This chapter describes the commands for basic file manipulation: +copying, moving (renaming), and deleting (removing). + +@menu +* cp invocation:: Copy files. +* dd invocation:: Convert and copy a file. +* install invocation:: Copy files and set attributes. +* mv invocation:: Move (rename) files. +* rm invocation:: Remove files or directories. +* shred invocation:: Remove files more securely. +@end menu + + +@node cp invocation +@section @code{cp}: Copy files and directories + +@pindex cp +@cindex copying files and directories +@cindex files, copying +@cindex directories, copying + +@code{cp} copies files (or, optionally, directories). The copy is +completely independent of the original. You can either copy one file to +another, or copy arbitrarily many files to a destination directory. +Synopsis: + +@example +cp [@var{option}]@dots{} @var{source} @var{dest} +cp [@var{option}]@dots{} @var{source}@dots{} @var{directory} +@end example + +If the last argument names an existing directory, @code{cp} copies each +@var{source} file into that directory (retaining the same name). +Otherwise, if only two files are given, it copies the first onto the +second. It is an error if the last argument is not a directory and more +than two non-option arguments are given. + +Generally, files are written just as they are read. For exceptions, +see the @samp{--sparse} option below. + +By default, @command{cp} does not copy directories. However, the +@option{-R}, @option{-a}, and @option{-r} options cause @command{cp} to +copy recursively by descending into source directories and copying files +to corresponding destination directories. + +By default, @command{cp} follows symbolic links only when not copying +recursively. This default can be overridden with the +@option{--no-dereference} (@option{-d}), @option{--dereference} +(@option{-L}), and @option{-H} options. If more than one of these +options is specified, the last one silently overrides the others. + +@cindex self-backups +@cindex backups, making only +@code{cp} generally refuses to copy a file onto itself, with the +following exception: if @samp{--force --backup} is specified with +@var{source} and @var{dest} identical, and referring to a regular file, +@code{cp} will make a backup file, either regular or numbered, as +specified in the usual ways (@pxref{Backup options}). This is useful when +you simply want to make a backup of an existing file before changing it. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -a +@itemx --archive +@opindex -a +@opindex --archive +Preserve as much as possible of the structure and attributes of the +original files in the copy (but do not attempt to preserve internal +directory structure; i.e., @samp{ls -U} may list the entries in a copied +directory in a different order). +Equivalent to @samp{-dpR}. + +@item -b +@itemx @w{@kbd{--backup}[=@var{method}]} +@opindex -b +@opindex --backup +@vindex VERSION_CONTROL +@cindex backups, making +@xref{Backup options}. +Make a backup of each file that would otherwise be overwritten or removed. +As a special case, @code{cp} makes a backup of @var{source} when the force +and backup options are given and @var{source} and @var{dest} are the same +name for an existing, regular file. One useful application of this +combination of options is this tiny Bourne shell script: + +@example +#!/bin/sh +# Usage: backup FILE... +# Create a @sc{gnu}-style backup of each listed FILE. +for i in "$@"; do + cp --backup --force "$i" "$i" +done +@end example + +@item -d +@itemx --no-dereference +@opindex -d +@opindex --no-dereference +@cindex symbolic links, copying +@cindex hard links, preserving +Copy symbolic links as symbolic links rather than copying the files that +they point to, and preserve hard links between source files in the +copies. + +@item -f +@itemx --force +@opindex -f +@opindex --force +When copying without this option and an existing destination file cannot +be opened for writing, the copy fails. However, with @samp{--force}), +when a destination file cannot be opened, @code{cp} then unlinks it and +tries to open it again. Contrast this behavior with that enabled by +@samp{--link} and @samp{--symbolic-link}, whereby the destination file +is never opened but rather is unlinked unconditionally. Also see the +description of @samp{--remove-destination}. + +@item -H +@opindex -H +If a command line argument specifies a symbolic link, then copy the +file it points to rather than the symbolic link itself. However, +copy (preserving its nature) any symbolic link that is encountered +via recursive traversal. + +@item -i +@itemx --interactive +@opindex -i +@opindex --interactive +Prompt whether to overwrite existing regular destination files. + +@item -l +@itemx --link +@opindex -l +@opindex --link +Make hard links instead of copies of non-directories. + +@item -L +@itemx --dereference +@opindex -L +@opindex --dereference +Always follow symbolic links. + +@item -p +@itemx --preserve +@opindex -p +@opindex --preserve +@cindex file information, preserving +Preserve the original files' owner, group, permissions, and timestamps. +In the absence of this option, each destination file is created with the +permissions of the corresponding source file, minus the bits set in the +umask. @xref{File permissions}. + +@item -P +@itemx --parents +@opindex -P +@opindex --parents +@cindex parent directories and @code{cp} +Form the name of each destination file by appending to the target +directory a slash and the specified name of the source file. The last +argument given to @code{cp} must be the name of an existing directory. +For example, the command: + +@example +cp --parents a/b/c existing_dir +@end example + +@noindent +copies the file @file{a/b/c} to @file{existing_dir/a/b/c}, creating +any missing intermediate directories. + +Warning: the meaning of @option{-P} will change in the future to conform +to @sc{posix}. Use @option{--parents} for the old meaning, and +@option{--no-dereference} for the new. + +@item -r +@cindex directories, copying recursively +@cindex copying directories recursively +@cindex recursively copying directories +@cindex non-directories, copying as special files +Copy directories recursively, copying any non-directories and special +files (e.g., symbolic links, FIFOs and device files) as if they were +regular files. This means trying to read the data in each source +file and writing it to the destination. It is usually a mistake to +apply @code{cp -r} to special files like FIFOs and the ones typically +found in the @file{/dev} directory. In most cases, @code{cp -r} +will hang indefinitely trying to read from FIFOs and special files +like @file{/dev/console}, and it will fill up your destination disk +if you use it to copy @file{/dev/zero}. +Use the @samp{--recursive} (@samp{-R}) option instead if you want +to copy special files, preserving their special nature +rather than reading from them to copy their contents. + +@item -R +@itemx --recursive +@opindex -R +@opindex --recursive +Copy directories recursively, preserving non-directories (contrast with +@samp{-r} just above). + +@item --remove-destination +@opindex --remove-destination +Remove each existing destination file before attempting to open it +(contrast with @option{-f} above). + +@item --sparse=@var{when} +@opindex --sparse=@var{when} +@cindex sparse files, copying +@cindex holes, copying files with +@findex read @r{system call, and holes} +A @dfn{sparse file} contains @dfn{holes}---a sequence of zero bytes that +does not occupy any physical disk blocks; the @samp{read} system call +reads these as zeroes. This can both save considerable disk space and +increase speed, since many binary files contain lots of consecutive zero +bytes. By default, @code{cp} detects holes in input source files via a crude +heuristic and makes the corresponding output file sparse as well. + +The @var{when} value can be one of the following: +@table @samp +@item auto +The default behavior: the output file is sparse if the input file is sparse. + +@item always +Always make the output file sparse. This is useful when the input +file resides on a filesystem that does not support sparse files (the +most notable example is @samp{efs} filesystems in SGI IRIX 5.3 and +earlier), but the output file is on another type of filesystem. + +@item never +Never make the output file sparse. +This is useful in creating a file for use with the @code{mkswap} command, +since such a file must not have any holes. +@end table + +@itemx @w{@kbd{--strip-trailing-slashes}} +@opindex --strip-trailing-slashes +@cindex stripping trailing slashes +Remove any trailing slashes from each @var{source} argument. +@xref{Trailing slashes}. + +@item -s +@itemx --symbolic-link +@opindex -s +@opindex --symbolic-link +@cindex symbolic links, copying with +Make symbolic links instead of copies of non-directories. All source +file names must be absolute (starting with @samp{/}) unless the +destination files are in the current directory. This option merely +results in an error message on systems that do not support symbolic links. + +@item -S @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -S +@opindex --suffix +Append @var{suffix} to each backup file made with @samp{-b}. +@xref{Backup options}. + +@itemx @w{@kbd{--target-directory}=@var{directory}} +@opindex --target-directory +@cindex target directory +@cindex destination directory +Specify the destination @var{directory}. +@xref{Target directory}. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Print the name of each file before copying it. + +@item -V @var{method} +@itemx --version-control=@var{method} +@opindex -V +@opindex --version-control +Change the type of backups made with @samp{-b}. The @var{method} +argument can be @samp{none} (or @samp{off}), @samp{numbered} (or +@samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or +@samp{simple}). @xref{Backup options}. + +@item -x +@itemx --one-file-system +@opindex -x +@opindex --one-file-system +@cindex filesystems, omitting copying to different +Skip subdirectories that are on different filesystems from the one that +the copy started on. +However, mount point directories @emph{are} copied. + +@end table + + +@node dd invocation +@section @code{dd}: Convert and copy a file + +@pindex dd +@cindex converting while copying a file + +@code{dd} copies a file (from standard input to standard output, by +default) with a changeable I/O block size, while optionally performing +conversions on it. Synopsis: + +@example +dd [@var{option}]@dots{} +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@cindex multipliers after numbers +The numeric-valued options below (@var{bytes} and @var{blocks}) can be +followed by a multiplier: @samp{b}=512, @samp{c}=1, +@samp{w}=2, @samp{x@var{m}}=@var{m}, or any of the +standard block size suffixes like @samp{k}=1024 (@pxref{Block size}). + +Use different @command{dd} invocations to use different block sizes for +skipping and I/O. For example, the following shell commands copy data +in 512 kB blocks between a disk and a tape, but do not save or restore a +4 kB label at the start of the disk: + +@example +disk=/dev/rdsk/c0t1d0s2 +tape=/dev/rmt/0 + +# Copy all but the label from disk to tape. +(dd bs=4k skip=1 count=0 && dd bs=512k) <$disk >$tape + +# Copy from tape back to disk, but leave the disk label alone. +(dd bs=4k seek=1 count=0 && dd bs=512k) <$tape >$disk +@end example + +@table @samp + +@item if=@var{file} +@opindex if +Read from @var{file} instead of standard input. + +@item of=@var{file} +@opindex of +Write to @var{file} instead of standard output. Unless +@samp{conv=notrunc} is given, @code{dd} truncates @var{file} to zero +bytes (or the size specified with @samp{seek=}). + +@item ibs=@var{bytes} +@opindex ibs +@cindex block size of input +@cindex input block size +Read @var{bytes} bytes at a time. + +@item obs=@var{bytes} +@opindex obs +@cindex block size of output +@cindex output block size +Write @var{bytes} bytes at a time. + +@item bs=@var{bytes} +@opindex bs +@cindex block size +Both read and write @var{bytes} bytes at a time. This overrides +@samp{ibs} and @samp{obs}. + +@item cbs=@var{bytes} +@opindex cbs +@cindex block size of conversion +@cindex conversion block size +Convert @var{bytes} bytes at a time. + +@item skip=@var{blocks} +@opindex skip +Skip @var{blocks} @samp{ibs}-byte blocks in the input file before copying. + +@item seek=@var{blocks} +@opindex seek +Skip @var{blocks} @samp{obs}-byte blocks in the output file before copying. + +@item count=@var{blocks} +@opindex count +Copy @var{blocks} @samp{ibs}-byte blocks from the input file, instead +of everything until the end of the file. + +@item conv=@var{conversion}[,@var{conversion}]@dots{} +@opindex conv +Convert the file as specified by the @var{conversion} argument(s). +(No spaces around any comma(s).) + +Conversions: + +@table @samp + +@item ascii +@opindex ascii@r{, converting to} +Convert EBCDIC to ASCII. + +@item ebcdic +@opindex ebcdic@r{, converting to} +Convert ASCII to EBCDIC. + +@item ibm +@opindex alternate ebcdic@r{, converting to} +Convert ASCII to alternate EBCDIC. + +@item block +@opindex block @r{(space-padding)} +For each line in the input, output @samp{cbs} bytes, replacing the +input newline with a space and padding with spaces as necessary. + +@item unblock +@opindex unblock +Replace trailing spaces in each @samp{cbs}-sized input block with a +newline. + +@item lcase +@opindex lcase@r{, converting to} +Change uppercase letters to lowercase. + +@item ucase +@opindex ucase@r{, converting to} +Change lowercase letters to uppercase. + +@item swab +@opindex swab @r{(byte-swapping)} +@cindex byte-swapping +Swap every pair of input bytes. @sc{gnu} @code{dd}, unlike others, works +when an odd number of bytes are read---the last byte is simply copied +(since there is nothing to swap it with). + +@item noerror +@opindex noerror +@cindex read errors, ignoring +Continue after read errors. + +@item notrunc +@opindex notrunc +@cindex truncating output file, avoiding +Do not truncate the output file. + +@item sync +@opindex sync @r{(padding with nulls)} +Pad every input block to size of @samp{ibs} with trailing zero bytes. +When used with @samp{block} or @samp{unblock}, pad with spaces instead of +zero bytes. +@end table + +@end table + + +@node install invocation +@section @code{install}: Copy files and set attributes + +@pindex install +@cindex copying files and setting attributes + +@code{install} copies files while setting their permission modes and, if +possible, their owner and group. Synopses: + +@example +install [@var{option}]@dots{} @var{source} @var{dest} +install [@var{option}]@dots{} @var{source}@dots{} @var{directory} +install -d [@var{option}]@dots{} @var{directory}@dots{} +@end example + +In the first of these, the @var{source} file is copied to the @var{dest} +target file. In the second, each of the @var{source} files are copied +to the destination @var{directory}. In the last, each @var{directory} +(and any missing parent directories) is created. + +@cindex Makefiles, installing programs in +@code{install} is similar to @code{cp}, but allows you to control the +attributes of destination files. It is typically used in Makefiles to +copy programs into their destination directories. It refuses to copy +files onto themselves. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx @w{@kbd{--backup}[=@var{method}]} +@opindex -b +@opindex --backup +@vindex VERSION_CONTROL +@cindex backups, making +@xref{Backup options}. +Make a backup of each file that would otherwise be overwritten or removed. + +@item -c +@opindex -c +Ignored; for compatibility with old Unix versions of @code{install}. + +@item -d +@itemx --directory +@opindex -d +@opindex --directory +@cindex directories, creating with given attributes +@cindex parent directories, creating missing +@cindex leading directories, creating missing +Create each given directory and any missing parent directories, setting +the owner, group and mode as given on the command line or to the +defaults. It also gives any parent directories it creates those +attributes. (This is different from the SunOS 4.x @code{install}, which +gives directories that it creates the default attributes.) + +@item -g @var{group} +@itemx --group=@var{group} +@opindex -g +@opindex --group +@cindex group ownership of installed files, setting +Set the group ownership of installed files or directories to +@var{group}. The default is the process' current group. @var{group} +may be either a group name or a numeric group id. + +@item -m @var{mode} +@itemx --mode=@var{mode} +@opindex -m +@opindex --mode +@cindex permissions of installed files, setting +Set the permissions for the installed file or directory to @var{mode}, +which can be either an octal number, or a symbolic mode as in +@code{chmod}, with 0 as the point of departure (@pxref{File +permissions}). The default mode is @samp{u=rwx,go=rx}---read, write, +and execute for the owner, and read and execute for group and other. + +@item -o @var{owner} +@itemx --owner=@var{owner} +@opindex -o +@opindex --owner +@cindex ownership of installed files, setting +@cindex appropriate privileges +@vindex root @r{as default owner} +If @code{install} has appropriate privileges (is run as root), set the +ownership of installed files or directories to @var{owner}. The default +is @code{root}. @var{owner} may be either a user name or a numeric user +ID. + +@item -p +@itemx --preserve-timestamps +@opindex -p +@opindex --preserve-timestamps +@cindex timestamps of installed files, preserving +Set the time of last access and the time of last modification of each +installed file to match those of each corresponding original file. +When a file is installed without this option, its last access and +last modification times are both set to the time of installation. +This option is useful if you want to use the last modification times +of installed files to keep track of when they were last built as opposed +to when they were last installed. + +@item -s +@itemx --strip +@opindex -s +@opindex --strip +@cindex symbol table information, stripping +@cindex stripping symbol table information +Strip the symbol tables from installed binary executables. + +@item -S @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -S +@opindex --suffix +Append @var{suffix} to each backup file made with @samp{-b}. +@xref{Backup options}. + +@itemx @w{@kbd{--target-directory}=@var{directory}} +@opindex --target-directory +@cindex target directory +@cindex destination directory +Specify the destination @var{directory}. +@xref{Target directory}. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Print the name of each file before copying it. + +@item -V @var{method} +@itemx --version-control=@var{method} +@opindex -V +@opindex --version-control +Change the type of backups made with @samp{-b}. The @var{method} +argument can be @samp{none} (or @samp{off}), @samp{numbered} (or +@samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or +@samp{simple}). @xref{Backup options}. + +@end table + + +@node mv invocation +@section @code{mv}: Move (rename) files + +@pindex mv + +@code{mv} moves or renames files (or directories). Synopsis: + +@example +mv [@var{option}]@dots{} @var{source} @var{dest} +mv [@var{option}]@dots{} @var{source}@dots{} @var{directory} +@end example + +If the last argument names an existing directory, @code{mv} moves each +other given file into a file with the same name in that directory. +Otherwise, if only two files are given, it renames the first as +the second. It is an error if the last argument is not a directory +and more than two files are given. + +@code{mv} can move any type of file from one filesystem to another. +Prior to version @code{4.0} of the fileutils, +@code{mv} could move only regular files between filesystems. +For example, now @code{mv} can move an entire directory hierarchy +including special device files from one partition to another. It first +uses some of the same code that's used by @code{cp -a} to copy the +requested directories and files, then (assuming the copy succeeded) +it removes the originals. If the copy fails, then the part that was +copied to the destination partition is removed. If you were to copy +three directories from one partition to another and the copy of the first +directory succeeded, but the second didn't, the first would be left on +the destination partion and the second and third would be left on the +original partition. + +@cindex prompting, and @code{mv} +If a destination file exists but is normally unwritable, standard input +is a terminal, and the @samp{-f} or @samp{--force} option is not given, +@code{mv} prompts the user for whether to replace the file. (You might +own the file, or have write permission on its directory.) If the +response does not begin with @samp{y} or @samp{Y}, the file is skipped. + +@emph{Warning}: If you try to move a symlink that points to a directory, +and you specify the symlink with a trailing slash, then @code{mv} +doesn't move the symlink but instead moves the directory referenced +by the symlink. @xref{Trailing slashes}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx @w{@kbd{--backup}[=@var{method}]} +@opindex -b +@opindex --backup +@vindex VERSION_CONTROL +@cindex backups, making +@xref{Backup options}. +Make a backup of each file that would otherwise be overwritten or removed. + +@item -f +@itemx --force +@opindex -f +@opindex --force +@cindex prompts, omitting +Do not prompt the user before removing an unwritable destination file. + +@item -i +@itemx --interactive +@opindex -i +@opindex --interactive +@cindex prompts, forcing +Prompt whether to overwrite each existing destination file, regardless +of its permissions. If the response does not begin with @samp{y} or +@samp{Y}, the file is skipped. + +@item -u +@itemx --update +@opindex -u +@opindex --update +@cindex newer files, moving only +Do not move a nondirectory that has an existing destination with the +same or newer modification time. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Print the name of each file before moving it. + +@itemx @w{@kbd{--strip-trailing-slashes}} +@opindex --strip-trailing-slashes +@cindex stripping trailing slashes +Remove any trailing slashes from each @var{source} argument. +@xref{Trailing slashes}. + +@item -S @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -S +@opindex --suffix +Append @var{suffix} to each backup file made with @samp{-b}. +@xref{Backup options}. + +@itemx @w{@kbd{--target-directory}=@var{directory}} +@opindex --target-directory +@cindex target directory +@cindex destination directory +Specify the destination @var{directory}. +@xref{Target directory}. + +@item -V @var{method} +@itemx --version-control=@var{method} +@opindex -V +@opindex --version-control +Change the type of backups made with @samp{-b}. The @var{method} +argument can be @samp{none} (or @samp{off}), @samp{numbered} (or +@samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or +@samp{simple}). @xref{Backup options}. + +@end table + + +@node rm invocation +@section @code{rm}: Remove files or directories + +@pindex rm +@cindex removing files or directories + +@code{rm} removes each given @var{file}. By default, it does not remove +directories. Synopsis: + +@example +rm [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +@cindex prompting, and @code{rm} +If a file is unwritable, standard input is a terminal, and the @samp{-f} +or @samp{--force} option is not given, or the @samp{-i} or +@samp{--interactive} option @emph{is} given, @code{rm} prompts the user +for whether to remove the file. If the response does not begin with +@samp{y} or @samp{Y}, the file is skipped. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -d +@itemx --directory +@opindex -d +@opindex --directory +@cindex directories, removing with @code{unlink} +@findex unlink +@pindex fsck +Attempt to remove directories with @code{unlink} instead of @code{rmdir}, and +don't require a directory to be empty before trying to unlink it. This works +only if you have appropriate privileges and if your operating system supports +@code{unlink} for directories. Because unlinking a directory causes any files +in the deleted directory to become unreferenced, it is wise to @code{fsck} the +filesystem after doing this. + +@item -f +@itemx --force +@opindex -f +@opindex --force +Ignore nonexistent files and never prompt the user. +Ignore any previous @samp{--interactive} (@samp{-i}) option. + +@item -i +@itemx --interactive +@opindex -i +@opindex --interactive +Prompt whether to remove each file. If the response does not begin +with @samp{y} or @samp{Y}, the file is skipped. +Ignore any previous @samp{--force} (@samp{-f}) option. + +@item -r +@itemx -R +@itemx --recursive +@opindex -r +@opindex -R +@opindex --recursive +@cindex directories, removing (recursively) +Remove the contents of directories recursively. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Print the name of each file before removing it. + +@end table + +@cindex files beginning with @samp{-}, removing +@cindex @samp{-}, removing files beginning with +One common question is how to remove files whose names begin with a +@samp{-}. @sc{gnu} @code{rm}, like every program that uses the @code{getopt} +function to parse its arguments, lets you use the @samp{--} option to +indicate that all following arguments are non-options. To remove a file +called @file{-f} in the current directory, you could type either: + +@example +rm -- -f +@end example + +@noindent +or: + +@example +rm ./-f +@end example + +@opindex - @r{and Unix @code{rm}} +The Unix @code{rm} program's use of a single @samp{-} for this purpose +predates the development of the getopt standard syntax. + + +@node shred invocation +@section @code{shred}: Remove files more securely + +@pindex shred +@cindex data, erasing +@cindex erasing data + +@code{shred} overwrites devices or files, to help prevent even +very expensive hardware from recovering the data. + +Ordinarily when you remove a file (@pxref{rm invocation}), the data is +not actually destroyed. Only the index listing where the file is +stored is destroyed, and the storage is made available for reuse. +There are undelete utilities that will attempt to reconstruct the index +and can bring the file back if the parts were not reused. + +On a busy system with a nearly-full drive, space can get reused in a few +seconds. But there is no way to know for sure. If you have sensitive +data, you may want to be sure that recovery is not possible by actually +overwriting the file with non-sensitive data. + +However, even after doing that, it is possible to take the disk back +to a laboratory and use a lot of sensitive (and expensive) equipment +to look for the faint ``echoes'' of the original data underneath the +overwritten data. If the data has only been overwritten once, it's not +even that hard. + +The best way to remove something irretrievably is to destroy the media +it's on with acid, melt it down, or the like. For cheap removable media +like floppy disks, this is the preferred method. However, hard drives +are expensive and hard to melt, so the @code{shred} utility tries +to achieve a similar effect non-destructively. + +This uses many overwrite passes, with the data patterns chosen to +maximize the damage they do to the old data. While this will work on +floppies, the patterns are designed for best effect on hard drives. +For more details, see the source code and Peter Gutmann's paper +@cite{Secure Deletion of Data from Magnetic and Solid-State Memory}, +from the proceedings of the Sixth USENIX Security Symposium (San Jose, +California, 22--25 July, 1996). The paper is also available online +@url{http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html}. + +@strong{Please note} that @code{shred} relies on a very important assumption: +that the filesystem overwrites data in place. This is the traditional +way to do things, but many modern filesystem designs do not satisfy this +assumption. Exceptions include: + +@itemize @bullet + +@item +Log-structured or journaled filesystems, such as those supplied with +AIX and Solaris. + +@item +Filesystems that write redundant data and carry on even if some writes +fail, such as RAID-based filesystems. + +@item +Filesystems that make snapshots, such as Network Appliance's NFS server. + +@item +Filesystems that cache in temporary locations, such as NFS version 3 +clients. + +@item +Compressed filesystems. +@end itemize + +If you are not sure how your filesystem operates, then you should assume +that it does not overwrite data in place, which means that shred cannot +reliably operate on regular files in your filesystem. + +Generally speaking, it is more reliable to shred a device than a file, +since this bypasses the problem of filesystem design mentioned above. +However, even shredding devices is not always completely reliable. For +example, most disks map out bad sectors invisibly to the application; if +the bad sectors contain sensitive data, @code{shred} won't be able to +destroy it. + +@code{shred} makes no attempt to detect or report this problem, just as +it makes no attempt to do anything about backups. However, since it is +more reliable to shred devices than files, @code{shred} by default does +not truncate or remove the output file. This default is more suitable +for devices, which typically cannot be truncated and should not be +removed. + +Finally, consider the risk of backups and mirrors. +File system backups and remote mirrors may contain copies of the +file that cannot be removed, and that will allow a shredded file +to be recovered later. So if you keep any data you may later want +to destroy using @code{shred}, be sure that it is not backed up or mirrored. + +@example +shred [@var{option}]@dots{} @var{file}[@dots{}] +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -f +@itemx --force +@opindex -f +@opindex --force +@cindex force deletion +Override file permissions if necessary to allow overwriting. + +@item -@var{NUMBER} +@itemx -n @var{NUMBER} +@itemx --iterations=@var{NUMBER} +@opindex -n @var{NUMBER} +@opindex --iterations=@var{NUMBER} +@cindex iterations, selecting the number of +By default, @code{shred} uses 25 passes of overwrite. This is enough +for all of the useful overwrite patterns to be used at least once. +You can reduce this to save time, or increase it if you have a lot of +time to waste. + +@item -s @var{BYTES} +@itemx --size=@var{BYTES} +@opindex -s @var{BYTES} +@opindex --size=@var{BYTES} +@cindex size of file to shred +Shred the first @var{BYTES} bytes of the file. The default is to shred +the whole file. @var{BYTES} can be followed by a size specification like +@samp{k}, @samp{M}, or @samp{G} to specify a multiple. @xref{Block size}. + +@item -u +@itemx --remove +@opindex -u +@opindex --remove +@cindex removing files after shredding +After shredding a file, truncate it (if possible) and then remove it. +If a file has multiple links, only the named links will be removed. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Display status updates as sterilization proceeds. + +@item -x +@itemx --exact +@opindex -x +@opindex --exact +Normally, shred rounds the file size up to the next multiple of +the filesystem block size to fully erase the last block of the file. +This option suppresses that behavior. +Thus, by default if you shred a 10-byte file on a system with 512-byte +blocks, the resulting file will be 512 bytes long. With this option, +shred does not increase the size of the file. + +@item -z +@itemx --zero +@opindex -z +@opindex --zero +Normally, the last pass that @code{shred} writes is made up of +random data. If this would be conspicuous on your hard drive (for +example, because it looks like encrypted data), or you just think +it's tidier, the @samp{--zero} option adds an additional overwrite pass with +all zero bits. This is in addition to the number of passes specified +by the @samp{--iterations} option. + +@item - +@opindex - +Shred standard output. + +This argument is considered an option. If the common @samp{--} option has +been used to indicate the end of options on the command line, then @samp{-} +will be interpreted as an ordinary file name. + +The intended use of this is to shred a removed temporary file. +For example + +@example +i=`tempfile -m 0600` +exec 3<>"$i" +rm -- "$i" +echo "Hello, world" >&3 +shred - >&3 +exec 3>- +@end example + +Note that the shell command @samp{shred - >file} does not shred the +contents of @var{file}, since it truncates @var{file} before invoking +@code{shred}. Use the command @samp{shred file} or (if using a +Bourne-compatible shell) the command @samp{shred - 1<>file} instead. + +@end table + +You might use the following command to erase all trace of the +file system you'd created on the floppy disk in your first drive. +That command takes about 20 minutes to erase a 1.44MB floppy. + +@example +shred --verbose /dev/fd0 +@end example + +Similarly, to erase all data on a selected partition of +your hard disk, you could give a command like this: + +@example +shred --verbose /dev/sda5 +@end example + +@node Special file types +@chapter Special file types + +@cindex special file types +@cindex file types, special + +This chapter describes commands which create special types of files (and +@code{rmdir}, which removes directories, one special file type). + +@cindex special file types +@cindex file types +Although Unix-like operating systems have markedly fewer special file +types than others, not @emph{everything} can be treated only as the +undifferentiated byte stream of @dfn{normal files}. For example, when a +file is created or removed, the system must record this information, +which it does in a @dfn{directory}---a special type of file. Although +you can read directories as normal files, if you're curious, in order +for the system to do its job it must impose a structure, a certain +order, on the bytes of the file. Thus it is a ``special'' type of file. + +Besides directories, other special file types include named pipes +(FIFOs), symbolic links, sockets, and so-called @dfn{special files}. + +@menu +* ln invocation:: Make links between files. +* mkdir invocation:: Make directories. +* mkfifo invocation:: Make FIFOs (named pipes). +* mknod invocation:: Make block or character special files. +* rmdir invocation:: Remove empty directories. +@end menu + + +@node ln invocation +@section @code{ln}: Make links between files + +@pindex ln +@cindex links, creating +@cindex hard links, creating +@cindex symbolic (soft) links, creating +@cindex creating links (hard or soft) + +@cindex filesystems and hard links +@code{ln} makes links between files. By default, it makes hard links; +with the @samp{-s} option, it makes symbolic (or @dfn{soft}) links. +Synopses: + +@example +ln [@var{option}]@dots{} @var{target} [@var{linkname}] +ln [@var{option}]@dots{} @var{target}@dots{} @var{directory} +@end example + +@itemize @bullet + +@item If the last argument names an existing directory, @code{ln} creates a +link to each @var{target} file in that directory, using the +@var{target}s' names. (But see the description of the +@samp{--no-dereference} option below.) + +@item If two filenames are given, @code{ln} creates a link from the +second to the first. + +@item If one @var{target} is given, @code{ln} creates a link to that +file in the current directory. + +@item It is an error if the last argument is not a directory and more +than two files are given. Without @samp{-f} or @samp{-i} (see below), +@code{ln} will not remove an existing file. Use the @samp{--backup} +option to make @code{ln} rename existing files. + +@end itemize + +@cindex hard link, defined +@cindex inode, and hard links +A @dfn{hard link} is another name for an existing file; the link and the +original are indistinguishable. Technically speaking, they share the +same inode, and the inode contains all the information about a +file---indeed, it is not incorrect to say that the inode @emph{is} the +file. On all existing implementations, you cannot make a hard link to +a directory, and hard links cannot cross filesystem boundaries. (These +restrictions are not mandated by @sc{posix}, however.) + +@cindex dereferencing symbolic links +@cindex symbolic link, defined +@dfn{Symbolic links} (@dfn{symlinks} for short), on the other hand, are +a special file type (which not all kernels support: System V release 3 +(and older) systems lack symlinks) in which the link file actually +refers to a different file, by name. When most operations (opening, +reading, writing, and so on) are passed the symbolic link file, the +kernel automatically @dfn{dereferences} the link and operates on the +target of the link. But some operations (e.g., removing) work on the +link file itself, rather than on its target. @xref{Symbolic Links,,, +library, The GNU C Library Reference Manual}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -b +@itemx @w{@kbd{--backup}[=@var{method}]} +@opindex -b +@opindex --backup +@vindex VERSION_CONTROL +@cindex backups, making +@xref{Backup options}. +Make a backup of each file that would otherwise be overwritten or removed. + +@item -d +@itemx -F +@itemx --directory +@opindex -d +@opindex -F +@opindex --directory +@cindex hard links to directories +Allow the super-user to make hard links to directories. + +@item -f +@itemx --force +@opindex -f +@opindex --force +Remove existing destination files. + +@item -i +@itemx --interactive +@opindex -i +@opindex --interactive +@cindex prompting, and @code{ln} +Prompt whether to remove existing destination files. + +@item -n +@itemx --no-dereference +@opindex -n +@opindex --no-dereference +When given an explicit destination that is a symlink to a directory, +treat that destination as if it were a normal file. + +When the destination is an actual directory (not a symlink to one), +there is no ambiguity. The link is created in that directory. +But when the specified destination is a symlink to a directory, +there are two ways to treat the user's request. @code{ln} can +treat the destination just as it would a normal directory and create +the link in it. On the other hand, the destination can be viewed as a +non-directory---as the symlink itself. In that case, @code{ln} +must delete or backup that symlink before creating the new link. +The default is to treat a destination that is a symlink to a directory +just like a directory. + +@item -s +@itemx --symbolic +@opindex -s +@opindex --symbolic +Make symbolic links instead of hard links. This option merely produces +an error message on systems that do not support symbolic links. + +@item -S @var{suffix} +@itemx --suffix=@var{suffix} +@opindex -S +@opindex --suffix +Append @var{suffix} to each backup file made with @samp{-b}. +@xref{Backup options}. + +@itemx @w{@kbd{--target-directory}=@var{directory}} +@opindex --target-directory +@cindex target directory +@cindex destination directory +Specify the destination @var{directory}. +@xref{Target directory}. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Print the name of each file before linking it. + +@item -V @var{method} +@itemx --version-control=@var{method} +@opindex -V +@opindex --version-control +Change the type of backups made with @samp{-b}. The @var{method} +argument can be @samp{none} (or @samp{off}), @samp{numbered} (or +@samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or +@samp{simple}). @xref{Backup options}. + +@end table + +Examples: + +@smallexample +ln -s /some/name # creates link ./name pointing to /some/name +ln -s /some/name myname # creates link ./myname pointing to /some/name +ln -s a b .. # creates links ../a and ../b pointing to ./a and ./b +@end smallexample + + +@node mkdir invocation +@section @code{mkdir}: Make directories + +@pindex mkdir +@cindex directories, creating +@cindex creating directories + +@code{mkdir} creates directories with the specified names. Synopsis: + +@example +mkdir [@var{option}]@dots{} @var{name}@dots{} +@end example + +If a @var{name} is an existing file but not a directory, @code{mkdir} prints a +warning message on stderr and will exit with a status of 1 after +processing any remaining @var{name}s. The same is done when a @var{name} is an +existing directory and the -p option is not given. If a @var{name} is an +existing directory and the -p option is given, @code{mkdir} will ignore it. +That is, @code{mkdir} will not print a warning, raise an error, or change +the mode of the directory (even if the -m option is given), and will +move on to processing any remaining @var{name}s. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -m @var{mode} +@itemx --mode=@var{mode} +@opindex -m +@opindex --mode +@cindex modes of created directories, setting +Set the mode of created directories to @var{mode}, which is symbolic as +in @code{chmod} and uses @samp{a=rwx} (read, write and execute allowed for +everyone) minus the bits set in the umask for the point of the +departure. @xref{File permissions}. + +@item -p +@itemx --parents +@opindex -p +@opindex --parents +@cindex parent directories, creating +Make any missing parent directories for each argument. The mode for parent +directories is set to the umask modified by @samp{u+wx}. +Ignore arguments corresponding to existing directories. + +@item -v +@item --verbose +@opindex -v +@opindex --verbose +Print a message for each created directory. This is most useful with +@samp{--parents}. +@end table + + +@node mkfifo invocation +@section @code{mkfifo}: Make FIFOs (named pipes) + +@pindex mkfifo +@cindex FIFOs, creating +@cindex named pipes, creating +@cindex creating FIFOs (named pipes) + +@code{mkfifo} creates FIFOs (also called @dfn{named pipes}) with the +specified names. Synopsis: + +@example +mkfifo [@var{option}] @var{name}@dots{} +@end example + +A @dfn{FIFO} is a special file type that permits independent processes +to communicate. One process opens the FIFO file for writing, and +another for reading, after which data can flow as with the usual +anonymous pipe in shells or elsewhere. + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp + +@item -m @var{mode} +@itemx --mode=@var{mode} +@opindex -m +@opindex --mode +@cindex modes of created FIFOs, setting +Set the mode of created FIFOs to @var{mode}, which is symbolic as in +@code{chmod} and uses @samp{a=rw} (read and write allowed for everyone) minus +the bits set in the umask for the point of departure. @xref{File permissions}. + +@end table + + +@node mknod invocation +@section @code{mknod}: Make block or character special files + +@pindex mknod +@cindex block special files, creating +@cindex character special files, creating + +@code{mknod} creates a FIFO, character special file, or block special +file with the specified name. Synopsis: + +@example +mknod [@var{option}]@dots{} @var{name} @var{type} [@var{major} @var{minor}] +@end example + +@cindex special files +@cindex block special files +@cindex character special files +Unlike the phrase ``special file type'' above, the term @dfn{special +file} has a technical meaning on Unix: something that can generate or +receive data. Usually this corresponds to a physical piece of hardware, +e.g., a printer or a disk. (These files are typically created at +system-configuration time.) The @code{mknod} command is what creates +files of this type. Such devices can be read either a character at a +time or a ``block'' (many characters) at a time, hence we say there are +@dfn{block special} files and @dfn{character special} files. + +The arguments after @var{name} specify the type of file to make: + +@table @samp + +@item p +@opindex p @r{for FIFO file} +for a FIFO + +@item b +@opindex b @r{for block special file} +for a block special file + +@item c +@c Don't document the `u' option -- it's just a synonym for `c'. +@c Do *any* versions of mknod still use it? +@c @itemx u +@opindex c @r{for character special file} +@c @opindex u @r{for character special file} +for a character special file + +@end table + +When making a block or character special file, the major and minor +device numbers must be given after the file type. + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp + +@item -m @var{mode} +@itemx --mode=@var{mode} +@opindex -m +@opindex --mode +Set the mode of created files to @var{mode}, which is symbolic as in +@code{chmod} and uses @samp{a=rw} minus the bits set in the umask as the point +of departure. @xref{File permissions}. + +@end table + + +@node rmdir invocation +@section @code{rmdir}: Remove empty directories + +@pindex rmdir +@cindex removing empty directories +@cindex directories, removing empty + +@code{rmdir} removes empty directories. Synopsis: + +@example +rmdir [@var{option}]@dots{} @var{directory}@dots{} +@end example + +If any @var{directory} argument does not refer to an existing empty +directory, it is an error. + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp + +@item --ignore-fail-on-non-empty +@opindex --ignore-fail-on-non-empty +@cindex directory deletion, ignoring failures +Ignore each failure to remove a directory that is solely because +the directory is non-empty. + +@item -p +@itemx --parents +@opindex -p +@opindex --parents +@cindex parent directories, removing +Remove @var{directory}, then try to remove each component of @var{directory}. +So, for example, @samp{rmdir -p a/b/c} is similar to @samp{rmdir a/b/c a/b a}. +As such, it fails if any of those directories turns out not to be empty. +Use the @samp{--ignore-fail-on-non-empty} option to make it so such +a failure does not evoke a diagnostic and does not cause @code{rmdir} to +exit unsuccessfully. + +@item -v +@item --verbose +@opindex -v +@opindex --verbose +@cindex directory deletion, reporting +Give a diagnostic for each successful removal. +@var{directory} is removed. + +@end table + +@xref{rm invocation}, for how to remove non-empty directories (recursively). + + +@node Changing file attributes +@chapter Changing file attributes + +@cindex changing file attributes +@cindex file attributes, changing +@cindex attributes, file + +A file is not merely its contents, a name, and a file type +(@pxref{Special file types}). A file also has an owner (a userid), a +group (a group id), permissions (what the owner can do with the file, +what people in the group can do, and what everyone else can do), various +timestamps, and other information. Collectively, we call these a file's +@dfn{attributes}. + +These commands change file attributes. + +@menu +* chown invocation:: Change file owners and groups. +* chgrp invocation:: Change file groups. +* chmod invocation:: Change access permissions. +* touch invocation:: Change file timestamps. +@end menu + + +@node chown invocation +@section @code{chown}: Change file owner and group + +@pindex chown +@cindex file ownership, changing +@cindex group ownership, changing +@cindex changing file ownership +@cindex changing group ownership + +@code{chown} changes the user and/or group ownership of each given @var{file} +to @var{new-owner} or to the user and group of an existing reference file. +Synopsis: + +@example +chown [@var{option}]@dots{} @{@var{new-owner} | --reference=@var{ref_file}@} @var{file}@dots{} +@end example + +If used, @var{new-owner} specifies the new owner and/or group as follows +(with no embedded white space): + +@example +[@var{owner}] [ [:] [@var{group}] ] +@end example + +Specifically: + +@table @var +@item owner +If only an @var{owner} (a user name or numeric user id) is given, that +user is made the owner of each given file, and the files' group is not +changed. + +@itemx owner@samp{:}group +If the @var{owner} is followed by a colon and a @var{group} (a +group name or numeric group id), with no spaces between them, the group +ownership of the files is changed as well (to @var{group}). + +@itemx owner@samp{:} +If a colon but no group name follows @var{owner}, that user is +made the owner of the files and the group of the files is changed to +@var{owner}'s login group. + +@itemx @samp{:}group +If the colon and following @var{group} are given, but the owner +is omitted, only the group of the files is changed; in this case, +@code{chown} performs the same function as @code{chgrp}. + +@end table + +You may use @samp{.} in place of the @samp{:} separator. This is a +@sc{gnu} extension for compatibility with older scripts. +New scripts should avoid the use of @samp{.} because @sc{gnu} @code{chown} +may fail if @var{owner} contains @samp{.} characters. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -c +@itemx --changes +@opindex -c +@opindex --changes +@cindex changed owners, verbosely describing +Verbosely describe the action for each @var{file} whose ownership +actually changes. + +@item -f +@itemx --silent +@itemx --quiet +@opindex -f +@opindex --silent +@opindex --quiet +@cindex error messages, omitting +Do not print error messages about files whose ownership cannot be +changed. + +@itemx @w{@kbd{--from}=@var{old-owner}} +@opindex --from +@cindex symbolic links, changing owner +Change a @var{file}'s ownership only if it has current attributes specified +by @var{old-owner}. @var{old-owner} has the same form as @var{new-owner} +described above. +This option is useful primarily from a security standpoint in that +it narrows considerably the window of potential abuse. +For example, to reflect a UID numbering change for one user's files +without an option like this, @code{root} might run + +@smallexample +find / -owner OLDUSER -print0 | xargs -0 chown NEWUSER +@end smallexample + +But that is dangerous because the interval between when the @code{find} +tests the existing file's owner and when the @code{chown} is actually run +may be quite large. +One way to narrow the gap would be to invoke chown for each file +as it is found: + +@example +find / -owner OLDUSER -exec chown NEWUSER @{@} \; +@end example + +But that is very slow if there are many affected files. +With this option, it is safer (the gap is narrower still) +though still not perfect: + +@example +chown -R --from=OLDUSER NEWUSER / +@end example + +@item --dereference +@opindex --dereference +@cindex symbolic links, changing owner +@findex lchown +Do not act on symbolic links themselves but rather on what they point to. + +@item -h +@itemx --no-dereference +@opindex -h +@opindex --no-dereference +@cindex symbolic links, changing owner +@findex lchown +Act on symbolic links themselves instead of what they point to. +This is the default. +This mode relies on the @code{lchown} system call. +On systems that do not provide the @code{lchown} system call, +@code{chown} fails when a file specified on the command line +is a symbolic link. +By default, no diagnostic is issued for symbolic links encountered +during a recursive traversal, but see @samp{--verbose}. + +@item --reference=@var{ref_file} +@opindex --reference +Change the user and group of each @var{file} to be the same as those of +@var{ref_file}. If @var{ref_file} is a symbolic link, do not use the +user and group of the symbolic link, but rather those of the file it +refers to. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Output a diagnostic for every file processed. +If a symbolic link is encountered during a recursive traversal +on a system without the @code{lchown} system call, and @samp{--no-dereference} +is in effect, then issue a diagnostic saying neither the symbolic link nor +its referent is being changed. + +@item -R +@itemx --recursive +@opindex -R +@opindex --recursive +@cindex recursively changing file ownership +Recursively change ownership of directories and their contents. + +@end table + + +@node chgrp invocation +@section @code{chgrp}: Change group ownership + +@pindex chgrp +@cindex group ownership, changing +@cindex changing group ownership + +@code{chgrp} changes the group ownership of each given @var{file} +to @var{group} (which can be either a group name or a numeric group id) +or to the group of an existing reference file. Synopsis: + +@example +chgrp [@var{option}]@dots{} @{@var{group} | --reference=@var{ref_file}@} @var{file}@dots{} +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -c +@itemx --changes +@opindex -c +@opindex --changes +@cindex changed files, verbosely describing +Verbosely describe the action for each @var{file} whose group actually +changes. + +@item -f +@itemx --silent +@itemx --quiet +@opindex -f +@opindex --silent +@opindex --quiet +@cindex error messages, omitting +Do not print error messages about files whose group cannot be +changed. + +@item --dereference +@opindex --dereference +@cindex symbolic links, changing owner +@findex lchown +Do not act on symbolic links themselves but rather on what they point to. + +@item -h +@itemx --no-dereference +@opindex -h +@opindex --no-dereference +@cindex symbolic links, changing group +@findex lchown +Act on symbolic links themselves instead of what they point to. +This is the default. +This mode relies on the @code{lchown} system call. +On systems that do not provide the @code{lchown} system call, +@code{chgrp} fails when a file specified on the command line +is a symbolic link. +By default, no diagnostic is issued for symbolic links encountered +during a recursive traversal, but see @samp{--verbose}. + +@item --reference=@var{ref_file} +@opindex --reference +Change the group of each @var{file} to be the same as that of +@var{ref_file}. If @var{ref_file} is a symbolic link, do not use the +group of the symbolic link, but rather that of the file it refers to. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Output a diagnostic for every file processed. +If a symbolic link is encountered during a recursive traversal +on a system without the @code{lchown} system call, and @samp{--no-dereference} +is in effect, then issue a diagnostic saying neither the symbolic link nor +its referent is being changed. + +@item -R +@itemx --recursive +@opindex -R +@opindex --recursive +@cindex recursively changing group ownership +Recursively change the group ownership of directories and their contents. + +@end table + + +@node chmod invocation +@section @code{chmod}: Change access permissions + +@pindex chmod +@cindex changing access permissions +@cindex access permissions, changing +@cindex permissions, changing access + +@code{chmod} changes the access permissions of the named files. Synopsis: + +@example +chmod [@var{option}]@dots{} @{@var{mode} | --reference=@var{ref_file}@} @var{file}@dots{} +@end example + +@cindex symbolic links, permissions of +@code{chmod} never changes the permissions of symbolic links, since +the @code{chmod} system call cannot change their permissions. +This is not a problem since the permissions of symbolic links are +never used. However, for each symbolic link listed on the command +line, @code{chmod} changes the permissions of the pointed-to file. +In contrast, @code{chmod} ignores symbolic links encountered during +recursive directory traversals. + +If used, @var{mode} specifies the new permissions. +For details, see the section on @ref{File permissions}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -c +@itemx --changes +@opindex -c +@opindex --changes +Verbosely describe the action for each @var{file} whose permissions +actually changes. + +@item -f +@itemx --silent +@itemx --quiet +@opindex -f +@opindex --silent +@opindex --quiet +@cindex error messages, omitting +Do not print error messages about files whose permissions cannot be +changed. + +@item -v +@itemx --verbose +@opindex -v +@opindex --verbose +Verbosely describe the action or non-action taken for every @var{file}. + +@item --reference=@var{ref_file} +@opindex --reference +Change the mode of each @var{file} to be the same as that of @var{ref_file}. +@xref{File permissions}. +If @var{ref_file} is a symbolic link, do not use the mode +of the symbolic link, but rather that of the file it refers to. + +@item -R +@itemx --recursive +@opindex -R +@opindex --recursive +@cindex recursively changing access permissions +Recursively change permissions of directories and their contents. + +@end table + + +@node touch invocation +@section @code{touch}: Change file timestamps + +@pindex touch +@cindex changing file timestamps +@cindex file timestamps, changing +@cindex timestamps, changing file + +@code{touch} changes the access and/or modification times of the +specified files. Synopsis: + +@example +touch [@var{option}]@dots{} @var{file}@dots{} +@end example + +If the first @var{file} would be a valid argument to the @samp{-t} +option and no timestamp is given with any of the @samp{-d}, @samp{-r}, +or @samp{-t} options and the @samp{--} argument is not given, that +argument is interpreted as the time for the other files instead of +as a file name. Warning: this usage is obsolescent, and future versions +of @sc{posix} will require that support for it be withdrawn. Use +@option{-t} instead. + +@cindex empty files, creating +Any @var{file} that does not exist is created empty. + +@cindex permissions, for changing file timestamps +If changing both the access and modification times to the current +time, @code{touch} can change the timestamps for files that the user +running it does not own but has write permission for. Otherwise, the +user must own the files. + +Although @code{touch} provides options for changing two of the times -- +the times of last access and modification -- of a file, there is actually +a third one as well: the inode change time. This is often referred to +as a file's @code{ctime}. +The inode change time represents the time when the file's meta-information +last changed. One common example of this is when the permissions of a +file change. Changing the permissions doesn't access the file, so +the atime doesn't change, nor does it modify the file, so the mtime +doesn't change. Yet, something about the file itself has changed, +and this must be noted somewhere. This is the job of the ctime field. +This is necessary, so that, for example, a backup program can make a +fresh copy of the file, including the new permissions value. +Another operation that modifies a file's ctime without affecting +the others is renaming. In any case, it is not possible, in normal +operations, for a user to change the ctime field to a user-specified value. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -a +@itemx --time=atime +@itemx --time=access +@itemx --time=use +@opindex -a +@opindex --time +@opindex atime@r{, changing} +@opindex access @r{time, changing} +@opindex use @r{time, changing} +Change the access time only. + +@item -c +@itemx --no-create +@opindex -c +@opindex --no-create +Do not create files that do not exist. + +@item -d +@itemx --date=time +@opindex -d +@opindex --date +@opindex time +Use @var{time} instead of the current time. It can contain month names, +time zones, @samp{am} and @samp{pm}, etc. @xref{Date input formats}. + +@item -f +@opindex -f +@cindex BSD @code{touch} compatibility +Ignored; for compatibility with BSD versions of @code{touch}. + +@item -m +@itemx --time=mtime +@itemx --time=modify +@opindex -m +@opindex --time +@opindex mtime@r{, changing} +@opindex modify @r{time, changing} +Change the modification time only. + +@item -r @var{file} +@itemx --reference=@var{file} +@opindex -r +@opindex --reference +Use the times of the reference @var{file} instead of the current time. + +@item -t [[CC]YY]MMDDhhmm[.ss] +Use the argument (optional four-digit or two-digit years, months, +days, hours, minutes, optional seconds) instead of the current time. +If the year is specified with only two digits, then @var{CC} +is 20 for years in the range 0 @dots{} 68, and 19 for years in +69 @dots{} 99. If no digits of the year are specified, +the argument is interpreted as a date in the current year. + +@end table + + +@node Disk usage +@chapter Disk usage + +@cindex disk usage + +No disk can hold an infinite amount of data. These commands report on +how much disk storage is in use or available. (This has nothing much to +do with how much @emph{main memory}, i.e., RAM, a program is using when +it runs; for that, you want @code{ps} or @code{pstat} or @code{swap} +or some such command.) + +@menu +* df invocation:: Report filesystem disk space usage. +* du invocation:: Estimate file space usage. +* sync invocation:: Synchronize memory and disk. +@end menu + + +@node df invocation +@section @code{df}: Report filesystem disk space usage + +@pindex df +@cindex filesystem disk usage +@cindex disk usage by filesystem + +@code{df} reports the amount of disk space used and available on +filesystems. Synopsis: + +@example +df [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +With no arguments, @code{df} reports the space used and available on all +currently mounted filesystems (of all types). Otherwise, @code{df} +reports on the filesystem containing each argument @var{file}. + +Normally the disk space is printed in units of +1024 bytes, but this can be overridden (@pxref{Block size}). + +@cindex disk device file +@cindex device file, disk +If an argument @var{file} is a disk device file containing a mounted +filesystem, @code{df} shows the space available on that filesystem +rather than on the filesystem containing the device node (i.e., the root +filesystem). @sc{gnu} @code{df} does not attempt to determine the disk usage +on unmounted filesystems, because on most kinds of systems doing so +requires extremely nonportable intimate knowledge of filesystem +structures. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -a +@itemx --all +@opindex -a +@opindex --all +@cindex automounter filesystems +@cindex ignore filesystems +Include in the listing filesystems that have a size of 0 blocks, which +are omitted by default. Such filesystems are typically special-purpose +pseudo-filesystems, such as automounter entries. Also, filesystems of +type ``ignore'' or ``auto'', supported by some operating systems, are +only included if this option is specified. + +@item -h +@itemx --human-readable +@opindex -h +@opindex --human-readable +@cindex human-readable output +Append a size letter such as @samp{M} for megabytes to each size. +Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes. +Use the @samp{-H} or @samp{--si} option if you prefer powers of 1000. + +@item -H +@itemx --si +@opindex -H +@opindex --si +@cindex SI output +Append a size letter such as @samp{M} for megabytes to each size. (SI +is the International System of Units, which defines these letters as +prefixes.) Powers of 1000 are used, not 1024; @samp{M} stands for +1,000,000 bytes. Use the @samp{-h} or @samp{--human-readable} option if +you prefer powers of 1024. + +@item -i +@itemx --inodes +@opindex -i +@opindex --inodes +@cindex inode usage +List inode usage information instead of block usage. An inode (short +for index node) contains information about a file such as its owner, +permissions, timestamps, and location on the disk. + +@item -k +@itemx --kilobytes +@opindex -k +@opindex --kilobytes +@cindex kilobytes for filesystem sizes +Print sizes in 1024-byte blocks, overriding the default block size +(@pxref{Block size}). + +@item -l +@itemx --local +@opindex -l +@opindex --local +@cindex filesystem types, limiting output to certain +Limit the listing to local filesystems. By default, remote filesystems +are also listed. + +@item -m +@itemx --megabytes +@opindex -m +@opindex --megabytes +@cindex megabytes for filesystem sizes +Print sizes in megabyte (that is, 1,048,576-byte) blocks. + +@item --no-sync +@opindex --no-sync +@cindex filesystem space, retrieving old data more quickly +Do not invoke the @code{sync} system call before getting any usage data. +This may make @code{df} run significantly faster on systems with many +disks, but on some systems (notably SunOS) the results may be slightly +out of date. This is the default. + +@item -P +@itemx --portability +@opindex -P +@opindex --portability +@cindex one-line output format +@cindex @sc{posix} output format +@cindex portable output format +@cindex output format, portable +Use the @sc{posix} output format. This is like the default format except +for the following: + +@enumerate +@item +The information about each filesystem is always printed on exactly +one line; a mount device is never put on a line by itself. This means +that if the mount device name is more than 20 characters long (e.g., for +some network mounts), the columns are misaligned. + +@item +Non-integer values are rounded up, instead of being rounded down or +rounded to the nearest integer. + +@item +The labels in the header output line are changed to conform to @sc{posix}. +@end enumerate + +@item --sync +@opindex --sync +@cindex filesystem space, retrieving current data more slowly +Invoke the @code{sync} system call before getting any usage data. On +some systems (notably SunOS), doing this yields more up to date results, +but in general this option makes @code{df} much slower, especially when +there are many or very busy filesystems. + +@item -t @var{fstype} +@itemx --type=@var{fstype} +@opindex -t +@opindex --type +@cindex filesystem types, limiting output to certain +Limit the listing to filesystems of type @var{fstype}. Multiple +filesystem types can be specified by giving multiple @samp{-t} options. +By default, nothing is omitted. + +@item -T +@itemx --print-type +@opindex -T +@opindex --print-type +@cindex filesystem types, printing +Print each filesystem's type. The types printed here are the same ones +you can include or exclude with @samp{-t} and @samp{-x}. The particular +types printed are whatever is supported by the system. Here are some of +the common names (this list is certainly not exhaustive): + +@table @samp + +@item nfs +@cindex NFS filesystem type +An NFS filesystem, i.e., one mounted over a network from another +machine. This is the one type name which seems to be used uniformly by +all systems. + +@item 4.2@r{, }ufs@r{, }efs@dots{} +@cindex Linux filesystem types +@cindex local filesystem types +@opindex 4.2 @r{filesystem type} +@opindex ufs @r{filesystem type} +@opindex efs @r{filesystem type} +A filesystem on a locally-mounted hard disk. (The system might even +support more than one type here; Linux does.) + +@item hsfs@r{, }cdfs +@cindex CD-ROM filesystem type +@cindex High Sierra filesystem +@opindex hsfs @r{filesystem type} +@opindex cdfs @r{filesystem type} +A filesystem on a CD-ROM drive. HP-UX uses @samp{cdfs}, most other +systems use @samp{hsfs} (@samp{hs} for ``High Sierra''). + +@item pcfs +@cindex PC filesystem +@cindex DOS filesystem +@cindex MS-DOS filesystem +@cindex diskette filesystem +@opindex pcfs +An MS-DOS filesystem, usually on a diskette. + +@end table + +@item -x @var{fstype} +@itemx --exclude-type=@var{fstype} +@opindex -x +@opindex --exclude-type +Limit the listing to filesystems not of type @var{fstype}. +Multiple filesystem types can be eliminated by giving multiple +@samp{-x} options. By default, no filesystem types are omitted. + +@item -v +Ignored; for compatibility with System V versions of @code{df}. + +@end table + + +@node du invocation +@section @code{du}: Estimate file space usage + +@pindex du +@cindex file space usage +@cindex disk usage for files + +@code{du} reports the amount of disk space used by the specified files +and for each subdirectory (of directory arguments). Synopsis: + +@example +du [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +With no arguments, @code{du} reports the disk space for the current +directory. Normally the disk space is printed in units of +1024 bytes, but this can be overridden (@pxref{Block size}). + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -a +@itemx --all +@opindex -a +@opindex --all +Show counts for all files, not just directories. + +@item -b +@itemx --bytes +@opindex -b +@opindex --bytes +Print sizes in bytes, overriding the default block size (@pxref{Block size}). + +@item -c +@itemx --total +@opindex -c +@opindex --total +@cindex grand total of disk space +Print a grand total of all arguments after all arguments have +been processed. This can be used to find out the total disk usage of +a given set of files or directories. + +@item -D +@itemx --dereference-args +@opindex -D +@opindex --dereference-args +Dereference symbolic links that are command line arguments. +Does not affect other symbolic links. This is helpful for finding +out the disk usage of directories, such as @file{/usr/tmp}, which +are often symbolic links. + +@item -h +@itemx --human-readable +@opindex -h +@opindex --human-readable +@cindex human-readable output +Append a size letter such as @samp{M} for megabytes to each size. +Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes. +Use the @samp{-H} or @samp{--si} option if you prefer powers of 1000. + +@item -H +@itemx --si +@opindex -H +@opindex --si +@cindex SI output +Append a size letter such as @samp{M} for megabytes to each size. (SI +is the International System of Units, which defines these letters as +prefixes.) Powers of 1000 are used, not 1024; @samp{M} stands for +1,000,000 bytes. Use the @samp{-h} or @samp{--human-readable} option if +you prefer powers of 1024. + +@item -k +@itemx --kilobytes +@opindex -k +@opindex --kilobytes +Print sizes in 1024-byte blocks, overriding the default block size +(@pxref{Block size}). + +@item -l +@itemx --count-links +@opindex -l +@opindex --count-links +@cindex hard links, counting in @code{du} +Count the size of all files, even if they have appeared already (as a +hard link). + +@item -L +@itemx --dereference +@opindex -L +@opindex --dereference +@cindex symbolic links, dereferencing in @code{du} +Dereference symbolic links (show the disk space used by the file +or directory that the link points to instead of the space used by +the link). + +@item --max-depth=@var{DEPTH} +@opindex --max-depth=@var{DEPTH} +@cindex limiting output of @code{du} +Show the total for each directory (and file if --all) that is at +most MAX_DEPTH levels down from the root of the hierarchy. The root +is at level 0, so @code{du --max-depth=0} is equivalent to @code{du -s}. + +@item -m +@itemx --megabytes +@opindex -m +@opindex --megabytes +@cindex megabytes for filesystem sizes +Print sizes in megabyte (that is, 1,048,576-byte) blocks. + +@item -s +@itemx --summarize +@opindex -s +@opindex --summarize +Display only a total for each argument. + +@item -S +@itemx --separate-dirs +@opindex -S +@opindex --separate-dirs +Report the size of each directory separately, not including the sizes +of subdirectories. + +@item -x +@itemx --one-file-system +@opindex -x +@opindex --one-file-system +@cindex one filesystem, restricting @code{du} to +Skip directories that are on different filesystems from the one that +the argument being processed is on. + +@item --exclude=@var{PAT} +@opindex --exclude=@var{PAT} +@cindex excluding files from @code{du} +When recursing, skip subdirectories or files matching @var{PAT}. +For example, @code{du --exclude='*.o'} excludes files whose names +end in @samp{.o}. + +@item -X @var{FILE} +@itemx --exclude-from=@var{FILE} +@opindex -X @var{FILE} +@opindex --exclude-from=@var{FILE} +@cindex excluding files from @code{du} +Like @samp{--exclude}, except take the patterns to exclude from @var{FILE}, +one per line. If @var{FILE} is @samp{-}, take the patterns from standard +input. + +@end table + +@cindex NFS mounts from BSD to HP-UX +On BSD systems, @code{du} reports sizes that are half the correct +values for files that are NFS-mounted from HP-UX systems. On HP-UX +systems, it reports sizes that are twice the correct values for +files that are NFS-mounted from BSD systems. This is due to a flaw +in HP-UX; it also affects the HP-UX @code{du} program. + + +@node sync invocation +@section @code{sync}: Synchronize data on disk with memory + +@pindex sync +@cindex synchronize disk and memory + +@cindex superblock, writing +@cindex inodes, written buffered +@code{sync} writes any data buffered in memory out to disk. This can +include (but is not limited to) modified superblocks, modified inodes, +and delayed reads and writes. This must be implemented by the kernel; +The @code{sync} program does nothing but exercise the @code{sync} system +call. + +@cindex crashes and corruption +The kernel keeps data in memory to avoid doing (relatively slow) disk +reads and writes. This improves performance, but if the computer +crashes, data may be lost or the filesystem corrupted as a +result. @code{sync} ensures everything in memory is written to disk. + +Any arguments are ignored, except for a lone @samp{--help} or +@samp{--version} (@pxref{Common options}). + +@node Printing text +@chapter Printing text + +@cindex printing text, commands for +@cindex commands for printing text + +This section describes commands that display text strings. + +@menu +* echo invocation:: Print a line of text. +* printf invocation:: Format and print data. +* yes invocation:: Print a string until interrupted. +@end menu + + +@node echo invocation +@section @code{echo}: Print a line of text + +@pindex echo +@cindex displaying text +@cindex printing text +@cindex text, displaying +@cindex arbitrary text, displaying + +@code{echo} writes each given @var{string} to standard output, with a +space between each and a newline after the last one. Synopsis: + +@example +echo [@var{option}]@dots{} [@var{string}]@dots{} +@end example + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -n +@opindex -n +Do not output the trailing newline. + +@item -e +@opindex -e +@cindex backslash escapes +Enable interpretation of the following backslash-escaped characters in +each @var{string}: + +@table @samp +@item \a +alert (bell) +@item \b +backspace +@item \c +suppress trailing newline +@item \f +form feed +@item \n +new line +@item \r +carriage return +@item \t +horizontal tab +@item \v +vertical tab +@item \\ +backslash +@item \@var{nnn} +the character whose ASCII code is @var{nnn} (octal); if @var{nnn} is not +a valid octal number, it is printed literally. +@end table + +@end table + + +@node printf invocation +@section @code{printf}: Format and print data + +@pindex printf +@code{printf} does formatted printing of text. Synopsis: + +@example +printf @var{format} [@var{argument}]@dots{} +@end example + +@code{printf} prints the @var{format} string, interpreting @samp{%} +directives and @samp{\} escapes in the same way as the C @code{printf} +function. The @var{format} argument is re-used as necessary to convert +all of the given @var{argument}s. + +@code{printf} has one additional directive, @samp{%b}, which prints its +argument string with @samp{\} escapes interpreted in the same way as in +the @var{format} string. + +@kindex \0ooo +@kindex \0xhhh +@code{printf} interprets @samp{\0ooo} in @var{format} as an octal number +(if @var{ooo} is 0 to 3 octal digits) specifying a character to print, +and @samp{\xhhh} as a hexadecimal number (if @var{hhh} is 1 to 3 hex +digits) specifying a character to print. + +@kindex \uhhhh +@kindex \Uhhhhhhhh +@code{printf} interprets two character syntaxes introduced in ISO C 99: +@samp{\u} for 16-bit Unicode characters, specified as 4 hex digits +@var{hhhh}, and @samp{\U} for 32-bit Unicode characters, specified as 8 hex +digits @var{hhhhhhhh}. @code{printf} outputs the Unicode characters +according to the LC_CTYPE part of the current locale, i.e. depending +on the values of the environment variables @code{LC_ALL}, @code{LC_CTYPE}, +@code{LANG}. + +The processing of @samp{\u} and @samp{\U} requires a full-featured +@code{iconv} facility. It is activated on systems with glibc 2.2 (or newer), +or when @code{libiconv} is installed prior to the sh-utils. Otherwise the +use of @samp{\u} and @samp{\U} will give an error message. + +@kindex \c +An additional escape, @samp{\c}, causes @code{printf} to produce no +further output. + +The only options are a lone @samp{--help} or +@samp{--version}. @xref{Common options}. + +The Unicode character syntaxes are useful for writing strings in a locale +independent way. For example, a string containing the Euro currency symbol + +@example +$ /usr/local/bin/printf '\u20AC 14.95' +@end example + +@noindent +will be output correctly in all locales supporting the Euro symbol +(ISO-8859-15, UTF-8, and others). Similarly, a Chinese string + +@example +$ /usr/local/bin/printf '\u4e2d\u6587' +@end example + +@noindent +will be output correctly in all Chinese locales (GB2312, BIG5, UTF-8, etc). + +Note that in these examples, the full pathname of @code{printf} has been +given, to distinguish it from the GNU @code{bash} builtin function +@code{printf}. + +For larger strings, you don't need to look up the hexadecimal code +values of each character one by one. ASCII characters mixed with \u +escape sequences is also known as the JAVA source file encoding. You can +use GNU recode 3.5c (or newer) to convert strings to this encoding. Here +is how to convert a piece of text into a shell script which will output +this text in a locale-independent way: + +@smallexample +$ LC_CTYPE=zh_CN.big5 /usr/local/bin/printf \ + '\u4e2d\u6587\n' > sample.txt +$ recode BIG5..JAVA < sample.txt \ + | sed -e "s|^|/usr/local/bin/printf '|" -e "s|$|\\\\n'|" \ + > sample.sh +@end smallexample + + +@node yes invocation +@section @code{yes}: Print a string until interrupted + +@pindex yes +@cindex repeated output of a string + +@code{yes} prints the command line arguments, separated by spaces and +followed by a newline, forever until it is killed. If no arguments are +given, it prints @samp{y} followed by a newline forever until killed. + +The only options are a lone @samp{--help} or @samp{--version}. +@xref{Common options}. + + +@node Conditions +@chapter Conditions + +@cindex conditions +@cindex commands for exit status +@cindex exit status commands + +This section describes commands that are primarily useful for their exit +status, rather than their output. Thus, they are often used as the +condition of shell @code{if} statements, or as the last command in a +pipeline. + +@menu +* false invocation:: Do nothing, unsuccessfully. +* true invocation:: Do nothing, successfully. +* test invocation:: Check file types and compare values. +* expr invocation:: Evaluate expressions. +@end menu + + +@node false invocation +@section @code{false}: Do nothing, unsuccessfully + +@pindex false +@cindex do nothing, unsuccessfully +@cindex failure exit status +@cindex exit status of @code{false} + +@code{false} does nothing except return an exit status of 1, meaning +@dfn{failure}. It can be used as a place holder in shell scripts +where an unsuccessful command is needed. + +By default, @code{false} honors the @samp{--help} and @samp{--version} +options. However, that is contrary to @sc{posix}, so when the environment +variable @env{POSIXLY_CORRECT} is set, @code{false} ignores @emph{all} +command line arguments, including @samp{--help} and @samp{--version}. + +This version of @code{false} is implemented as a C program, and is thus +more secure and faster than a shell script implementation, and may safely +be used as a dummy shell for the purpose of disabling accounts. + + +@node true invocation +@section @code{true}: Do nothing, successfully + +@pindex true +@cindex do nothing, successfully +@cindex no-op +@cindex successful exit +@cindex exit status of @code{true} + +@code{true} does nothing except return an exit status of 0, meaning +@dfn{success}. It can be used as a place holder in shell scripts +where a successful command is needed, although the shell built-in +command @code{:} (colon) may do the same thing faster. +In most modern shells, @code{true} is a built-in command, so when +you use @samp{true} in a script, you're probably using the built-in +command, not the one documented here. + +By default, @code{true} honors the @samp{--help} and @samp{--version} +options. However, that is contrary to @sc{posix}, so when the environment +variable @env{POSIXLY_CORRECT} is set, @code{true} ignores @emph{all} +command line arguments, including @samp{--help} and @samp{--version}. + +This version of @code{true} is implemented as a C program, and is thus +more secure and faster than a shell script implementation, and may safely +be used as a dummy shell for the purpose of disabling accounts. + +@node test invocation +@section @code{test}: Check file types and compare values + +@pindex test +@cindex check file types +@cindex compare values +@cindex expression evaluation + +@code{test} returns a status of 0 (true) or 1 (false) depending on the +evaluation of the conditional expression @var{expr}. Each part of the +expression must be a separate argument. + +@code{test} has file status checks, string operators, and numeric +comparison operators. + +@cindex conflicts with shell built-ins +@cindex built-in shell commands, conflicts with +Because most shells have a built-in command by the same name, using the +unadorned command name in a script or interactively may get you +different functionality than that described here. + +Besides the options below, @code{test} accepts a lone @samp{--help} or +@samp{--version}. @xref{Common options}. A single non-option argument +is also allowed: @code{test} returns true if the argument is not null. + +@menu +* File type tests:: -[bcdfhLpSt] +* Access permission tests:: -[gkruwxOG] +* File characteristic tests:: -e -s -nt -ot -ef +* String tests:: -z -n = != +* Numeric tests:: -eq -ne -lt -le -gt -ge +* Connectives for test:: ! -a -o +@end menu + + +@node File type tests +@subsection File type tests + +@cindex file type tests + +These options test for particular types of files. (Everything's a file, +but not all files are the same!) + +@table @samp + +@item -b @var{file} +@opindex -b +@cindex block special check +True if @var{file} exists and is a block special device. + +@item -c @var{file} +@opindex -c +@cindex character special check +True if @var{file} exists and is a character special device. + +@item -d @var{file} +@opindex -d +@cindex directory check +True if @var{file} exists and is a directory. + +@item -f @var{file} +@opindex -f +@cindex regular file check +True if @var{file} exists and is a regular file. + +@item -h @var{file} +@itemx -L @var{file} +@opindex -L +@opindex -h +@cindex symbolic link check +True if @var{file} exists and is a symbolic link. + +@item -p @var{file} +@opindex -p +@cindex named pipe check +True if @var{file} exists and is a named pipe. + +@item -S @var{file} +@opindex -S +@cindex socket check +True if @var{file} exists and is a socket. + +@item -t [@var{fd}] +@opindex -t +@cindex terminal check +True if @var{fd} is opened on a terminal. If @var{fd} is omitted, it +defaults to 1 (standard output). + +@end table + + +@node Access permission tests +@subsection Access permission tests + +@cindex access permission tests +@cindex permission tests + +These options test for particular access permissions. + +@table @samp + +@item -g @var{file} +@opindex -g +@cindex set-group-id check +True if @var{file} exists and has its set-group-id bit set. + +@item -k @var{file} +@opindex -k +@cindex sticky bit check +True if @var{file} has its @dfn{sticky} bit set. + +@item -r @var{file} +@opindex -r +@cindex readable file check +True if @var{file} exists and is readable. + +@item -u @var{file} +@opindex -u +@cindex set-user-id check +True if @var{file} exists and has its set-user-id bit set. + +@item -w @var{file} +@opindex -w +@cindex writable file check +True if @var{file} exists and is writable. + +@item -x @var{file} +@opindex -x +@cindex executable file check +True if @var{file} exists and is executable. + +@item -O @var{file} +@opindex -O +@cindex owned by effective uid check +True if @var{file} exists and is owned by the current effective user id. + +@item -G @var{file} +@opindex -G +@cindex owned by effective gid check +True if @var{file} exists and is owned by the current effective group id. + +@end table + +@node File characteristic tests +@subsection File characteristic tests + +@cindex file characteristic tests + +These options test other file characteristics. + +@table @samp + +@item -e @var{file} +@opindex -e +@cindex existence-of-file check +True if @var{file} exists. + +@item -s @var{file} +@opindex -s +@cindex nonempty file check +True if @var{file} exists and has a size greater than zero. + +@item @var{file1} -nt @var{file2} +@opindex -nt +@cindex newer-than file check +True if @var{file1} is newer (according to modification date) than +@var{file2}. + +@item @var{file1} -ot @var{file2} +@opindex -ot +@cindex older-than file check +True if @var{file1} is older (according to modification date) than +@var{file2}. + +@item @var{file1} -ef @var{file2} +@opindex -ef +@cindex same file check +@cindex hard link check +True if @var{file1} and @var{file2} have the same device and inode +numbers, i.e., if they are hard links to each other. + +@end table + + +@node String tests +@subsection String tests + +@cindex string tests + +These options test string characteristics. Strings are not quoted for +@code{test}, though you may need to quote them to protect characters +with special meaning to the shell, e.g., spaces. + +@table @samp + +@item -z @var{string} +@opindex -z +@cindex zero-length string check +True if the length of @var{string} is zero. + +@item -n @var{string} +@itemx @var{string} +@opindex -n +@cindex nonzero-length string check +True if the length of @var{string} is nonzero. + +@item @var{string1} = @var{string2} +@opindex = +@cindex equal string check +True if the strings are equal. + +@item @var{string1} != @var{string2} +@opindex != +@cindex not-equal string check +True if the strings are not equal. + +@end table + + +@node Numeric tests +@subsection Numeric tests + +@cindex numeric tests +@cindex arithmetic tests + +Numeric relationals. The arguments must be entirely numeric (possibly +negative), or the special expression @w{@code{-l @var{string}}}, which +evaluates to the length of @var{string}. + +@table @samp + +@item @var{arg1} -eq @var{arg2} +@itemx @var{arg1} -ne @var{arg2} +@itemx @var{arg1} -lt @var{arg2} +@itemx @var{arg1} -le @var{arg2} +@itemx @var{arg1} -gt @var{arg2} +@itemx @var{arg1} -ge @var{arg2} +@opindex -eq +@opindex -ne +@opindex -lt +@opindex -le +@opindex -gt +@opindex -ge +These arithmetic binary operators return true if @var{arg1} is equal, +not-equal, less-than, less-than-or-equal, greater-than, or +greater-than-or-equal than @var{arg2}, respectively. + +@end table + +For example: + +@example +test -1 -gt -2 && echo yes +@result{} yes +test -l abc -gt 1 && echo yes +@result{} yes +test 0x100 -eq 1 +@error{} test: integer expression expected before -eq +@end example + + +@node Connectives for test +@subsection Connectives for @code{test} + +@cindex logical connectives +@cindex connectives, logical + +The usual logical connectives. + +@table @samp + +@item ! @var{expr} +@opindex ! +True if @var{expr} is false. + +@item @var{expr1} -a @var{expr2} +@opindex -a +@cindex logical and operator +@cindex and operator +True if both @var{expr1} and @var{expr2} are true. + +@item @var{expr1} -o @var{expr2} +@opindex -o +@cindex logical or operator +@cindex or operator +True if either @var{expr1} or @var{expr2} is true. + +@end table + + +@node expr invocation +@section @code{expr}: Evaluate expressions + +@pindex expr +@cindex expression evaluation +@cindex evaluation of expressions + +@code{expr} evaluates an expression and writes the result on standard +output. Each token of the expression must be a separate argument. + +Operands are either numbers or strings. @code{expr} converts +anything appearing in an operand position to an integer or a string +depending on the operation being applied to it. + +Strings are not quoted for @code{expr} itself, though you may need to +quote them to protect characters with special meaning to the shell, +e.g., spaces. + +@cindex parentheses for grouping +Operators may given as infix symbols or prefix keywords. Parentheses +may be used for grouping in the usual manner (you must quote parentheses +to avoid the shell evaluating them, however). + +@cindex exit status of @code{expr} +Exit status: + +@display +0 if the expression is neither null nor 0, +1 if the expression is null or 0, +2 for invalid expressions. +@end display + +@menu +* String expressions:: <colon> match substr index length quote +* Numeric expressions:: + - * / % +* Relations for expr:: | & < <= = == != >= > +* Examples of expr:: Examples. +@end menu + + +@node String expressions +@subsection String expressions + +@cindex string expressions +@cindex expressions, string + +@code{expr} supports pattern matching and other string operators. These +have lower precedence than both the numeric and relational operators (in +the next sections). + +@table @samp + +@item @var{string} : @var{regex} +@cindex pattern matching +@cindex regular expression matching +@cindex matching patterns +Perform pattern matching. The arguments are converted to strings and the +second is considered to be a (basic, a la GNU @code{grep}) regular +expression, with a @code{^} implicitly prepended. The first argument is +then matched against this regular expression. + +If the match succeeds and @var{regex} uses @samp{\(} and @samp{\)}, the +@code{:} expression returns the part of @var{string} that matched the +subexpression; otherwise, it returns the number of characters matched. + +If the match fails, the @code{:} operator returns the null string if +@samp{\(} and @samp{\)} are used in @var{regex}, otherwise 0. + +@kindex \( @r{regexp operator} +Only the first @samp{\( @dots{} \)} pair is relevant to the return +value; additional pairs are meaningful only for grouping the regular +expression operators. + +@kindex \+ @r{regexp operator} +@kindex \? @r{regexp operator} +@kindex \| @r{regexp operator} +In the regular expression, @code{\+}, @code{\?}, and @code{\|} are +operators which respectively match one or more, zero or one, or separate +alternatives. SunOS and other @code{expr}'s treat these as regular +characters. (@sc{posix} allows either behavior.) +@xref{Top, , Regular Expression Library, regex, Regex}, for details of +regular expression syntax. Some examples are in @ref{Examples of expr}. + +@item match @var{string} @var{regex} +@findex match +An alternative way to do pattern matching. This is the same as +@w{@samp{@var{string} : @var{regex}}}. + +@item substr @var{string} @var{position} @var{length} +@findex substr +Returns the substring of @var{string} beginning at @var{position} +with length at most @var{length}. If either @var{position} or +@var{length} is negative, zero, or non-numeric, returns the null string. + +@item index @var{string} @var{charset} +@findex index +Returns the first position in @var{string} where the first character in +@var{charset} was found. If no character in @var{charset} is found in +@var{string}, return 0. + +@item length @var{string} +@findex length +Returns the length of @var{string}. + +@item quote @var{token} +@findex quote +Interpret @var{token} as a string, even if it is a keyword like @var{match} +or an operator like @code{/}. +This makes it possible to test @code{expr length quote "$x"} or +@code{expr quote "$x" : '.*/\(.\)'} and have it do the right thing even if +the value of @var{$x} happens to be (for example) @code{/} or @code{index}. +This operator is a GNU extension. It is disabled when +the environment variable @env{POSIXLY_CORRECT} is set. + +@end table + +To make @code{expr} interpret keywords as strings, you must use the +@code{quote} operator. + + +@node Numeric expressions +@subsection Numeric expressions + +@cindex numeric expressions +@cindex expressions, numeric + +@code{expr} supports the usual numeric operators, in order of increasing +precedence. The string operators (previous section) have lower precedence, +the connectives (next section) have higher. + +@table @samp + +@item + - +@kindex + +@kindex - +@cindex addition +@cindex subtraction +Addition and subtraction. Both arguments are converted to numbers; +an error occurs if this cannot be done. + +@item * / % +@kindex * +@kindex / +@kindex % +@cindex multiplication +@cindex division +@cindex remainder +Multiplication, division, remainder. Both arguments are converted to +numbers; an error occurs if this cannot be done. + +@end table + + +@node Relations for expr +@subsection Relations for @code{expr} + +@cindex connectives, logical +@cindex logical connectives +@cindex relations, numeric or string + +@code{expr} supports the usual logical connectives and relations. These +are higher precedence than either the string or numeric operators +(previous sections). Here is the list, lowest-precedence operator first. + +@table @samp + +@item | +@kindex | +@cindex logical or operator +@cindex or operator +Returns its first argument if that is neither null nor 0, otherwise its +second argument. + +@item & +@kindex & +@cindex logical and operator +@cindex and operator +Return its first argument if neither argument is null or 0, otherwise +0. + +@item < <= = == != >= > +@kindex < +@kindex <= +@kindex = +@kindex == +@kindex > +@kindex >= +@cindex comparison operators +Compare the arguments and return 1 if the relation is true, 0 otherwise. +@code{==} is a synonym for @code{=}. @code{expr} first tries to convert +both arguments to numbers and do a numeric comparison; if either +conversion fails, it does a lexicographic comparison. + +@end table + + +@node Examples of expr +@subsection Examples of using @code{expr} + +@cindex examples of @code{expr} +Here are a few examples, including quoting for shell metacharacters. + +To add 1 to the shell variable @code{foo}, in Bourne-compatible shells: +@example +foo=`expr $foo + 1` +@end example + +To print the non-directory part of the file name stored in +@code{$fname}, which need not contain a @code{/}. +@example +expr $fname : '.*/\(^.*\)' '^|' $fname +@end example + +An example showing that @code{\+} is an operator: +@example +expr aaa : 'a\+' +@result{} 3 +@end example + +@example +expr abc : 'a\(.\)c' +@result{} b +expr index abcdef cz +@result{} 3 +expr index index a +@error{} expr: syntax error +expr index quote index a +@result{} 0 +@end example + + +@node Redirection +@chapter Redirection + +@cindex redirection +@cindex commands for redirection + +Unix shells commonly provide several forms of @dfn{redirection}---ways +to change the input source or output destination of a command. But one +useful redirection is performed by a separate command, not by the shell; +it's described here. + +@menu +* tee invocation:: Redirect output to multiple files. +@end menu + + +@node tee invocation +@section @code{tee}: Redirect output to multiple files + +@pindex tee +@cindex pipe fitting +@cindex destinations, multiple output +@cindex read from stdin and write to stdout and files + +The @code{tee} command copies standard input to standard output and also +to any files given as arguments. This is useful when you want not only +to send some data down a pipe, but also to save a copy. Synopsis: + +@example +tee [@var{option}]@dots{} [@var{file}]@dots{} +@end example + +If a file being written to does not already exist, it is created. If a +file being written to already exists, the data it previously contained +is overwritten unless the @code{-a} option is used. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -a +@itemx --append +@opindex -a +@opindex --append +Append standard input to the given files rather than overwriting +them. + +@item -i +@itemx --ignore-interrupts +@opindex -i +@opindex --ignore-interrupts +Ignore interrupt signals. + +@end table + + +@node File name manipulation +@chapter File name manipulation + +@cindex file name manipulation +@cindex manipulation of file names +@cindex commands for file name manipulation + +This section describes commands that manipulate file names. + +@menu +* basename invocation:: Strip directory and suffix from a file name. +* dirname invocation:: Strip non-directory suffix from a file name. +* pathchk invocation:: Check file name portability. +@end menu + + +@node basename invocation +@section @code{basename}: Strip directory and suffix from a file name + +@pindex basename +@cindex strip directory and suffix from file names +@cindex directory, stripping from file names +@cindex suffix, stripping from file names +@cindex file names, stripping directory and suffix +@cindex leading directory components, stripping + +@code{basename} removes any leading directory components from +@var{name}. Synopsis: + +@example +basename @var{name} [@var{suffix}] +@end example + +If @var{suffix} is specified and is identical to the end of @var{name}, +it is removed from @var{name} as well. @code{basename} prints the +result on standard output. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node dirname invocation +@section @code{dirname}: Strip non-directory suffix from a file name + +@pindex dirname +@cindex directory components, printing +@cindex stripping non-directory suffix +@cindex non-directory suffix, stripping + +@code{dirname} prints all but the final slash-delimited component of +a string (presumably a filename). Synopsis: + +@example +dirname @var{name} +@end example + +If @var{name} is a single component, @code{dirname} prints @samp{.} +(meaning the current directory). + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node pathchk invocation +@section @code{pathchk}: Check file name portability + +@pindex pathchk +@cindex file names, checking validity and portability +@cindex valid file names, checking for +@cindex portable file names, checking for + +@code{pathchk} checks portability of filenames. Synopsis: + +@example +pathchk [@var{option}]@dots{} @var{name}@dots{} +@end example + +For each @var{name}, @code{pathchk} prints a message if any of +these conditions is true: +@enumerate +@item +one of the existing directories in @var{name} does not have search +(execute) permission, +@item +the length of @var{name} is larger than its filesystem's maximum +file name length, +@item +the length of one component of @var{name}, corresponding to an +existing directory name, is larger than its filesystem's maximum +length for a file name component. +@end enumerate + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp + +@item -p +@itemx --portability +@opindex -p +@opindex --portability +Instead of performing length checks on the underlying filesystem, +test the length of each file name and its components against the +@sc{posix.1} minimum limits for portability. Also check that the file +name contains no characters not in the portable file name character set. + +@end table + +@cindex exit status of @code{pathchk} +Exit status: + +@display +0 if all specified file names passed all of the tests, +1 otherwise. +@end display + + +@node Working context +@chapter Working context + +@cindex working context +@cindex commands for printing the working context + +This section describes commands that display or alter the context in +which you are working: the current directory, the terminal settings, and +so forth. See also the user-related commands in the next section. + +@menu +* pwd invocation:: Print working directory. +* stty invocation:: Print or change terminal characteristics. +* printenv invocation:: Print environment variables. +* tty invocation:: Print file name of terminal on standard input. +@end menu + + +@node pwd invocation +@section @code{pwd}: Print working directory + +@pindex pwd +@cindex print name of current directory +@cindex current working directory, printing +@cindex working directory, printing + +@cindex symbolic links and @code{pwd} +@code{pwd} prints the fully resolved name of the current directory. +That is, all components of the printed name will be actual directory +names---none will be symbolic links. + +@cindex conflicts with shell built-ins +@cindex built-in shell commands, conflicts with +Because most shells have a built-in command by the same name, using the +unadorned command name in a script or interactively may get you +different functionality than that described here. + +The only options are a lone @samp{--help} or +@samp{--version}. @xref{Common options}. + + +@node stty invocation +@section @code{stty}: Print or change terminal characteristics + +@pindex stty +@cindex change or print terminal settings +@cindex terminal settings +@cindex line settings of terminal + +@code{stty} prints or changes terminal characteristics, such as baud rate. +Synopses: + +@example +stty [@var{option}] [@var{setting}]@dots{} +stty [@var{option}] +@end example + +If given no line settings, @code{stty} prints the baud rate, line +discipline number (on systems that support it), and line settings +that have been changed from the values set by @samp{stty sane}. +By default, mode reading and setting are performed on the tty line +connected to standard input, although this can be modified by the +@samp{--file} option. + +@code{stty} accepts many non-option arguments that change aspects of +the terminal line operation, as described below. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -a +@itemx --all +@opindex -a +@opindex --all +Print all current settings in human-readable form. This option may not +be used in combination with any line settings. + +@item -F @var{device} +@itemx --file=@var{device} +@opindex -F +@opindex --file +Set the line opened by the filename specified in @var{device} instead of +the tty line connected to standard input. This option is necessary +because opening a @sc{posix} tty requires use of the @code{O_NONDELAY} flag to +prevent a @sc{posix} tty from blocking until the carrier detect line is high if +the @code{clocal} flag is not set. Hence, it is not always possible +to allow the shell to open the device in the traditional manner. + +@item -g +@itemx --save +@opindex -g +@opindex --save +@cindex machine-readable @code{stty} output +Print all current settings in a form that can be used as an argument to +another @code{stty} command to restore the current settings. This option +may not be used in combination with any line settings. + +@end table + +Many settings can be turned off by preceding them with a @samp{-}. +Such arguments are marked below with ``May be negated'' in their +description. The descriptions themselves refer to the positive +case, that is, when @emph{not} negated (unless stated otherwise, +of course). + +Some settings are not available on all @sc{posix} systems, since they use +extensions. Such arguments are marked below with ``Non-@sc{posix}'' in their +description. On non-@sc{posix} systems, those or other settings also may not +be available, but it's not feasible to document all the variations: just +try it and see. + +@menu +* Control:: Control settings +* Input:: Input settings +* Output:: Output settings +* Local:: Local settings +* Combination:: Combination settings +* Characters:: Special characters +* Special:: Special settings +@end menu + + +@node Control +@subsection Control settings + +@cindex control settings +Control settings: + +@table @samp +@item parenb +@opindex parenb +@cindex two-way parity +Generate parity bit in output and expect parity bit in input. +May be negated. + +@item parodd +@opindex parodd +@cindex odd parity +@cindex even parity +Set odd parity (even if negated). May be negated. + +@item cs5 +@itemx cs6 +@itemx cs7 +@itemx cs8 +@opindex cs@var{n} +@cindex character size +@cindex eight-bit characters +Set character size to 5, 6, 7, or 8 bits. + +@item hup +@itemx hupcl +@opindex hup[cl] +Send a hangup signal when the last process closes the tty. May be +negated. + +@item cstopb +@opindex cstopb +@cindex stop bits +Use two stop bits per character (one if negated). May be negated. + +@item cread +@opindex cread +Allow input to be received. May be negated. + +@item clocal +@opindex clocal +@cindex modem control +Disable modem control signals. May be negated. + +@item crtscts +@opindex crtscts +@cindex hardware flow control +@cindex flow control, hardware +@cindex RTS/CTS flow control +Enable RTS/CTS flow control. Non-@sc{posix}. May be negated. +@end table + + +@node Input +@subsection Input settings + +@cindex input settings + +@table @samp +@item ignbrk +@opindex ignbrk +@cindex breaks, ignoring +Ignore break characters. May be negated. + +@item brkint +@opindex brkint +@cindex breaks, cause interrupts +Make breaks cause an interrupt signal. May be negated. + +@item ignpar +@opindex ignpar +@cindex parity, ignoring +Ignore characters with parity errors. May be negated. + +@item parmrk +@opindex parmrk +@cindex parity errors, marking +Mark parity errors (with a 255-0-character sequence). May be negated. + +@item inpck +@opindex inpck +Enable input parity checking. May be negated. + +@item istrip +@opindex istrip +@cindex eight-bit input +Clear high (8th) bit of input characters. May be negated. + +@item inlcr +@opindex inlcr +@cindex newline, translating to return +Translate newline to carriage return. May be negated. + +@item igncr +@opindex igncr +@cindex return, ignoring +Ignore carriage return. May be negated. + +@item icrnl +@opindex icrnl +@cindex return, translating to newline +Translate carriage return to newline. May be negated. + +@item ixon +@opindex ixon +@kindex C-s/C-q flow control +@cindex XON/XOFF flow control +Enable XON/XOFF flow control (that is, @kbd{CTRL-S}/@kbd{CTRL-Q}). May +be negated. + +@item ixoff +@itemx tandem +@opindex ixoff +@opindex tandem +@cindex software flow control +@cindex flow control, software +Enable sending of @code{stop} character when the system input buffer +is almost full, and @code{start} character when it becomes almost +empty again. May be negated. + +@item iuclc +@opindex iuclc +@cindex uppercase, translating to lowercase +Translate uppercase characters to lowercase. Non-@sc{posix}. May be +negated. + +@item ixany +@opindex ixany +Allow any character to restart output (only the start character +if negated). Non-@sc{posix}. May be negated. + +@item imaxbel +@opindex imaxbel +@cindex beeping at input buffer full +Enable beeping and not flushing input buffer if a character arrives +when the input buffer is full. Non-@sc{posix}. May be negated. +@end table + + +@node Output +@subsection Output settings + +@cindex output settings +These arguments specify output-related operations. + +@table @samp +@item opost +@opindex opost +Postprocess output. May be negated. + +@item olcuc +@opindex olcuc +@cindex lowercase, translating to output +Translate lowercase characters to uppercase. Non-@sc{posix}. May be +negated. + +@item ocrnl +@opindex ocrnl +@cindex return, translating to newline +Translate carriage return to newline. Non-@sc{posix}. May be negated. + +@item onlcr +@opindex onlcr +@cindex newline, translating to crlf +Translate newline to carriage return-newline. Non-@sc{posix}. May be +negated. + +@item onocr +@opindex onocr +Do not print carriage returns in the first column. Non-@sc{posix}. +May be negated. + +@item onlret +@opindex onlret +Newline performs a carriage return. Non-@sc{posix}. May be negated. + +@item ofill +@opindex ofill +@cindex pad instead of timing for delaying +Use fill (padding) characters instead of timing for delays. Non-@sc{posix}. +May be negated. + +@item ofdel +@opindex ofdel +@cindex pad character +Use delete characters for fill instead of null characters. Non-@sc{posix}. +May be negated. + +@item nl1 +@itemx nl0 +@opindex nl@var{n} +Newline delay style. Non-@sc{posix}. + +@item cr3 +@itemx cr2 +@itemx cr1 +@itemx cr0 +@opindex cr@var{n} +Carriage return delay style. Non-@sc{posix}. + +@item tab3 +@itemx tab2 +@itemx tab1 +@itemx tab0 +@opindex tab@var{n} +Horizontal tab delay style. Non-@sc{posix}. + +@item bs1 +@itemx bs0 +@opindex bs@var{n} +Backspace delay style. Non-@sc{posix}. + +@item vt1 +@itemx vt0 +@opindex vt@var{n} +Vertical tab delay style. Non-@sc{posix}. + +@item ff1 +@itemx ff0 +@opindex ff@var{n} +Form feed delay style. Non-@sc{posix}. +@end table + + +@node Local +@subsection Local settings + +@cindex local settings + +@table @samp +@item isig +@opindex isig +Enable @code{interrupt}, @code{quit}, and @code{suspend} special +characters. May be negated. + +@item icanon +@opindex icanon +Enable @code{erase}, @code{kill}, @code{werase}, and @code{rprnt} +special characters. May be negated. + +@item iexten +@opindex iexten +Enable non-@sc{posix} special characters. May be negated. + +@item echo +@opindex echo +Echo input characters. May be negated. + +@item echoe +@itemx crterase +@opindex echoe +@opindex crterase +Echo @code{erase} characters as backspace-space-backspace. May be +negated. + +@item echok +@opindex echok +@cindex newline echoing after @code{kill} +Echo a newline after a @code{kill} character. May be negated. + +@item echonl +@opindex echonl +@cindex newline, echoing +Echo newline even if not echoing other characters. May be negated. + +@item noflsh +@opindex noflsh +@cindex flushing, disabling +Disable flushing after @code{interrupt} and @code{quit} special +characters. May be negated. + +@item xcase +@opindex xcase +@cindex case translation +Enable input and output of uppercase characters by preceding their +lowercase equivalents with @samp{\}, when @code{icanon} is set. +Non-@sc{posix}. May be negated. + +@item tostop +@opindex tostop +@cindex background jobs, stopping at terminal write +Stop background jobs that try to write to the terminal. Non-@sc{posix}. +May be negated. + +@item echoprt +@itemx prterase +@opindex echoprt +@opindex prterase +Echo erased characters backward, between @samp{\} and @samp{/}. +Non-@sc{posix}. May be negated. + +@item echoctl +@itemx ctlecho +@opindex echoctl +@opindex ctlecho +@cindex control characters, using @samp{^@var{c}} +@cindex hat notation for control characters +Echo control characters in hat notation (@samp{^@var{c}}) instead +of literally. Non-@sc{posix}. May be negated. + +@item echoke +@itemx crtkill +@opindex echoke +@opindex crtkill +Echo the @code{kill} special character by erasing each character on +the line as indicated by the @code{echoprt} and @code{echoe} settings, +instead of by the @code{echoctl} and @code{echok} settings. Non-@sc{posix}. +May be negated. +@end table + + +@node Combination +@subsection Combination settings + +@cindex combination settings +Combination settings: + +@table @samp +@item evenp +@opindex evenp +@itemx parity +@opindex parity +Same as @code{parenb -parodd cs7}. May be negated. If negated, same +as @code{-parenb cs8}. + +@item oddp +@opindex oddp +Same as @code{parenb parodd cs7}. May be negated. If negated, same +as @code{-parenb cs8}. + +@item nl +@opindex nl +Same as @code{-icrnl -onlcr}. May be negated. If negated, same as +@code{icrnl -inlcr -igncr onlcr -ocrnl -onlret}. + +@item ek +@opindex ek +Reset the @code{erase} and @code{kill} special characters to their default +values. + +@item sane +@opindex sane +Same as: +@c This is too long to write inline. +@example +cread -ignbrk brkint -inlcr -igncr icrnl -ixoff +-iuclc -ixany imaxbel opost -olcuc -ocrnl onlcr +-onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 +ff0 isig icanon iexten echo echoe echok -echonl +-noflsh -xcase -tostop -echoprt echoctl echoke +@end example +@noindent and also sets all special characters to their default values. + +@item cooked +@opindex cooked +Same as @code{brkint ignpar istrip icrnl ixon opost isig icanon}, plus +sets the @code{eof} and @code{eol} characters to their default values +if they are the same as the @code{min} and @code{time} characters. +May be negated. If negated, same as @code{raw}. + +@item raw +@opindex raw +Same as: +@example +-ignbrk -brkint -ignpar -parmrk -inpck -istrip +-inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany +-imaxbel -opost -isig -icanon -xcase min 1 time 0 +@end example +@noindent May be negated. If negated, same as @code{cooked}. + +@item cbreak +@opindex cbreak +Same as @code{-icanon}. May be negated. If negated, same as +@code{icanon}. + +@item pass8 +@opindex pass8 +@cindex eight-bit characters +Same as @code{-parenb -istrip cs8}. May be negated. If negated, +same as @code{parenb istrip cs7}. + +@item litout +@opindex litout +Same as @code{-parenb -istrip -opost cs8}. May be negated. +If negated, same as @code{parenb istrip opost cs7}. + +@item decctlq +@opindex decctlq +Same as @code{-ixany}. Non-@sc{posix}. May be negated. + +@item tabs +@opindex tabs +Same as @code{tab0}. Non-@sc{posix}. May be negated. If negated, same +as @code{tab3}. + +@item lcase +@itemx LCASE +@opindex lcase +@opindex LCASE +Same as @code{xcase iuclc olcuc}. Non-@sc{posix}. May be negated. + +@item crt +@opindex crt +Same as @code{echoe echoctl echoke}. + +@item dec +@opindex dec +Same as @code{echoe echoctl echoke -ixany intr ^C erase ^? kill C-u}. +@end table + + +@node Characters +@subsection Special characters + +@cindex special characters +@cindex characters, special + +The special characters' default values vary from system to system. +They are set with the syntax @samp{name value}, where the names are +listed below and the value can be given either literally, in hat +notation (@samp{^@var{c}}), or as an integer which may start with +@samp{0x} to indicate hexadecimal, @samp{0} to indicate octal, or +any other digit to indicate decimal. + +@cindex disabling special characters +@kindex u@r{, and disabling special characters} +For GNU stty, giving a value of @code{^-} or @code{undef} disables that +special character. (This is incompatible with Ultrix @code{stty}, +which uses a value of @samp{u} to disable a special character. GNU +@code{stty} treats a value @samp{u} like any other, namely to set that +special character to @key{U}.) + +@table @samp + +@item intr +@opindex intr +Send an interrupt signal. + +@item quit +@opindex quit +Send a quit signal. + +@item erase +@opindex erase +Erase the last character typed. + +@item kill +@opindex kill +Erase the current line. + +@item eof +@opindex eof +Send an end of file (terminate the input). + +@item eol +@opindex eol +End the line. + +@item eol2 +@opindex eol2 +Alternate character to end the line. Non-@sc{posix}. + +@item swtch +@opindex swtch +Switch to a different shell layer. Non-@sc{posix}. + +@item start +@opindex start +Restart the output after stopping it. + +@item stop +@opindex stop +Stop the output. + +@item susp +@opindex susp +Send a terminal stop signal. + +@item dsusp +@opindex dsusp +Send a terminal stop signal after flushing the input. Non-@sc{posix}. + +@item rprnt +@opindex rprnt +Redraw the current line. Non-@sc{posix}. + +@item werase +@opindex werase +Erase the last word typed. Non-@sc{posix}. + +@item lnext +@opindex lnext +Enter the next character typed literally, even if it is a special +character. Non-@sc{posix}. +@end table + + +@node Special +@subsection Special settings + +@cindex special settings + +@table @samp +@item min @var{n} +@opindex min +Set the minimum number of characters that will satisfy a read until +the time value has expired, when @code{-icanon} is set. + +@item time @var{n} +@opindex time +Set the number of tenths of a second before reads time out if the minimum +number of characters have not been read, when @code{-icanon} is set. + +@item ispeed @var{n} +@opindex ispeed +Set the input speed to @var{n}. + +@item ospeed @var{n} +@opindex ospeed +Set the output speed to @var{n}. + +@item rows @var{n} +@opindex rows +Tell the tty kernel driver that the terminal has @var{n} rows. Non-@sc{posix}. + +@item cols @var{n} +@itemx columns @var{n} +@opindex cols +@opindex columns +Tell the kernel that the terminal has @var{n} columns. Non-@sc{posix}. + +@item size +@opindex size +@vindex LINES +@vindex COLUMNS +Print the number of rows and columns that the kernel thinks the +terminal has. (Systems that don't support rows and columns in the kernel +typically use the environment variables @env{LINES} and @env{COLUMNS} +instead; however, GNU @code{stty} does not know anything about them.) +Non-@sc{posix}. + +@item line @var{n} +@opindex line +Use line discipline @var{n}. Non-@sc{posix}. + +@item speed +@opindex speed +Print the terminal speed. + +@item @var{n} +@cindex baud rate, setting +@c FIXME: Is this still true that the baud rate can't be set +@c higher than 38400? +Set the input and output speeds to @var{n}. @var{n} can be one +of: 0 50 75 110 134 134.5 150 200 300 600 1200 1800 2400 4800 9600 +19200 38400 @code{exta} @code{extb}. @code{exta} is the same as +19200; @code{extb} is the same as 38400. 0 hangs up the line if +@code{-clocal} is set. +@end table + + +@node printenv invocation +@section @code{printenv}: Print all or some environment variables + +@pindex printenv +@cindex printing all or some environment variables +@cindex environment variables, printing + +@code{printenv} prints environment variable values. Synopsis: + +@example +printenv [@var{option}] [@var{variable}]@dots{} +@end example + +If no @var{variable}s are specified, @code{printenv} prints the value of +every environment variable. Otherwise, it prints the value of each +@var{variable} that is set, and nothing for those that are not set. + +The only options are a lone @samp{--help} or @samp{--version}. +@xref{Common options}. + +@cindex exit status of @code{printenv} +Exit status: + +@display +0 if all variables specified were found +1 if at least one specified variable was not found +2 if a write error occurred +@end display + + +@node tty invocation +@section @code{tty}: Print file name of terminal on standard input + +@pindex tty +@cindex print terminal file name +@cindex terminal file name, printing + +@code{tty} prints the file name of the terminal connected to its standard +input. It prints @samp{not a tty} if standard input is not a terminal. +Synopsis: + +@example +tty [@var{option}]@dots{} +@end example + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp + +@item -s +@itemx --silent +@itemx --quiet +@opindex -s +@opindex --silent +@opindex --quiet +Print nothing; only return an exit status. + +@end table + +@cindex exit status of @code{tty} +Exit status: + +@display +0 if standard input is a terminal +1 if standard input is not a terminal +2 if given incorrect arguments +3 if a write error occurs +@end display + + +@node User information +@chapter User information + +@cindex user information, commands for +@cindex commands for printing user information + +This section describes commands that print user-related information: +logins, groups, and so forth. + +@menu +* id invocation:: Print real and effective uid and gid. +* logname invocation:: Print current login name. +* whoami invocation:: Print effective user id. +* groups invocation:: Print group names a user is in. +* users invocation:: Print login names of users currently logged in. +* who invocation:: Print who is currently logged in. +@end menu + + +@node id invocation +@section @code{id}: Print real and effective uid and gid + +@pindex id +@cindex real uid and gid, printing +@cindex effective uid and gid, printing +@cindex printing real and effective uid and gid + +@code{id} prints information about the given user, or the process +running it if no user is specified. Synopsis: + +@example +id [@var{option}]@dots{} [@var{username}] +@end example + +By default, it prints the real user id, real group id, effective user id +if different from the real user id, effective group id if different from +the real group id, and supplemental group ids. + +Each of these numeric values is preceded by an identifying string and +followed by the corresponding user or group name in parentheses. + +The options cause @code{id} to print only part of the above information. +Also see @ref{Common options}. + +@table @samp +@item -g +@itemx --group +@opindex -g +@opindex --group +Print only the group id. + +@item -G +@itemx --groups +@opindex -G +@opindex --groups +Print only the supplementary groups. + +@item -n +@itemx --name +@opindex -n +@opindex --name +Print the user or group name instead of the ID number. Requires +@code{-u}, @code{-g}, or @code{-G}. + +@item -r +@itemx --real +@opindex -r +@opindex --real +Print the real, instead of effective, user or group id. Requires +@code{-u}, @code{-g}, or @code{-G}. + +@item -u +@itemx --user +@opindex -u +@opindex --user +Print only the user id. + +@end table + + +@node logname invocation +@section @code{logname}: Print current login name + +@pindex logname +@cindex printing user's login name +@cindex login name, printing +@cindex user name, printing + +@flindex /etc/utmp +@flindex utmp + +@code{logname} prints the calling user's name, as found in the file +@file{/etc/utmp}, and exits with a status of 0. If there is no +@file{/etc/utmp} entry for the calling process, @code{logname} prints +an error message and exits with a status of 1. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node whoami invocation +@section @code{whoami}: Print effective user id + +@pindex whoami +@cindex effective UID, printing +@cindex printing the effective UID + +@code{whoami} prints the user name associated with the current +effective user id. It is equivalent to the command @samp{id -un}. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node groups invocation +@section @code{groups}: Print group names a user is in + +@pindex groups +@cindex printing groups a user is in +@cindex supplementary groups, printing + +@code{groups} prints the names of the primary and any supplementary +groups for each given @var{username}, or the current process if no names +are given. If names are given, the name of each user is printed before +the list of that user's groups. Synopsis: + +@example +groups [@var{username}]@dots{} +@end example + +The group lists are equivalent to the output of the command @samp{id -Gn}. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node users invocation +@section @code{users}: Print login names of users currently logged in + +@pindex users +@cindex printing current usernames +@cindex usernames, printing current + +@cindex login sessions, printing users with +@code{users} prints on a single line a blank-separated list of user +names of users currently logged in to the current host. Each user name +corresponds to a login session, so if a user has more than one login +session, that user's name will appear the same number of times in the +output. Synopsis: + +@example +users [@var{file}] +@end example + +@flindex /etc/utmp +@flindex /etc/wtmp +With no @var{file} argument, @code{users} extracts its information from +the file @file{/etc/utmp}. If a file argument is given, @code{users} +uses that file instead. A common choice is @file{/etc/wtmp}. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node who invocation +@section @code{who}: Print who is currently logged in + +@pindex who +@cindex printing current user information +@cindex information, about current users + +@code{who} prints information about users who are currently logged on. +Synopsis: + +@example +@code{who} [@var{option}] [@var{file}] [am i] +@end example + +@cindex terminal lines, currently used +@cindex login time +@cindex remote hostname +If given no non-option arguments, @code{who} prints the following +information for each user currently logged on: login name, terminal +line, login time, and remote hostname or X display. + +@flindex /etc/utmp +@flindex /etc/wtmp +If given one non-option argument, @code{who} uses that instead of +@file{/etc/utmp} as the name of the file containing the record of +users logged on. @file{/etc/wtmp} is commonly given as an argument +to @code{who} to look at who has previously logged on. + +@opindex am i +@opindex who am i +If given two non-option arguments, @code{who} prints only the entry +for the user running it (determined from its standard input), preceded +by the hostname. Traditionally, the two arguments given are @samp{am +i}, as in @samp{who am i}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -m +@opindex -m +Same as @samp{who am i}. + +@item -q +@itemx --count +@opindex -q +@opindex --count +Print only the login names and the number of users logged on. +Overrides all other options. + +@item -s +@opindex -s +Ignored; for compatibility with other versions of @code{who}. + +@item -i +@itemx -u +@itemx --idle +@opindex -i +@opindex -u +@opindex --idle +@cindex idle time +After the login time, print the number of hours and minutes that the +user has been idle. @samp{.} means the user was active in last minute. +@samp{old} means the user was idle for more than 24 hours. + +@item -l +@itemx --lookup +@opindex -l +@opindex --lookup +Attempt to canonicalize hostnames found in utmp through a DNS lookup. This +is not the default because it can cause significant delays on systems with +automatic dial-up internet access. + +@item -H +@itemx --heading +@opindex -H +@opindex --heading +Print a line of column headings. + +@item -w +@itemx -T +@itemx --mesg +@itemx --message +@itemx --writable +@opindex -w +@opindex -T +@opindex --mesg +@opindex --message +@opindex --writable +@cindex message status +@pindex write@r{, allowed} +After each login name print a character indicating the user's message status: + +@display +@samp{+} allowing @code{write} messages +@samp{-} disallowing @code{write} messages +@samp{?} cannot find terminal device +@end display + +@end table + + +@node System context +@chapter System context + +@cindex system context +@cindex context, system +@cindex commands for system context + +This section describes commands that print or change system-wide +information. + +@menu +* date invocation:: Print or set system date and time. +* uname invocation:: Print system information. +* hostname invocation:: Print or set system name. +* hostid invocation:: Print numeric host identifier. +@end menu + + +@node date invocation +@section @code{date}: Print or set system date and time + +@pindex date +@cindex time, printing or setting +@cindex printing the current time + +Synopses: + +@example +date [@var{option}]@dots{} [+@var{format}] +date [-u|--utc|--universal] @c this avoids a newline in the output +[ MMDDhhmm[[CC]YY][.ss] ] +@end example + +Invoking @code{date} with no @var{format} argument is equivalent to invoking +@samp{date '+%a %b %e %H:%M:%S %Z %Y'}. + +@findex strftime @r{and @code{date}} +@cindex time formats +@cindex formatting times +If given an argument that starts with a @samp{+}, @code{date} prints the +current time and date (or the time and date specified by the +@code{--date} option, see below) in the format defined by that argument, +which is the same as in the @code{strftime} function. Except for +directives, which start with @samp{%}, characters in the format string +are printed unchanged. The directives are described below. + +@menu +* Time directives:: %[HIklMprsSTXzZ] +* Date directives:: %[aAbBcdDhjmUwWxyY] +* Literal directives:: %[%nt] +* Padding:: Pad with zeroes, spaces (%_), or nothing (%-). +* Setting the time:: Changing the system clock. +* Options for date:: Instead of the current time. +* Examples of date:: Examples. +@end menu + +@node Time directives +@subsection Time directives + +@cindex time directives +@cindex directives, time + +@code{date} directives related to times. + +@table @samp +@item %H +hour (00@dots{}23) +@item %I +hour (01@dots{}12) +@item %k +hour ( 0@dots{}23) +@item %l +hour ( 1@dots{}12) +@item %M +minute (00@dots{}59) +@item %p +locale's AM or PM +@item %r +time, 12-hour (hh:mm:ss [AP]M) +@item %s +@cindex epoch, seconds since +@cindex seconds since the epoch +@cindex beginning of time +seconds since the epoch, i.e., 1 January 1970 00:00:00 UTC (a +GNU extension). +Note that this value is the number of seconds between the epoch +and the current date as defined by the localtime system call. +It isn't changed by the @samp{--date} option. +@item %S +second (00@dots{}60) +@item %T +time, 24-hour (hh:mm:ss) +@item %X +locale's time representation (%H:%M:%S) +@item %z +RFC-822 style numeric time zone (e.g., -0600 or +0100), or nothing if no +time zone is determinable. This value reflects the @emph{current} time +zone. It isn't changed by the @samp{--date} option. +@item %Z +time zone (e.g., EDT), or nothing if no time zone is +determinable. +Note that this value reflects the @emph{current} time zone. +It isn't changed by the @samp{--date} option. +@end table + + +@node Date directives +@subsection Date directives + +@cindex date directives +@cindex directives, date + +@code{date} directives related to dates. + +@table @samp +@item %a +locale's abbreviated weekday name (Sun@dots{}Sat) +@item %A +locale's full weekday name, variable length (Sunday@dots{}Saturday) +@item %b +locale's abbreviated month name (Jan@dots{}Dec) +@item %B +locale's full month name, variable length (January@dots{}December) +@item %c +locale's date and time (Sat Nov 04 12:02:33 EST 1989) +@item %C +century (year divided by 100 and truncated to an integer) (00@dots{}99) +@item %d +day of month (01@dots{}31) +@item %D +date (mm/dd/yy) +@item %h +same as %b +@item %j +day of year (001@dots{}366) +@item %m +month (01@dots{}12) +@item %U +week number of year with Sunday as first day of week (00@dots{}53). +Days in a new year preceding the first Sunday are in week zero. +@item %V +week number of year with Monday as first day of the week as a decimal +(01@dots{}53). If the week containing January 1 has four or more days in +the new year, then it is considered week 1; otherwise, it is week 53 of +the previous year, and the next week is week 1. (See the ISO 8601: 1988 +standard.) +@item %w +day of week (0@dots{}6) with 0 corresponding to Sunday +@item %W +week number of year with Monday as first day of week (00@dots{}53). +Days in a new year preceding the first Monday are in week zero. +@item %x +locale's date representation (mm/dd/yy) +@item %y +last two digits of year (00@dots{}99) +@item %Y +year (1970@dots{}.) +@end table + + +@node Literal directives +@subsection Literal directives + +@cindex literal directives +@cindex directives, literal + +@code{date} directives that produce literal strings. + +@table @samp +@item %% +a literal % +@item %n +a newline +@item %t +a horizontal tab +@end table + + +@node Padding +@subsection Padding + +@cindex numeric field padding +@cindex padding of numeric fields +@cindex fields, padding numeric + +By default, @code{date} pads numeric fields with zeroes, so that, for +example, numeric months are always output as two digits. GNU @code{date} +recognizes the following numeric modifiers between the @samp{%} and the +directive. + +@table @samp +@item - +(hyphen) do not pad the field; useful if the output is intended for +human consumption. +@item _ +(underscore) pad the field with spaces; useful if you need a fixed +number of characters in the output, but zeroes are too distracting. +@end table + +@noindent +These are GNU extensions. + +Here is an example illustrating the differences: + +@example +date +%d/%m -d "Feb 1" +@result{} 01/02 +date +%-d/%-m -d "Feb 1" +@result{} 1/2 +date +%_d/%_m -d "Feb 1" +@result{} 1/ 2 +@end example + + +@node Setting the time +@subsection Setting the time + +@cindex setting the time +@cindex time setting +@cindex appropriate privileges + +If given an argument that does not start with @samp{+}, @code{date} sets +the system clock to the time and date specified by that argument (as +described below). You must have appropriate privileges to set the +system clock. The @samp{--date} and @samp{--set} options may not be +used with such an argument. The @samp{--universal} option may be used +with such an argument to indicate that the specified time and date are +relative to Coordinated Universal Time rather than to the local time +zone. + +The argument must consist entirely of digits, which have the following +meaning: + +@table @samp +@item MM +month +@item DD +day within month +@item hh +hour +@item mm +minute +@item CC +first two digits of year (optional) +@item YY +last two digits of year (optional) +@item ss +second (optional) +@end table + +The @samp{--set} option also sets the system clock; see the next section. + + +@node Options for date +@subsection Options for @code{date} + +@cindex @code{date} options +@cindex options for @code{date} + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -d @var{datestr} +@itemx --date=@var{datestr} +@opindex -d +@opindex --date +@cindex parsing date strings +@cindex date strings, parsing +@cindex arbitrary date strings, parsing +@opindex yesterday +@opindex tomorrow +@opindex next @var{day} +@opindex last @var{day} +Display the time and date specified in @var{datestr} instead of the +current time and date. @var{datestr} can be in almost any common +format. It can contain month names, time zones, @samp{am} and @samp{pm}, +@samp{yesterday}, @samp{ago}, @samp{next}, etc. @xref{Date input formats}. + +@item -f @var{datefile} +@itemx --file=@var{datefile} +@opindex -f +@opindex --file +Parse each line in @var{datefile} as with @samp{-d} and display the +resulting time and date. If @var{datefile} is @samp{-}, use standard +input. This is useful when you have many dates to process, because the +system overhead of starting up the @code{date} executable many times can +be considerable. + +@item -I[@var{timespec}] +@itemx --iso-8601[=@var{timespec}] +@opindex -I[@var{timespec}] +@opindex --iso-8601[=@var{timespec}] +Display the date using the ISO 8601 format, @samp{%Y-%m-%d}. + +The optional argument @var{timespec} specifies the number of additional +terms of the time to include. It can be one of the following: +@table @samp +@item auto +The default behavior: print just the date. + +@item hours +Append the hour of the day to the date. + +@item minutes +Append the hours and minutes. + +@item seconds +Append the hours, minutes, and seconds. +@end table + +If showing any time terms, then include the time zone using the format +@samp{%z}. + +@item -R +@itemx --rfc-822 +@opindex -R +@opindex --rfc-822 +Display the time and date using the RFC-822-conforming +format, @samp{%a, %_d %b %Y %H:%M:%S %z}. + +@item -r @var{file} +@itemx --reference=@var{file} +@opindex -r +@opindex --reference +Display the time and date reference according to the last modification +time of @var{file}, instead of the current time and date. + +@item -s @var{datestr} +@itemx --set=@var{datestr} +@opindex -s +@opindex --set +Set the time and date to @var{datestr}. See @samp{-d} above. + +@item -u +@itemx --utc +@itemx --universal +@opindex -u +@opindex --utc +@opindex --universal +@cindex Coordinated Universal Time +@cindex UTC +@cindex Greenwich Mean Time +@cindex GMT +Use Coordinated Universal Time (@sc{utc}) by operating as if the +@env{TZ} environment variable were set to the string @samp{UTC0}. +Normally, @command{date} operates in the time zone indicated by +@env{TZ}, or the system default if @env{TZ} is not set. Coordinated +Universal Time is often called ``Greenwich Mean Time'' (@sc{gmt}) for +historical reasons. +@end table + + +@node Examples of date +@subsection Examples of @code{date} + +@cindex examples of @code{date} + +Here are a few examples. Also see the documentation for the @samp{-d} +option in the previous section. + +@itemize @bullet + +@item +To print the date of the day before yesterday: + +@example +date --date='2 days ago' +@end example + +@item +To print the date of the day three months and one day hence: +@example +date --date='3 months 1 day' +@end example + +@item +To print the day of year of Christmas in the current year: +@example +date --date='25 Dec' +%j +@end example + +@item +To print the current full month name and the day of the month: +@example +date '+%B %d' +@end example + +But this may not be what you want because for the first nine days of +the month, the @samp{%d} expands to a zero-padded two-digit field, +for example @samp{date -d 1may '+%B %d'} will print @samp{May 01}. + +@item +To print a date without the leading zero for one-digit days +of the month, you can use the (GNU extension) @code{-} modifier to suppress +the padding altogether. +@example +date -d 1may '+%B %-d +@end example + +@item +To print the current date and time in the format required by many +non-GNU versions of @code{date} when setting the system clock: +@example +date +%m%d%H%M%Y.%S +@end example + +@item +To set the system clock forward by two minutes: +@example +date --set='+2 minutes' +@end example + +@item +To print the date in the format specified by RFC-822, +use @samp{date --rfc}. I just did and saw this: + +@example +Mon, 25 Mar 1996 23:34:17 -0600 +@end example + +@item +To convert a date string to the number of seconds since the epoch +(which is 1970-01-01 00:00:00 UTC), use the @samp{--date} option with +the @samp{%s} format. That can be useful in sorting and/or graphing +and/or comparing data by date. The following command outputs the +number of the seconds since the epoch for the time two minutes after the +epoch: + +@example +date --date='1970-01-01 00:02:00 +0000' +%s +120 +@end example + +If you do not specify time zone information in the date string, +@command{date} uses your computer's idea of the time zone when +interpreting the string. For example, if your computer's time zone is +that of Cambridge, Massachusetts, which was then 5 hours (i.e., 18,000 +seconds) behind UTC: + +@example +# local time zone used +date --date='1970-01-01 00:02:00' +%s +18120 +@end example + +@item +If you're sorting or graphing dated data, your raw date values may be +represented as seconds since the epoch. But few people can look at +the date @samp{946684800} and casually note ``Oh, that's the first second +of the year 2000 in Greenwich, England.'' + +@example +date --date='2000-01-01 UTC' +%s +946684800 +@end example + +To convert such an unwieldy number of seconds back to +a more readable form, use a command like this: + +@smallexample +# local time zone used +date -d '1970-01-01 UTC 946684800 seconds' +"%Y-%m-%d %T %z" +1999-12-31 19:00:00 -0500 +@end smallexample + +@end itemize + + +@node uname invocation +@section @code{uname}: Print system information + +@pindex uname +@cindex print system information +@cindex system information, printing + +@code{uname} prints information about the machine and operating system +it is run on. If no options are given, @code{uname} acts as if the +@code{-s} option were given. Synopsis: + +@example +uname [@var{option}]@dots{} +@end example + +If multiple options or @code{-a} are given, the selected information is +printed in this order: + +@example +@var{sysname} @var{nodename} @var{release} @var{osversion} @var{machine} +@end example + +The @var{osversion}, at least, may well be multiple words. For example: + +@example +uname -a +@result{} Linux hayley 1.0.4 #3 Thu May 12 18:06:34 1994 i486 +@end example + + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -a +@itemx --all +@opindex -a +@opindex --all +Print all of the below information. + +@item -m +@itemx --machine +@opindex -m +@opindex --machine +@cindex machine type +@cindex hardware type +Print the machine (hardware) type. + +@item -n +@itemx --nodename +@opindex -n +@opindex --nodename +@cindex hostname +@cindex node name +@cindex network node name +Print the machine's network node hostname. + +@item -p +@itemx --processor +@opindex -p +@opindex --processor +@cindex host processor type +Print the machine's processor type + +@item -r +@itemx --release +@opindex -r +@opindex --release +@cindex operating system release +@cindex release of operating system +Print the operating system release. + +@item -s +@itemx --sysname +@opindex -s +@opindex --sysname +@cindex operating system name +@cindex name of operating system +Print the operating system name. + +@item -v +@opindex -v +@cindex operating system version +@cindex version of operating system +Print the operating system version. + +@end table + +@node hostname invocation +@section @code{hostname}: Print or set system name + +@pindex hostname +@cindex setting the hostname +@cindex printing the hostname +@cindex system name, printing +@cindex appropriate privileges + +With no arguments, @code{hostname} prints the name of the current host +system. With one argument, it sets the current host name to the +specified string. You must have appropriate privileges to set the host +name. Synopsis: + +@example +hostname [@var{name}] +@end example + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node hostid invocation +@section @code{hostid}: Print numeric host identifier. + +@pindex hostid +@cindex printing the host identifier + +@code{hostid} prints the numeric identifier of the current host +in hexadecimal. This command accepts no arguments. +The only options are @samp{--help} and @samp{--version}. +@xref{Common options}. + +For example, here's what it prints on one system I use: + +@example +$ hostid +1bac013d +@end example + +On that system, the 32-bit quantity happens to be closely +related to the system's Internet address, but that isn't always +the case. + + +@node Modified command invocation +@chapter Modified command invocation + +@cindex modified command invocation +@cindex invocation of commands, modified +@cindex commands for invoking other commands + +This section describes commands that run other commands in some context +different than the current one: a modified environment, as a different +user, etc. + +@menu +* chroot invocation:: Modify the root directory. +* env invocation:: Modify environment variables. +* nice invocation:: Modify scheduling priority. +* nohup invocation:: Immunize to hangups. +* su invocation:: Modify user and group id. +@end menu + + +@node chroot invocation +@section @code{chroot}: Run a command with a different root directory + +@pindex chroot +@cindex running a program in a specified root directory +@cindex root directory, running a program in a specified + +@code{chroot} runs a command with a specified root directory. +On many systems, only the super-user can do this. +Synopses: + +@example +chroot @var{newroot} [@var{command} [@var{args}]@dots{}] +chroot @var{option} +@end example + +Ordinarily, filenames are looked up starting at the root of the +directory structure, i.e., @file{/}. @code{chroot} changes the root to +the directory @var{newroot} (which must exist) and then runs +@var{command} with optional @var{args}. If @var{command} is not +specified, the default is the value of the @env{SHELL} environment +variable or @code{/bin/sh} if not set, invoked with the @samp{-i} option. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + +Here are a few tips to help avoid common problems in using chroot. +To start with a simple example, make @var{command} refer to a statically +linked binary. If you were to use a dynamically linked executable, then +you'd have to arrange to have the shared libraries in the right place under +your new root directory. + +For example, if you create a statically linked `ls' executable, +and put it in /tmp/empty, you can run this command as root: + +@example +$ chroot /tmp/empty /ls -Rl / +@end example + +Then you'll see output like this: + +@example +/: +total 1023 +-rwxr-xr-x 1 0 0 1041745 Aug 16 11:17 ls +@end example + +If you want to use a dynamically linked executable, say @code{bash}, +then first run @samp{ldd bash} to see what shared objects it needs. +Then, in addition to copying the actual binary, also copy the listed +files to the required positions under your intended new root directory. +Finally, if the executable requires any other files (e.g., data, state, +device files), copy them into place, too. + + +@node env invocation +@section @code{env}: Run a command in a modified environment + +@pindex env +@cindex environment, running a program in a modified +@cindex modified environment, running a program in a +@cindex running a program in a modified environment + +@code{env} runs a command with a modified environment. Synopses: + +@example +env [@var{option}]@dots{} [@var{name}=@var{value}]@dots{} @c +[@var{command} [@var{args}]@dots{}] +env +@end example + +Arguments of the form @samp{@var{variable}=@var{value}} set +the environment variable @var{variable} to value @var{value}. +@var{value} may be empty (@samp{@var{variable}=}). Setting a variable +to an empty value is different from unsetting it. + +@vindex PATH +The first remaining argument specifies the program name to invoke; it is +searched for according to the @env{PATH} environment variable. Any +remaining arguments are passed as arguments to that program. + +@cindex environment, printing + +If no command name is specified following the environment +specifications, the resulting environment is printed. This is like +specifying a command name of @code{printenv}. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp + +@item -u @var{name} +@itemx --unset=@var{name} +@opindex -u +@opindex -unset +Remove variable @var{name} from the environment, if it was in the +environment. + +@item - +@itemx -i +@itemx --ignore-environment +@opindex - +@opindex -i +@opindex --ignore-environment +Start with an empty environment, ignoring the inherited environment. + +@end table + + +@node nice invocation +@section @code{nice}: Run a command with modified scheduling priority + +@pindex nice +@cindex modifying scheduling priority +@cindex scheduling priority, modifying +@cindex priority, modifying +@cindex appropriate privileges + +@code{nice} prints or modifies the scheduling priority of a job. +Synopsis: + +@example +nice [@var{option}]@dots{} [@var{command} [@var{arg}]@dots{}] +@end example + +If no arguments are given, @code{nice} prints the current scheduling +priority, which it inherited. Otherwise, @code{nice} runs the given +@var{command} with its scheduling priority adjusted. If no +@var{adjustment} is given, the priority of the command is incremented by +10. You must have appropriate privileges to specify a negative +adjustment. The priority can be adjusted by @code{nice} over the range +of -20 (the highest priority) to 19 (the lowest). + +@cindex conflicts with shell built-ins +@cindex built-in shell commands, conflicts with +Because most shells have a built-in command by the same name, using the +unadorned command name in a script or interactively may get you +different functionality than that described here. + +The program accepts the following option. Also see @ref{Common options}. + +@table @samp +@item -n @var{adjustment} +@itemx -@var{adjustment} +@itemx --adjustment=@var{adjustment} +@opindex -n +@opindex --adjustment +@opindex -@var{adjustment} +Add @var{adjustment} instead of 10 to the command's priority. +@end table + + +@node nohup invocation +@section @code{nohup}: Run a command immune to hangups + +@pindex nohup +@cindex hangups, immunity to +@cindex immunity to hangups +@cindex logging out and continuing to run + +@flindex nohup.out +@code{nohup} runs the given @var{command} with hangup signals ignored, +so that the command can continue running in the background after you log +out. Synopsis: + +@example +nohup @var{command} [@var{arg}]@dots{} +@end example + +@flindex nohup.out +@code{nohup} increases the scheduling priority of @var{command} by 5, so +it has a slightly smaller chance to run. If standard output is a terminal, +it and standard error are redirected so that they are appended to the +file @file{nohup.out}; if that cannot be written to, they are appended +to the file @file{$HOME/nohup.out}. If that cannot be written to, the +command is not run. + +If @code{nohup} creates either @file{nohup.out} or +@file{$HOME/nohup.out}, it creates it with no ``group'' or ``other'' +access permissions. It does not change the permissions if the output +file already existed. + +@code{nohup} does not automatically put the command it runs in the +background; you must do that explicitly, by ending the command line +with an @samp{&}. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node su invocation +@section @code{su}: Run a command with substitute user and group id + +@pindex su +@cindex substitute user and group ids +@cindex user id, switching +@cindex super-user, becoming +@cindex root, becoming + +@code{su} allows one user to temporarily become another user. It runs a +command (often an interactive shell) with the real and effective user +id, group id, and supplemental groups of a given @var{user}. Synopsis: + +@example +su [@var{option}]@dots{} [@var{user} [@var{arg}]@dots{}] +@end example + +@cindex passwd entry, and @code{su} shell +@flindex /bin/sh +@flindex /etc/passwd +If no @var{user} is given, the default is @code{root}, the super-user. +The shell to use is taken from @var{user}'s @code{passwd} entry, or +@file{/bin/sh} if none is specified there. If @var{user} has a +password, @code{su} prompts for the password unless run by a user with +effective user id of zero (the super-user). + +@vindex HOME +@vindex SHELL +@vindex USER +@vindex LOGNAME +@cindex login shell +By default, @code{su} does not change the current directory. +It sets the environment variables @env{HOME} and @env{SHELL} +from the password entry for @var{user}, and if @var{user} is not +the super-user, sets @env{USER} and @env{LOGNAME} to @var{user}. +By default, the shell is not a login shell. + +Any additional @var{arg}s are passed as additional arguments to the +shell. + +@cindex @samp{-su} +GNU @code{su} does not treat @file{/bin/sh} or any other shells specially +(e.g., by setting @code{argv[0]} to @samp{-su}, passing @code{-c} only +to certain shells, etc.). + +@findex syslog +@code{su} can optionally be compiled to use @code{syslog} to report +failed, and optionally successful, @code{su} attempts. (If the system +supports @code{syslog}.) However, GNU @code{su} does not check if the +user is a member of the @code{wheel} group; see below. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -c @var{command} +@itemx --command=@var{command} +@opindex -c +@opindex --command +Pass @var{command}, a single command line to run, to the shell with +a @code{-c} option instead of starting an interactive shell. + +@item -f +@itemx --fast +@opindex -f +@opindex --fast +@flindex .cshrc +@cindex file name pattern expansion, disabled +@cindex globbing, disabled +Pass the @code{-f} option to the shell. This probably only makes sense +if the shell run is @code{csh} or @code{tcsh}, for which the @code{-f} +option prevents reading the startup file (@file{.cshrc}). With +Bourne-like shells, the @code{-f} option disables file name pattern +expansion (globbing), which is not likely to be useful. + +@item - +@itemx -l +@itemx --login +@opindex - +@opindex -l +@opindex --login +@c other variables already indexed above +@vindex TERM +@vindex PATH +@cindex login shell, creating +Make the shell a login shell. This means the following. Unset all +environment variables except @env{TERM}, @env{HOME}, and @env{SHELL} +(which are set as described above), and @env{USER} and @env{LOGNAME} +(which are set, even for the super-user, as described above), and set +@env{PATH} to a compiled-in default value. Change to @var{user}'s home +directory. Prepend @samp{-} to the shell's name, intended to make it +read its login startup file(s). + +@item -m +@itemx -p +@itemx --preserve-environment +@opindex -m +@opindex -p +@opindex --preserve-environment +@cindex environment, preserving +@flindex /etc/shells +@cindex restricted shell +Do not change the environment variables @env{HOME}, @env{USER}, +@env{LOGNAME}, or @env{SHELL}. Run the shell given in the environment +variable @env{SHELL} instead of the shell from @var{user}'s passwd +entry, unless the user running @code{su} is not the superuser and +@var{user}'s shell is restricted. A @dfn{restricted shell} is one that +is not listed in the file @file{/etc/shells}, or in a compiled-in list +if that file does not exist. Parts of what this option does can be +overridden by @code{--login} and @code{--shell}. + +@item -s @var{shell} +@itemx --shell=@var{shell} +@opindex -s +@opindex --shell +Run @var{shell} instead of the shell from @var{user}'s passwd entry, +unless the user running @code{su} is not the superuser and @var{user}'s +shell is restricted (see @samp{-m} just above). + +@end table + +@cindex wheel group, not supported +@cindex group wheel, not supported +@cindex fascism +@heading Why GNU @code{su} does not support the @samp{wheel} group + +(This section is by Richard Stallman.) + +@cindex Twenex +@cindex MIT AI lab +Sometimes a few of the users try to hold total power over all the +rest. For example, in 1984, a few users at the MIT AI lab decided to +seize power by changing the operator password on the Twenex system and +keeping it secret from everyone else. (I was able to thwart this coup +and give power back to the users by patching the kernel, but I +wouldn't know how to do that in Unix.) + +However, occasionally the rulers do tell someone. Under the usual +@code{su} mechanism, once someone learns the root password who +sympathizes with the ordinary users, he or she can tell the rest. The +``wheel group'' feature would make this impossible, and thus cement the +power of the rulers. + +I'm on the side of the masses, not that of the rulers. If you are +used to supporting the bosses and sysadmins in whatever they do, you +might find this idea strange at first. + + +@node Delaying +@chapter Delaying + +@cindex delaying commands +@cindex commands for delaying + +@c Perhaps @code{wait} or other commands should be described here also? + +@menu +* sleep invocation:: Delay for a specified time. +@end menu + + +@node sleep invocation +@section @code{sleep}: Delay for a specified time + +@pindex sleep +@cindex delay for a specified time + +@code{sleep} pauses for an amount of time specified by the sum of +the values of the command line arguments. +Synopsis: + +@example +sleep @var{number}[smhd]@dots{} +@end example + +@cindex time units +Each argument is a number followed by an optional unit; the default +is seconds. The units are: + +@table @samp +@item s +seconds +@item m +minutes +@item h +hours +@item d +days +@end table + +Historical implementations of @code{sleep} have required that +@var{number} be an integer. However, GNU @code{sleep} accepts +arbitrary floating point numbers. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + + +@node Numeric operations +@chapter Numeric operations + +@cindex numeric operations +These programs do numerically-related operations. + +@menu +* factor invocation:: Show factors of numbers. +* seq invocation:: Print sequences of numbers. +@end menu + + +@node factor invocation +@section @code{factor}: Print prime factors + +@pindex factor +@cindex prime factors + +@code{factor} prints prime factors. Synopses: + +@example +factor [@var{number}]@dots{} +factor @var{option} +@end example + +If no @var{number} is specified on the command line, @code{factor} reads +numbers from standard input, delimited by newlines, tabs, or spaces. + +The only options are @samp{--help} and @samp{--version}. @xref{Common +options}. + +The algorithm it uses is not very sophisticated, so for some inputs +@code{factor} runs for a long time. The hardest numbers to factor are +the products of large primes. Factoring the product of the two largest 32-bit +prime numbers takes over 10 minutes of CPU time on a 400MHz Pentium II. + +@example +$ p=`echo '4294967279 * 4294967291'|bc` +$ factor $p +18446743979220271189: 4294967279 4294967291 +@end example + +In contrast, @code{factor} factors the largest 64-bit number in just +over a tenth of a second: + +@example +$ factor `echo '2^64-1'|bc` +18446744073709551615: 3 5 17 257 641 65537 6700417 +@end example + +@node seq invocation +@section @code{seq}: Print numeric sequences + +@pindex seq +@cindex numeric sequences +@cindex sequence of numbers + +@code{seq} prints a sequence of numbers to standard output. Synopses: + +@example +seq [@var{option}]@dots{} [@var{first} [@var{increment}]] @var{last}@dots{} +@end example + +@code{seq} prints the numbers from @var{first} to @var{last} by +@var{increment}. By default, @var{first} and @var{increment} are both 1, +and each number is printed on its own line. All numbers can be reals, +not just integers. + +The program accepts the following options. Also see @ref{Common options}. + +@table @samp +@item -f @var{format} +@itemx --format=@var{format} +@opindex -f @var{format} +@opindex --format=@var{format} +@cindex formatting of numbers in @code{seq} +Print all numbers using @var{format}; default @samp{%g}. +@var{format} must contain exactly one of the floating point +output formats @samp{%e}, @samp{%f}, or @samp{%g}. + +@item -s @var{string} +@itemx --separator=@var{string} +@cindex separator for numbers in @code{seq} +Separate numbers with @var{string}; default is a newline. +The output always terminates with a newline. + +@item -w +@itemx --equal-width +Print all numbers with the same width, by padding with leading zeroes. +(To have other kinds of padding, use @samp{--format}). + +@end table + +If you want to use @code{seq} to print sequences of large integer values, +don't use the default @samp{%g} format since it can result in +loss of precision: + +@example +$ seq 1000000 1000001 +1e+06 +1e+06 +@end example + +Instead, you can use the format, @samp{%1.f}, +to print large decimal numbers with no exponent and no decimal point. + +@example +$ seq --format=%1.f 1000000 1000001 +1000000 +1000001 +@end example + +If you want hexadecimal output, you can use @code{printf} +to perform the conversion: + +@example +$ printf %x'\n' `seq -f %1.f 1048575 1024 1050623` +fffff +1003ff +1007ff +@end example + +For very long lists of numbers, use xargs to avoid +system limitations on the length of an argument list: + +@example +$ seq -f %1.f 1000000 | xargs printf %x'\n' |tail -3 +f423e +f423f +f4240 +@end example + +To generate octal output, use the printf @code{%o} format instead +of @code{%x}. Note however that using printf works only for numbers +smaller than @code{2^32}: + +@example +$ printf "%x\n" `seq -f %1.f 4294967295 4294967296` +ffffffff +bash: printf: 4294967296: Numerical result out of range +@end example + +On most systems, seq can produce whole-number output for values up to +@code{2^53}, so here's a more general approach to base conversion that +also happens to be more robust for such large numbers. It works by +using @code{bc} and setting its output radix variable, @var{obase}, +to @samp{16} in this case to produce hexadecimal output. + +@example +$ (echo obase=16; seq -f %1.f 4294967295 4294967296)|bc +FFFFFFFF +100000000 +@end example + +Be careful when using @code{seq} with a fractional @var{increment}, +otherwise you may see surprising results. Most people would expect to +see @code{0.3} printed as the last number in this example: + +@example +$ seq -s' ' 0 .1 .3 +0 0.1 0.2 +@end example + +But that doesn't happen on most systems because @code{seq} is +implemented using binary floating point arithmetic (via the C +@code{double} type) -- which means some decimal numbers like @code{.1} +cannot be represented exactly. That in turn means some nonintuitive +conditions like @code{.1 * 3 > .3} will end up being true. + +To work around that in the above example, use a slightly larger number as +the @var{last} value: + +@example +$ seq -s' ' 0 .1 .31 +0 0.1 0.2 0.3 +@end example + +In general, when using an @var{increment} with a fractional part, where +(@var{last} - @var{first}) / @var{increment} is (mathematically) a whole +number, specify a slightly larger (or smaller, if @var{increment} is negative) +value for @var{last} to ensure that @var{last} is the final value printed +by seq. + +@node File permissions +@chapter File permissions +@include perm.texi + +@include getdate.texi + +@c What's GNU? +@c Arnold Robbins +@node Opening the software toolbox +@chapter Opening the Software Toolbox + +This chapter originally appeared in @cite{Linux Journal}, volume 1, +number 2, in the @cite{What's GNU?} column. It was written by Arnold +Robbins. + +@menu +* Toolbox introduction:: Toolbox introduction +* I/O redirection:: I/O redirection +* The who command:: The @command{who} command +* The cut command:: The @command{cut} command +* The sort command:: The @command{sort} command +* The uniq command:: The @command{uniq} command +* Putting the tools together:: Putting the tools together +@end menu + + +@node Toolbox introduction +@unnumberedsec Toolbox Introduction + +This month's column is only peripherally related to the GNU Project, in +that it describes a number of the GNU tools on your GNU/Linux system and how they +might be used. What it's really about is the ``Software Tools'' philosophy +of program development and usage. + +The software tools philosophy was an important and integral concept +in the initial design and development of Unix (of which Linux and GNU are +essentially clones). Unfortunately, in the modern day press of +Internetworking and flashy GUIs, it seems to have fallen by the +wayside. This is a shame, since it provides a powerful mental model +for solving many kinds of problems. + +Many people carry a Swiss Army knife around in their pants pockets (or +purse). A Swiss Army knife is a handy tool to have: it has several knife +blades, a screwdriver, tweezers, toothpick, nail file, corkscrew, and perhaps +a number of other things on it. For the everyday, small miscellaneous jobs +where you need a simple, general purpose tool, it's just the thing. + +On the other hand, an experienced carpenter doesn't build a house using +a Swiss Army knife. Instead, he has a toolbox chock full of specialized +tools---a saw, a hammer, a screwdriver, a plane, and so on. And he knows +exactly when and where to use each tool; you won't catch him hammering nails +with the handle of his screwdriver. + +The Unix developers at Bell Labs were all professional programmers and trained +computer scientists. They had found that while a one-size-fits-all program +might appeal to a user because there's only one program to use, in practice +such programs are + +@enumerate a +@item +difficult to write, + +@item +difficult to maintain and +debug, and + +@item +difficult to extend to meet new situations. +@end enumerate + +Instead, they felt that programs should be specialized tools. In short, each +program ``should do one thing well.'' No more and no less. Such programs are +simpler to design, write, and get right---they only do one thing. + +Furthermore, they found that with the right machinery for hooking programs +together, that the whole was greater than the sum of the parts. By combining +several special purpose programs, you could accomplish a specific task +that none of the programs was designed for, and accomplish it much more +quickly and easily than if you had to write a special purpose program. +We will see some (classic) examples of this further on in the column. +(An important additional point was that, if necessary, take a detour +and build any software tools you may need first, if you don't already +have something appropriate in the toolbox.) + +@node I/O redirection +@unnumberedsec I/O Redirection + +Hopefully, you are familiar with the basics of I/O redirection in the +shell, in particular the concepts of ``standard input,'' ``standard output,'' +and ``standard error''. Briefly, ``standard input'' is a data source, where +data comes from. A program should not need to either know or care if the +data source is a disk file, a keyboard, a magnetic tape, or even a punched +card reader. Similarly, ``standard output'' is a data sink, where data goes +to. The program should neither know nor care where this might be. +Programs that only read their standard input, do something to the data, +and then send it on, are called @dfn{filters}, by analogy to filters in a +water pipeline. + +With the Unix shell, it's very easy to set up data pipelines: + +@smallexample +program_to_create_data | filter1 | .... | filterN > final.pretty.data +@end smallexample + +We start out by creating the raw data; each filter applies some successive +transformation to the data, until by the time it comes out of the pipeline, +it is in the desired form. + +This is fine and good for standard input and standard output. Where does the +standard error come in to play? Well, think about @command{filter1} in +the pipeline above. What happens if it encounters an error in the data it +sees? If it writes an error message to standard output, it will just +disappear down the pipeline into @command{filter2}'s input, and the +user will probably never see it. So programs need a place where they can send +error messages so that the user will notice them. This is standard error, +and it is usually connected to your console or window, even if you have +redirected standard output of your program away from your screen. + +For filter programs to work together, the format of the data has to be +agreed upon. The most straightforward and easiest format to use is simply +lines of text. Unix data files are generally just streams of bytes, with +lines delimited by the @sc{ascii} @sc{lf} (Line Feed) character, +conventionally called a ``newline'' in the Unix literature. (This is +@code{'\n'} if you're a C programmer.) This is the format used by all +the traditional filtering programs. (Many earlier operating systems +had elaborate facilities and special purpose programs for managing +binary data. Unix has always shied away from such things, under the +philosophy that it's easiest to simply be able to view and edit your +data with a text editor.) + +OK, enough introduction. Let's take a look at some of the tools, and then +we'll see how to hook them together in interesting ways. In the following +discussion, we will only present those command line options that interest +us. As you should always do, double check your system documentation +for the full story. + +@node The who command +@unnumberedsec The @command{who} Command + +The first program is the @command{who} command. By itself, it generates a +list of the users who are currently logged in. Although I'm writing +this on a single-user system, we'll pretend that several people are +logged in: + +@example +$ who +@print{} arnold console Jan 22 19:57 +@print{} miriam ttyp0 Jan 23 14:19(:0.0) +@print{} bill ttyp1 Jan 21 09:32(:0.0) +@print{} arnold ttyp2 Jan 23 20:48(:0.0) +@end example + +Here, the @samp{$} is the usual shell prompt, at which I typed @samp{who}. +There are three people logged in, and I am logged in twice. On traditional +Unix systems, user names are never more than eight characters long. This +little bit of trivia will be useful later. The output of @command{who} is nice, +but the data is not all that exciting. + +@node The cut command +@unnumberedsec The @command{cut} Command + +The next program we'll look at is the @command{cut} command. This program +cuts out columns or fields of input data. For example, we can tell it +to print just the login name and full name from the @file{/etc/passwd} +file. The @file{/etc/passwd} file has seven fields, separated by +colons: + +@example +arnold:xyzzy:2076:10:Arnold D. Robbins:/home/arnold:/bin/bash +@end example + +To get the first and fifth fields, we would use @command{cut} like this: + +@example +$ cut -d: -f1,5 /etc/passwd +@print{} root:Operator +@dots{} +@print{} arnold:Arnold D. Robbins +@print{} miriam:Miriam A. Robbins +@dots{} +@end example + +With the @option{-c} option, @command{cut} will cut out specific characters +(i.e., columns) in the input lines. This command looks like it might be +useful for data filtering. + + +@node The sort command +@unnumberedsec The @command{sort} Command + +Next we'll look at the @command{sort} command. This is one of the most +powerful commands on a Unix-style system; one that you will often find +yourself using when setting up fancy data plumbing. + +The @command{sort} +command reads and sorts each file named on the command line. It then +merges the sorted data and writes it to standard output. It will read +standard input if no files are given on the command line (thus +making it into a filter). The sort is based on the character collating +sequence or based on user-supplied ordering criteria. + + +@node The uniq command +@unnumberedsec The @command{uniq} Command + +Finally (at least for now), we'll look at the @command{uniq} program. When +sorting data, you will often end up with duplicate lines, lines that +are identical. Usually, all you need is one instance of each line. +This is where @command{uniq} comes in. The @command{uniq} program reads its +standard input, which it expects to be sorted. It only prints out one +copy of each duplicated line. It does have several options. Later on, +we'll use the @option{-c} option, which prints each unique line, preceded +by a count of the number of times that line occurred in the input. + + +@node Putting the tools together +@unnumberedsec Putting the Tools Together + +Now, let's suppose this is a large ISP server system with dozens of users +logged in. The management wants the system administrator to write a program that will +generate a sorted list of logged in users. Furthermore, even if a user +is logged in multiple times, his or her name should only show up in the +output once. + +The administrator could sit down with the system documentation and write a C +program that did this. It would take perhaps a couple of hundred lines +of code and about two hours to write it, test it, and debug it. +However, knowing the software toolbox, the administrator can instead start out +by generating just a list of logged on users: + +@example +$ who | cut -c1-8 +@print{} arnold +@print{} miriam +@print{} bill +@print{} arnold +@end example + +Next, sort the list: + +@example +$ who | cut -c1-8 | sort +@print{} arnold +@print{} arnold +@print{} bill +@print{} miriam +@end example + +Finally, run the sorted list through @command{uniq}, to weed out duplicates: + +@example +$ who | cut -c1-8 | sort | uniq +@print{} arnold +@print{} bill +@print{} miriam +@end example + +The @command{sort} command actually has a @option{-u} option that does what +@command{uniq} does. However, @command{uniq} has other uses for which one +cannot substitute @samp{sort -u}. + +The administrator puts this pipeline into a shell script, and makes it available for +all the users on the system (@samp{#} is the system administrator, +or @code{root}, prompt): + +@example +# cat > /usr/local/bin/listusers +who | cut -c1-8 | sort | uniq +^D +# chmod +x /usr/local/bin/listusers +@end example + +There are four major points to note here. First, with just four +programs, on one command line, the administrator was able to save about two +hours worth of work. Furthermore, the shell pipeline is just about as +efficient as the C program would be, and it is much more efficient in +terms of programmer time. People time is much more expensive than +computer time, and in our modern ``there's never enough time to do +everything'' society, saving two hours of programmer time is no mean +feat. + +Second, it is also important to emphasize that with the +@emph{combination} of the tools, it is possible to do a special +purpose job never imagined by the authors of the individual programs. + +Third, it is also valuable to build up your pipeline in stages, as we did here. +This allows you to view the data at each stage in the pipeline, which helps +you acquire the confidence that you are indeed using these tools correctly. + +Finally, by bundling the pipeline in a shell script, other users can use +your command, without having to remember the fancy plumbing you set up for +them. In terms of how you run them, shell scripts and compiled programs are +indistinguishable. + +After the previous warm-up exercise, we'll look at two additional, more +complicated pipelines. For them, we need to introduce two more tools. + +The first is the @command{tr} command, which stands for ``transliterate.'' +The @command{tr} command works on a character-by-character basis, changing +characters. Normally it is used for things like mapping upper case to +lower case: + +@example +$ echo ThIs ExAmPlE HaS MIXED case! | tr '[A-Z]' '[a-z]' +@print{} this example has mixed case! +@end example + +There are several options of interest: + +@table @code +@item -c +work on the complement of the listed characters, i.e., +operations apply to characters not in the given set + +@item -d +delete characters in the first set from the output + +@item -s +squeeze repeated characters in the output into just one character. +@end table + +We will be using all three options in a moment. + +The other command we'll look at is @command{comm}. The @command{comm} +command takes two sorted input files as input data, and prints out the +files' lines in three columns. The output columns are the data lines +unique to the first file, the data lines unique to the second file, and +the data lines that are common to both. The @option{-1}, @option{-2}, and +@option{-3} command line options @emph{omit} the respective columns. (This is +non-intuitive and takes a little getting used to.) For example: + +@example +$ cat f1 +@print{} 11111 +@print{} 22222 +@print{} 33333 +@print{} 44444 +$ cat f2 +@print{} 00000 +@print{} 22222 +@print{} 33333 +@print{} 55555 +$ comm f1 f2 +@print{} 00000 +@print{} 11111 +@print{} 22222 +@print{} 33333 +@print{} 44444 +@print{} 55555 +@end example + +The single dash as a filename tells @command{comm} to read standard input +instead of a regular file. + +Now we're ready to build a fancy pipeline. The first application is a word +frequency counter. This helps an author determine if he or she is over-using +certain words. + +The first step is to change the case of all the letters in our input file +to one case. ``The'' and ``the'' are the same word when doing counting. + +@example +$ tr '[A-Z]' '[a-z]' < whats.gnu | ... +@end example + +The next step is to get rid of punctuation. Quoted words and unquoted words +should be treated identically; it's easiest to just get the punctuation out of +the way. + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | ... +@end smallexample + +The second @command{tr} command operates on the complement of the listed +characters, which are all the letters, the digits, the underscore, and +the blank. The @samp{\012} represents the newline character; it has to +be left alone. (The @sc{ascii} tab character should also be included for +good measure in a production script.) + +At this point, we have data consisting of words separated by blank space. +The words only contain alphanumeric characters (and the underscore). The +next step is break the data apart so that we have one word per line. This +makes the counting operation much easier, as we will see shortly. + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | +> tr -s '[ ]' '\012' | ... +@end smallexample + +This command turns blanks into newlines. The @option{-s} option squeezes +multiple newline characters in the output into just one. This helps us +avoid blank lines. (The @samp{>} is the shell's ``secondary prompt.'' +This is what the shell prints when it notices you haven't finished +typing in all of a command.) + +We now have data consisting of one word per line, no punctuation, all one +case. We're ready to count each word: + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | +> tr -s '[ ]' '\012' | sort | uniq -c | ... +@end smallexample + +At this point, the data might look something like this: + +@example + 60 a + 2 able + 6 about + 1 above + 2 accomplish + 1 acquire + 1 actually + 2 additional +@end example + +The output is sorted by word, not by count! What we want is the most +frequently used words first. Fortunately, this is easy to accomplish, +with the help of two more @command{sort} options: + +@table @code +@item -n +do a numeric sort, not a textual one + +@item -r +reverse the order of the sort +@end table + +The final pipeline looks like this: + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | +> tr -s '[ ]' '\012' | sort | uniq -c | sort -nr +@print{} 156 the +@print{} 60 a +@print{} 58 to +@print{} 51 of +@print{} 51 and +@dots{} +@end smallexample + +Whew! That's a lot to digest. Yet, the same principles apply. With six +commands, on two lines (really one long one split for convenience), we've +created a program that does something interesting and useful, in much +less time than we could have written a C program to do the same thing. + +A minor modification to the above pipeline can give us a simple spelling +checker! To determine if you've spelled a word correctly, all you have to +do is look it up in a dictionary. If it is not there, then chances are +that your spelling is incorrect. So, we need a dictionary. +The conventional location for a dictionary is @file{/usr/dict/words}. +On my GNU/Linux system,@footnote{Redhat Linux 6.1, for the November 2000 +revision of this article.} +this is a is a sorted, 45,402 word dictionary. + +Now, how to compare our file with the dictionary? As before, we generate +a sorted list of words, one per line: + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | +> tr -s '[ ]' '\012' | sort -u | ... +@end smallexample + +Now, all we need is a list of words that are @emph{not} in the +dictionary. Here is where the @command{comm} command comes in. + +@smallexample +$ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | +> tr -s '[ ]' '\012' | sort -u | +> comm -23 - /usr/dict/words +@end smallexample + +The @option{-2} and @option{-3} options eliminate lines that are only in the +dictionary (the second file), and lines that are in both files. Lines +only in the first file (standard input, our stream of words), are +words that are not in the dictionary. These are likely candidates for +spelling errors. This pipeline was the first cut at a production +spelling checker on Unix. + +There are some other tools that deserve brief mention. + +@table @command +@item grep +search files for text that matches a regular expression + +@item egrep +like @command{grep}, but with more powerful regular expressions + +@item wc +count lines, words, characters + +@item tee +a T-fitting for data pipes, copies data to files and to standard output + +@item sed +the stream editor, an advanced tool + +@item awk +a data manipulation language, another advanced tool +@end table + +The software tools philosophy also espoused the following bit of +advice: ``Let someone else do the hard part.'' This means, take +something that gives you most of what you need, and then massage it the +rest of the way until it's in the form that you want. + +To summarize: + +@enumerate 1 +@item +Each program should do one thing well. No more, no less. + +@item +Combining programs with appropriate plumbing leads to results where +the whole is greater than the sum of the parts. It also leads to novel +uses of programs that the authors might never have imagined. + +@item +Programs should never print extraneous header or trailer data, since these +could get sent on down a pipeline. (A point we didn't mention earlier.) + +@item +Let someone else do the hard part. + +@item +Know your toolbox! Use each program appropriately. If you don't have an +appropriate tool, build one. +@end enumerate + +As of this writing, all the programs we've discussed are available via +anonymous @command{ftp} from: @* +@uref{ftp://gnudist.gnu.org/textutils/textutils-1.22.tar.gz}. (There may +be more recent versions available now.) + +None of what I have presented in this column is new. The Software Tools +philosophy was first introduced in the book @cite{Software Tools}, by +Brian Kernighan and P.J. Plauger (Addison-Wesley, ISBN 0-201-03669-X). +This book showed how to write and use software tools. It was written in +1976, using a preprocessor for FORTRAN named @command{ratfor} (RATional +FORtran). At the time, C was not as ubiquitous as it is now; FORTRAN +was. The last chapter presented a @command{ratfor} to FORTRAN +processor, written in @command{ratfor}. @command{ratfor} looks an awful +lot like C; if you know C, you won't have any problem following the +code. + +In 1981, the book was updated and made available as @cite{Software Tools +in Pascal} (Addison-Wesley, ISBN 0-201-10342-7). The first book is +still in print; the second, alas, is not. Both books are well worth +reading if you're a programmer. They certainly made a major change in +how I view programming. + +Initially, the programs in both books were available (on 9-track tape) +from Addison-Wesley. Unfortunately, this is no longer the case, +although the @command{ratfor} versions are available from +@uref{http://cm.bell-labs.come/who/bwk, Brian Kernighan's home page}, +and you might be able to find copies of the Pascal versions floating +around the Internet. For a number of years, there was an active +Software Tools Users Group, whose members had ported the original +@command{ratfor} programs to essentially every computer system with a +FORTRAN compiler. The popularity of the group waned in the middle 1980s +as Unix began to spread beyond universities. + +With the current proliferation of GNU code and other clones of Unix programs, +these programs now receive little attention; modern C versions are +much more efficient and do more than these programs do. Nevertheless, as +exposition of good programming style, and evangelism for a still-valuable +philosophy, these books are unparalleled, and I recommend them highly. + +Acknowledgment: I would like to express my gratitude to Brian Kernighan +of Bell Labs, the original Software Toolsmith, for reviewing this column. + +@include doclicense.texi + +@node Index +@unnumbered Index + +@printindex cp + +@shortcontents +@contents +@bye + +@c Local variables: +@c texinfo-column-for-description: 32 +@c End: |