summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-07-13maint: reorganize latest NEWSPádraig Brady
* NEWS: Add another blank line before the previous version. (Bug fixes): Move to the start. (Changes in behavior): Add the item about the du mem usage change from the "New features" section.
2010-07-13maint: heap.c: simplify dynamic allocationsPádraig Brady
* gl/lib/heap.c (heap_alloc): Use the fact that the xalloc routines will not return NULL. Also remove the redundant temporary variables. (heap_insert): From Jim Meyering, use x2nrealloc() which is simpler while handling overflow and increasing the size more efficiently. This reallocation is currently unused by coreutils in any case as it preallocates enough.
2010-07-13sort: parallelize internal sortChen Guo
This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al., with a basic heap implementation based off of the GDSL heap, originally by Nicolas Darnis. The number of sorts done in parallel is limited to the number of available processors by default, or can be further restricted with the --parallel option. On a dual-die, 8 core Intel Xeon, results show sorting with 8 threads is almost 4 times faster than using a single thread. Timings when sorting a 96MB file: THREADS TIME (s) 1 5.10 2 2.87 4 1.75 8 1.31 Single threaded sorting has also been improved, especially for cheaper comparison operations: COMMAND BEFORE (s) AFTER (s) sort 8.822 8.716 sort -g 10.336 10.222 sort -n 3.077 2.961 LANG=C sort 2.169 2.066 * bootstrap.conf: Add heap, pthread. * coreutils.texi (sort): Describe the new --parallel option. * gl/lib/heap.c: New file. Very basic heap implementation. * gl/lib/heap.h: New file. * gl/modules/heap: New file. * src/Makefile.am: Add LIB_PTHREAD. * src/sort.c: Include heap.h, nproc.h, pthread.h. (MAX_MERGE): New macro. (SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants. (MERGE_END, MERGE_ROOT): New constants. (struct merge_node): New struct. (struct merge_node_queue): New struct. (sortlines temp): Remove declaration. (usage, long_options, main): New option, --parallel. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sequential_sort): New function, renamed from sortlines. Merge in the functionality of sortlines_temp. (compare_nodes): New function. (lock_node, unlock_node): New functions. (queue_destroy): New function. (queue_init): New function. (queue_insert): New function. (queue_pop): New function. (write_unique): New function. (mergelines_node): New function. (check_insert): New function. (update_parent): New function. (merge_loop): New function. (sortlines): Rewrite to support and use parallelism, with a new signature. All callers changed. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New argument NTHREADS. All uses changed. Output moved to mergelines_node. (main): disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. Signed-off-by: Pádraig Brady <P@draigBrady.com>
2010-07-13di-set, ino-map: adjust a type, improve readabilityJim Meyering
* gl/lib/ino-map.c (ino_hash): Declare "i" as unsigned int. Use an intermediate variable for the for-loop upper bound, so it's a little more readable. Adjust comment. * gl/lib/di-set.c (di_ent_hash): Likewise.
2010-07-12dd: also spell out size on memory exhaustionPaul R. Eggert
* src/dd.c (dd_copy): Use requested blocksize (not adjusted) in diagnostic, to forestall user complaints that the numbers don't match exactly. Report both exact and human-readable sizes, using a message format that is consistent with both "BBBB bytes (N XB) copied" in dd.c and "memory exhausted" in lib/xmalloc.c.
2010-07-09chcon, chmod, chown, du: don't translate "%s"Paul Eggert
* src/chcon.c (process_file): Replace _("%s") with "%s". * src/chmod.c (process_file): Likewise. * src/chown-core.c (change_file_owner): Likewise. * src/du.c (process_file): Likewise.
2010-07-06du: avoid spurious warnings with 64-bit gcc -WPaul Eggert
Problem reported by Jim Meyering in: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6524#74 * gl/lib/di-set.c (di_ent_hash): Rework so that the compiler does not incorrectly warn about shifting by 64-bits in unreachable code. * gl/lib/ino-map.c (ino_hash): Likewise.
2010-07-06du: Hash with a mechanism that's simpler and takes less memory.Paul Eggert
* gl/lib/dev-map.c, gl/lib/dev-map.h, gl/modules/dev-map: Remove. * gl/lib/ino-map.c, gl/lib/ino-map.h, gl/modules/ino-map: New files. * gl/modules/dev-map-tests, gl/tests/test-dev-map.c: Remove. * gl/modules/ino-map-tests, gl/tests/test-ino-map.c: New files. * gl/lib/di-set.h (struct di_set): Renamed from struct di_set_state, and now private. All uses changed. (_ATTRIBUTE_NONNULL_): Don't assume C99. (di_set_alloc): Renamed from di_set_init, with no size arg. Now allocates the object rather than initializing it. For now, this no longer takes an initial size; we can put this back later if it is needed. * gl/lib/di-set.c: Include hash.h, ino-map.h, and limits.h instead of stdio.h, assert.h, stdint.h, sys/types.h (di-set.h includes that now), sys/stat.h, and verify.h. (N_DEV_BITS_4, N_INO_BITS_4, N_DEV_BITS_8, N_INO_BITS_8): Remove. (struct dev_ino_4, struct dev_ino_8, struct dev_ino_full): Remove. (enum di_mode): Remove. (hashint): New typedef. (HASHINT_MAX, LARGE_INO_MIN): New macros. (struct di_ent): Now maps a dev_t to a inode set, instead of containing a union. (struct dev_map_ent): Remove. (struct di_set): New type. (is_encoded_ptr, decode_ptr, di_ent_create): Remove. (di_ent_hash, di_ent_compare, di_ent_free, di_set_alloc, di_set_free): (di_set_insert): Adjust to new representation. (di_ino_hash, map_device, map_inode_number): New functions. * gl/modules/di-set (Depends-on): Replace dev-map with ino-map. Remove 'verify'. * gl/tests/test-di-set.c: Adjust to the above changes to API. * src/du.c (INITIAL_DI_SET_SIZE): Remove. (hash_ins, main): Adjust to new di-set API.
2010-07-05stat: getfilecon failure now evokes nonzero exit statusJim Meyering
Add comments and adjust interfaces to allow low-level failure to propagate out to callers. * src/stat.c (out_file_context): Return bool, not void, so we can tell callers about failure. (print_statfs, print_stat, print_it): Propagate failure to caller. (do_statfs): Propagate print_it failure to caller. (do_stat): Likewise. I nearly forgot to update do_stat to propagate print_it failure, and it compiled just fine in spite of that. To prevent possibility of a repeat, I've marked each function that returns non-void with ATTRIBUTE_WARN_UNUSED_RESULT.
2010-07-05system.h: define ATTRIBUTE_WARN_UNUSED_RESULTJim Meyering
* src/system.h (ATTRIBUTE_WARN_UNUSED_RESULT): Define.
2010-07-05tests: make tests requiring a delay to pass, more robustPádraig Brady
* tests/init.cfg: Introduce a retry_delay_() function to repeatedly call a test function that requires a delay. This delay can now be shorter for the common case on fast systems, but will double until a configurable limit it reached before failing on slower systems. * tests/dd/reblock: Use retry_delay_. * tests/misc/cat-buf: Likewise. * tests/misc/stdbuf: Likewise. * tests/tail-2/F-vs-rename: Likewise. * tests/tail-2/flush-initial: Likewise. * tests/tail-2/tail-n0f: Likewise. * tests/tail-2/wait: Likewise. * test/dd/misc: Comment that delay is needed to trigger failure.
2010-07-04doc: Add advice about ChangeLogs and synchronizing submodulesPaul Eggert
* README-hacking: Update accordingly.
2010-07-04du: increase the initial dev-inode set sizeJim Meyering
* src/du.c (INITIAL_DI_SET_SIZE): Increase to the prime just under 1024. This gives a speed-up of about 2% when processing a tree containing 100,000 files, each with a link count greater than 1, all pointing to files in some other tree.
2010-07-04du: use less than half as much memory when tracking hard linksJim Meyering
When processing a hard-linked file, du must keep track of the file's device and inode numbers in order to avoid counting its storage more than once. When du would process many hard linked files -- as are created by some backup tools -- the amount of memory required for the supporting data structure could become prohibitively large. This patch takes advantage of the fact that the amount of information in the numbers of the typical dev,inode pair is far less than even 32 bits, and hence usually fits in the space of a pointer, be it 32 or 64 bits wide. A typical du traversal examines files on no more than a handful of distinct devices, so the device number can be encoded in just a few bits. Similarly, few inode numbers use all of the high bits in an ino_t. Before, we would represent the dev,inode pair using a naive struct, and allocate space for each. Thus, an entry in the hash table consisted of a pointer (to that struct) and a "next" pointer. With this change, we encode the dev,inode information and put those bits in place of the pointer, and thus do away with the need to allocate additional space for each dev,inode pair. * src/du.c: Include "di-set.h". Don't include "hash.h"; it's no longer used. (INITIAL_DI_SET_SIZE): Define. (di_set): New global, to replace "htab". (entry_hash, entry_compare, hash_init): Remove functions. (hash_ins): Use di-set functions, rather than ones from the hash module. (main): Likewise. * bootstrap.conf (gnulib_modules): Add the new di-set module. * NEWS (New features): Mention it.
2010-07-04di-set: manipulate sets of dev/inode pairs efficientlyJim Meyering
* gl/lib/di-set.c: Implementation. * gl/lib/di-set.h: Declarations. * gl/modules/di-set: Define module. * gl/modules/di-set-tests: Define test module. * gl/tests/test-di-set.c: Likewise.
2010-07-04dev-map: map device number to small non-negativeJim Meyering
* gl/lib/dev-map.c: New file. * gl/lib/dev-map.h: Declarations. * gl/modules/dev-map: Define primary modules. * gl/modules/dev-map-tests: Define test module. * gl/tests/test-dev-map.c: Test it.
2010-07-03du: don't miscount duplicate directories or link-count-1 filesPaul Eggert
* NEWS: Mention this. * src/du.c (hash_all): New static var. (process_file): Use it. (main): Set it. * tests/du/hard-link: Add a couple of test cases to help make sure this bug stays squashed. * tests/du/files0-from: Adjust existing tests to reflect change in semantics with duplicate arguments.
2010-07-02build: update gnulib submodule to latest; update bootstrap, tooJim Meyering
* gnulib: Update to latest, for hash.c improvement. * bootstrap: Update from gnulib, too.
2010-07-01cp: add an option to only copy the file attributesPádraig Brady
* src/copy.c (copy_attr): A new function which merges copy_attr_by_fd and copy_attr_by_name. Also display all errors when --attributes-only * src/copy.c (copy_reg): Skip copying the file contents if specified. Refactor the SELinux error handling code a little and display all SELinux errors when only copying attributes. * src/copy.h (struct cp_options): Add a data_copy_required boolean * src/cp.c (main): Default to copying data but don't if specified * src/install.c: Default to copying data * src/mv.c: Likewise tests/cp/reflink-perm: Add a test to check that --attributes-only does not copy data * tests/cp/acl: Likewise. Also refactor to remove redundant acl manipulation * doc/coreutils.texi (cp invocation): Describe the new option * NEWS: Mention the new feature
2010-07-01ls: use the POSIX date style when the locale does not specify onePádraig Brady
Previously we defaulted to "long-iso" format in locales without specific format translations, like the en_* locales for example. This reverts part of commit 6837183d, 08-11-2005, "ls ... acts like --time-style='posix-long-iso' if the locale settings are messed up" * src/ls.c (decode_switches): Only use the ISO format when specified. * NEWS: Mention the change in behavior. Reported by Daniel Qarras at http://bugzilla.redhat.com/525134
2010-06-30tests: fail rather than infloop in tail's inotify-rotate testJim Meyering
* tests/tail-2/inotify-rotate: Switch to new init.sh-based framework. (grep_timeout): New function. Use it in place of open-coded loops that might infloop. This was prompted by my encountering an inexplicable, and so far unreproducible, infloop in the code that was waiting for "b" to appear in "out".
2010-06-30tests: move most helper functions from test-lib.sh to new init.cfgJim Meyering
From there, they will be used by both test-lib.sh (as we phase it out) and the newer init.sh, to which all tests will migrate. * tests/test-lib.sh: Move most functions from here, ... * tests/init.cfg: ...to here. New file. * tests/Makefile.am (EXTRA_DIST): Add init.cfg.
2010-06-30tests: sync tests/init.sh from gnulibJim Meyering
* tests/init.sh: Update from gnulib.
2010-06-29doc: df, du, ls: improve --blocksize description in --helpJim Meyering
* src/du.c (usage): Print better --blocksize description. Prompted by Samuel Thibault in <http://bugs.debian.org/353100>. * src/df.c (usage): Likewise. * src/ls.c (usage): Likewise.
2010-06-29doc: reference POS syntax in --help for sort --keyPádraig Brady
* src/sort.c (usage): Reference the additional description of the POS syntax.
2010-06-27ln: print a clearer error message when linking failsBenno Schulenberg
2010-06-22stat: remove support for deprecated --context (-Z) optionJim Meyering
* src/stat.c (main): Remove support for the --context (-Z) option. In upstream releases this option has always been a no-op. It was first ignored for compatibility, and since the June 2008 commit, 574f7614 (coreutils-7.0), its use has evoked a warning. * NEWS (Changes in behavior): Mention it.
2010-06-21doc: fix comm's --help output so we generate a better man pageJim Meyering
* src/comm.c (usage): Don't align example comments in --help output, since the extra space (sequence of two spaces) there would be interpreted by help2man and induce an unwanted line break in the resulting man page. Reported by Jari Aalto.
2010-06-14maint: ignore another gnulib generated fileEric Blake
* .gnulib: Exclude lib/c++defs.h.
2010-06-14cat: improve documentationEric Blake
* src/cat.c (usage): Clarify that -b overrides -n. * doc/coreutils.texi (cat invocation): Likewise. * THANKS: Update. Suggested by Chas. Owens, in bug 6383.
2010-06-14doc: dd: discourage use of very large block sizesJim Meyering
* doc/coreutils.texi (dd invocation): Warn against using a very large block size. Suggested by Imre Péntek.
2010-06-14dd: print a better diagnostic for an invalid block sizeJim Meyering
* src/dd.c (dd_copy): Give a better diagnostic than "dd: memory exhausted" for an over-large bs= block size setting. Same for ibs= and obs=. Reported by Imre Péntek in http://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/591969
2010-06-11build: update gnulib submodule to latestJim Meyering
2010-06-11build: don't let a large sparse temporary file cause "make dist" to failJim Meyering
* bootstrap.conf (bootstrap_epilogue): Replace the offending grep command from po/Makefile.in.in's $(DOMAIN).pot-update rule.
2010-06-10ls: avoid just-introduced buffer overrunJim Meyering
* src/ls.c (gobble_file): Revert part of my preceding change, to avoid clobbering stack.
2010-06-08maint: adjust INT_BUFSIZE_BOUND usage for maintainabilityJim Meyering
* src/tail.c (xlseek): Give INT_BUFSIZE_BOUND a variable name, not a type name. * src/ls.c (gobble_file, format_user_or_group_width): Likewise. * src/head.c (elide_tail_bytes_pipe): Likewise. (elide_tail_lines_seekable, main): Likewise. [This change is not complete -- there are doubtless other uses that can be updated in the same way.]
2010-06-08sort: avoid unnecessary use of sprintfJim Meyering
sprintf is relatively heavy-weight. * src/sort.c (key_warnings): Use umaxtostr and stpcpy rather than sprintf. Also, replace each INT_BUFSIZE_BOUND "type_name" argument with the equivalent variable name. More maintainable that way.
2010-06-08dirname: tweak summary wordingEric Blake
* doc/coreutils.texi (dirname invocation): Reword to be more precise. * src/dirname.c (usage): Likewise. * THANKS: Update. Reported by Filipus Klutiero, bug 6175.
2010-06-02doc: mention the new coreutils@ mailing listsJim Meyering
* README-package-renamed-to-coreutils: Mention the new mailing list and a mirror.
2010-06-02touch: remove support for --file=REF_FILE optionJim Meyering
* src/touch.c (main): Remove support for the deprecated, long-named --file option, which is an alternate name for --reference (-r). That option was undocumented with the arrival of --reference, in the 1995-10-29 commit, 8b92864e1d. Since the 2009-02-09 commit, ed85df444a, use of --file has elicited a warning. Not only was this code due for removal, but the long-name-use-detecting code was buggy in that it would use a stale or uninitialized "long_idx", as reported by Robin H. Johnson in http://bugs.gentoo.org/322421. * NEWS (Changes in behavior): Mention it.
2010-05-31maint: make spacing around "=" consistent, even in IF_LINTJim Meyering
E.g., - size_t desired_width IF_LINT (= 0); + size_t desired_width IF_LINT ( = 0); Use this command: g grep -l IF_LINT | grep '\.[ch]$' \ | xargs perl -pi -e 's/(IF_LINT \()= /$1 = /'
2010-05-31maint: replace each "for (;;)" with "while (true)"Jim Meyering
Run this command: git ls-files | grep '\.[ch]$' \ | xargs perl -pi -e 's/for \(;;\)/while (true)/g' ...except for randint.c, which does not include stdbool.h. In that case, use "while (1)". * gl/lib/randint.c (randint_genmax): Use "while (1)" for infloops. * src/cat.c (simple_cat, cat): Use "while (true)" for infloops. * gl/lib/randread.c (readsource, readisaac): Likewise. * src/copy.c (copy_reg): Likewise. * src/csplit.c (record_line_starts, process_regexp): Likewise. * src/cut.c (set_fields): Likewise. * src/dd.c (iread, parse_symbols): Likewise. * src/df.c (find_mount_point, main): Likewise. * src/du.c (main): Likewise. * src/expand.c (expand): Likewise. * src/factor.c (factor_using_division, do_stdin): Likewise. * src/fmt.c (get_space): Likewise. * src/ls.c (decode_switches): Likewise. * src/od.c (main): Likewise. * src/pr.c (main, read_line): Likewise. * src/shred.c (dopass, genpattern): Likewise. * src/sort.c (initbuf, fillbuf, getmonth, keycompare): Likewise. * src/split.c (bytes_split, lines_split): Likewise. * src/tac.c (tac_seekable): Likewise. * src/test.c (and, or): Likewise. * src/tr.c (squeeze_filter, main): Likewise. * src/tsort.c (search_item): Likewise. * src/unexpand.c (unexpand): Likewise. * src/uniq.c (main): Likewise. * src/yes.c (main): Likewise.
2010-05-31maint: correct indentation of case_GETOPT_* macro usesJim Meyering
* src/base64.c (main): Correct indentation of syntactically questionable case_GETOPT_HELP_CHAR and case_GETOPT_VERSION_CHAR macros. * src/who.c (main): Likewise.
2010-05-31stat: use gnulib's alignof moduleJim Meyering
* src/stat.c (alignof): Remove definition. Instead, include "alignof.h", and sort the #include directives. And get its definition from the gnulib module by that name: * bootstrap.conf (gnulib_modules): Add alignof.
2010-05-31tests: remove unnecessary single quotes in perl hash use: ->{'SYM'}Jim Meyering
Run this command: git grep -l "limits->{'" \ | xargs perl -pi -e "s/limits->{'(.*?)'}/limits->{\$1}/g" * cfg.mk (sc_prohibit_perl_hash_quotes): New rule to match. * tests/misc/join: Remove quotes. * tests/misc/sort: Likewise. * tests/misc/sort-merge: Likewise. * tests/misc/test: Likewise. * tests/misc/unexpand: Likewise. * tests/misc/uniq: Likewise.
2010-05-29truncate: improve handling of non regular filesPádraig Brady
Previously we copied `dd` and suppressed error messages when truncating neither regular files or shared mem objects. This was valid for `dd`, as truncation is ancillary to copying it may also do, but for `truncate` we should display all errors. Also we used the st_size from non regular files which is undefined, so we display an error when the user tries this. * src/truncate (do_truncate): Error when referencing the size of non regular files or non shared memory objects. Display all errors returned by ftruncate(). (main): Error when referencing the size of non regular files or non shared memory objects. Don't suppress error messages for any file types that can't be opened for writing. * tests/misc/truncate-dir-fail: Check that referencing the size of a directory is not supported. * tests/misc/truncate-fifo: Ensure the test doesn't hang by using the `timeout` command. Don't test the return from running ftruncate on the fifo as it's system dependent as to whether this fails or not. NEWS: Mention the change in behavior. Reported by Jim Meyering.
2010-05-28truncate: support sizes relative to an existing filePádraig Brady
* doc/coreutils.texi (truncate invocation): Mention that --reference bases the --size rather than just setting it. * src/truncate.c (usage): Likewise. Also remove the clause describing --size and --reference as being mutually exclusive. (do_truncate): Add an extra parameter to hold the size of a referenced file, and use it if positive. (main): Pass the size of a referenced file to do_truncate(). * tests/misc/truncate-parameters: Adjust for the new combinations. * NEWS: Mention the change Suggested by Richard W.M. Jones
2010-05-26tests: update help-version to work with parted, tooJim Meyering
* tests/misc/help-version: Add init code for GNU Parted.
2010-05-25maint: don't emit an extra newline in each of two diagnosticsJim Meyering
* src/shuf.c (main): Remove a stray newline in a diagnostic. * src/od.c (main): Likewise. Detected via these: git grep -A1 'error *(.*,$' | grep -C1 '\\n"[,)]' git grep 'error *(.*;$' | grep '\\n"[,)]'
2010-05-25maint: remove unneeded double quotes on RHS of shell assignmentsJim Meyering
Run this command: git grep -l 'LC_[A-Z]*="' \ | xargs perl -pi -e 's/(LC_[A-Z]*)="(.*?)"/$1=$2/' * src/Makefile.am: Write LC_ALL=$$locale, not LC_ALL="$$locale". * src/date.c (main): Similar, in a comment. * tests/misc/sort-month: Write LC_ALL=$LOC, not LC_ALL="$LOC".