Age | Commit message (Collapse) | Author |
|
* src/sort.c (keycompare): Use xmemcoll0, as it avoids
a couple of stores.
(write_bytes): Leave the buffer the way we found it,
as it might be used again for a later comparison,
if -u is used.
|
|
* src/sort.c (fillbuf): The previous commit incorrectly
terminated the buffer when the last line of input
didn't contain a terminating character.
|
|
Don't write NUL after the comparison buffers on each compare,
which increases performance by about 3% for short lines
on a pentium-m with gcc-4.4.1
* src/sort.c: (fillbuf): Delimit input items with NUL.
(write_bytes): Restore the item delimiter char which was
replaced with NUL in fillbuf().
|
|
* src/Makefile.am (printf_LDADD, seq_LDADD, sleep_LDADD, sort_LDADD):
(tail_LDADD, uptime_LDADD): Omit $(POW_LIB), as it's no longer
needed due to recent gnulib changes, where the strtod module no
longer uses the pow function. strtold needs pow only because it's
sometimes aliased to strtod. See
http://lists.gnu.org/archive/html/bug-gnulib/2010-07/msg00076.html
|
|
This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike
Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al.,
with a basic heap implementation based off of the GDSL heap,
originally by Nicolas Darnis.
The number of sorts done in parallel is limited to the number
of available processors by default, or can be further restricted
with the --parallel option.
On a dual-die, 8 core Intel Xeon, results show sorting with
8 threads is almost 4 times faster than using a single thread.
Timings when sorting a 96MB file:
THREADS TIME (s)
1 5.10
2 2.87
4 1.75
8 1.31
Single threaded sorting has also been improved,
especially for cheaper comparison operations:
COMMAND BEFORE (s) AFTER (s)
sort 8.822 8.716
sort -g 10.336 10.222
sort -n 3.077 2.961
LANG=C sort 2.169 2.066
* bootstrap.conf: Add heap, pthread.
* coreutils.texi (sort): Describe the new --parallel option.
* gl/lib/heap.c: New file. Very basic heap implementation.
* gl/lib/heap.h: New file.
* gl/modules/heap: New file.
* src/Makefile.am: Add LIB_PTHREAD.
* src/sort.c: Include heap.h, nproc.h, pthread.h.
(MAX_MERGE): New macro.
(SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants.
(MERGE_END, MERGE_ROOT): New constants.
(struct merge_node): New struct.
(struct merge_node_queue): New struct.
(sortlines temp): Remove declaration.
(usage, long_options, main): New option, --parallel.
(specify_nthreads): New function.
(mergelines): New signature, to emphasize the fact that the HI area
must be part of the destination. All callers changed.
(sequential_sort): New function, renamed from sortlines. Merge in
the functionality of sortlines_temp.
(compare_nodes): New function.
(lock_node, unlock_node): New functions.
(queue_destroy): New function.
(queue_init): New function.
(queue_insert): New function.
(queue_pop): New function.
(write_unique): New function.
(mergelines_node): New function.
(check_insert): New function.
(update_parent): New function.
(merge_loop): New function.
(sortlines): Rewrite to support and use parallelism, with a new
signature. All callers changed.
(struct thread_args): New struct.
(sortlines_thread): New function.
(sortlines_temp): Remove.
(sort): New argument NTHREADS. All uses changed. Output moved to
mergelines_node.
(main): disable threading if we are sorting at random.
* tests/Makefile.am (TESTS): Add misc/sort-benchmark-random.
* tests/misc/sort-benchmark-random: New file.
Signed-off-by: Pádraig Brady <P@draigBrady.com>
|
|
* src/dd.c (dd_copy): Use requested blocksize (not adjusted) in
diagnostic, to forestall user complaints that the numbers don't
match exactly. Report both exact and human-readable sizes, using
a message format that is consistent with both "BBBB bytes (N XB)
copied" in dd.c and "memory exhausted" in lib/xmalloc.c.
|
|
* src/chcon.c (process_file): Replace _("%s") with "%s".
* src/chmod.c (process_file): Likewise.
* src/chown-core.c (change_file_owner): Likewise.
* src/du.c (process_file): Likewise.
|
|
* gl/lib/dev-map.c, gl/lib/dev-map.h, gl/modules/dev-map: Remove.
* gl/lib/ino-map.c, gl/lib/ino-map.h, gl/modules/ino-map: New files.
* gl/modules/dev-map-tests, gl/tests/test-dev-map.c: Remove.
* gl/modules/ino-map-tests, gl/tests/test-ino-map.c: New files.
* gl/lib/di-set.h (struct di_set): Renamed from struct di_set_state,
and now private. All uses changed.
(_ATTRIBUTE_NONNULL_): Don't assume C99.
(di_set_alloc): Renamed from di_set_init, with no size arg.
Now allocates the object rather than initializing it.
For now, this no longer takes an initial size; we can put this
back later if it is needed.
* gl/lib/di-set.c: Include hash.h, ino-map.h, and limits.h instead of
stdio.h, assert.h, stdint.h, sys/types.h (di-set.h includes that
now), sys/stat.h, and verify.h.
(N_DEV_BITS_4, N_INO_BITS_4, N_DEV_BITS_8, N_INO_BITS_8): Remove.
(struct dev_ino_4, struct dev_ino_8, struct dev_ino_full): Remove.
(enum di_mode): Remove.
(hashint): New typedef.
(HASHINT_MAX, LARGE_INO_MIN): New macros.
(struct di_ent): Now maps a dev_t to a inode set, instead of
containing a union.
(struct dev_map_ent): Remove.
(struct di_set): New type.
(is_encoded_ptr, decode_ptr, di_ent_create): Remove.
(di_ent_hash, di_ent_compare, di_ent_free, di_set_alloc, di_set_free):
(di_set_insert): Adjust to new representation.
(di_ino_hash, map_device, map_inode_number): New functions.
* gl/modules/di-set (Depends-on): Replace dev-map with ino-map.
Remove 'verify'.
* gl/tests/test-di-set.c: Adjust to the above changes to API.
* src/du.c (INITIAL_DI_SET_SIZE): Remove.
(hash_ins, main): Adjust to new di-set API.
|
|
Add comments and adjust interfaces to allow low-level failure
to propagate out to callers.
* src/stat.c (out_file_context): Return bool, not void,
so we can tell callers about failure.
(print_statfs, print_stat, print_it): Propagate failure to caller.
(do_statfs): Propagate print_it failure to caller.
(do_stat): Likewise.
I nearly forgot to update do_stat to propagate print_it failure,
and it compiled just fine in spite of that. To prevent possibility
of a repeat, I've marked each function that returns non-void with
ATTRIBUTE_WARN_UNUSED_RESULT.
|
|
* src/system.h (ATTRIBUTE_WARN_UNUSED_RESULT): Define.
|
|
* src/du.c (INITIAL_DI_SET_SIZE): Increase to the prime just under
1024. This gives a speed-up of about 2% when processing a tree
containing 100,000 files, each with a link count greater than 1,
all pointing to files in some other tree.
|
|
When processing a hard-linked file, du must keep track of the file's
device and inode numbers in order to avoid counting its storage
more than once. When du would process many hard linked files --
as are created by some backup tools -- the amount of memory required
for the supporting data structure could become prohibitively large.
This patch takes advantage of the fact that the amount of information
in the numbers of the typical dev,inode pair is far less than even
32 bits, and hence usually fits in the space of a pointer, be it
32 or 64 bits wide. A typical du traversal examines files on no
more than a handful of distinct devices, so the device number can
be encoded in just a few bits. Similarly, few inode numbers use
all of the high bits in an ino_t. Before, we would represent the
dev,inode pair using a naive struct, and allocate space for each.
Thus, an entry in the hash table consisted of a pointer (to that
struct) and a "next" pointer. With this change, we encode the
dev,inode information and put those bits in place of the pointer,
and thus do away with the need to allocate additional space for
each dev,inode pair.
* src/du.c: Include "di-set.h".
Don't include "hash.h"; it's no longer used.
(INITIAL_DI_SET_SIZE): Define.
(di_set): New global, to replace "htab".
(entry_hash, entry_compare, hash_init): Remove functions.
(hash_ins): Use di-set functions, rather than ones from the hash module.
(main): Likewise.
* bootstrap.conf (gnulib_modules): Add the new di-set module.
* NEWS (New features): Mention it.
|
|
* NEWS: Mention this.
* src/du.c (hash_all): New static var.
(process_file): Use it.
(main): Set it.
* tests/du/hard-link: Add a couple of test cases to help make
sure this bug stays squashed.
* tests/du/files0-from: Adjust existing tests to reflect
change in semantics with duplicate arguments.
|
|
* src/copy.c (copy_attr): A new function which merges copy_attr_by_fd
and copy_attr_by_name. Also display all errors when --attributes-only
* src/copy.c (copy_reg): Skip copying the file contents if specified.
Refactor the SELinux error handling code a little and display all
SELinux errors when only copying attributes.
* src/copy.h (struct cp_options): Add a data_copy_required boolean
* src/cp.c (main): Default to copying data but don't if specified
* src/install.c: Default to copying data
* src/mv.c: Likewise
tests/cp/reflink-perm: Add a test to check that --attributes-only
does not copy data
* tests/cp/acl: Likewise. Also refactor to remove redundant
acl manipulation
* doc/coreutils.texi (cp invocation): Describe the new option
* NEWS: Mention the new feature
|
|
Previously we defaulted to "long-iso" format in locales without
specific format translations, like the en_* locales for example.
This reverts part of commit 6837183d, 08-11-2005, "ls ... acts like
--time-style='posix-long-iso' if the locale settings are messed up"
* src/ls.c (decode_switches): Only use the ISO format when specified.
* NEWS: Mention the change in behavior.
Reported by Daniel Qarras at http://bugzilla.redhat.com/525134
|
|
* src/du.c (usage): Print better --blocksize description.
Prompted by Samuel Thibault in <http://bugs.debian.org/353100>.
* src/df.c (usage): Likewise.
* src/ls.c (usage): Likewise.
|
|
* src/sort.c (usage): Reference the additional description
of the POS syntax.
|
|
|
|
* src/stat.c (main): Remove support for the --context (-Z) option.
In upstream releases this option has always been a no-op. It was
first ignored for compatibility, and since the June 2008 commit,
574f7614 (coreutils-7.0), its use has evoked a warning.
* NEWS (Changes in behavior): Mention it.
|
|
* src/comm.c (usage): Don't align example comments in --help output,
since the extra space (sequence of two spaces) there would be
interpreted by help2man and induce an unwanted line break
in the resulting man page. Reported by Jari Aalto.
|
|
* src/cat.c (usage): Clarify that -b overrides -n.
* doc/coreutils.texi (cat invocation): Likewise.
* THANKS: Update.
Suggested by Chas. Owens, in bug 6383.
|
|
* src/dd.c (dd_copy): Give a better diagnostic than
"dd: memory exhausted" for an over-large bs= block size setting.
Same for ibs= and obs=. Reported by Imre Péntek in
http://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/591969
|
|
* src/ls.c (gobble_file): Revert part of my preceding change,
to avoid clobbering stack.
|
|
* src/tail.c (xlseek): Give INT_BUFSIZE_BOUND a variable name,
not a type name.
* src/ls.c (gobble_file, format_user_or_group_width): Likewise.
* src/head.c (elide_tail_bytes_pipe): Likewise.
(elide_tail_lines_seekable, main): Likewise.
[This change is not complete -- there are doubtless other uses
that can be updated in the same way.]
|
|
sprintf is relatively heavy-weight.
* src/sort.c (key_warnings): Use umaxtostr and stpcpy rather
than sprintf.
Also, replace each INT_BUFSIZE_BOUND "type_name" argument
with the equivalent variable name. More maintainable that way.
|
|
* doc/coreutils.texi (dirname invocation): Reword to be more
precise.
* src/dirname.c (usage): Likewise.
* THANKS: Update.
Reported by Filipus Klutiero, bug 6175.
|
|
* src/touch.c (main): Remove support for the deprecated, long-named
--file option, which is an alternate name for --reference (-r).
That option was undocumented with the arrival of --reference, in
the 1995-10-29 commit, 8b92864e1d. Since the 2009-02-09 commit,
ed85df444a, use of --file has elicited a warning. Not only was
this code due for removal, but the long-name-use-detecting code
was buggy in that it would use a stale or uninitialized "long_idx",
as reported by Robin H. Johnson in http://bugs.gentoo.org/322421.
* NEWS (Changes in behavior): Mention it.
|
|
E.g.,
- size_t desired_width IF_LINT (= 0);
+ size_t desired_width IF_LINT ( = 0);
Use this command:
g grep -l IF_LINT | grep '\.[ch]$' \
| xargs perl -pi -e 's/(IF_LINT \()= /$1 = /'
|
|
Run this command:
git ls-files | grep '\.[ch]$' \
| xargs perl -pi -e 's/for \(;;\)/while (true)/g'
...except for randint.c, which does not include stdbool.h.
In that case, use "while (1)".
* gl/lib/randint.c (randint_genmax): Use "while (1)" for infloops.
* src/cat.c (simple_cat, cat): Use "while (true)" for infloops.
* gl/lib/randread.c (readsource, readisaac): Likewise.
* src/copy.c (copy_reg): Likewise.
* src/csplit.c (record_line_starts, process_regexp): Likewise.
* src/cut.c (set_fields): Likewise.
* src/dd.c (iread, parse_symbols): Likewise.
* src/df.c (find_mount_point, main): Likewise.
* src/du.c (main): Likewise.
* src/expand.c (expand): Likewise.
* src/factor.c (factor_using_division, do_stdin): Likewise.
* src/fmt.c (get_space): Likewise.
* src/ls.c (decode_switches): Likewise.
* src/od.c (main): Likewise.
* src/pr.c (main, read_line): Likewise.
* src/shred.c (dopass, genpattern): Likewise.
* src/sort.c (initbuf, fillbuf, getmonth, keycompare): Likewise.
* src/split.c (bytes_split, lines_split): Likewise.
* src/tac.c (tac_seekable): Likewise.
* src/test.c (and, or): Likewise.
* src/tr.c (squeeze_filter, main): Likewise.
* src/tsort.c (search_item): Likewise.
* src/unexpand.c (unexpand): Likewise.
* src/uniq.c (main): Likewise.
* src/yes.c (main): Likewise.
|
|
* src/base64.c (main): Correct indentation of syntactically
questionable case_GETOPT_HELP_CHAR and case_GETOPT_VERSION_CHAR macros.
* src/who.c (main): Likewise.
|
|
* src/stat.c (alignof): Remove definition.
Instead, include "alignof.h", and sort the #include directives.
And get its definition from the gnulib module by that name:
* bootstrap.conf (gnulib_modules): Add alignof.
|
|
Previously we copied `dd` and suppressed error messages
when truncating neither regular files or shared mem objects.
This was valid for `dd`, as truncation is ancillary to copying
it may also do, but for `truncate` we should display all errors.
Also we used the st_size from non regular files which is undefined,
so we display an error when the user tries this.
* src/truncate (do_truncate): Error when referencing the size
of non regular files or non shared memory objects. Display all
errors returned by ftruncate().
(main): Error when referencing the size of non regular files or
non shared memory objects. Don't suppress error messages for
any file types that can't be opened for writing.
* tests/misc/truncate-dir-fail: Check that referencing the
size of a directory is not supported.
* tests/misc/truncate-fifo: Ensure the test doesn't hang
by using the `timeout` command. Don't test the return from
running ftruncate on the fifo as it's system dependent as
to whether this fails or not.
NEWS: Mention the change in behavior.
Reported by Jim Meyering.
|
|
* doc/coreutils.texi (truncate invocation): Mention that --reference
bases the --size rather than just setting it.
* src/truncate.c (usage): Likewise. Also remove the clause
describing --size and --reference as being mutually exclusive.
(do_truncate): Add an extra parameter to hold the size
of a referenced file, and use it if positive.
(main): Pass the size of a referenced file to do_truncate().
* tests/misc/truncate-parameters: Adjust for the new combinations.
* NEWS: Mention the change
Suggested by Richard W.M. Jones
|
|
* src/shuf.c (main): Remove a stray newline in a diagnostic.
* src/od.c (main): Likewise.
Detected via these:
git grep -A1 'error *(.*,$' | grep -C1 '\\n"[,)]'
git grep 'error *(.*;$' | grep '\\n"[,)]'
|
|
Run this command:
git grep -l 'LC_[A-Z]*="' \
| xargs perl -pi -e 's/(LC_[A-Z]*)="(.*?)"/$1=$2/'
* src/Makefile.am: Write LC_ALL=$$locale, not LC_ALL="$$locale".
* src/date.c (main): Similar, in a comment.
* tests/misc/sort-month: Write LC_ALL=$LOC, not LC_ALL="$LOC".
|
|
* src/sort.c (key_warnings): Always warn about significant leading
blanks when character offsets are specified, unless they key is
possibly a line offset, i.e. of the form -k1.x,1.y. Also suppress
this warning if the user could be sorting right aligned indexes.
|
|
* NEWS (New features): Mention it.
* src/du.c (DEBUG_OPT): Remove. Use long-named ---debug instead.
Commented out.
(MAX_DEPTH_OPTION): Remove. Use 'd' instead.
(main): Insert literal "d:"; remove DEBUG_OPT.
* doc/coreutils.texi (du invocation): Add -d to indices.
* tests/du/max-depth: Exercise -d, too.
|
|
* src/Makefile.am (fs-def): Sort definitions.
This is required by fs-magic-compare's use of join.
|
|
* src/sort.c (mark_key): Add a cast-to-int of a printf field width,
to placate -Wformat.
|
|
* src/sort.c (usage): Mention --debug can output warnings to stderr.
Also split the translatable string to aid translation.
(default_key_compare): A new function refactored from main(),
and now also called from the new key_warnings() function.
(key_to_opts): A new function refactored from incompatible_options(),
and now also called from the new key_warnings() function.
(key_numeric): A new function refactored to test if key is numeric.
(key_warnings): A new function to output warnings to stderr,
about questionable use of various options. Currently it warns
about zero length keys and ineffective global options.
(incompatible_options): Refactor out key_to_opts()
(main): Use key_init() to initialize gkey. Refactor out
default_key_compare(). Call key_warnings() in debug mode.
* doc/coreutils.texi (sort invocation): Mention that warnings
are output by --debug.
* tests/misc/sort-debug-warn: A new test for debug warnings.
* tests/Makefile.am: Reference the new test.
* NEWS: Mention the new feature
|
|
* src/sort (usage): Add description for --debug.
(write_bytes): Pass a line structure so it can subsequently
be passed to compare to highlight the keys when in debug mode.
Also transform TAB and NUL characters written to stdout so
that the highlighting in debug mode aligns correctly.
(human_numcompare): Pass an "endptr" so we can record the extent
of the number matched.
(general_numcompare): Likewise.
(find_unit_order): Likewise.
(getmonth): Likewise.
(numcompare): Likewise. Note we reuse find_unit_order() for this,
which is a good enough approximation, and means we don't need to
change the strnumcmp() interface.
(check_mixed_SI_IEC): Return whether iec_present, so that can be
used to set the "endptr" in find_unit_order. Also make the key
parameter optional, which will be the case from numcompare().
(count_tabs): A new function to determine how much to adjust
the mbswidth() values by (TABs don't have a width).
(mark_key): A new function to output the key highlighting to stdout.
(debug_key): A new function to determine the offset and width
of the key highlighting.
(key_compare): Pass the show_debug parameter so the key highlighting
is only displayed when explicitly called. For each key type, set
the length (lena) and whether leading blanks are auto skipped (skipb)
which are then used by debug_key() to highlight the portion of the
key used in the comparison.
(compare): Pass the show_debug parameter so the key highlighting
is only displayed when explicitly called. Call debug_key() to
highlight the last resort comparison.
(check): Output highlighting for disorder line to stdout.
(main): Process the --debug option and make it mutually exlusive
with the -o option as I don't see it useful there, even potentially
harmful if someone left a --debug in by mistake when updating a file.
Also restricting debug output to stdout, simplifies the logic
for dealing with temporary files.
* doc/coreutils.texi (sort invocation): Describe the --debug option,
and reference it from the --key description.
* tests/misc/sort-debug-keys: A new test for highlighting keys.
* tests/Makefile.am: Reference the new test.
* NEWS: Mention the new feature.
|
|
* src/dd.c (SA_NODEFER, SA_RESETHAND): Remove definitions,
now that gnulib guarantees they are defined in <signal.h>.
* src/ls.c (SA_RESTART): Likewise.
|
|
* src/timeout.c (WIFSIGNALED, WTERMSIG): Remove definitions,
now that gnulib guarantees they are defined in <sys/wait.h>.
* src/operand2sig.c: Likewise.
* src/kill.c: Likewise.
|
|
* src/sort.c (general_numcompare): Don't use long double if strtold
is not available, as it may introduce needless overhead.
|
|
* src/sort.c (general_numcompare): Use long doubles unconditionally,
and strtold when available, to convert numbers with greater range and
precision. Performance was seen to be on par with standard doubles.
* doc/coreutils.texi (sort invocation): Amend the -g description to
mention long double rather than double, and strtold rather than strtod.
* src/getlimits.c (main): Output floating point limits for use in tests.
* tests/misc/sort-float: A new test to ensure sort is using long
doubles when possible, and that locale specific floats are handled.
* tests/Makefile.am: Reference the new test.
* tests/test-lib.sh (getlimits_): Normalize indenting.
* NEWS: Mention the new behaviour.
Reported by Nelson Beebe.
|
|
* .x-sc_prohibit_always_true_header_tests: New file.
* Makefile.am (syntax_check_exceptions): Add it.
* src/cat.c: Remove #if HAVE_SYS_IOCTL_H test.
* src/copy.c: Likewise.
* src/ls.c: Likewise.
* src/stty.c: Likewise.
* src/install.c: Remove #if HAVE_SYS_WAIT_H test.
* src/kill.c: Likewise.
* src/operand2sig.c: Likewise.
* src/timeout.c: Likewise.
* src/pathchk.c: Remove #if HAVE_WCHAR_H test.
* src/stat.c: Remove #if HAVE_NETINET_IN_H test.
|
|
The value of `$?' on entrance to signal handlers in shell scripts
cannot be relied upon, so set the exit code explicitly.
* cfg.mk (sc_always_defined_macros, sc_system_h_headers): Set
the exit code in signal handler explicitly to 128 + SIG<SIGNAL>.
* src/Makefile.am (sc_tight_scope): Likewise.
* tests/test-lib.sh: Likewise.
|
|
Necessary for cygwin. Technically, this patch is not correct,
in that it clobbers O_APPEND, but it is no different than any
other use of xfreopen to force binary mode, so all such uses
should be fixed at once in a later patch.
* src/base64.c (main): Open input in binary mode.
* THANKS: Update.
Reported by Yutaka Amanai.
|
|
This regression was introduced in commit 224a69b5, 2009-02-24,
"sort: Fix two bugs with determining the end of field".
The specific regression being that we include 1 field too many when
an end field is specified using obsolescent key syntax (+POS -POS).
* src/sort.c (struct keyfield): Clarify the description of the eword
member, as suggested by Alan Curry.
(main): When processing obsolescent format key specifications,
normalize eword to a zero based count when no specific end char is given
for an end field. This matches what's done when keys are specified with -k.
* tests/misc/sort: Add a few more tests for the obsolescent key formats,
with test 07i being the particular failure addressed by this change.
* THANKS: Add Alan Curry who precisely identified the issue.
* NEWS: Mention the fix.
Reported by Santiago Rodríguez
|
|
* src/copy.c (copy_reg): Copy xattrs _after_ setting file ownership
so that capabilities are not cleared when setting ownership.
* tests/cp/capability: A new root test.
* tests/Makefile.am (root_tests): Reference the new test.
* NEWS: Mention the fix.
|