Age | Commit message (Collapse) | Author |
|
* src/sort.c (INPUT_FILE_SIZE_GUESS): Decrease initial allocation
factor used to size buffer used when reading a non-regular file.
For motivation, see discussion here:
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/878/focus=887
|
|
* src/sort.c (SUBTHREAD_LINES_HEURISTIC): Do not spawn a new thread
for every 4 lines. Increase this from 4 to 128K. 128K lines seems
appropriate for a 5-year-old dual-core laptop, but it is too low for
some common combinations of short lines and/or newer systems.
* NEWS (Bug fixes): Mention it.
|
|
This case was overlooked in commit bdde34f9, 2010-08-05,
"sort: tune and refactor --debug code, and fix minor underlining bug"
* src/sort.c (debug_key): Don't adjust the key end when
it's before the key start.
* tests/misc/sort-debug-keys: Add a test case.
|
|
* src/pr.c (char_to_clump): Remove a dead store.
* src/remove.c (fts_skip_tree): Likewise.
* src/sort.c (key_warnings): Likewise.
(sort): Suppress an uninitialized pointer warning.
|
|
Run "make update-copyright".
|
|
* cfg.mk (LINE_LEN_MAX, FILTER_LONG_LINES): New macros.
(sc_long_lines): New rule.
* HACKING: Use shorter URLs to the same material.
* doc/Makefile.am, doc/coreutils.texi, m4/boottime.m4:
* man/help2man, man/stdbuf.x, src/Makefile.am, src/cat.c, src/copy.c:
* src/cp.c, src/dd.c, src/df.c, src/du.c, src/groups.c, src/install.c:
* src/ls.c, src/md5sum.c, src/mv.c, src/od.c, src/pinky.c, src/ptx.c:
* src/readlink.c, src/remove.c, src/rmdir.c, src/setuidgid.c:
* src/sort.c, src/tail.c, src/touch.c, tests/Coreutils.pm:
* tests/cp/existing-perm-race, tests/cp/perm, tests/cp/preserve-gid:
* tests/du/2g, tests/du/long-from-unreadable, tests/init.sh:
* tests/install/basic-1, tests/ls/nameless-uid:
* tests/ls/readdir-mountpoint-inode, tests/misc/chroot-credentials:
* tests/misc/cut, tests/misc/date, tests/misc/join, tests/misc/md5sum:
* tests/misc/sha1sum, tests/misc/sha224sum, tests/misc/sort:
* tests/misc/sort-continue, tests/misc/sort-files0-from:
* tests/misc/sort-rand, tests/misc/stdbuf, tests/misc/tr:
* tests/misc/uniq, tests/mv/atomic, tests/mv/part-fail:
* tests/mv/part-symlink, tests/mv/sticky-to-xpart, tests/pr/pr-tests:
* tests/rm/fail-2eperm, tests/rm/interactive-always:
Reformat to fit within 80 columns.
* doc/Makefile.am (BAD_POSIX_PERL): New macro.
* doc/coreutils.texi: Reword slightly, to make menus and
index lines shorter.
* src/md5sum.c: Redo --help output so that it fits within 79
columns, since that's a bit more portable and all the other --help
strings fit in 79 columns.
|
|
* src/sort.c (main): Don't invoke num_processors twice.
|
|
Also fix up Chen Guo's contacts
* src/sort.c (usage): Add a missing "of"
* THANKS: Add Chen Guo
* .mailmap: Add Chen Guo's UCLA address
|
|
* src/sort.c (main): If --parallel isn't specified,
restrict the number of threads to 8 by default.
If the --parallel option is specified, then
allow any number of threads to be set, independent
of the number of processors on the system.
* doc/coreutils.texi (sort invocation): Document the changes
to determining the number of threads to use.
Mention the memory overhead when using multiple threads.
* tests/misc/sort-spinlock-abuse: Allow single core
systems that support pthreads.
* tests/misc/sort-stale-thread-mem: Likewise.
* tests/misc/sort-unique-segv: Likewise.
* NEWS: Mention the change in behaviour.
|
|
Without this change, tests/misc/sort-compress-hang would consume
more than 10,000 process slots on my RHEL 5.5 x86-64 server,
making it likely for other applications to fail due to lack of
process slots. With this change, the same benchmark causes 'sort'
to consume at most 19 process slots. The change also improved
wall-clock time by 2% and user+system time by 14% on that benchmark.
* NEWS: Document this.
* src/sort.c (MAX_PROCS_BEFORE_REAP): Remove.
(reap_exited): Renamed from reap_some; this is a more accurate name,
since "some" incorrectly implies that it reaps at least one process.
All uses changed.
(reap_some): New function: it *does* reap at least one process.
(pipe_fork): Do not allow more than NMERGE + 2 subprocesses.
(mergefps, sort): Omit check for exited processes: no longer needed,
and anyway the code consumed too much CPU per line when 2 < nprocs.
|
|
* NEWS: Document this.
* src/sort.c (UNCOMPRESSED, UNREAPED, REAPED): New constants.
(struct tempnode): New member 'state', to hold these constants.
The pid member is now undefined if state == UNCOMPRESSED.
(struct sortfile): Replace member 'pid' with member 'temp'.
(uintptr): Remove.
(proctab_hasher, proctab_comparator, register_proc, delete_proc):
Proctab entries are now struct tempnode *, not pid_t, to handle
the case where multiple tempnode objects correspond to the same
pid. This avoids a race condition that can cause a hang.
(register_proc): Arg is now struct tempnode *, not pid_t. All
callers changed.
(delete_proc): Set tempnode state to REAPED.
(create_temp_file): No need to set pid member here; it's now
done when the pid is known.
(maybe_create_temp, create_temp): Remove PPID arg. Return struct
tempnode *, not char *. All callers changed.
(maybe_create_temp): Set node state to UNCOMPRESSED or UNREAPED.
No need to set node->pid to 0.
(open_temp): Replace NAME and PID args with a single TEMP arg.
All callers changed. Wait only for unreaped children.
(zaptemp): Wait for decompressor to finish before removing its
temporary-file input. This avoids .nfsXXXX hassles with NFS
and fixes a race (leading to a hang) regardless of NFS.
(open_input_files): Adjust to new way of dealing with temp files
and their subprocesses.
* tests/Makefile.am (TESTS): Add misc/sort-compress-hang.
* tests/misc/sort-compress-hang: New file.
|
|
* NEWS: Document this.
* src/sort.c (avoid_trashing_input): The previous fix to this
function didn't fix all the problems with this code. Replace it
with something simpler: just copy the input file. This doesn't
change the number of files, so return void instead of the updated
file count. Caller changed.
* tests/misc/sort-merge-fdlimit: Test for the bug.
|
|
* src/sort.c (avoid_trashing_input): Fix a typo that could cause a
buffer overrun in theory. In practice this is extremely unlikely,
as it requires running out of file descriptors in a small merge,
presumably because some other process is hogging all the OS's file
descriptors.
|
|
* src/sort.c (uintptr): New type.
(enum procstate, struct procnode, update_proc): Remove.
(proctab_hasher, proctab_comparator, register_proc, wait_proc):
(reap_some): The proctab is now simply a hash of process-IDs
rather than of pointers to objects with reference counts and
states; this is smaller and faster and easier to understand.
(nprocs): Now pid_t, not size_t, since one cannot have more than
PID_MAX children.
(reap): If the argument is -1, wait; if 0 (a new value), do not.
Delete pid from proctab as needed. Ignore children that are not
in proctab, as they are from the program that exec'ed us and are
irrelevant to our success or failure.
(delete_proc, reap_all): New functions.
(open_temp): Register the child.
(sort): Clean up all children afterwards; without this patch,
'sort' sometimes missed failures in children due to race conditions.
* tests/Makefile.am (TESTS): Add misc/sort-compress-proc.
* tests/misc/sort-compress-proc: New file, to test for the
bugs fixed above.
|
|
* src/sort.c (xfopen, debug_key, sortlines, sort, main): Adjust
formatting: fix misplaced braces, use consistent spacing,
split a 2-stmt line.
|
|
* src/sort.c (specify_nthreads, merge_tree_init, init_node):
(queue_init, sortlines, struct thread_args, sort, main):
Use size_t, not unsigned long int, for thread counts, since thread
counts are now used to compute sizes.
(specify_nthreads): Check for size_t overflow.
(merge_tree_init, sort): Shorten name of local variable, for
readability.
(merge_tree_init): Move constants next to each other in product,
so that the constant folding is easier to see.
(init_node): Now static. Add 'restrict' only where it might
be helpful for compiler optimization.
(queue_init): 2nd arg is now nthreads, not "reserve", which is
a bit harder to follow. All uses changed.
(struct thread_args): Rename lo_child to is_lo_child, so that
it's obvious to the reader when we're talking about this boolean
as opposed to the new lo_child member of the other structure.
All uses changed.
(sort): Remove unused local variable end_node.
(main): Don't allow large thread counts to cause undefined behavior
later, due to integer overflow.
|
|
* src/sort.c: (merge_tree_init) New function. Allocates memory for
merge tree nodes.
(merge_tree_destory) New function.
(init_node) New function.
(sortlines) Refactor node creation code to init_node. Remove now
superfluous arguments. All callers changed.
(sort) Initialize/destory merge tree. Refactor root node creation
to merge_tree_init.
|
|
* src/sort.c: Comment fix re spin locks.
|
|
Running a command like this on a multi-core system
sort < big-file | less
would peg all processors at near 100% utilization.
* src/sort.c: (struct merge_node) Change member lock to mutex.
All uses changed.
* tests/Makefile.am (XFAIL_TESTS): Remove definition, now that
this test passes once again. I.e., the sort-spinlock-abuse test
no longer fails.
* NEWS (Bug reports): Mention this.
Reported by DJ Lucas in http://debbugs.gnu.org/7489.
|
|
* src/sort.c (struct thread_args, sortlines_thread, sortlines, sort):
Rename "merge_queue" to "queue", for consistency with other functions
that just use the name "queue" for these things.
|
|
* src/sort.c (queue_check_insert): Clarify body a bit, and remove
no-longer-needed comment.
|
|
* src/sort.c (mergelines_node): Return void, not size_t. All
callers changed. Change *node->dest here, not in caller.
Do not change node->dest: it's not needed and could cause problems
on (mostly theoretical) hosts that do not allow adding integers to
null pointers.
(queue_check_insert_parent): Omit MERGED parameter; no longer needed.
All callers changed.
|
|
* src/sort.c (write_unique): Simplify slightly so that there is
just one call to write_line, not two.
|
|
* src/sort.c (queue_check_insert, queue_check_insert_parent): Make
the queue arg first, for consistency with other functions such as
queue_insert that put the queue arg first. Rename from
check_insert and update_parent, respectively. All callers
changed.
|
|
* src/sort.c (struct merge_node): 'lock' is now the actual lock,
not a pointer to the lock; there's no need for indirection here.
Make 'level' unsigned int instead of size_t, since it is a
bit-shift count; also, move it next to a bool so that it's more
likely to take less space. All uses changed.
(sortlines, sort): Spell out initialization instead of using an
initializer. This makes the initializer a bit easier to understand,
and avoids unnecessary stores into the spin lock.
|
|
* src/sort.c: Improve comments a bit.
|
|
* src/sort.c (MAX_MERGE): Avoid integer overflow when on a machine
with (say) 32-bit int and 64-bit size_t and when level == 15.
Without this fix, on such a machine with 32768 or more processors,
the level computation could overflow on large input, and this
would result in division by zero.
|
|
* src/sort.c (write_unique): Save the entire "struct line", not
just a pointer to one. Otherwise, with a multi-thread run,
sometimes, with some inputs, fillbuf would would win a race
and clobber a "saved->text" pointer in one thread just before
it was dereferenced in a comparison in another thread.
* NEWS (Bug fixes): Mention it.
|
|
obsolete keys
* src/sort.c (key_warnings): Local buffer should be of size
INT_BUFSIZE_BOUND (uintmax_t), not INT_BUFSIZE_BOUND (sword).
This bug was discovered by running 'make check' on a 32-bit
Solaris 8 sparc host, using Sun cc. I saw several other instances
of invoking umaxtostr on a buffer declared to be of size
INT_BUFSIZE_BOUND (VAR), and these instances should at some point
be replaced by INT_BUFSIZE_BOUND (uintmax_t) too, as that's a
less error-prone style.
|
|
* src/sort.c (fold_toupper): Change this back from char to
unsigned char, fixing a portability issue introduced in commit
59e2e55d0f154a388adc9bac37d2b45f2ba971f8 dated February 26, as the
C Standard doesn't let you convert from unsigned char * to char *
without a cast, and the (in theory more portable) style here is to
convert char values, not pointer values.
(getmonth): Convert char to unsigned char when needed for
comparison.
|
|
* src/sort.c (sortlines, sort): Use pthread_spin_destroy when a
spin lock is no longer used. This isn't needed on GNU/Linux or
Solaris, but POSIX says it may free up resources on some platforms.
|
|
This change was prompted by the previous one: I audited the code
looking for similar examples. Too bad valgrind doesn't catch this.
* src/sort.c (check, mergefps): xrealloc -> free + xmalloc
* src/who.c (print_user): Likewise.
|
|
* src/sort.c (compare_random): Use free/xmalloc rather than
xrealloc, since the old buffer contents need not be preserved.
Also, don't fail if the guessed-sized malloc fails. Suggested by
Bruno Haible.
|
|
* src/sort.c: Include "ignore-value.h".
(debug_key): Use ignore_value.
|
|
* src/sort.c (compare_random): Guess that the output will be
3X the input. This avoids the overhead of calling strxfrm
twice on typical implementations. Suggested by Bruno Haible.
|
|
* NEWS: Document this.
* src/sort.c (getmonth): Omit LEN arg, as MONTH is now null-terminated.
(compare_random): Don't null-terminate keys, as caller now does that.
(compare_version): Remove.
(debug_key): Null-terminate string for getmonth.
(keycompare): Support combining -R with any of -d, -f, -i, -V.
Also, support combining -V with any of -d, -i.
(check_ordering_compatibility): Allow newly-supported combinations.
* tests/misc/sort (02q, 02r, 02s): New tests, for new combinations.
(incompat2): Now test -nR, since -fR are now compatible.
|
|
Formerly, the 'compare' function and some of its subroutines had a
debugging flag, which caused them to output underlines. This
change refactors the code so that debugging output is
more-separated from the actual sorting. In the process, the
change fixes a minor error in the debugging output. The change
shortens the source code and executable size a tad, and improves
CPU performance by 2.4% on my platform with a simple benchmark (C
locale, line sorting, no debug).
* src/sort.c (long_double, strtold): Move back to prelude, since
they're now used by multiple functions again.
(unit_order): Move to file scope, since it's now used by two functions.
(find_unit_order, human_numcompare, numcompare, general_numcompare):
Remove endptr parameter. All callers changed.
(human_numcompare): Args are now const pointers.
(getmonth): Endptr is now non-const.
(key_numeric): Move up, since it's needed earlier.
(debug_key): Take a line and a key as argument, instead of having
the caller figure out where the field is.
(debug_line): New function.
(keycompare, compare): Omit debug parameter; debug output now done
elsewhere. All callers changed.
(write_line): Renamed from write_bytes; all callers changed.
Use debug_line (not 'compare') to output debug info.
Use a slightly faster check for whether output file is stdout.
(check): Don't do debugging output; it's not that useful here,
and it confuses the code.
(main): Check for incompatibility between -c and --debug.
Use standard diagnostic for incompatible options.
* tests/misc/sort-debug-keys: Fix test case: "--Mi-1" is not
a number, so its first character should not be underlined when
debugging a numeric sort.
|
|
* lib/Makefile.am (libcoreutils_a_SOURCES): Remove xmemxfrm.c,
xmemxfrm.h.
* lib/memxfrm.c, lib/memxfrm.h, lib/xmemxfrm.c, lib/xmemxfrm.h: Remove.
* m4/memxfrm.m4: Likewise.
* m4/prereq.m4 (gl_PREREQ): Remove gl_MEMXFRM.
* po/POTFILES.in: Remove lib/xmemxfrm.c.
* src/sort.c: Don't include xmemxfrm.h.
(cmp_hashes): Remove.
(xstrxfrm): New function.
(compare_random): If a line contains NULs, don't create a big
buffer that contains the strxfrm output of each string in the line.
Instead, accumulate checksums and differences as we go, so that
at any one time we have to store at most the output of a single
strxfrm call when processing the line. This removes the need for
an memxfrm function.
|
|
* src/sort.c (debug_width): New function, which does not stop
counting tabs at \0, and also invokes mbsnwidth. Stamp out strnlen!
(count_tabs): Remove.
(debug_key): Use debug_width instead of mbsnwidth and count_tabs.
* tests/misc/sort-debug-keys: Check that \0 and \t intermix.
|
|
* NEWS: Document changes to sort -h, which are now minor with
respect to the pre-July-30th version.
* doc/coreutils.texi (sort invocation): Likewise. The
documentation now describes how -h comparison is done rather than
being vague with border cases.
* src/sort.c (long_double, strtold): Move back to general_numcompare.
(LD, compute_human): Remove.
(find_unit_order): Remove THOU_SEP parameter, since thousands
separators are now allowed by all callers. Revert to previous
behavior of sorting by suffix, and returning the order rather than
2 * order + binary, since we no longer care whether binary powers
are being used. However, treat all zeros the same, instead of
sorting 0M before 0G; this is more consistent with the desired
behavior of sorting -1G before -1M.
* tests/misc/sort (h1, h3, h6): Adjust to match mostly-reverted
behavior. However, check that all zeros sort together.
* tests/misc/sort-debug-keys: Omit a "_", since the trailing "i"
in "1234Gi" is no longer part of the key.
|
|
* NEWS: Document changes to sort -h.
* doc/coreutils.texi (sort invocation): Likewise.
* src/sort.c (long_double, strtold): Move to prelude, since they're
now used by multiple functions.
(LD): New macro.
(struct keyfield.iec_present): Remove this member. All uses removed.
(check_mixed_SI_IEC): Remove. This code was busted in the presence
of multiple threads, as it had a race condition.
(find_unit_order): Remove arg KEY; add arg THOU_SEP; arg ENDPTR is
now char ** rather than char const **. Return an integer that
distinguishes decimal from binary powers. Parse the number
consistently with the intersection of strtold and strnumcmp.
Set *ENDPTR unconditionally.
(compute_human): New static function.
(human_numcompare): Remove arg KEY. Remove 'const' from other args.
Use strnumcmp if possible, but fall back on floating point if not.
(numcompare, general_numcompare): Arg EA is now char ** rather
than char const **.
(numcompare): Adjust to new find_unit_order signature and behavior.
(keycompare): Adjus to new human_numcompare signature.
* tests/misc/sort (h1, h3, h4, h6): Adjust to new behavior.
* tests/misc/sort-debug-keys: Likewise.
|
|
* src/sort.c (mark_key): Don't assume offset <= INT_MAX.
Make the code a bit clearer when width != 0.
|
|
* src/sort.c (fillbuf): Don't append eol unless the line is nonempty.
This fixes a bug that was partly but not completely fixed by
the aadc67dfdb47f28bb8d1fa5e0fe0f52e2a8c51bf commit (dated July 15).
* tests/misc/sort (realloc-buf-2): New test, which catches this
bug on 64-bit hosts.
|
|
* src/sort.c (mergelines, queue_destroy, queue_init, queue_insert):
(queue_pop, write_unique, mergelines_node, check_insert):
(update_parent): No longer inline; these uses of "inline"
seemed unlikely to help performance much.
|
|
* src/sort.c (find_unit_order): Don't assume ASCII.
|
|
* src/sort.c (queue_pop): Omit unnecessary unlock+lock after
pthread_cond_wait returns. Don't access "count" member of the
heap; any efficiency gains should be quite minor, the access
complicates this code, and "count" should be private anyway.
|
|
* src/sort.c (lock_node, unlock_node, queue_destroy, queue_init):
(queue_pop):
Omit 'restrict'; it shouldn't help here, as these functions have just
one pointer parameter and don't access static storage.
(queue_insert, check_insert, update_parent): Omit 'restrict', as
the pointer types differ, and are not char * or unsigned char *,
and therefore can't alias.
(write_unique): Omit 'restrict', as the pointer types are all
read-only.
(merge_loop, sortlines): Omit 'restrict', as any performance
advantages are extremely unlikely and it's not worth cluttering
the code for that.
(struct thread_args): Omit 'restrict': this seems to be incorrect.
It's unlikely for 'restrict' to be correct inside a typedef.
|
|
* src/sort.c (inittables, general_numcompare, compare_nodes):
(queue_init, queue_pop): Omit casts that are not needed, typically
because they are between void * and some other pointer type.
|
|
* src/sort.c (proctab_hasher, proctab_comparator, stream_open, xfopen):
(open_temp, zaptemp, struct_month_cmp, begfield, limfield):
(find_unit_order, human_numcompare, numcompare, general_numcompare):
(count_tabs, keycompare, compare, compare_nodes, lock_node):
(unlock_node, queue_destroy, queue_init, queue_insert, queue_pop):
(write_unique, mergelines_node, check_insert, update_parent):
(merge_loop, sortlines, struct thread_args, set_ordering):
Prefer the style "T const" to "const T".
* gl/lib/heap.h (struct heap, heap_alloc): Likewise.
* gl/lib/heap.c (heap_default_compare, heapify_down, heapify_up):
(heap_alloc): Likewise.
|
|
* bootstrap.conf: Include the new module
* gl/lib/fadvise.c: Provide a simpler interface to posix_fadvise.
(fadvise): Provide hint to the whole file associated with a stream.
(fdadvise): Provide hint to the specific portion of a file
associated with a file descriptor.
* gl/lib/fadvise.h: Redefine POSIX_FADV_* to FADVISE_* enums.
* gl/modules/fadvise: New file.
* m4/jm-macros.m4: Remove the no longer needed posix_fadvise check.
* .x-sc_program_name: Exclude test-fadvise.c from this check.
* gl/tests/test-fadvise (main): New test program.
* gl/modules/fadvise-testss: A new index to reference the tests.
* src/sort.c (stream_open): Use the new interface.
* src/dd.c (iwrite): Likewise.
|