summaryrefslogtreecommitdiff
path: root/src/sort.c
AgeCommit message (Collapse)Author
2015-04-30doc: standardize messages about the '-' stdin FILEPádraig Brady
* src/system.h (emit_stdin_note): A new function, refactoring the usage note about the '-' FILE implying stdin. * src/base64.c (usage): Use the new function to emit the note in a standard location and with standard separation. * src/cat.c (usage): Likewise. * src/csplit.c (usage): Likewise. * src/cut.c (usage): Likewise. * src/expand.c (usage): Likewise. * src/fmt.c (usage): Likewise. * src/head.c (usage): Likewise. * src/md5sum.c (usage): Likewise. * src/nl.c (usage): Likewise. * src/od.c (usage): Likewise. * src/paste.c (usage): Likewise. * src/pr.c (usage): Likewise. * src/ptx.c (usage): Likewise. * src/shred.c (usage): Likewise. * src/shuf.c (usage): Likewise. * src/sort.c (usage): Likewise. * src/sum.c (usage): Likewise. * src/tac.c (usage): Likewise. * src/tail.c (usage): Likewise. * src/tsort.c (usage): Likewise. * src/unexpand.c (usage): Likewise. * src/wc.c (usage): Likewise. * src/join.c (usage): Adjust the separation used for the message referring to FILE1 or FILE2 as stdin. * src/comm.c (usage): Add a message using the same wording (translation) as used in join. * src/split.c (usage): Reword to using FILE rather than INPUT, allowing use of emit_stdin_note(). Also remove the mention of "fixed-size" pieces as this isn't now always the case. Fixes http://pad.lv/1450179
2015-01-01maint: update all copyright year number rangesPádraig Brady
Run "make update-copyright" and then... * tests/sample-test: Adjust to use the single most recent year. * tests/du/bind-mount-dir-cycle-v2.sh: Fix case in copyright message, so that year is updated automatically in future.
2014-09-19doc: output correct --help references with --program-prefixPádraig Brady
* src/system.h (emit_ancillary_info): Take the invariant PROGRAM_NAME as a parameter, so that consistent references are made to online docs and texinfo nodes, when a --program-prefix is in place. Note the man pages don't need this fix as they're generated before the program prefix is used. * NEWS: Mention the improvements in references to online documentation.
2014-09-08maint: prefer 'return status;' to 'exit (status);' in 'main'Paul Eggert
* build-aux/gen-single-binary.sh: Don't use ATTRIBUTE_NORETURN for main functions. * src/base64.c, src/basename.c, src/cat.c, src/chcon.c, src/chgrp.c: * src/chmod.c, src/chown.c, src/chroot.c, src/cksum.c, src/comm.c: * src/cp.c, src/csplit.c, src/cut.c, src/date.c, src/dd.c, src/df.c: * src/dircolors.c, src/dirname.c, src/du.c, src/echo.c, src/env.c: * src/expand.c, src/expr.c, src/factor.c, src/fmt.c, src/fold.c: * src/getlimits.c, src/groups.c, src/head.c, src/hostid.c: * src/hostname.c, src/id.c, src/install.c, src/join.c, src/kill.c: * src/link.c, src/ln.c, src/logname.c, src/ls.c, src/make-prime-list.c: * src/md5sum.c, src/mkdir.c, src/mkfifo.c, src/mknod.c, src/mktemp.c: * src/mv.c, src/nice.c, src/nl.c, src/nohup.c, src/nproc.c: * src/numfmt.c, src/od.c, src/paste.c, src/pathchk.c, src/pinky.c: * src/pr.c, src/printenv.c, src/printf.c, src/ptx.c, src/pwd.c: * src/readlink.c, src/realpath.c, src/rm.c, src/rmdir.c, src/runcon.c: * src/seq.c, src/shred.c, src/shuf.c, src/sleep.c, src/sort.c: * src/split.c, src/stat.c, src/stdbuf.c, src/stty.c, src/sum.c: * src/sync.c, src/tac.c, src/tail.c, src/tee.c, src/timeout.c: * src/touch.c, src/tr.c, src/true.c, src/truncate.c, src/tsort.c: * src/tty.c, src/uname.c, src/unexpand.c, src/uniq.c, src/unlink.c: * src/uptime.c, src/users.c, src/wc.c, src/who.c, src/whoami.c: In 'main' functions, Prefer 'return status;' to 'exit (status);'. * src/coreutils-arch.c (_single_binary_main_uname) (_single_binary_main_arch): * src/coreutils-dir.c, src/coreutils-vdir.c (_single_binary_main_ls) (_single_binary_main_dir, _single_binary_main_vdir): Omit ATTRIBUTE_NORETURN. Return a value. * src/coreutils.c (SINGLE_BINARY_PROGRAM): Omit ATTRIBUTE_NORETURN. (launch_program): Now static. * src/dd.c (finish_up): New function. (quit, main): Use it. * src/getlimits.c (main): Return a proper exit status. * src/test.c (test_main_return): New macro. (main): Use it. * src/logname.c, src/nohup.c, src/whoami.c: Use 'error' to simplify exit status in 'main' function. * src/yes.c (main): Use 'return' rather than 'error' to exit, so that GCC doesn't suggest ATTRIBUTE_NORETURN.
2014-07-13sort: avoid undefined operation with destroying locked mutexPádraig Brady
This didn't seem to cause any invalid operation on GNU/Linux at least, but depending on the implementation, mutex deadlocks could occur. For example this might be the cause of lockups seen on Solaris: http://lists.gnu.org/archive/html/coreutils/2013-03/msg00048.html This was identified with valgrind 3.9.0 with this setup: seq 200000 > file.sort valgrind --tool=drd src/sort file.sort -o file.sort With that, valgrind would _intermittently_ report the following: Destroying locked mutex: mutex 0x5419548, recursion count 1, owner 2. at 0x4C2E3F0: pthread_mutex_destroy(in vgpreload_drd-amd64-linux.so) by 0x409FA2: sortlines (sort.c:3649) by 0x409E26: sortlines (sort.c:3621) by 0x40AA9E: sort (sort.c:3955) by 0x40C5D9: main (sort.c:4739) mutex 0x5419548 was first observed at: at 0x4C2DE82: pthread_mutex_init(in vgpreload_drd-amd64-linux.so) by 0x409266: init_node (sort.c:3276) by 0x4092F4: init_node (sort.c:3286) by 0x4090DD: merge_tree_init (sort.c:3234) by 0x40AA5A: sort (sort.c:3951) by 0x40C5D9: main (sort.c:4739) Thread 2: The object at address 0x5419548 is not a mutex. at 0x4C2F4A4: pthread_mutex_unlock(in vgpreload_drd-amd64-linux.so) by 0x4093CA: unlock_node (sort.c:3323) by 0x409C85: merge_loop (sort.c:3531) by 0x409F8F: sortlines (sort.c:3644) by 0x409CE3: sortlines_thread (sort.c:3574) by 0x4E44F32: start_thread (in /usr/lib64/libpthread-2.18.so) by 0x514EEAC: clone (in /usr/lib64/libc-2.18.so) * src/sort.c (sortlines): Move pthread_mutex_destroy() out to merge_tree_destroy(), so that we don't overlap mutex destruction with threads still operating on the nodes. (sort): Call the destructors only with "lint" defined, as the memory used will be deallocated implicitly at process end. * NEWS: Mention the bug fix.
2014-07-13sort: fix two threading issues reported by valgrindShayan Pooya
Neither issue impacts on the correct operation of sort. The issues were detected by both valgrind 3.8.1 and 3.9.0 using: seq 200000 > file.sort valgrind --tool=drd src/sort file.sort -o file.sort For tool usage and error details see: http://valgrind.org/docs/manual/drd-manual.html * src/sort.c (queue_insert): Unlock mutex _after_ signalling the associated condition variable. Valgrind flags this with: "Probably a race condition: condition variable 0xffeffffb0 has been signaled but the associated mutex 0xffeffff88 is not locked by the signalling thread." The explanation at the above URL is: "Sending a signal to a condition variable while no lock is held on the mutex associated with the condition variable. This is a common programming error which can cause subtle race conditions and unpredictable behavior." This should at least give more defined scheduling behavior. (merge_tree_destroy): Make symmetrical with merge_tree_init() thus destroying the correct mutex. Valgrind flags this with: "The object at address 0x5476cf8 is not a mutex."
2014-05-26doc: clarify --zero-terminated optionPádraig Brady
* src/join.c (usage): Reword to avoid implication that the NUL byte is only generated as the output delimeter. * src/sort.c (usage): Likewise. * src/shuf.c (usage): Likewise. Also since we're changing the translation string take the opportunity to separate out the description to a separate string to reduce translation overhead. * src/uniq.c (usage): Likewise. * src/stty.c (usage): s/null/NUL/ for consistency. * src/basename.c (usage): Reword for accuracy/consistency. * src/dirname.c (usage): Likewise. * src/du.c (usage): Likewise. * src/env.c (usage): Likewise. * src/printenv.c (usage): Likewise. * src/readlink.c (usage): Likewise. * src/realpath.c (usage): Likewise. * doc/coreutils.texi: Consolidate/share the descriptions of --null, --zero and --zero-terminated.
2014-01-02maint: update all copyright year number rangesBernhard Voelker
Run "make update-copyright", but then also run this, perl -pi -e 's/2\d\d\d-//' tests/sample-test to make that one script use the single most recent year number.
2013-11-27sort: avoid issues when issuing diagnostics from child processesPádraig Brady
* src/sort.c: (async_safe_die): A new limited version of error(), that outputs fixed strings and unconverted errnos to stderr. This is safe to call in the limited context of a signal handler, or in this particular case, between the fork() and exec() of a multithreaded process. (move_fd_or_die): Use the async_safe_die() rather than error(). (maybe_create_temp): Likewise. (open_temp): Likewise. Fixes http://bugs.gnu.org/15970
2013-09-01maint: update out of date confusing commentsPádraig Brady
* src/copy.c (copy_internal): Change mention of the removed --reply=no option, to the similar in this context --no-clobber. * src/sort.c: SI and IEC suffixes can now be mixed when --human-numeric.
2013-05-18maint: port --enable-gcc-warnings to clangPaul Eggert
* configure.ac: If clang, add -Wno-format-extra-args and -Wno-tautological-constant-out-of-range-compare. * gl/lib/rand-isaac.c (ind): * gl/lib/randread.c (readisaac): * src/ls.c (dev_ino_push, dev_ino_pop): * src/sort.c (buffer_linelim): * src/system.h (is_nul): * src/tail.c (tail_forever_inotify): Rewrite to avoid casts that clang dislikes. It's good to avoid casts anyway. * src/expr.c (integer_overflow): Declare only if it exists. (die): Remove; unused. * src/ls.c (dev_ino_push): New function, replacing ... (DEV_INO_PUSH): ... this removed macro. All uses changed. (decode_switches): Rewrite "str"+i to &str[i].
2013-01-23maint: define usage note about mandatory args centrallyBernhard Voelker
Each program with at least one long option which is marked as 'required_argument' and which has also a short option for that option, should print a note about mandatory arguments. Define that well-known note centrally and use it rather than literal printf/fputs, and add it where it was missing. * src/system.h (emit_mandatory_arg_note): Add new function. * src/cp.c (usage): Use it rather than literal printf/fputs. * src/csplit.c, src/cut.c, src/date.c, src/df.c, src/du.c: * src/expand.c, src/fmt.c, src/fold.c, src/head.c, src/install.c: * src/kill.c, src/ln.c, src/ls.c, src/mkdir.c, src/mkfifo.c: * src/mknod.c, src/mv.c, src/nl.c, src/od.c, src/paste.c: * src/pr.c, src/ptx.c, src/shred.c, src/shuf.c, src/sort.c: * src/split.c, src/stdbuf.c, src/tac.c, src/tail.c, src/timeout.c: * src/touch.c, src/truncate.c, src/unexpand.c, src/uniq.c: Likewise. * src/base64.c (usage): Add call of the above new function because at least one long option has a required argument. * src/basename.c, src/chcon.c, src/date.c, src/env.c: * src/nice.c, src/runcon.c, src/seq.c, src/stat.c, src/stty.c: Likewise.
2013-01-01maint: update all copyright year number rangesJim Meyering
Run "make update-copyright", but then also run this, perl -pi -e 's/2\d\d\d-//' tests/sample-test to make that one script use the single most recent year number.
2012-08-18sort: simpler fix for sort -u data-loss bug, and for a FMR bugPaul Eggert
This also fixes a free-memory-read (FMR) bug: when fillbuf's realloc of buf->buf frees the buffer into which saved_line.text points, the processing of that just-read longer line includes comparison against the saved line in freed memory. * src/sort.c (overlap): Remove. (fillbuf): Do not try to copy saved lines, as that is too risky in the presence of parallelism, reallocated buffers, etc. (sort): Invalidate any saved line before sorting a new batch.
2012-08-17sort: sort --unique (-u) could cause data lossJim Meyering
sort -u could omit one or more lines of expected output. This bug arose because sort recorded the most recently printed line via reference, and if you were unlucky, the storage for that line would be reused (overwritten) as additional input was read into memory. If you were doubly unlucky, the new value of the "saved" line would not only match the very next line, but if that next line were also the first in a series of identical, not-yet-printed lines, then the corrupted "saved" line value would result in the omission of all matching lines. * src/sort.c (saved_line): New static/global, renamed and moved from... (write_unique): ...here. Old name was "saved", which was too generic for its new role as file-scoped global. (fillbuf): With --unique, when we're about to read into a buffer that overlaps the saved "preceding" line (saved_line), copy the line's .text member to a realloc'd-as-needed temporary buffer and adjust the line's key-defining members if they're set. (overlap): New function. * tests/misc/sort: New tests. * NEWS (Bug fixes): Mention it. * THANKS.in: Update. Bug introduced via commit v8.5-89-g9face83. Reported by Rasmus Borup Hansen in http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/23173/focus=24647
2012-08-16maint: correct a stale comment in sort.cJim Meyering
* src/sort.c (fillbuf): Fix comment typo. x2nrealloc no longer doubles the size of its input buffer.
2012-07-10sort: by default, do not exceed 3/4 of physical memoryPaul Eggert
* src/sort.c (default_sort_size): Do not exceed 3/4 of total memory. See Jeff Janes's bug report in <http://lists.gnu.org/archive/html/coreutils/2012-06/msg00018.html>.
2012-07-02sort: fix exit-status typoPaul Eggert
* src/sort.c (stream_open): EXIT_FAILURE -> SORT_FAILURE. Suggested by Pádraig Brady in <http://bugs.gnu.org/11816#34>.
2012-07-02sort: simplify -o handling to avoid fdopen, assertPaul Eggert
* src/sort.c (outfd): Remove. All uses replaced by STDOUT_FILENO. (stream_open): When writing, use stdout rather than fdopen. (move_fd_or_die): Renamed from dup2_or_die, with the added functionality of closing its first argument. All uses changed. (avoid_trashing_input): Special case for !outfile no longer needed. (check_output): Arrange for standard output to go to the file, rather than storing the fd in outfd.
2012-07-02sort: avoid redundant processing with inaccessible inputs or outputPádraig Brady
* src/sort.c (check_inputs): A new function to verify all inputs are accessible before further processing. (check_output): A new function to open or create a specified output file, before futher processing. (stream_open): Adjust to truncating the previously opened output file rather than opening directly. (avoid_trashing_input): Optimize to stat the output file descriptor, rather than the file name. (main): Call the new functions to check accessibility of inputs and output, before processing starts. * tests/misc/sort: Adjust to the changed error message. * tests/misc/sort-merge-fdlimit: Account for the earlier opened file descriptor of the specified output file. * tests/misc/sort-exit-early: A new test to exercise the improvements. * tests/Makefile.am: Reference the new test. * NEWS: Mention the improvement. Suggested-by: Bernhard Voelker
2012-06-20maint: sort: style adjustment to help clarify size determinationBernhard Voelker
* src/sort.c (default_sort_size): Move physmem code "down" to first use.
2012-05-16maint: add assertions to placate static analysis toolsJim Meyering
A static analysis tool (http://labs.oracle.com/projects/parfait/) produced some false positive diagnostics. Add assertions to help it understand that the code is correct. * src/stty.c: Include <assert.h>. (display_changed): Add an assertion to placate parfait. (display_all): Likewise. * src/sort.c: Include <assert.h>. (main): Add an assertion to placate parfait. * src/fmt.c: Include <assert.h>. (get_paragraph): Add an assertion to placate parfait.
2012-05-04maint: rely on gnulib's new sys_resource moduleJim Meyering
* bootstrap.conf (gnulib_modules): Add sys_resource. * src/sort.c: Remove #if HAVE_SYS_RESOURCE_H guard around inclusion of <sys/resource.h> and move the inclusion "up" into the alphabetized list of its peers. This also avoids a failure of the sc_prohibit_always_true_header_tests syntax-check rule. * m4/jm-macros.m4 (gl_CHECK_ALL_HEADERS): Remove sys/resource.h.
2012-02-25sort: default to physmem/8, not physmem/16Paul Eggert
* src/sort.c (default_sort_size): Don't divide advice by 2. Just divide the hard limits by 2. This matches the comments. Reported by Rogier Wolff in http://bugs.gnu.org/10877
2012-01-30maint: sort: remove the last uses of "'%s'" in diagnosticsJim Meyering
* src/sort.c (key_warnings): Use quote (quote_n, since there are two) rather than literal single quotes ('%s') in diagnostic.
2012-01-27maint: use single copyright year rangeJim Meyering
Run "make update-copyright".
2012-01-22maint: quote 'like this' or "like this", not `like this'Paul Eggert
* doc/coreutils.texi (Formatting the file names): coreutils now quotes 'like this'. * man/help2man: * src/timeout.c (usage): Quote 'like this' in diagnostics. * HACKING, Makefile.am, NEWS, README, README-hacking, TODO, cfg.mk: * doc/Makefile.am, doc/coreutils.texi, m4/jm-macros.m4: * man/Makefile.am, man/help2man, src/Makefile.am, src/copy.h: * src/extract-magic, src/ls.c, src/pinky.c, src/pr.c, src/sort.c: * src/split.c, src/timeout.c, src/who.c, tests/dd/skip-seek-past-file: * tests/pr/pr-tests: Quote 'like this' in commentary. * cfg.mk (old_NEWS_hash): Update due to changed old NEWS.
2012-01-09maint: src/*.[ch]: convert more `...' to '...'Jim Meyering
Run this (twice): git grep -E -l '`.+'\' src/*.[ch] \ |xargs perl -pi -e 's/`(.+?'\'')/'\''$1/'
2012-01-09maint: src/*.c: change remaining quotes (without embedded spaces)Jim Meyering
Run this (twice): git grep -E -l '`[^ ]+'\' src/*.c \ |xargs perl -pi -e 's/`([^ ]+'\'')/'\''$1/'
2012-01-09maint: convert `...' to '...' in --help outputJim Meyering
All affected lines end with \ or \n\, so run this command until it produces no new changes (4 times): git grep -E -l '`[^ ]+'\''.*\\' src \ |xargs perl -pi -e 's/`([^ ]+'\''.*\\)/'\''$1/'
2012-01-09maint: adjust quoting: emit '...', not `...' in diagnosticsJim Meyering
* src/csplit.c (parse_repeat_count, extract_regexp): As above. * src/date.c (main): Likewise. * src/ls.c (decode_switches): Likewise. * src/od.c (decode_one_format, main): Likewise. * src/pathchk.c (no_leading_hyphen): Likewise. * src/pr.c (main, getoptarg): Likewise. * src/rm.c (diagnose_leading_hyphen): Likewise. * src/sort.c (key_warnings, incompatible_options, main): Likewise. * src/stat.c (print_esc_char): Print '\x', not `\x' in diagnostic. * src/test.c (main): Likewise. * src/touch.c (main): Likewise. * src/tr.c (build_spec_list, validate, append_range): Likewise. * tests/misc/mktemp: This is an unusual case, since the affected string contains only the ` of an `...' string. So we change the long ` to a lone '. * tests/pr/pr-tests: Manual quote adapting fix-up. * tests/ln/hard-to-sym: Likewise. * tests/split/suffix-length: Likewise. * tests/mv/part-fail: Likewise. * tests/misc/chcon: Likewise. * tests/misc/stat-printf: Likewise.
2012-01-07maint: use new emit_try_help in place of equivalent fprintfJim Meyering
Run this command: perl -0777 -pi -e \ 's/fprintf \(stderr, _\("Try `%s --help.*\n.*;/emit_try_help ();/m'\ src/*.c
2012-01-01maint: update all copyright year number rangesJim Meyering
Run "make update-copyright".
2011-12-05maint: sort, stat: remove unused parametersJim Meyering
* src/sort.c (struct thread_args) [is_lo_child]: Remove member. (sortlines): Remove unused parameter, "is_lo_child". Update callers. * src/stat.c (out_epoch_sec): Mark unused parameter. (do_statfs, do_stat): Remove unused parameter, "terse". Update callers.
2011-11-16sort: clarify wording on -k syntaxEric Blake
* src/sort.c (usage): Use KEYDEF instead of POS, and call out the specific OPTS that can occur in KEYDEF. Based on a report by Lars Noodén, http://bugs.gnu.org/10019
2011-09-27sort: avoid a NaN-induced infloopJim Meyering
These commands would fail to terminate: yes -- -nan | head -156903 | sort -g > /dev/null echo nan > F; sort -m -g F F That can happen with any strtold implementation that includes uninitialized data in its return value. The problem arises in the mergefps function when bubble-sorting the two or more lines, each from one of the input streams being merged: compare(a,b) returns 64, yet compare(b,a) also returns a positive value. With a broken comparison function like that, the bubble sort never terminates. Why do the long-double bit strings corresponding to two identical "nan" strings not compare equal? Because some parts of the result are uninitialized and thus depend on the state of the stack. For more details, see http://bugs.gnu.org/9612. * src/sort.c (nan_compare): New function. (general_numcompare): Use it rather than bare memcmp. Reported by Aaron Denney in http://bugs.debian.org/642557. * NEWS (Bug fixes): Mention it. * tests/misc/sort-NaN-infloop: New file. * tests/Makefile.am (TESTS): Add it.
2011-07-02maint: use "const" and "pure" function attributes where possibleJim Meyering
* configure.ac (WARN_CFLAGS): Add -Wsuggest-attribute=const, -Wsuggest-attribute=pure and -Wsuggest-attribute=noreturn. (GNULIB_WARN_CFLAGS): But do not add them here... yet. * src/chown-core.h (chopt_free, uid_to_name): Add function attribute(s). * src/copy.c (is_ancestor, valid_options): Likewise. * src/copy.h (chown_failure_ok): Likewise. * src/dd.c (operand_matches, operand_is): Likewise. * src/df.c (selected_fstype, excluded_fstype): Likewise. * src/expr.c (null looks_like_integer): Likewise. * src/md5sum.c (hex_digits): Likewise. * src/od.c (get_lcm): Likewise. * src/pathchk.c (component_start, component_len): Likewise. * src/pinky.c (count_ampersands): Likewise. * src/pr.c (cols_ready_to_print): Likewise. * src/ptx.c (search_table): Likewise. * src/sort.c (find_unit_order): Likewise. * src/stty.c (mode_type_flag, string_to_baud, baud_to_value): Likewise. * src/system.h (gcd, lcm): Likewise. * src/tr.c (is_char_class_member, look_up_char_class): Likewise. (star_digits_closebracket): Likewise. * src/uniq.c (find_field): Likewise. * src/wc.c (compute_number_width): Likewise. * lib/xfts.h (cycle_warning_required): Likewise. * gl/lib/randint.h (randint_get_source): Likewise. * gl/lib/randperm.c (ceil_lg): Likewise. * gl/lib/randperm.h (randperm_bound): Likewise. * lib/strnumcmp.h (strintcmp): Likewise.
2011-05-19maint: correct typos involving misuse of "a" and "an"Jim Meyering
* NEWS: "an misleading" * src/expr.c: "a integer * src/ptx.c (find_occurs_in_text): "a end" * src/shred.c (do_wipefd): "a infinite" * src/sort.c (SUBTHREAD_LINES_HEURISTIC): "an dual-core" (compare_random): "an checksum" * cfg.mk (old_NEWS_hash): Update, since the typo was in old news.
2011-05-06sort: fix a contradictory --debug warningPádraig Brady
* src/sort.c (key_warn): `sort -k2,1n --debug` would output warnings about being both "zero width" and "spanning multiple fields". Suppress the latter one. * tests/misc/sort-debug-warn: Add a couple of test cases.
2011-03-16sort: avoid memory pressure of 130MB/thread when reading from pipeJim Meyering
* src/sort.c (INPUT_FILE_SIZE_GUESS): Decrease initial allocation factor used to size buffer used when reading a non-regular file. For motivation, see discussion here: http://thread.gmane.org/gmane.comp.gnu.coreutils.general/878/focus=887
2011-03-13sort: spawn fewer threads for small inputsJim Meyering
* src/sort.c (SUBTHREAD_LINES_HEURISTIC): Do not spawn a new thread for every 4 lines. Increase this from 4 to 128K. 128K lines seems appropriate for a 5-year-old dual-core laptop, but it is too low for some common combinations of short lines and/or newer systems. * NEWS (Bug fixes): Mention it.
2011-02-03sort: fix --debug key highlighting when key start after key endPádraig Brady
This case was overlooked in commit bdde34f9, 2010-08-05, "sort: tune and refactor --debug code, and fix minor underlining bug" * src/sort.c (debug_key): Don't adjust the key end when it's before the key start. * tests/misc/sort-debug-keys: Add a test case.
2011-01-07maint: suppress some clang scan-build warningsPádraig Brady
* src/pr.c (char_to_clump): Remove a dead store. * src/remove.c (fts_skip_tree): Likewise. * src/sort.c (key_warnings): Likewise. (sort): Suppress an uninitialized pointer warning.
2011-01-01maint: update all copyright year number rangesJim Meyering
Run "make update-copyright".
2010-12-28coreutils: keep lines within 80-column limitsPaul Eggert
* cfg.mk (LINE_LEN_MAX, FILTER_LONG_LINES): New macros. (sc_long_lines): New rule. * HACKING: Use shorter URLs to the same material. * doc/Makefile.am, doc/coreutils.texi, m4/boottime.m4: * man/help2man, man/stdbuf.x, src/Makefile.am, src/cat.c, src/copy.c: * src/cp.c, src/dd.c, src/df.c, src/du.c, src/groups.c, src/install.c: * src/ls.c, src/md5sum.c, src/mv.c, src/od.c, src/pinky.c, src/ptx.c: * src/readlink.c, src/remove.c, src/rmdir.c, src/setuidgid.c: * src/sort.c, src/tail.c, src/touch.c, tests/Coreutils.pm: * tests/cp/existing-perm-race, tests/cp/perm, tests/cp/preserve-gid: * tests/du/2g, tests/du/long-from-unreadable, tests/init.sh: * tests/install/basic-1, tests/ls/nameless-uid: * tests/ls/readdir-mountpoint-inode, tests/misc/chroot-credentials: * tests/misc/cut, tests/misc/date, tests/misc/join, tests/misc/md5sum: * tests/misc/sha1sum, tests/misc/sha224sum, tests/misc/sort: * tests/misc/sort-continue, tests/misc/sort-files0-from: * tests/misc/sort-rand, tests/misc/stdbuf, tests/misc/tr: * tests/misc/uniq, tests/mv/atomic, tests/mv/part-fail: * tests/mv/part-symlink, tests/mv/sticky-to-xpart, tests/pr/pr-tests: * tests/rm/fail-2eperm, tests/rm/interactive-always: Reformat to fit within 80 columns. * doc/Makefile.am (BAD_POSIX_PERL): New macro. * doc/coreutils.texi: Reword slightly, to make menus and index lines shorter. * src/md5sum.c: Redo --help output so that it fits within 79 columns, since that's a bit more portable and all the other --help strings fit in 79 columns.
2010-12-22sort: minor performance tweak with num_processorsPaul Eggert
* src/sort.c (main): Don't invoke num_processors twice.
2010-12-20maint: fix a typo in sort --parallel help messagePádraig Brady
Also fix up Chen Guo's contacts * src/sort.c (usage): Add a missing "of" * THANKS: Add Chen Guo * .mailmap: Add Chen Guo's UCLA address
2010-12-19sort: use at most 8 threads by defaultPádraig Brady
* src/sort.c (main): If --parallel isn't specified, restrict the number of threads to 8 by default. If the --parallel option is specified, then allow any number of threads to be set, independent of the number of processors on the system. * doc/coreutils.texi (sort invocation): Document the changes to determining the number of threads to use. Mention the memory overhead when using multiple threads. * tests/misc/sort-spinlock-abuse: Allow single core systems that support pthreads. * tests/misc/sort-stale-thread-mem: Likewise. * tests/misc/sort-unique-segv: Likewise. * NEWS: Mention the change in behaviour.
2010-12-16sort: do not generate thousands of subprocesses for 16-way mergePaul Eggert
Without this change, tests/misc/sort-compress-hang would consume more than 10,000 process slots on my RHEL 5.5 x86-64 server, making it likely for other applications to fail due to lack of process slots. With this change, the same benchmark causes 'sort' to consume at most 19 process slots. The change also improved wall-clock time by 2% and user+system time by 14% on that benchmark. * NEWS: Document this. * src/sort.c (MAX_PROCS_BEFORE_REAP): Remove. (reap_exited): Renamed from reap_some; this is a more accurate name, since "some" incorrectly implies that it reaps at least one process. All uses changed. (reap_some): New function: it *does* reap at least one process. (pipe_fork): Do not allow more than NMERGE + 2 subprocesses. (mergefps, sort): Omit check for exited processes: no longer needed, and anyway the code consumed too much CPU per line when 2 < nprocs.
2010-12-16sort: fix hang with sort --compressPaul Eggert
* NEWS: Document this. * src/sort.c (UNCOMPRESSED, UNREAPED, REAPED): New constants. (struct tempnode): New member 'state', to hold these constants. The pid member is now undefined if state == UNCOMPRESSED. (struct sortfile): Replace member 'pid' with member 'temp'. (uintptr): Remove. (proctab_hasher, proctab_comparator, register_proc, delete_proc): Proctab entries are now struct tempnode *, not pid_t, to handle the case where multiple tempnode objects correspond to the same pid. This avoids a race condition that can cause a hang. (register_proc): Arg is now struct tempnode *, not pid_t. All callers changed. (delete_proc): Set tempnode state to REAPED. (create_temp_file): No need to set pid member here; it's now done when the pid is known. (maybe_create_temp, create_temp): Remove PPID arg. Return struct tempnode *, not char *. All callers changed. (maybe_create_temp): Set node state to UNCOMPRESSED or UNREAPED. No need to set node->pid to 0. (open_temp): Replace NAME and PID args with a single TEMP arg. All callers changed. Wait only for unreaped children. (zaptemp): Wait for decompressor to finish before removing its temporary-file input. This avoids .nfsXXXX hassles with NFS and fixes a race (leading to a hang) regardless of NFS. (open_input_files): Adjust to new way of dealing with temp files and their subprocesses. * tests/Makefile.am (TESTS): Add misc/sort-compress-hang. * tests/misc/sort-compress-hang: New file.