Age | Commit message (Collapse) | Author |
|
* .x-sc_program_name: Exclude all current and future
c files in gl/tests from this check
* gl/tests/test-di-set.c: Remove the hack to work around
the set_program_name syntax-check
* gl/tests/test-ino-map.c: Likewise
* gl/tests/test-rand-isaac.c: Likewise
|
|
* doc/coreutils.texi (sort invocation): Use @pxref inside parentheses.
|
|
* doc/coreutils.texi (md5sum invocation): Mention currently known
security problems. Don't recommend SHA-1 as alternative.
* man/md5sum.x (BUGS): Warn about the vulnerabilities and
reference the SHA-2 based alternatives.
Reported by Simon Josefsson
|
|
|
|
This change was prompted by the previous one: I audited the code
looking for similar examples. Too bad valgrind doesn't catch this.
* src/sort.c (check, mergefps): xrealloc -> free + xmalloc
* src/who.c (print_user): Likewise.
|
|
* src/sort.c (compare_random): Use free/xmalloc rather than
xrealloc, since the old buffer contents need not be preserved.
Also, don't fail if the guessed-sized malloc fails. Suggested by
Bruno Haible.
|
|
Fixes bug 6053 for NFS on HP-UX not recognizing ACL operations.
* gnulib: Update to latest version.
|
|
* src/sort.c: Include "ignore-value.h".
(debug_key): Use ignore_value.
|
|
* gl/tests/test-rand-isaac.c (main): Avoid warnings from
syntax-check.
|
|
Resolves bug#6793.
* src/dircolors.hin: Add screen.rxvt & terminator.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
|
|
* tests/ls/readdir-mountpoint-inode: Check to see if skip_test_ is
called in a helper function via $() instead of mistakenly failing.
* THANKS: Update.
|
|
* src/sort.c (compare_random): Guess that the output will be
3X the input. This avoids the overhead of calling strxfrm
twice on typical implementations. Suggested by Bruno Haible.
|
|
* NEWS: Document this.
* src/sort.c (getmonth): Omit LEN arg, as MONTH is now null-terminated.
(compare_random): Don't null-terminate keys, as caller now does that.
(compare_version): Remove.
(debug_key): Null-terminate string for getmonth.
(keycompare): Support combining -R with any of -d, -f, -i, -V.
Also, support combining -V with any of -d, -i.
(check_ordering_compatibility): Allow newly-supported combinations.
* tests/misc/sort (02q, 02r, 02s): New tests, for new combinations.
(incompat2): Now test -nR, since -fR are now compatible.
|
|
Formerly, the 'compare' function and some of its subroutines had a
debugging flag, which caused them to output underlines. This
change refactors the code so that debugging output is
more-separated from the actual sorting. In the process, the
change fixes a minor error in the debugging output. The change
shortens the source code and executable size a tad, and improves
CPU performance by 2.4% on my platform with a simple benchmark (C
locale, line sorting, no debug).
* src/sort.c (long_double, strtold): Move back to prelude, since
they're now used by multiple functions again.
(unit_order): Move to file scope, since it's now used by two functions.
(find_unit_order, human_numcompare, numcompare, general_numcompare):
Remove endptr parameter. All callers changed.
(human_numcompare): Args are now const pointers.
(getmonth): Endptr is now non-const.
(key_numeric): Move up, since it's needed earlier.
(debug_key): Take a line and a key as argument, instead of having
the caller figure out where the field is.
(debug_line): New function.
(keycompare, compare): Omit debug parameter; debug output now done
elsewhere. All callers changed.
(write_line): Renamed from write_bytes; all callers changed.
Use debug_line (not 'compare') to output debug info.
Use a slightly faster check for whether output file is stdout.
(check): Don't do debugging output; it's not that useful here,
and it confuses the code.
(main): Check for incompatibility between -c and --debug.
Use standard diagnostic for incompatible options.
* tests/misc/sort-debug-keys: Fix test case: "--Mi-1" is not
a number, so its first character should not be underlined when
debugging a numeric sort.
|
|
* lib/Makefile.am (libcoreutils_a_SOURCES): Remove xmemxfrm.c,
xmemxfrm.h.
* lib/memxfrm.c, lib/memxfrm.h, lib/xmemxfrm.c, lib/xmemxfrm.h: Remove.
* m4/memxfrm.m4: Likewise.
* m4/prereq.m4 (gl_PREREQ): Remove gl_MEMXFRM.
* po/POTFILES.in: Remove lib/xmemxfrm.c.
* src/sort.c: Don't include xmemxfrm.h.
(cmp_hashes): Remove.
(xstrxfrm): New function.
(compare_random): If a line contains NULs, don't create a big
buffer that contains the strxfrm output of each string in the line.
Instead, accumulate checksums and differences as we go, so that
at any one time we have to store at most the output of a single
strxfrm call when processing the line. This removes the need for
an memxfrm function.
|
|
* tests/init.sh (setup_): Move exit trap outside of shell function.
This patch is imported from gnulib.
|
|
* src/sort.c (debug_width): New function, which does not stop
counting tabs at \0, and also invokes mbsnwidth. Stamp out strnlen!
(count_tabs): Remove.
(debug_key): Use debug_width instead of mbsnwidth and count_tabs.
* tests/misc/sort-debug-keys: Check that \0 and \t intermix.
|
|
* NEWS: Document changes to sort -h, which are now minor with
respect to the pre-July-30th version.
* doc/coreutils.texi (sort invocation): Likewise. The
documentation now describes how -h comparison is done rather than
being vague with border cases.
* src/sort.c (long_double, strtold): Move back to general_numcompare.
(LD, compute_human): Remove.
(find_unit_order): Remove THOU_SEP parameter, since thousands
separators are now allowed by all callers. Revert to previous
behavior of sorting by suffix, and returning the order rather than
2 * order + binary, since we no longer care whether binary powers
are being used. However, treat all zeros the same, instead of
sorting 0M before 0G; this is more consistent with the desired
behavior of sorting -1G before -1M.
* tests/misc/sort (h1, h3, h6): Adjust to match mostly-reverted
behavior. However, check that all zeros sort together.
* tests/misc/sort-debug-keys: Omit a "_", since the trailing "i"
in "1234Gi" is no longer part of the key.
|
|
* NEWS: Document changes to sort -h.
* doc/coreutils.texi (sort invocation): Likewise.
* src/sort.c (long_double, strtold): Move to prelude, since they're
now used by multiple functions.
(LD): New macro.
(struct keyfield.iec_present): Remove this member. All uses removed.
(check_mixed_SI_IEC): Remove. This code was busted in the presence
of multiple threads, as it had a race condition.
(find_unit_order): Remove arg KEY; add arg THOU_SEP; arg ENDPTR is
now char ** rather than char const **. Return an integer that
distinguishes decimal from binary powers. Parse the number
consistently with the intersection of strtold and strnumcmp.
Set *ENDPTR unconditionally.
(compute_human): New static function.
(human_numcompare): Remove arg KEY. Remove 'const' from other args.
Use strnumcmp if possible, but fall back on floating point if not.
(numcompare, general_numcompare): Arg EA is now char ** rather
than char const **.
(numcompare): Adjust to new find_unit_order signature and behavior.
(keycompare): Adjus to new human_numcompare signature.
* tests/misc/sort (h1, h3, h4, h6): Adjust to new behavior.
* tests/misc/sort-debug-keys: Likewise.
|
|
* src/sort.c (mark_key): Don't assume offset <= INT_MAX.
Make the code a bit clearer when width != 0.
|
|
* src/sort.c (fillbuf): Don't append eol unless the line is nonempty.
This fixes a bug that was partly but not completely fixed by
the aadc67dfdb47f28bb8d1fa5e0fe0f52e2a8c51bf commit (dated July 15).
* tests/misc/sort (realloc-buf-2): New test, which catches this
bug on 64-bit hosts.
|
|
* src/sort.c (mergelines, queue_destroy, queue_init, queue_insert):
(queue_pop, write_unique, mergelines_node, check_insert):
(update_parent): No longer inline; these uses of "inline"
seemed unlikely to help performance much.
|
|
* src/sort.c (find_unit_order): Don't assume ASCII.
|
|
* gl/lib/heap.c (struct heap): Move this here...
* gl/lib/heap.h (struct heap): ... from here, as outside code no
longer needs to access any of these members.
|
|
* src/sort.c (queue_pop): Omit unnecessary unlock+lock after
pthread_cond_wait returns. Don't access "count" member of the
heap; any efficiency gains should be quite minor, the access
complicates this code, and "count" should be private anyway.
|
|
* src/sort.c (lock_node, unlock_node, queue_destroy, queue_init):
(queue_pop):
Omit 'restrict'; it shouldn't help here, as these functions have just
one pointer parameter and don't access static storage.
(queue_insert, check_insert, update_parent): Omit 'restrict', as
the pointer types differ, and are not char * or unsigned char *,
and therefore can't alias.
(write_unique): Omit 'restrict', as the pointer types are all
read-only.
(merge_loop, sortlines): Omit 'restrict', as any performance
advantages are extremely unlikely and it's not worth cluttering
the code for that.
(struct thread_args): Omit 'restrict': this seems to be incorrect.
It's unlikely for 'restrict' to be correct inside a typedef.
|
|
* src/sort.c (inittables, general_numcompare, compare_nodes):
(queue_init, queue_pop): Omit casts that are not needed, typically
because they are between void * and some other pointer type.
|
|
* src/sort.c (proctab_hasher, proctab_comparator, stream_open, xfopen):
(open_temp, zaptemp, struct_month_cmp, begfield, limfield):
(find_unit_order, human_numcompare, numcompare, general_numcompare):
(count_tabs, keycompare, compare, compare_nodes, lock_node):
(unlock_node, queue_destroy, queue_init, queue_insert, queue_pop):
(write_unique, mergelines_node, check_insert, update_parent):
(merge_loop, sortlines, struct thread_args, set_ordering):
Prefer the style "T const" to "const T".
* gl/lib/heap.h (struct heap, heap_alloc): Likewise.
* gl/lib/heap.c (heap_default_compare, heapify_down, heapify_up):
(heap_alloc): Likewise.
|
|
|
|
* src/du.c (process_file): Avoid recalculation of hashes
and of file-exclusion for directories. Do not descend into
the same directory more than once, unless -l is given; this is faster.
Calculate stat buffer lazily, since it
need not be computed at all for excluded files.
Count space if FTS_ERR, since stat buffer is always valid then.
No need for 'print' local variable.
(main): Use FTS_NOSTAT. Use FTS_TIGHT_CYCLE_CHECK only when not
hashing everything, since process_file finds cycles on its own
when hashing everything.
* tests/du/deref: Add test cases for -L bugs.
|
|
* tests/misc/sort-merge-fdlimit: This test was written assuming that
-R typically opens /dev/urandom, but that's no longer the case.
Redo test to specify a random source; this resurrects the point of
checking for file descriptor exhaustion. Also try plain -R, since
that implementation may change in the future too.
|
|
* gl/lib/rand-isaac.c: Remove the I/O; this belongs elsewhere.
Add support for ISAAC64. Port to hosts with padding bits.
Add self to author list. Include <limits.h>, for CHAR_BIT.
Don't include string.h, sys/time.h, unistd.h.
(min, just): New functions.
(IF32): New macros.
(ind, ISAAC_STEP, isaac_refill, mix, isaac_init, isaac_seed):
Add support for ISAAC64. Port to hosts with padding bits.
(ind): Now an inline function rather than a macro; no need for it
to be a macro with modern compilers.
(ISAAC_STEP): Renamed from isaac_step, since it's not function-like.
Don't bother to pass args that are always the same. All uses changed.
(ISAAC_STEP, ISAAC_SEED): Move to inside the only function body
that can use it.
(ISAAC_MIX): Renamed from isaac_mix, since it's now a macro and is
no longer function-like. Don't bother saving and restoring state;
no longer needed now that we're not a function. All uses changed.
(isaac_seed_start, isaac_seed_data, isaac_seed_finish): Remove.
(isaac_seed): Take just the one arg; the caller now sets s->m.
* gl/lib/rand-isaac.h: Use _GL_RAND_ISAAC_H to protect, instead
of RAND_ISAAC_H. Try out " #" rather than "# " for indenting.
(ISAAC_BITS_LOG, ISAAC_BITS): New macros.
(ISAAC_WORDS_LOG): Renamed from ISAAC_LOG.
(isaac_word): New type. All uses of uint32_t changed to isaac_word,
to support ISAAC64.
(struct isaac_state): Rename member MM to M, and make it public.
(isaac_seed, isaac_refill): Adjust to new API.
* gl/lib/randread.c: Include sys/time.h.
(get_nonce): New function, containing the nonce stuff that used
to be in rand-isaac.c but better belongs here.
(randread_new): Use it.
* gl/modules/randread (Depends-on): Add inline.
* gl/modules/randread-tests: New file.
* gl/tests/test-rand-isaac.c: New file.
|
|
Following on from commit dae35bac, 01-03-2010,
"sort: inform the system about our input access pattern"
apply the same hint to all appropriate utils.
This currently gives around a 5% speedup for reading
large files from fast flash devices on GNU/Linux.
* src/base64.c: Call fadvise (..., FADVISE_SEQUENTIAL);
* src/cat.c: Likewise.
* src/cksum.c: Likewise.
* src/comm.c: Likewise.
* src/cut.c: Likewise.
* src/expand.c: Likewise.
* src/fmt.c: Likewise.
* src/fold.c: Likewise.
* src/join.c: Likewise.
* src/md5sum.c: Likewise.
* src/nl.c: Likewise.
* src/paste.c: Likewise.
* src/pr.c: Likewise.
* src/ptx.c: Likewise.
* src/shuf.c: Likewise.
* src/sum.c: Likewise.
* src/tee.c: Likewise.
* src/tr.c: Likewise.
* src/tsort.c: Likewise.
* src/unexpand.c: Likewise.
* src/uniq.c: Likewise.
* src/wc.c: Likewise, unless we don't actually read().
|
|
* bootstrap.conf: Include the new module
* gl/lib/fadvise.c: Provide a simpler interface to posix_fadvise.
(fadvise): Provide hint to the whole file associated with a stream.
(fdadvise): Provide hint to the specific portion of a file
associated with a file descriptor.
* gl/lib/fadvise.h: Redefine POSIX_FADV_* to FADVISE_* enums.
* gl/modules/fadvise: New file.
* m4/jm-macros.m4: Remove the no longer needed posix_fadvise check.
* .x-sc_program_name: Exclude test-fadvise.c from this check.
* gl/tests/test-fadvise (main): New test program.
* gl/modules/fadvise-testss: A new index to reference the tests.
* src/sort.c (stream_open): Use the new interface.
* src/dd.c (iwrite): Likewise.
|
|
* configure.ac (optional_pkglib_progs): Only update
after the main programs have been selected, so that
libstdbuf.so can be excluded if stdbuf also is.
|
|
* gl/lib/rand-isaac.c (isaac_seed_start): New arg SEEDED.
(isaac_seed): New args FD and BYTES_BOUND. Read from FD if possible.
Don't bother with low-quality sources if FD has enough bytes.
* gl/lib/rand-isaac.h: New size_t arg for isaac_seed.
* gl/lib/randread.c: Include fcntl.h, unistd.h.
(NAME_OF_NONCE_DEVICE): New #define.
(nonce_device): New static var.
(randread_new): Use nonce device if available.
|
|
* src/sort.c (random_md5_state): New static var.
(random_md5_state_init): New function, to initialize random_md5_state.
(random_state, randread_source): Remove.
(cmp_hashes): Use random_md5_state rather than random_state.
Break ties using memcmp, not by getting more randomness.
If MD5 collisions turn into a problem in practice, we should
simply use a better checksum.
(main): If -R is given, call random_md5_state_init rather than
going single-threaded.
|
|
Programs like 'sort' were linking to -lrt in order to get
clock_gettime, but this was misguided: it wasted considerable
resources while gaining at most 10 bits of entropy. Almost nobody
needs the entropy, and there are better ways to get much better
entropy for people who do need it.
* gl/lib/rand-isaac.c (isaac_seed): Include <sys/time.h> not
"gethrxtime.h".
(isaac_seed): Use gettimeofday rather than gethrxtime.
* gl/modules/randread (Depends-on): Depend on gettimeofday
and not gethrxtime.
* src/Makefile.am (mktemp_LDADD, shred_LDADD, shuf_LDADD, sort_LDADD):
(tac_LDADD): Omit $(LIB_GETHRXTIME); no longer needed.
|
|
* tests/Makefile.am (TESTS): Add misc/sort-unique.
* tests/misc/sort-unique: New file.
|
|
* src/sort.c (keycompare): Use xmemcoll0, as it avoids
a couple of stores.
(write_bytes): Leave the buffer the way we found it,
as it might be used again for a later comparison,
if -u is used.
|
|
* src/sort.c (fillbuf): The previous commit incorrectly
terminated the buffer when the last line of input
didn't contain a terminating character.
|
|
Don't write NUL after the comparison buffers on each compare,
which increases performance by about 3% for short lines
on a pentium-m with gcc-4.4.1
* src/sort.c: (fillbuf): Delimit input items with NUL.
(write_bytes): Restore the item delimiter char which was
replaced with NUL in fillbuf().
|
|
* gl/lib/heap.c (heap_alloc): Use xnmalloc rather than xmalloc,
to avoid pathological overflow.
|
|
* src/Makefile.am (printf_LDADD, seq_LDADD, sleep_LDADD, sort_LDADD):
(tail_LDADD, uptime_LDADD): Omit $(POW_LIB), as it's no longer
needed due to recent gnulib changes, where the strtod module no
longer uses the pow function. strtold needs pow only because it's
sometimes aliased to strtod. See
http://lists.gnu.org/archive/html/bug-gnulib/2010-07/msg00076.html
|
|
* gnulib: Update for xmemcoll0 and fix for systems without pthreads
* bootstrap: Update for translation project sync fix
|
|
* NEWS: Add another blank line before the previous version.
(Bug fixes): Move to the start.
(Changes in behavior): Add the item about the du mem usage change
from the "New features" section.
|
|
* gl/lib/heap.c (heap_alloc): Use the fact that the xalloc
routines will not return NULL. Also remove the redundant
temporary variables.
(heap_insert): From Jim Meyering, use x2nrealloc() which
is simpler while handling overflow and increasing the
size more efficiently. This reallocation is currently
unused by coreutils in any case as it preallocates enough.
|
|
This patch is by Gene Auyeung, Chris Dickens, Chen Guo, and Mike
Nichols, based off of a patch by Paul Eggert, Glen Lenker, et. al.,
with a basic heap implementation based off of the GDSL heap,
originally by Nicolas Darnis.
The number of sorts done in parallel is limited to the number
of available processors by default, or can be further restricted
with the --parallel option.
On a dual-die, 8 core Intel Xeon, results show sorting with
8 threads is almost 4 times faster than using a single thread.
Timings when sorting a 96MB file:
THREADS TIME (s)
1 5.10
2 2.87
4 1.75
8 1.31
Single threaded sorting has also been improved,
especially for cheaper comparison operations:
COMMAND BEFORE (s) AFTER (s)
sort 8.822 8.716
sort -g 10.336 10.222
sort -n 3.077 2.961
LANG=C sort 2.169 2.066
* bootstrap.conf: Add heap, pthread.
* coreutils.texi (sort): Describe the new --parallel option.
* gl/lib/heap.c: New file. Very basic heap implementation.
* gl/lib/heap.h: New file.
* gl/modules/heap: New file.
* src/Makefile.am: Add LIB_PTHREAD.
* src/sort.c: Include heap.h, nproc.h, pthread.h.
(MAX_MERGE): New macro.
(SUBTHREAD_LINES_HEURISTIC, PARALLEL_OPTION): New constants.
(MERGE_END, MERGE_ROOT): New constants.
(struct merge_node): New struct.
(struct merge_node_queue): New struct.
(sortlines temp): Remove declaration.
(usage, long_options, main): New option, --parallel.
(specify_nthreads): New function.
(mergelines): New signature, to emphasize the fact that the HI area
must be part of the destination. All callers changed.
(sequential_sort): New function, renamed from sortlines. Merge in
the functionality of sortlines_temp.
(compare_nodes): New function.
(lock_node, unlock_node): New functions.
(queue_destroy): New function.
(queue_init): New function.
(queue_insert): New function.
(queue_pop): New function.
(write_unique): New function.
(mergelines_node): New function.
(check_insert): New function.
(update_parent): New function.
(merge_loop): New function.
(sortlines): Rewrite to support and use parallelism, with a new
signature. All callers changed.
(struct thread_args): New struct.
(sortlines_thread): New function.
(sortlines_temp): Remove.
(sort): New argument NTHREADS. All uses changed. Output moved to
mergelines_node.
(main): disable threading if we are sorting at random.
* tests/Makefile.am (TESTS): Add misc/sort-benchmark-random.
* tests/misc/sort-benchmark-random: New file.
Signed-off-by: Pádraig Brady <P@draigBrady.com>
|
|
* gl/lib/ino-map.c (ino_hash): Declare "i" as unsigned int.
Use an intermediate variable for the for-loop upper bound,
so it's a little more readable. Adjust comment.
* gl/lib/di-set.c (di_ent_hash): Likewise.
|
|
* src/dd.c (dd_copy): Use requested blocksize (not adjusted) in
diagnostic, to forestall user complaints that the numbers don't
match exactly. Report both exact and human-readable sizes, using
a message format that is consistent with both "BBBB bytes (N XB)
copied" in dd.c and "memory exhausted" in lib/xmalloc.c.
|