summaryrefslogtreecommitdiff
path: root/src/sort.c
AgeCommit message (Collapse)Author
2005-04-11Include unistd-safer.h.Paul Eggert
(create_temp_file): Use fd_safer. (xfclose): Don't assume fileno (stdin) == STDIN_FILENO, etc.
2005-04-09(SA_NOCLDSTOP): Define to 0 if not defined.Paul Eggert
All uses changed. (siginterrupt) [! HAVE_SIGINTERRUPT]: New macro. (main) [! SA_NOCLDSTOP]: Use it.
2005-03-28(long_options, mergefps): Use NULL, not `0'.Jim Meyering
2005-03-06Remove `register' keyword.Jim Meyering
2005-02-14(mergefps): Use binary search rather than linear onePaul Eggert
when comparing new line to lines already in main memory.
2004-12-02(C_DECIMAL_POINT): Remove. Use changed to '.'.Paul Eggert
Assume setlocale exists. (thousands_sep): Renamed from th_sep. (IS_THOUSANDS_SEP): Remove. All uses replaced by comparisons. (NONZERO): Parenthesize use of arg. (numcompare): Avoid duplicate loads. Use ISDIGIT as boolean, for consistency. Avoid unnecessary negation by reversing fraccompare args. (main): Rewrite localeconv call to match seq.c.
2004-11-14(zaptemp): Mark new diagnostic for translation.Jim Meyering
2004-11-13(zaptemp): Warn if a temporary file is not removed.Paul Eggert
Prune unnecessary accesses to volatile locations, and take some code out of the critical section that didn't need to be in it.
2004-11-13Make the newly-introduced critical section a bit smaller.Paul Eggert
2004-11-13Avoid O(N**2) behavior when there are many temporary files.Paul Eggert
(temptail): New variable, so that we can easily append to list. (create_temp_file): Create new files at end of list, so that searching the list has O(N**NMERGE) behavior instead of O(N**2). (zaptemp): Update temptail if needed. (mergefps, merge): Accept new arg that counts temp files, and keep it up to date as we create and remove temporaries. This is for efficiency, so that we don't call zaptemp so often. All callers changed. (sort): Don't create array in reverse order, since the list of temporaries is now in the correct order. (zaptemp): Protect against race condition: if 'sort' is interrupted in the middle of zaptemp, it might unlink the temporary file twice, and the second time this happens the file might already have been created by some other process. (create_temp_file): Use offsetof for clarity. (die): Move it up earlier, to clean up the code a bit.
2004-11-07(merge): Remove declarations of now-unused variables.Jim Meyering
2004-11-06(first_same_file): Remove. Move most of the code to....Paul Eggert
(avoid_trashing_input): New function. (merge): Avoid some silly merges, e.g., copying a single file to a temporary file when there are exactly 17 input files to merge. Take a count of temporary files rather than a max_merge arg. All uses changed.
2004-11-06(xfclose): Don't close stdout here (just flush it),Jim Meyering
since close_stdout now closes stdout unconditionally.
2004-11-05(inittables, sort_buffer_size, getmonth, mergefps,Paul Eggert
first_same_file, merge, sort, main): Use size_t for indexes into arrays. This fixes some unlikely havoc-wreaking bugs (e.g., more than INT_MAX temporary files). (getmonth, keycompare, compare): Rewrite to avoid need for alloca, thus avoiding unchecked stack overflow in some cases. As a side effect this improve the performance of "sort -M" by a factor of 4 on my benchmarks.
2004-09-21Don't include "long-options.h".Paul Eggert
2004-09-07(main): Emulate Solaris 8 and 9 "sort -y", so thatPaul Eggert
"sort -y abc" is like "sort abc" whereas "sort -y 100" is like plain "sort".
2004-08-10(die, xfopen, mergefps, first_same_file, merge):Paul Eggert
A null file arg means standard output. (main): "-o -" means to write to a file named "-", not to standard output.
2004-07-30Improve comment for first_same_file.Paul Eggert
2004-07-30(UCHAR): Remove; all uses changed to to_uchar.Paul Eggert
(IS_THOUSANDS_SEP): Use bool when appropriate. (numcompare, main): Use char, not int, when the value is always a char. (numcompare): Remove "register"; compilers are smart enough these days.
2004-06-21(main): Standardize on the diagnostics given when someone givesJim Meyering
too few operands ("missing operand after `xxx'") or too many operands ("extra operand `xxx'"). Include "quote.h" and/or "error.h" if it wasn't already being included.
2004-06-01(main): Prefer the notation `STREQ (a, b)' over `strcmp (a, b) == 0'.Jim Meyering
2004-06-01(main, sort_buffer_size): Use STREQ (a, b) rather than `strcmp (a, b) == 0'Jim Meyering
2004-05-14Improve performance of `sort -m' on large files, at the cost ofJim Meyering
making some contrived examples unsafe. POSIX allows this optimization. Performance problem reported by Jonathan Baker in <http://mail.gnu.org/archive/html/bug-coreutils/2004-05/msg00071.html>. (first_same_file): Do not treat input pipes differently from other files.
2004-04-26(limfield): Make a comment clearer.Jim Meyering
2004-04-26Fix POSIX-conformance bug: "sort -k 3,3.5b" is supposed to skipJim Meyering
leading blanks when computing the location of the field end; it is not supposed to skip trailing blanks. Solaris 8 "sort" does conform to POSIX. Also fix the documentation to clarify this and related issues. (limfield): Use skipeblanks, not skipsblanks, to decode whether to skip leading blanks. (trailing_blanks): Remove. (fillbuf, getmonth, keycompare): Don't trim trailing blanks.
2004-04-20(main): Rewrite signal-catching code to make itJim Meyering
similar to other coreutils programs. When processing signals, block all signals that we catch, but do not block signals that we don't catch. Avoid problems with unsigned int warnings. (sighandler) [defined SA_NOCLDSTOP]: Use simpler "signal (sig, SIG_DFL)" rather than sigaction equivalent.
2004-02-17(usage) [-u]: Add punctuation so that the description inJim Meyering
the help2man-generated (line-joined) man page is more readable. Reported by Tim Waugh. [-T]: Add a semicolon, for the same reason.
2004-01-22(usage): Use EXIT_SUCCESS, not 0, for clarity.Jim Meyering
(main): Use initialize_exit_failure rather than setting exit_failure directly; this optimizes away redundant assignments. Don't include <assert.h>. (SORT_OUT_OF_ORDER, SORT_FAILURE): Now enums, not macros. (usage): Don't use 'assert'. (main): Remove redundant assignment to exit_failure.
2004-01-04(add_temp_dir): Use x2nrealloc rather than xrealloc.Jim Meyering
(fillbuf): Use x2nrealloc rather than xrealloc. (sort): Use xnmalloc rather than xmalloc. (main): Likewise.
2003-11-04(new_key): Use xzalloc, not xcalloc.Jim Meyering
2003-11-02(inittables): Use `sizeof *var' rather than `sizeof EXPLICIT_TYPE'.Jim Meyering
The former is more maintainable and usually shorter. (sort): Split a long line.
2003-10-18Most .c files (AUTHORS): Revert the WRITTEN_BY/AUTHORS changeJim Meyering
of 2003-09-19. Now, AUTHORS is a comma-separated list of strings. Update the call to parse_long_options so that `AUTHORS, NULL' are the last parameters. * src/true.c (main): Append NULL to version_etc argument list. * src/sys2.h (case_GETOPT_VERSION_CHAR): Likewise.
2003-10-15(parse_field_count): Handle the case where overflowJim Meyering
and invalid suffix char are both reported.
2003-09-28Remove unnecessary casts of alloca, since now it's guaranteed to be (void *).Jim Meyering
2003-09-18(WRITTEN_BY): Rename from AUTHORS.Jim Meyering
Begin each WRITTEN_BY string with `Written by ' and end it with `.'. Mark each WRITTEN_BY string as translatable.
2003-09-18revert previous changeJim Meyering
2003-09-18Update AUTHORS definition to be a comma-separated list of strings and/or updateJim Meyering
the call to parse_long_options so that `AUTHORS, NULL' are the last parameters.
2003-09-18(numcompare): Rename local, logb, to log_b to avoidJim Meyering
shadowing the math function name. Also rename loga to log_a.
2003-09-05Don't ignore -S if input is a pipe. Bug report by Michael McFarland inJim Meyering
<http://mail.gnu.org/archive/html/bug-coreutils/2003-09/msg00008.html>. (sort_buffer_size): Omit SIZE_BOUND arg. Compute the size_bound ourselves. if an input file is a pipe and the user specified a size, use that size instead of trying to guess the pipe size. This has the beneficial side effect of avoiding the overhead of default_sort_size in that case. All callers changed. (sort): Remove static var size; now done by sort_buffer_size.
2003-09-04(usage): Say "blanks" instead of "whitespace",Jim Meyering
Similar fixes for many comments. (TAB_DEFAULT): New constant, so that we can support NUL as the field separator. (tab): Now int, not char. Initialize to TAB_DEFAULT. (specify_sort_size): If multiple sizes are specified, use the largest. (begfield, limfield): Support NUL tab char. (set_ordering): Do not let -i override -d. (main): Report an error if incompatible -o or -t options are given. Report an error for "-t ''". Allow "-t '\0'" to specify a NUL tab.
2003-08-04(main): Use unsigned int instead of int for `nsigs'Jim Meyering
and for the indices to iterate through nsigs.
2003-08-03Minor code cleanups, mostly to use more accurateJim Meyering
types and to remove unnecessary casts. (min, max): Remove. All uses changed to MIN and MAX. (hard_lc_collate, hard_LC_TIME, struct buffer.eof, struct keyfield.skipsblanks, struct keyfield.skipeblanks, struct keyfield.numeric, struct keyfield.general_numeric, struct keyfield.month, struct keyfield.reverse, reverse, unique, have_read_stdin): Now bool, not int. All uses changed. (eolchar): Now char, not int. (struct keyfield.ignore): Now bool const *, not int *. (struct keyfield.translate): Now char const *, not char *. (struct month.name): Likewise. (blanks, nonprinting, nondictionary): Now bool[], not int[]. (cleanup, inittables, keycompare, check, mergefps, first_same_file, check, sort, main): Use const * pointers when possible. (month_cmp): Rewrite to avoid casts. (inittables): Initialize tables unconditionally, to avoid branches. (fillbuf): Return bool, not int. All uses changed. (fillbuf, keycompare, new_key, main): Use SIZE_MAX rather than (size_t) -1. (trailing_blanks): Renamed from trim_trailing_blanks. Return the number of blanks to trim. All uses changed. (getmonth): Use trailing_blanks rather than open code. (keycompare): Do not cast char * to unsigned char *; not needed. CMP_WITH_IGNORE converts args to UCHAR, so no need to convert it ourselves. (compare, main): Use | rather than || to avoid jumps. Replace "diff = NONZERO (alen)" with "diff = 1", since alen must be nonzero there. (check, first_same_file, sort, main): Use bool instead of int local vars when possible. (check): Merge the old 'checkfp' and 'check' into a single function, that returns a boolean (true if the file was ordered). All uses changed. (main): Use int instead of unsigned for iterating through nsigs. Rename local var "posix_pedantic" to "posixly_correct".
2003-08-02(sortlines): Add description and references.Jim Meyering
From Paul Eggert.
2003-07-28(sortlines_temp): Undo previous change.Jim Meyering
2003-07-27(sortlines_temp): Declare local `swap' to be `int', notJim Meyering
`bool'. Otherwise, at least one buggy compiler (alpha gcc-2.95.4) would cause lines[-1 - swap] (with swap = false) to evaluate to lines[4294967295].
2003-07-27remove trailing blanksJim Meyering
2003-07-27(sort): Don't require two `struct line's per text line,Jim Meyering
the new sort algorithm requires just 1.5.
2003-07-27This change was inspired by a similar proposal by Stepan Kasal.Jim Meyering
(mergelines, sortlines_temp): New functions. (sortlines): Use them, to reduce the number of times that we need to copy 'struct line' values. This improved CPU performance by about 30% on one 18 MB test. (sort): Don't invoke sortlines unless we have 2 or more lines.
2003-07-23Don't include headers already included by system.h:Jim Meyering
Don't include closeout.h.
2003-07-19Include "exitfail.h".Jim Meyering
(main): Set exit_failure rather than calling close_stdout_set_status.