summaryrefslogtreecommitdiff
path: root/src/sort.c
AgeCommit message (Collapse)Author
2000-11-30Port GNU "sort" to hosts where sizes don't fit in "int",Jim Meyering
e.g. 64-bit Solaris (sparc). ("human.h", "xstrtol.h"): Include. (struct line): length member is now size_t, not int. (struct lines): Likewise for used, alloc, limit members. (struct buffer): Likewise for used, alloc, left, newline_free members. (struct keyfield): Likewise for sword, schar, eword, echar members. (sortalloc, mergealloc, linelength): Now size_t, not int. (initbuf, fillbuf, initlines, begfield, limfield, findlines, numcompare, getmonth, keycompare, compare, checkfp, mergefps, sortlines, sort): Accept, return, and use size_t for sizes, not int. (fillbuf, initlines, findlines, checkfp, sort): Check for overflow when computing buffer sizes. (begfield, limfield): Do not index past end of array. (checkfp): Return a boolean, not a line number, as the line number may not fit in int. All callers changed. Use uintmax_t for line numbers, not int. (sort): Don't allocate tmp until we need it (and know the right size). (parse_field_count): New function. (main): Use it to check for overflow in field counts. "outfile" is now a pointer to const.
2000-10-22add missing backslashJim Meyering
2000-10-21(SORT_OUT_OF_ORDER): Define.Jim Meyering
(main): Use it instead of hard-coding the `1'.
2000-10-21(main): Use EXIT_SUCCESS rather than 0.Jim Meyering
Fail when checking (-c) with more than one file argument, rather than simply ignoring the extra arguments.
2000-08-11(usage): Describe -d and -i in a locale-independent way.Jim Meyering
2000-08-07(usage): Warn more succintly about the effects ofJim Meyering
the locale on sort order.
2000-08-05(main): Rename local `t' to `tmp_dir' to avoid shadowingJim Meyering
a previous local by that name. (usage): Warn that GNU sort is now locale-aware, and suggest people put LC_ALL=POSIX in their environment.
2000-07-29(temp_dir): Remove.Jim Meyering
(temp_dirs, temp_dir_count, temp_dir_alloc): New vars. (process_id): New var. (usage): Describe new use of -T. (add_temp_dir): New function. (tempname): Use new temp_dirs array. Do not discard information from the process-id or sequence number, unless we have short file names. (sighandle): Use process_id instead of getpid. (main): Initialize process_id. Add support for the new use of -T.
2000-05-20Arrange to call close_stdout upon exit. Don't close stdout explicitly.Jim Meyering
(but set exit status and file name, too)
2000-03-06(struct buffer.newline_free): New member.Jim Meyering
(initbuf, findlines): Set it. (fillbuf): Do not double the size of a full buffer to append a newline unless the buffer is known to be newline free.
2000-03-03(fillbuf): Move declaration of local, cc, into scope ofJim Meyering
`while' loop where it's used.
2000-03-03Big performance improvement when sorting many small files,Jim Meyering
building on a suggestion by Charles Randall. (fillbuf): Skip memmove if it would be a no-op, as many memmove implementations are slow in that case. Don't examine leftover bytes for eolchar, since they may be left over from a previous file, and we want to read from this file. (sort): At end of file, if there is more input and buffer room, concatenate the next input file.
2000-01-22(keycompare): Use global, hard_LC_COLLATE in place ofJim Meyering
local that is sometimes undeclared.
2000-01-19Tweak sort performance.Jim Meyering
(hard_LC_CTYPE): Remove. (keylist): Renamed from keyhead. Now a pointer, not a mostly-unused struct. All uses changed. (findlines, keycompare, CMP_WITH_IGNORE, compare, checkfp, mergefps, sort): Tune and use a more consistent style for reallocation. (keycompare, main): Don't worry about LC_CTYPE; it's buggy with multibyte chars anyway. (compare): Invoke alloca (0) after each call to keycompare, not just the ones that return nonzero. This avoids a memory leak on architectures without builtin alloca that occurs sometimes when a file contains all duplicate lines.
2000-01-18(sighandler, main):Jim Meyering
Don't use SA_INTERRUPT to decide whether to call sigaction, as POSIX.1 doesn't require SA_INTERRUPT and some systems (e.g. Solaris 7) don't define it. Use SA_NOCLDSTOP instead; it's been part of POSIX.1 since day 1 (in 1988).
2000-01-13(fillbuf): Avoid quadratic behavior with long lines.Jim Meyering
Also, stop worrying about ancient memchr bug (misbehavior when size is zero), since other code doesn't worry either.
1999-11-05(SORTALLOC): New macro.Jim Meyering
(sortalloc, mergealloc, LINEALLOC): Use it. (sortalloc, mergealloc, linelength): Now const. (sortalloc): Increase from 0.5 to 8 MB. (mergealloc): Increase from 16 to 256 kB. (LINEALLOC): Increase from 0.25 to 4 MB.
1999-11-04(begfield, limfield, findlines, keycompare, compare):Jim Meyering
Do not consider newline to be part of a line when comparing lines in `sort' and `comm'. POSIX.2 requires that we consider newline, but this is a bug in the spec and the bug will likely be fixed.
1999-09-02Remove xstrdup declaration.Jim Meyering
1999-08-22(checkfp): Use IF_LINT macro instead of #ifdef lint...Jim Meyering
(mergefps): Likewise.
1999-08-06Include file name in `write error' diagnostics.Jim Meyering
(write_bytes): Add output_file parameter and use it. Update callers. (mergefps): Likewise. (merge): Likewise. (sort): Likewise. Reported by John Summerfield.
1999-07-04Include hard-locale.h, memcoll.h.Jim Meyering
(hard_LC_COLLATE, hard_LC_CTYPE, hard_LC_TIME): New variables, replacing `need_locale'. (memcoll): Move to lib/memcoll.c. (keycompare): No need to alloc (0), since our caller now does it. (compare): alloca (0) before returning. (my_setlocale): Remove; hard_locale now dows this. (main): Invoke setlocale, bindtextdomain, and textdomain before invoking anything that might print an error. Use hard_locale to determine which locales are hard.
1999-05-22(general_numcompare): Put exceptional casesJim Meyering
first, not last, to be consistent with -M.
1999-05-22(strtod): Declare if STDC_HEADERS is not defined.Jim Meyering
(general_numcompare): Use strtod, not xstrtod. Do not consider partial conversions to be errors. Put -infinity at the start, and +infinity at the end; follow +infinity with NaNs (sorted by bit pattern), and finally by conversion errors.
1999-05-21Treat the trailing newline as part of the line, as required by POSIX.2.Jim Meyering
(struct line, findlines, compare, checkfp, mergefps, sort): A line now includes its trailing newline. (findlines): Do not replace newline with NUL. (memcoll, keycompare): Work even if the data to be compared are adjacent strings; this is possible now that lines contain the trailing newline. (fillbuf): Always have an unused byte at the end of the buffer, since memcoll and keycompare want to modify a byte after the last line. (sortalloc, mergealloc): Increase by 1, for trailing byte.
1999-05-20(keycompare): Ignore any length difference if theJim Meyering
localized comparison says the strings are equal.
1999-05-20(memcoll, keycompare, compare): Handle NULJim Meyering
characters properly when comparing with LC_COLLATE semantics. (NLS_MEMCMP): Remove. (memcoll): Renamed from strncoll. Take separate lengths for each string. This function is now invoked only when need_locale. (keycompare): Don't copy strings when ignore and translate are both NULL.
1999-05-20(MONTHTAB_CONST): Renamed from NLS_CONST; the useJim Meyering
is also changed. Define to const also if !HAVE_NL_LANGINFO. (usage): `,' -> `;' (English typo).
1999-05-16Don't autodetect the locale of numbers andJim Meyering
months, as this conflicts with POSIX.2 and is tricky to boot. (FLOATING_COMMA, NLS_STRNCMP, NLS_MAX_GROUPS, NLS_ONE_CHARACTER_STRING): Remove macros no longer used. (nls_grouping, nls_fraction_found, nls_month_found, nos_monthtab, nls_months_collide, nls_keyhead, us_monthtab): Remove variables no longer used. (struct nls_keyfield): Remove types no longer used. (strncoll_s2_readonly, nls_set_fraction, look_for_fraction, nls_month_is_either_locale, nls_numeric_format): Remove functions no longer used. (monthtab): Now has the role that us_monthtab had, but it's const only if ENABLE_NLS is not defined. (C_DECIMAL_POINT): Renamed from FLOATING_POINT. All uses changed. (MONTHS_PER_YEAR): Renamed from NLS_NUM_MONTHS. All uses changed. (struct_month_cmp): Renamed from nls_sort_month_comp. All uses changed. Use strcmp, not strcoll, since the user doesn't care about collating here. (inittables): Read locale data into monthtab, rather than modifying a separate month table and futzing with indirection. Do not worry about colliding months, since we no longer autodetect month locale. (fraccompare): Don't set no-longer-used variable nls_fraction_found. (getmonth): Use strncmp to compare months, since user doesn't care about collating here. Fix bug where code incorrectly assumed that strlen (monthtab[lo].name) == strlen (monthtab[ix].name). (keycompare, main): Don't autodetect month locale. (compare): Don't use NLS_MEMCP in code that can't be executed if need_locale is false, as NLS_MEMCP is equivalent to memcmp in that case. (sort, insertkey, main): Don't autodetect numeric locale.
1999-05-15(usage): Whoops.Jim Meyering
1999-05-12(usage): Split the --help message into two pieces so thatJim Meyering
neither is longer than 2048. For Irix4's cc. Reported by Kaveh Ghazi.
1999-05-09(fraccompare, numcompare): Merge the NLS andJim Meyering
non-NLS versions into a single function. (decimal_point): Now char, since we no longer convert to unsigned char. (th_sep): Now int, since we use a value out of char range to denote the absence of a thousands separator. (IS_THOUSANDS_SEP): New macro. (USE_NEW_FRAC_COMPARE): Remove. (nls_set_fraction): Arg is now char, not unsigned char. Set th_sep to CHAR_MAX + 1 if there is no thousands separator. (numcompare): Don't convert to unsigned char unless necessary. (main): Turn off decimal points and thousand separators if they are multibyte characters, as we don't support that yet.
1999-05-06(numcompare): Handle comparison of two negativeJim Meyering
numbers correctly in the ENABLE_NLS case.
1999-05-05add missing backslash-before-newline in usage messageJim Meyering
1999-05-05add missing backslash-before-newline in usage messageJim Meyering
1999-05-01(usage): Document the differences between theJim Meyering
obsolescent, +POS1[-POS2] form, and the POSIX -k option.
1999-04-19(tempname): Wrap after 99999 only for length-impaired file systems.Jim Meyering
1999-04-18(tempname): Add a FIXME comment.Jim Meyering
1999-04-18(NAME_MAX_IN_DIR): Rename from PATH_MAX_IN_DIR. Use _POSIX_NAME_MAX,Jim Meyering
not _POSIX_PATH_MAX. Guard with #if HAVE_PATHCONF rather than #if HAVE_UNISTD_H.
1999-04-18Rename global: s/temp_file_prefix/temp_dir/.Jim Meyering
1999-04-18(usage): s/DIRECT/DIRECTORY/gJim Meyering
1999-04-03Use AUTHORS in place of string in parse_long_options call.Jim Meyering
1999-04-03Insert AUTHORS definition.Jim Meyering
1999-04-03Use PROGRAM_NAME in place of string in parse_long_options call.Jim Meyering
1999-04-03define PROGRAM_NAMEJim Meyering
1999-03-07(main): Use a `%s' format in error call,Jim Meyering
in case the argument string contains a `%'.
1999-03-04(main): Include author name argument in call to parse_long_options.Jim Meyering
1999-02-16update copyright datesJim Meyering
1999-01-14Don't prototype usage as static.Jim Meyering
1999-01-01(PATH_MAX_IN_DIR) [HAVE_UNISTD_H]: New macro, for maxJim Meyering
file name characters in a given directory. (tempname): Make sure the temp file name is unique even if long file names aren't supported.