summaryrefslogtreecommitdiff
path: root/tests/misc/cut-huge-range.sh
diff options
context:
space:
mode:
authorCojocaru Alexandru <xojoc@gmx.com>2012-12-09 10:43:10 +0100
committerPádraig Brady <P@draigBrady.com>2013-04-29 17:54:27 +0100
commit3e466ad05181d95057e6612ff11059c91396cd0e (patch)
tree2110ad15ceb663c914eb61edb50d0df5408f4866 /tests/misc/cut-huge-range.sh
parente414ff4c4c3fe029a9702c9909bf4eccbef68c21 (diff)
downloadcoreutils-3e466ad05181d95057e6612ff11059c91396cd0e.tar.xz
cut: make memory allocation independent of range width
The current implementation of cut, uses a bit array, an array of `struct range_pair's, and (when --output-delimiter is specified) a hash_table. The new implementation will use only an array of `struct range_pair's. The old implementation is memory inefficient because: 1. When -b with a big num is specified, it allocates a lot of memory for `printable_field'. 2. When --output-delimiter is specified, it will allocate 31 buckets. Even if only a few ranges are specified. Note CPU overhead is increased to determine if an item is to be printed, as shown by: $ yes abcdfeg | head -n1MB > big-file $ for c in with-bitarray without-bitarray; do src/cut-$c 2>/dev/null echo -ne "\n== $c ==" time src/cut-$c -b1,3 big-file > /dev/null done == with-bitarray == real 0m0.084s user 0m0.078s sys 0m0.006s == without-bitarray == real 0m0.111s user 0m0.108s sys 0m0.002s Subsequent patches will reduce this overhead. * src/cut.c (set_fields): Set and initialize RP instead of printable_field. * src/cut.c (is_range_start_index): Use CURRENT_RP rather than a hash. * tests/misc/cut.pl: Check if `eol_range_start' is set correctly. * tests/misc/cut-huge-range.sh: Rename from cut-huge-to-eol-range.sh, and add a test to verify large amounts of mem aren't allocated. Fixes http://bugs.gnu.org/13127
Diffstat (limited to 'tests/misc/cut-huge-range.sh')
-rwxr-xr-xtests/misc/cut-huge-range.sh34
1 files changed, 34 insertions, 0 deletions
diff --git a/tests/misc/cut-huge-range.sh b/tests/misc/cut-huge-range.sh
new file mode 100755
index 000000000..8783e96ad
--- /dev/null
+++ b/tests/misc/cut-huge-range.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+# Ensure that cut does not allocate mem for a range like -b9999999999999-
+
+# Copyright (C) 2012-2013 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ cut
+require_ulimit_v_
+getlimits_
+
+# From coreutils-8.10 through 8.20, this would make cut try to allocate
+# a 256MiB bit vector. With a 20MB limit on VM, the following would fail.
+(ulimit -v 20000; : | cut -b$INT_MAX- > err 2>&1) || fail=1
+
+# Up to and including coreutils-8.21, cut would allocate possibly needed
+# memory upfront. Subsequently memory is allocated as required.
+(ulimit -v 20000; : | cut -b1-$INT_MAX > err 2>&1) || fail=1
+
+compare /dev/null err || fail=1
+
+Exit $fail