nice machine with 2 gb of ram, 800 megabytes in 2 logfiles. single word as search phrase. polish utf-8 locale (pl_PL.UTF-8), gnu grep 2.5.1. results?
=> time grep -in reloading postgresql-2007-10-22_000000.log postgresql-2007-10-22_120909.log
postgresql-2007-10-22_000000.log:40001:2007-10-22 10:50:13.528 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
postgresql-2007-10-22_120909.log:1215696:2007-10-22 12:15:21.769 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
real 1m21.212s
user 1m20.909s
sys 0m0.284s
same, check without -i:
=> time grep -n reloading postgresql-2007-10-22_000000.log postgresql-2007-10-22_120909.log
postgresql-2007-10-22_000000.log:40001:2007-10-22 10:50:13.528 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
postgresql-2007-10-22_120909.log:1215696:2007-10-22 12:15:21.769 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
real 0m1.147s
user 0m0.868s
sys 0m0.268s
after setting locale to C:
=> time grep -in reloading postgresql-2007-10-22_000000.log postgresql-2007-10-22_120909.log
postgresql-2007-10-22_000000.log:40001:2007-10-22 10:50:13.528 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
postgresql-2007-10-22_120909.log:1215696:2007-10-22 12:15:21.769 CEST @ 24681 LOG: received SIGHUP, reloading configuration files
real 0m1.209s
user 0m0.896s
sys 0m0.316s
all tests were repeated many times to get all data in memory, and check for extreme values.
does anybody need another proof that locale “thing" is broken? of course it might be that only locale handling in grep is bad, but anyway – it's still locale issue.