ongrep

A cleaned up fork of ngrep for OpenBSD
git clone git://git.sgregoratto.me/ongrep
Log | Files | Refs | README | LICENSE

commit 1874e13047af2dc8724bc261be5f544c81ed3b20
parent e147909a9d67f323e4f4e61fa37362196e38be88
Author: Jordan Ritter <jpr5@darkridge.com>
Date:   Tue, 22 Feb 2005 06:01:09 +0000

updated with tests against pcre 5.0

Diffstat:
Mdoc/PCRE.txt | 69++++++++++++++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 48 insertions(+), 21 deletions(-)

diff --git a/doc/PCRE.txt b/doc/PCRE.txt @@ -1,32 +1,59 @@ $Id$ -A quick note on PCRE vs GNU regex: +Date: 2/21/05 - I ran several tests comparing GNU regex to the PCRE library - using 10 million loop iterations: optimized vs. non-optimized, - match vs. non-match. The conclusion I came to was that an - unoptimized PCRE program is almost double the match and - non-match times of the GNU regex library yields, and when using - optimization a PCRE program would perform almost the same in the - non-match case, but again almost twice that of the match case. +A note about PCRE vs. GNU regex: + + I ran several tests comparing GNU regex 0.12 to the PCRE 5.0 + library using 100 million loop iterations: optimized + vs. non-optimized, match vs. non-match. The obvious conclusion + is that GNU regex is the reigning king of speed, and that with + regular expression engines optimization matters significantly. + + (Please note that I tried other third-party regex libraries like + RxSpencer's and libhackerlab's, and none came close to + comparing.) The test subject was "how now brown cow", and the pattern we were searching for in the match case was "now brown", and in the non-match case "not brown". Obviously, the speed of matches is directly related to the actual regex itself, and a well-formulated regex certainly performs more efficiently than a - simple substring match. However, this test is indicative of how - most people use ngrep, so the test results are still important. + simple substring match. However, this test is reasonably + indicative of how most people use ngrep, so the test results are + still important. Granted, on the single-match level the time difference is - absolutely unnoticeable (it took 10 million loop iterations to - compute it), so this may not mean anything to you. Likewise, - the stripped binary sizes are also within 10k of each other on - the test compile box. - - If absolute speed is not the issue, then compile against PCRE - since it has better licensing. If you're after the fastest you - can get (for you netops and netadmins out there, you know who - you are), then compile against GNU regex. The speed really - helps when piping those 500MB pcap dump files through ngrep over - and over. + absolutely unnoticeable (it took 100 million loop iterations to + compute something worthwhile), so this may not mean anything to + you. Likewise, the stripped binary sizes are also within 10k of + each other on the test compile box. + + If licensing terms are more sensitive for you than speed, then + compile against PCRE which is available under the Artistic + License (Free as in Beer). Otherwise, in all other cases the + GNU regex library is the best candidate, and the speed can + really helps when piping those 500MB pcap dump files through + ngrep over and over for analysis. + + +Test results: + + CPU: Intel Pentium-M 2GHz + L1 I cache: 32K, L1 D cache: 32K + L2 cache: 2048K + + Iterations: 100M + + match nomatch + + [-O0] + GNU regex-0.12 17.369s/17.385s 32.656s/32.069s + PCRE-5.0 35.840s/35.795s 25.340s/25.344s + + [-O2] + GNU regex-0.12 12.240s/12.280s 19.512s/19.489s + PCRE-5.0 24.580s/24.578s 17.235s/17.238s + + +