updated with tests against pcre 5.0 - ongrep - A cleaned up fork of ngrep for OpenBSD

commit 1874e13047af2dc8724bc261be5f544c81ed3b20
parent e147909a9d67f323e4f4e61fa37362196e38be88
Author: Jordan Ritter <jpr5@darkridge.com>
Date:   Tue, 22 Feb 2005 06:01:09 +0000

updated with tests against pcre 5.0

Diffstat:
M doc/PCRE.txt  | 69 ++++++++++++++++++++++++++++++++++++++++++++++++---------------------

1 file changed, 48 insertions(+), 21 deletions(-)
diff --git a/doc/PCRE.txt b/doc/PCRE.txt
@@ -1,32 +1,59 @@
 $Id$
 
-A quick note on PCRE vs GNU regex:
+Date: 2/21/05
 
-      I ran several tests comparing GNU regex to the PCRE library
-      using 10 million loop iterations: optimized vs. non-optimized,
-      match vs. non-match.  The conclusion I came to was that an
-      unoptimized PCRE program is almost double the match and
-      non-match times of the GNU regex library yields, and when using
-      optimization a PCRE program would perform almost the same in the
-      non-match case, but again almost twice that of the match case.
+A note about PCRE vs. GNU regex:
+
+      I ran several tests comparing GNU regex 0.12 to the PCRE 5.0
+      library using 100 million loop iterations: optimized
+      vs. non-optimized, match vs. non-match.  The obvious conclusion
+      is that GNU regex is the reigning king of speed, and that with
+      regular expression engines optimization matters significantly.
+
+      (Please note that I tried other third-party regex libraries like
+      RxSpencer's and libhackerlab's, and none came close to
+      comparing.)
 
       The test subject was "how now brown cow", and the pattern we
       were searching for in the match case was "now brown", and in the
       non-match case "not brown".  Obviously, the speed of matches is
       directly related to the actual regex itself, and a
       well-formulated regex certainly performs more efficiently than a
-      simple substring match.  However, this test is indicative of how
-      most people use ngrep, so the test results are still important.
+      simple substring match.  However, this test is reasonably
+      indicative of how most people use ngrep, so the test results are
+      still important.
 
       Granted, on the single-match level the time difference is
-      absolutely unnoticeable (it took 10 million loop iterations to
-      compute it), so this may not mean anything to you.  Likewise,
-      the stripped binary sizes are also within 10k of each other on
-      the test compile box.
-
-      If absolute speed is not the issue, then compile against PCRE
-      since it has better licensing.  If you're after the fastest you
-      can get (for you netops and netadmins out there, you know who
-      you are), then compile against GNU regex.  The speed really
-      helps when piping those 500MB pcap dump files through ngrep over
-      and over.
+      absolutely unnoticeable (it took 100 million loop iterations to
+      compute something worthwhile), so this may not mean anything to
+      you.  Likewise, the stripped binary sizes are also within 10k of
+      each other on the test compile box.
+
+      If licensing terms are more sensitive for you than speed, then
+      compile against PCRE which is available under the Artistic
+      License (Free as in Beer).  Otherwise, in all other cases the
+      GNU regex library is the best candidate, and the speed can
+      really helps when piping those 500MB pcap dump files through
+      ngrep over and over for analysis.
+
+
+Test results:
+
+     CPU: Intel Pentium-M 2GHz
+          L1 I cache: 32K, L1 D cache: 32K
+          L2 cache: 2048K
+
+     Iterations: 100M
+
+                           match             nomatch
+
+         [-O0]
+     GNU regex-0.12    17.369s/17.385s    32.656s/32.069s
+     PCRE-5.0          35.840s/35.795s    25.340s/25.344s
+
+         [-O2]
+     GNU regex-0.12    12.240s/12.280s    19.512s/19.489s
+     PCRE-5.0          24.580s/24.578s    17.235s/17.238s
+
+
+