Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Okay but think:

If you are searching a very large file for a very few occurrences of the expected match then this optimization is not so bad.

If you are running line-by-line through a very large log file to extract just those two pieces of information per line, then throw away the first N characters in each line (where N is the hopefully-constant length of your timestamps plus that space char) and start the regex engine at the beginning of the expected match. Then it doesn't have to waste any time passing over those chars.

Even if the exact details above aren't quite right the principal is (and is well-known): Avoid premature optimization! (And the corollary: Measure it. Profile your code, don't guess, you're probably wrong.)



There's a higher order solution. Read the AWK book, do the exercises, then ignore blog posts about regexps. Make an exception when it's Russ Cox demonstrating how often this wheel has been reinvented in square form.


Which AWK book? The one by A, W, and K?


Yes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: