We are often told that Regexps are slow and should be avoided whenever possible.However, taking into account the overhead of doing some string manipulation oneself (not talking about algorithm mistakes - this is a different matter), especially in \[code\]PHP\[/code\] or \[code\]Perl\[/code\] (maybe \[code\]Java\[/code\]) what is the limit, in which case can we consider string manipulation to be a better alternative? What regexps are particularly CPU greedy?For instance, for the following, in \[code\]C++\[/code\], \[code\]Java\[/code\], \[code\]PHP\[/code\] or \[code\]Perl\[/code\], what would you recommendThe regexps would probably be faster:
In recent Perl / PHP implementations for instance, what is known to be rather slow - and should be avoided?
The answer is expected from people who did already their own research (profiler...) and who are able to provide a kind of general guidelines about what is recommended/to be avoided.
- \[code\]s/abc/def/g\[/code\] or a \[code\]... while((i=index("abc",$x)>=0) ...$y .= substr()...\[/code\] based solution?
- \[code\]s/(\d)+/N/g\[/code\] or a scanning algorithm
- an email validation regexp?
- \[code\]s/((0|\w)+?[xy]*[^xy]){2,7}/u/g\[/code\]
In recent Perl / PHP implementations for instance, what is known to be rather slow - and should be avoided?
The answer is expected from people who did already their own research (profiler...) and who are able to provide a kind of general guidelines about what is recommended/to be avoided.