I’ve spend most of the afternoon working on a complex regex in order to parse command line argument forms (for lack of a better term). If you’ve ever run the man command you’ll know what I’m talking about. Take the tar command as an example:
If you’re already familiar with regular expressions you’ll know that doing something like:
trying to match:
won’t accomplish what you think. Instead of getting just the first set of braces you’ll end up with the whole remainder of the string. This is because of a feature, we’ll give it that title, called greedy matching. Greedy matching means that it takes the largest possible chunk that your regex will match, which in this case is the ‘]’ on the end of pathname.
I was aware of what was going on, but not being a particular master of regular expressions, I wasn’t sure how to get it to stop being greedy. As it turns out its quite easy.
- .* - Greedy matching
- .+ - Greedy matching
- .*? - Non-greedy matching
- .+? - Non-greedy matching
It could not be easier, once you know about it of course.