[ale] Regex question

Fri Mar 12 10:43:32 EST 2004

There is one thing I forgot to mention.  After every error code there is
a space.  That may help me simplify the RE.

On Fri, 2004-03-12 at 02:41, Joe Knapka wrote:
> Chris Fowler <cfowler at outpostsentinel.com> writes:
> 
> > On Thu, 2004-03-11 at 23:34, Mike Murphy wrote:
> > > ah, yeah, stuff like awk won't know what \d is. Try this:
> > > 
> > > echo XMI313 | awk '/XMI[0-9][0-9][0-9]/ {print $0}'
> > > 
> > I think this is the first one that would be required.
> > 
> > > this is a little neater:
> > > 
> > > echo XMI333 | awk '/XMI[0-9]+/ {print $0}'
> > > 
> > 
> > If I did XMI0 then the string would hit.  The remaining 01 would be
> > ignored by the program. 
> 
> No. REs are (almost always) greedy by default. However, it would match
> the lone string "XMI0", as well as the string "XMI0123", which you
> apparently don't want. To get exactly what you want you can use
> 
> (XMI([0-9]{3})?)
> 
> Which says, "XMI, possibly followed by exactly three occurrences of a
> decimal digit." Or if your regexp engine doesn't understand {}, just
> explicitly repeat the [0-9] three times:
> 
> (XMI([0-9][0-9][0-9])?)
> 
> The above RE should work everywhere REs are spoken. Both of these
> have the effect of creating an extraneous capture group, which might
> cause problems in some contexts. Perl-compatible REs have a syntax for
> a non-capturing group, but I don't remember it offhand. It's something
> like
> 
> (XMI(:?[0-9]{3})?)
> 
> I think.
> 
> -- Joe
> 
>  That would be an unanticipated consequence.
> > So an error code is basically a few letters followed by numbers.  The
> > letters represent a group and the numbers represent a specific error
> > condition.  I'm trying to group them to reduce load on the program.  I
> > have to search for 1000 possible errors.
> > 
> > 
> > > but could have unanticipated consequences. For instance, if you do that 
> > > with an input of something like 'XMI3334', its going to find that, but 
> > > that's also true of the first example. (because the substring matched. 
> > > That's probably ok for your purposes. If not, you might try anchoring 
> > > that with a ^ and a $ if necessary (assuming that would work for your 
> > > stream).
> > > 
> > > Mike
> > > 
> > > 
> > > Christopher Fowler wrote:
> > > > On Thu, Mar 11, 2004 at 11:13:04PM -0500, Mike Murphy wrote:
> > > > 
> > > >>unless I'm missing something, something like this:
> > > >>
> > > >>=~ /(XMI\d\d\d)/ should work. The entire string matched will show up in 
> > > >>$1 afteward. This presumes that not other characters will show in the 
> > > >>string.
> > > > 
> > > > 
> > > > I must be doing something wrong then,  I'm using AWK to validate the regex.
> > > > The perl Expect module will actually do the matching based on the regular
> > > > expression so I do not think anything that is perl specific will work.  That
> > > > is why I'm testing with awk
> > > > 
> > > > echo XMI | awk '/XMI\d\d\d/ {print $0}' 
> > > > 
> > 
> > _______________________________________________
> > Ale mailing list
> > Ale at ale.org
> > http://www.ale.org/mailman/listinfo/ale
> > 
> >