[ale] hmm. yer never too old to trip on Grep Reg Expressions
DJ-Pfulio
djpfulio at jdpfu.com
Sun Sep 25 09:12:32 EDT 2016
Good explanation Charles, but something looks funny. I created some files to test:
$ touch aaaa bb ccc i
# The tried the regex:
$ ls -1 |grep "[abc]+"
<nothing returned>
$ ls -1 |grep "[abc]"
aaaa
bb
ccc
$ ls -1 |grep "[abc]*"
aaaa
bb
ccc
i
# But if we used egrep (or grep -E if you like)
$ ls -1 |egrep "[abc]+"
aaaa
bb
ccc
Gotta know which regex engine is being used. ;)
In perl, I've used numbers after an [] group to say exactly how many of _those
things_ I needed. That didn't work with grep/egrep. Don't know why, just know it
didn't.
$ ls -1 |grep "[abc]c"
ccc
# and
$ ls -1 |grep "^[abc]c"
ccc
Should also mention that piping ls output into a grep is just to avoid bash
globbing so grep is really used. That ls option is a -1 (one), not l (el).
For a few years, I had to create some very nast regex to match patterns in govt
documents ... so we could hyperlink the ToC and Index entries into the document
at the correct page/paragraph. Nothing like experience to teach. About 200 docs
per flight, so lots of variability. Adobe Type 3 fonts really screwed with our
regexes since they aren't really letters to the computer. ;(
On 09/23/2016 09:01 AM, Charles Shapiro wrote:
>
> Ah, regex golf. Try 'def.*buff.*for.*ALTPLAN' Use "grep -i" to ignore case.
> Your initial regexp used *file* regex, where "*" means any character any
> length. In the proper formal dialects, "*" merely means any number of the
> preceding RE, and the "." means any character. Hence, "foo*" in the shell
> matches "fooa","foob", et cetera. But in regex, it matches only "foo", "fooo",
> "foooooo", et cetera. Watch out for quoting in the shell also; that's why I used
> single-quotes. Knowing just a few REs can carry you a surprising distance.
> [abc] matches the single character a,b,and c. So "[abc]+" matches aaaa, bb, or
> ccc but not i.
>
>
> This worked for me on the following file:
>
> define buffer snort for ALTPLAN
> DEFINE BUFFER BOOF for ALTPLAN
> FOO
>
> !:/home/cshapiro/Mapping_Contracts/forsythco> grep -i '^def.*buf.*for ALTPLAN'
> foo.txt
> define buffer snort for ALTPLAN
> DEFINE BUFFER BOOF for ALTPLAN
>
> For extra fnu, try the regex golf site ( http://www.regex.alf.nu/ ).
>
> -- CHS
>
>
> On Thu, Sep 22, 2016 at 8:35 PM, DJ-Pfulio <DJPfulio at jdpfu.com
> <mailto:DJPfulio at jdpfu.com>> wrote:
>
> I'd use perl. Trivial to read a file, find the lines matching any
> complex regex you like, back up 3 lines and print the following 14 lines.
> Don't forget to handle lines that happen inside the group to be
> exported. Would be good to show file:linenum:LINE so it is clear -
> perhaps highlight the actual line with << >> - idunno.
>
> I like Leam's regex except the leading ^ and trailing $ - these things
> don't need to start in col-1 or end of line. Otherwise, probably
> restrictive enough to minimize unwanted output.
>
> On 09/22/2016 07:30 PM, Leam Hall wrote:
> > Why not "^def*buff*altplan$"? Then grep v out things you don't want.
> >
> > On 09/22/16 14:46, Neal Rhodes wrote:
> >> So, I need to look in about a bazillion source files for variants of
> >>
> >> DEFINE BUFFER SNORT FOR ALTPLAN.
> >> Define Buffer Blech for AltPlan.
> >> Def Buff Blurf for AltPlan.
> >> Def Buff Blurf for AltPlan.
> >> def buff blurf for altplan.
> >> define buff blurf for altplan.
> >> define buffer blorf for
> >> altplan.
> >> define new shared buffer blorf for altplan.
> >>
> >> And grap 3 lines before, 10 lines afterwards, source file and line#.
> >>
> >> I was thinking this would to it:
> >>
> >> grep -i -B 3 -A 10 -H -n -r -f buf-grep.inp * > buf.grep.out
> >>
> >> Where buf-grep.inp was
> >>
> >> def*buff*for*ALTPLAN
> >>
> >> def*buff*for*ARM
> >>
> >> def*buff*for*ARMNOTE
> >>
> >> Alas it is not thus, and the more I study the reg exp notes the more I
> >> see there error of my ways, and the less I see an expression that would
> >> work.
> >>
More information about the Ale
mailing list