[ale] Regex Assistance

Alex Carver agcarver+ale at acarver.net
Mon May 13 20:41:04 EDT 2019


It's going to depend on the regex engine.

If you search on the topic of multiple group pattern matching you'll
find that some engines (like perl) can do it and automatically return
multiple group references.

qr/(?:(pattern).*?)+/;

Other engines can't do it and require explicit group notation to extract.

So, in theory, with the right engine you could do this with your
subprocessed string (and possibly without it, too).

qr/(?:([0-9A-Z]+ \x1D).*?)+/;


All of this is untested.


On 2019-05-13 17:02, Calvin Harrigan via Ale wrote:
> On 5/13/2019 7:14 PM, Byron Jeff wrote:
>> sed -e 's//Actual question about regex?/'
>>
>> BAJ
>>
>> On Mon, May 13, 2019 at 07:01:12PM -0400, Calvin Harrigan via Ale wrote:
>>> _______________________________________________
>>> Ale mailing list
>>> Ale at ale.org
>>> https://mail.ale.org/mailman/listinfo/ale
>>> See JOBS, ANNOUNCE and SCHOOLS lists at
>>> http://mail.ale.org/mailman/listinfo
> 
> I know right?  Sorry...
> 
> The source string seems to be getting sanitized by my email client.
> There are some special characters in it, so I've improvised.  RS =
> Record Set characters (0x1E), GS = Group Set (0x1D), EOT = End of
> Transmission (0x04),
> 
> CR = Carriage return (0x0D), LF = Line Feed (0x0A). There are no
> whitespace, I've only included them for readability.
> 
> Some assembly required...  I've also attached a file with the correct
> contents.
> 
> "[)>" Can be considered a start marker. Everything else I want to 
> capture into separate groups.
> 
> [)>
 RS
> 
> 06 GS
> 
> 
Y7130700000000Y GS
> 
> 
P84469826 GS
> 
> 
12V654663145
 GS
> 
> T1118360000100078
 GS
> 
> S100078
 GS
> 
> 2D122618
1 GS
> 
> PCXMG29N04D
 RS EOT CR LF
> 
> So far I've been able to create a group that contains everything between
> the two RS tags/bytes/chars.  After that extraction I can split it on
> the GS boundaries, but I would like to be able to do it all in one
> expression.
> 
> Group set extraction = ^\[\)\>\x1e(.+)\x1e\x04$
> 
> SubGroup Split =
> 
> OneExpressionToRuleThemAll =
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> https://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
> 



More information about the Ale mailing list