Bash: Regex matching on multiple lines simultaneously and extracting captured content -


i have xml file in following format

<starttag name="aaa" >     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="yyy"/> </starttag> <starttag name="bbb" >     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="xxx"/> </starttag> <starttag name="ccc" >     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="xxx"/>     <innertag name="xxx" value="yyy"/> </starttag> .. .. .. 

i want extract name attributes of starttag of innertag has value yyy.

so in file above, output aaa , ccc. can use regex matching. suppose possible using lookaheads not able create regex patterns multilines. know how use regex single line , tried using same not getting expected outputs. headway on this.

edit: though have put xml example trying know multiline regex matching , trying on file failing. please avoid xml parsing related solutions.

update: per steven suggestion, following worked

pcregrep -m '<starttag name="([^"])*"[^>]*>(\s|<innertag[^>]*>)*<innertag name="[^"]*" value="yyy"\/>(\s|<innertag[^>]*>)*<\/starttag>' file.xml  grep -pzo '<starttag name="([^"])*"[^>]*>(\s|<innertag[^>]*>)*<innertag name="[^"]*" value="yyy"\/>(\s|<innertag[^>]*>)*<\/starttag>' file.xml 

an xml parser, 1 supports xpath going far easier , more stable, if must insist on using regex, here's pattern work sample input provided:

<starttag name="([^"])*"[^>]*>(\s|<innertag[^>]*>)*<innertag name="[^"]*" value="yyy"\/>(\s|<innertag[^>]*>)*<\/starttag> 

it's not going work variations of well-formed xml documents, long consistently formatted example, should "okay".

by default, regex captures across multiple lines. there option can tell process 1 line @ time, that's not turned on default. real trick . pattern not match new-line characters, if want match character, including new-lines, need use .|\n or negative character class such [^>].


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -

Python Pig Latin Translator -