Regex to extract multiple URLs -
i extract category values doubleclick urls sit within of web source code across full site ;
<script type="text/javascript"> var axel = math.random() + ""; var = axel * 10000000000000; document.write('<iframe src="https://1234567.fls.doubleclick.net/activityi;src=1234567;type=examp123;cat=examp999;ord=1;num=' + + '?" width="1" height="1" frameborder="0" style="display:none"></iframe>'); </script> <noscript><iframe src="https://1234567.fls.doubleclick.net/activityi;src=1234567;type=examp456;cat=examp888;ord=1;num=1?" width="1" height="1" frameborder="0" style="display:none"></iframe></noscript>
what extract follows;
- cat 1 = examp999
- cat 2 = examp888
i have tried below @ankitmishra answer;
https:\/\/(?:.*.doubleclick.net).*cat=([^;]*);
this returns both values - tool using crawl pages of website returns 1 match per regex. doesn't support multiple values, returns first match. how can create 2nd string, capture 2nd cat value?
if 1 cat value default, i.e. defau123 - can use this, within rule above, ignore cat values of defau123, pass else?
^((?!defau123).)*$
any appreciated!
if trying more selective in regex, need expand regex match more data. if can use behind ensure ><iframe src="
before url.
(?<=\>\<iframe\ src\=\")https:\/\/(?:.*.doubleclick.net).*cat=([^;]*);
if behind not available or if ><iframe src="
not reliable, need find own reliable anchor.
Comments
Post a Comment