Regex to extract multiple URLs -
i extract category values doubleclick urls sit within of web source code across full site ;
<script type="text/javascript"> var axel = math.random() + ""; var = axel * 10000000000000; document.write('<iframe src="https://1234567.fls.doubleclick.net/activityi;src=1234567;type=examp123;cat=examp999;ord=1;num=' + + '?" width="1" height="1" frameborder="0" style="display:none"></iframe>'); </script> <noscript><iframe src="https://1234567.fls.doubleclick.net/activityi;src=1234567;type=examp456;cat=examp888;ord=1;num=1?" width="1" height="1" frameborder="0" style="display:none"></iframe></noscript> what extract follows;
- cat 1 = examp999
- cat 2 = examp888
i have tried below @ankitmishra answer;
https:\/\/(?:.*.doubleclick.net).*cat=([^;]*); this returns both values - tool using crawl pages of website returns 1 match per regex. doesn't support multiple values, returns first match. how can create 2nd string, capture 2nd cat value?
if 1 cat value default, i.e. defau123 - can use this, within rule above, ignore cat values of defau123, pass else?
^((?!defau123).)*$
any appreciated!
if trying more selective in regex, need expand regex match more data. if can use behind ensure ><iframe src=" before url.
(?<=\>\<iframe\ src\=\")https:\/\/(?:.*.doubleclick.net).*cat=([^;]*); if behind not available or if ><iframe src=" not reliable, need find own reliable anchor.
Comments
Post a Comment