Regex to find anchor tags which are without http or https in the href attribute -
i have sample text on want run regex pull anchor tags href doesn't contain http|https in address part.
i trying regex, , not complete yet. not able pluck anchor when not start http or https.
link gskinner site - http://regexr.com?34ev0
<a.*?href=[""|'](http|https:\/\/)(?<link>[^""|']*)[""|'].*?>
here sample string:-
<br /><span style="font-size: 16px;"><strong><a target="_blank" href="http://www.yahoo.com">good link (yahoo)</a><br /><br /><a target="_blank" href="www.bbc.com">bad link (bbc)</a><br /><br /><a href="" id="anchorsocialmedia" onclick="showmodalpopup('anchorsocialmedia','/events/popup/socialmediasharemodal.aspx','650px','500px');">share event</a><br />badge perf testing<br /><br /></strong></span>
thx.
using javascript regex methods (there equivalents in pretty languages):
<your string>.match(/<a\s[^>]*href\s*=\s*"[^"]*"[^>]*>/g) .join('') .match(/href\s*=\s*"(?!https?:\/\/)[^"]*"/g);
or
<your string>.match(/<a\s[^>]*href\s*=\s*"(?!https?:\/\/)[^"]*"[^>]*>/g) .map(function(x){return x.replace(/.*(href\s*=\s*"[^"]*").*/,'$1');})
you choose!
Comments
Post a Comment