admin管理员组

文章数量:1406007

I have a paragraph of text which may contain some links in plain text, or some links which are actually links.

For example:

Posting a link: , posting an image <img src=".jpg" />. Posting an actual A tag: <a href=".html">.html</a>

I need to fish out the unformatted links from this piece of text. So any regular expression that will match the first case, but not the second or third case because they are already well formatted links.

I've managed to fish out all the links with this regex: ((http:|https:)\/\/[a-zA-Z0-9&#=.\/\-?_]+), however, am still having trouble distinguishing between the cases.

This needs to be in javascript so I don't think negative lookbehind is allowed.

Any help would be appreciated.

EDIT: I'm trying to wrap the fished out unformatted links in an a tag.

I have a paragraph of text which may contain some links in plain text, or some links which are actually links.

For example:

Posting a link: http://test., posting an image <img src="http://test./2.jpg" />. Posting an actual A tag: <a href="http://test./test.html">http://test./test.html</a>

I need to fish out the unformatted links from this piece of text. So any regular expression that will match the first case, but not the second or third case because they are already well formatted links.

I've managed to fish out all the links with this regex: ((http:|https:)\/\/[a-zA-Z0-9&#=.\/\-?_]+), however, am still having trouble distinguishing between the cases.

This needs to be in javascript so I don't think negative lookbehind is allowed.

Any help would be appreciated.

EDIT: I'm trying to wrap the fished out unformatted links in an a tag.

Share Improve this question edited Apr 20, 2015 at 12:29 ketan 19.4k42 gold badges68 silver badges105 bronze badges asked Apr 20, 2015 at 12:21 l3utterflyl3utterfly 2,2064 gold badges35 silver badges63 bronze badges 2
  • can't you normalise the data server-side? – epoch Commented Apr 20, 2015 at 12:23
  • stackoverflow./questions/1732348/… – epascarello Commented Apr 20, 2015 at 12:27
Add a ment  | 

1 Answer 1

Reset to default 6

You can use this regex to get URLs outside of tags:

(?![^<]*>|[^<>]*<\/)((http:|https:)\/\/[a-zA-Z0-9&#=.\/\-?_]+)

See demo

We can shorten it a bit, too, with an i option:

(?![^<]*>|[^<>]*<\/)((https?:)\/\/[a-z0-9&#=.\/\-?_]+)

See another demo

Sample code:

var re = /(?![^<]*>|[^<>]*<\/)((https?:)\/\/[a-z0-9&#=.\/\-?_]+)/gi; 
var str = 'Posting a link: http://test., posting an image <img src="http://test./2.jpg" />. Posting an actual A tag: <a href="http://test./test.html">http://test./test.html</a>';
var val = re.exec(str);
document.getElementById("res").innerHTML = "<b>URL Found</b>: " + val[1];
var subst = '<a href="$1">$1</a>'; 
var result = str.replace(re, subst);
document.getElementById("res").innerHTML += "<br><b>Replacement Result</b>: " + result;
<div id="res"/>

Update:

To allow capturing inside specific tags, you can whitelist them like this:

var re = /(?![^<]*>|[^<>]*<\/(?!(?:p|pre)>))((https?:)\/\/[a-z0-9&#=.\/\-?_]+)/gi;

本文标签: jqueryAutolink URL with javascript RegexStack Overflow