admin管理员组

文章数量:1325367

This is the text:

\u0026sa=3Dt\u0026url=3D/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT

I want to get the actual link:

/

The /=3Dhttps.*\//g gets including =3D, but I want to get without it. How can I figure this out?

Here's the regex.

This is the text:

https://www.google./url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal./inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT

I want to get the actual link:

https://rivesjournal./inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/

The /=3Dhttps.*\//g gets including =3D, but I want to get without it. How can I figure this out?

Here's the regex.

Share Improve this question asked Feb 20, 2017 at 20:37 AmazingDayTodayAmazingDayToday 4,28215 gold badges41 silver badges70 bronze badges
Add a ment  | 

5 Answers 5

Reset to default 2

One option is to prevent the first http.* substring from being matched by using a negative lookahead with a ^ anchor:

Example Here

(?!^)https:.*\/

This essentially matches https:.*\/ as long as it isn't at the beginning of the string.

Snippet:

var string = 'https://www.google./url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal./inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';

console.log(string.match(/(?!^)https:.*\//)[0]);


However, the expression above won't cover all edge cases therefore the better option would be to just use a capturing group:

Updated Example

=3D(https.*\/)

Snippet:

var string = 'https://www.google./url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal./inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';

console.log(string.match(/=3D(https.*\/)/)[1]);


You can also use a negated character class, such as [^\\]+ in order to match one or more non-\ characters:

Updated Example

=3D(https[^\\]+)

make =3D as a positive lookbehind

(?<==3D)https.*\/

demo here : https://regex101./r/sHvRMA/2

update:

for javascript specific code, use capture groups

var str = 'https://www.google./url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal./inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';
var reg = /=3D(https.*\/)/;
console.log(str.match(reg)[1]);

this is a great resource for figuring out regex matches

http://regexr./

I have never used regex in Javascript, but I have used them extensively in bash, sh, ps and C# and from what I understand this is what you are looking for:

/=3D(http.*\/)\\

https://regex101./r/bupG3W/1

And for capturing the group inside the match

var myString = "something format_abc";
var myRegexp = /(?:^|\s)format_(.*?)(?:\s|$)/g;
var match = myRegexp.exec(myString);
console.log(match[1]); // abc

For those using regex with a language that supports a limited subset of features (like CMake) none of the other answers may work. In this case, one option is to just capture the preceeding string (=3D in the OP's case), and then use a string operation to remove it from the rest of the match after the fact. It's not elegant, but it works.

本文标签: javascriptRegex Matchbut don39t include part of matchedStack Overflow