admin管理员组文章数量:1323561
I did not find any good answer to this question so I share what I found and works
if you want to remove all the google analytics terms from an URL, you mostly want to keep the other parameters and get a clean valid URL at the end
url = url.replace(/(\&|\?)utm([_a-z0-9=+\-]+)/igm, "$1");
with a url like this
;utm_medium=affiliation&utm_content=catalogue-RDC&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249
you will get this ?&&&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249
this url is already valid but we have some dupe & signs if you remove the $1 from the first request you will with only a & sign and not the ? that you should have in the beginning
so next clean up we keep the first ? sign => $1 and remove the other leading &
url = url.replace(/(\?)\&+/igm, "$1");
here we have a nice clean URL
full version :
url = url.replace(/(\&|\?)utm([_a-z0-9=+\-]+)/igm, "$1");
url = url.replace(/(\?)\&+/igm, "$1");
if you can find a one liner you're wele
Edit : the resulting URL should be this one :
I did not find any good answer to this question so I share what I found and works
if you want to remove all the google analytics terms from an URL, you mostly want to keep the other parameters and get a clean valid URL at the end
url = url.replace(/(\&|\?)utm([_a-z0-9=+\-]+)/igm, "$1");
with a url like this
https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?utm_source=325483&utm_medium=affiliation&utm_content=catalogue-RDC&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249
you will get this https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?&&&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249
this url is already valid but we have some dupe & signs if you remove the $1 from the first request you will with only a & sign and not the ? that you should have in the beginning
so next clean up we keep the first ? sign => $1 and remove the other leading &
url = url.replace(/(\?)\&+/igm, "$1");
here we have a nice clean URL
full version :
url = url.replace(/(\&|\?)utm([_a-z0-9=+\-]+)/igm, "$1");
url = url.replace(/(\?)\&+/igm, "$1");
if you can find a one liner you're wele
Edit : the resulting URL should be this one : https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249
Share Improve this question edited Jul 5, 2018 at 10:04 benraay asked Jul 5, 2018 at 9:13 benraaybenraay 8639 silver badges15 bronze badges 3- What would be the resulting correct url? – Jorge.V Commented Jul 5, 2018 at 9:36
- Good point I have edited my question – benraay Commented Jul 5, 2018 at 10:05
-
@benraay
/(?<=&|\?)utm_.*?(&|$)/igm
does not remove the trailing?
if the query string only containsutm
params. It is also not portable across JS environments that do not support ECMAScript 2018 standard. – Wiktor Stribiżew Commented Jul 5, 2018 at 10:19
2 Answers
Reset to default 7I think it could be as simple as:
url = url.replace(/(?<=&|\?)utm_.*?(&|$)/igm, "");
You do not need to escape &
(?<=&|\?)
= positive lookbehind
.*?
= everything, but "not greedy"
You may use a single regex patible with all JS versions that will
- match and capture
?
that is followed by 1 or moreutm
param that are followed with a param other thanutm
one and replace with$1
to restore that?
since it is necessary - or, match any
?
with 1 or moreutm
params in the query string where no params other thanutm
are present (so,$1
will be empty, and?
will get removed) - or, just match all
utm
params to remove them.
The regex will look like
.replace(/(\?)utm[^&]*(?:&utm[^&]*)*&(?=(?!utm[^\s&=]*=)[^\s&=]+=)|\?utm[^&]*(?:&utm[^&]*)*$|&utm[^&]*/gi, '$1')
See the regex demo
Details
(\?)utm[^&]*(?:&utm[^&]*)*&(?=(?!utm[^\s&=]*=)[^\s&=]+=)
-?utm
(with?
inside a capturing group later referenced with$1
), 0+ chars other than&
, and then 0 or more repetitions of&utm
followed with 0+ chars other than&
and then a&
that is followed with 0+ chars other than whitespace,&
and=
and then=
that is notutm
param|
- or\?utm[^&]*(?:&utm[^&]*)*$
-?utm
, 0+ chars other than&
, and then 0 or more repetitions of&utm
followed with 0+ chars other than&
and then the end of the string|
- or&utm[^&]*
- a&
,utm
and then 0+ chars other than&
JS demo:
var urls = ['https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?utm_source=325483&utm_medium=affiliation&utm_content=catalogue-RDC&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249', 'https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?t=55&utm_source=325483&utm_medium=affiliation&utm_content=catalogue-RDC&awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249','https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?awc=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249&utm_tt=78', 'https://www.somewebsite.fr/produit/yi-camera-3600-noir-vr-33705370/offre-81085802?utm=6901_1530705916_88ef12642ad61dfc5239ba01bbbe5249&utm=ewe'];
var u = 'utm[^&]*';
var rx = new RegExp("(\\?)"+u+"(?:&"+u+")*&(?=(?!utm[^\s&=]*=)[^\s&=]+=)|\\?"+u+"(?:&"+u+")*$|&"+u, "ig");
for (var url of urls) {
console.log(url, "=>", url.replace(rx, '$1'));
}
本文标签: nodejsRemoving utm* parameters from URL in javascript with a regexStack Overflow
版权声明:本文标题:node.js - Removing utm_* parameters from URL in javascript with a regex - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742118680a2421590.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论