admin管理员组文章数量:1398831
I'm trying to extract the 'abc-def' part of the URLs below. I can do it with two different regex patterns. (see examples) Is it possible to write one regex that works for all cases below?
(BigQuery doesn't seem to support lookback)
SELECT REGEXP_EXTRACT('/', r'([^/]+)/?$')
UNION ALL
SELECT REGEXP_EXTRACT('', r'([^/]+)/?$')
UNION ALL
SELECT REGEXP_EXTRACT('', r'([^/]+)/?[$|\?]')
UNION ALL
SELECT REGEXP_EXTRACT('/?p=294', r'([^/]+)/?[$|\?]')
UNION ALL
SELECT REGEXP_EXTRACT('/?p=294', r'([^/]+)/?[$|\?]')
Expected output 'abc-def'
I'm trying to extract the 'abc-def' part of the URLs below. I can do it with two different regex patterns. (see examples) Is it possible to write one regex that works for all cases below?
(BigQuery doesn't seem to support lookback)
SELECT REGEXP_EXTRACT('https://www.example/post/abc-def/', r'([^/]+)/?$')
UNION ALL
SELECT REGEXP_EXTRACT('https://www.example/post/abc-def', r'([^/]+)/?$')
UNION ALL
SELECT REGEXP_EXTRACT('https://www.example/post/abc-def?p=294', r'([^/]+)/?[$|\?]')
UNION ALL
SELECT REGEXP_EXTRACT('https://www.example/post/abc-def/?p=294', r'([^/]+)/?[$|\?]')
UNION ALL
SELECT REGEXP_EXTRACT('http://www.example/abc-def/?p=294', r'([^/]+)/?[$|\?]')
Expected output 'abc-def'
Share Improve this question edited Mar 26 at 22:39 Barmar 784k57 gold badges548 silver badges660 bronze badges asked Mar 26 at 22:35 David FricksDavid Fricks 113 bronze badges 02 Answers
Reset to default 0Note that [$|\?]
matches either a $
, |
or ?
chars since [...]
specifies a character class.
Using REGEXP_EXTRACT
that only returns the first match from the given input string, you may use the ([^/?]+)/?(?:$|\?)
regex:
REGEXP_EXTRACT(col, r'([^/?]+)/?(?:$|\?)')
Details
([^/?]+)
- Group 1:/?
- an optional/
symbol(?:$|\?)
- a non-capturing group matching either end of string or a?
char.
If you want to test the pattern at regex101, make sure you test against each input individually, not a multiline string.
Another solution is using the ^(?:.*/)?([^/?]+)
pattern (add \n
into the negated character class when testing at regex101).
^
- start of string(?:.*/)?
- an optional sequence of any zero or more chars as many as possible followed with a/
char([^/?]+)
- Group 1: any one or more chars other than/
and?
.
Try
REGEXP_EXTRACT(url, r'.*/([^/?]+?)')
.*/
skips over everything until the/
before the last word([^/?]+?)
captures the last word, everything up to a following/
or?
.
本文标签: sqlBigQuery REGEXPEXTRACT end of URL without unsupported lookbackStack Overflow
版权声明:本文标题:sql - BigQuery REGEXP_EXTRACT end of URL without unsupported lookback - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744122479a2591805.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论