admin管理员组文章数量:1335623
Here's a code that retrieves information about package imports in a LaTeX
file. I fail to catch the optional dates in square brackets. How can I do this?
import re
test_str = r"""
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}%
[2020-01-02]
\RequirePackage{tocbasic}
\RequirePackage[svgnames]%
{xcolor}%
[2023/11/15]
\RequirePackage[raggedright]% OK?
{titlesec}
\RequirePackage{xcolor}%
[2022/06/12]
\RequirePackage{hyperref}% To load after titlesec!
[2023-02-07]
"""
pattern = repile(
r"\\RequirePackage(\[(.*?)\])?([^{]*?)?{(.*?)}",
flags = re.S
)
matches = pattern.finditer(test_str)
for m in matches:
print('---')
for i in [0, 2, 4]:
print(f"m.group({i}):")
print(m.group(i))
print()
Here is the actual output.
---
m.group(0):
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}
m.group(2):
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
m.group(4):
geometry
---
m.group(0):
\RequirePackage{tocbasic}
m.group(2):
None
m.group(4):
tocbasic
---
m.group(0):
\RequirePackage[svgnames]%
{xcolor}
m.group(2):
svgnames
m.group(4):
xcolor
---
m.group(0):
\RequirePackage[raggedright]% OK?
{titlesec}
m.group(2):
raggedright
m.group(4):
titlesec
---
m.group(0):
\RequirePackage{xcolor}
m.group(2):
None
m.group(4):
xcolor
---
m.group(0):
\RequirePackage{hyperref}
m.group(2):
None
m.group(4):
hyperref
Here's a code that retrieves information about package imports in a LaTeX
file. I fail to catch the optional dates in square brackets. How can I do this?
import re
test_str = r"""
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}%
[2020-01-02]
\RequirePackage{tocbasic}
\RequirePackage[svgnames]%
{xcolor}%
[2023/11/15]
\RequirePackage[raggedright]% OK?
{titlesec}
\RequirePackage{xcolor}%
[2022/06/12]
\RequirePackage{hyperref}% To load after titlesec!
[2023-02-07]
"""
pattern = repile(
r"\\RequirePackage(\[(.*?)\])?([^{]*?)?{(.*?)}",
flags = re.S
)
matches = pattern.finditer(test_str)
for m in matches:
print('---')
for i in [0, 2, 4]:
print(f"m.group({i}):")
print(m.group(i))
print()
Here is the actual output.
---
m.group(0):
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}
m.group(2):
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
m.group(4):
geometry
---
m.group(0):
\RequirePackage{tocbasic}
m.group(2):
None
m.group(4):
tocbasic
---
m.group(0):
\RequirePackage[svgnames]%
{xcolor}
m.group(2):
svgnames
m.group(4):
xcolor
---
m.group(0):
\RequirePackage[raggedright]% OK?
{titlesec}
m.group(2):
raggedright
m.group(4):
titlesec
---
m.group(0):
\RequirePackage{xcolor}
m.group(2):
None
m.group(4):
xcolor
---
m.group(0):
\RequirePackage{hyperref}
m.group(2):
None
m.group(4):
hyperref
Share
Improve this question
edited Nov 20, 2024 at 8:43
projetmbc
asked Nov 19, 2024 at 23:17
projetmbcprojetmbc
1,4621 gold badge13 silver badges27 bronze badges
2 Answers
Reset to default 1You could update the pattern using negated character classes and omit the flags = re.S
\\RequirePackage(\[([^][]*)\])?([^{]*){([^{}]*)}.*(?:\n\s*\[([^][]*)])?
The pattern matches:
\\RequirePackage
Match\RequirePackage
(\[([^][]*)\])?
Optionally capture[...]
([^{]*)
Capture optional chars other than{
{([^{}]*)}
Capture what is between{...}
.*
Match the rest of the line(?:
Non capture group\n\s*\[([^][]*)]
Match a newline, optional whitespace chars and then capture what is between[...]
)?
Close the non capture group and make it optional
See a regex 101 demo and a Python demo.
If you are only interested in group 2, 4 and the added group 5 then you can omit 2 capture groups which are not interesting use 3 capture groups in total in the regex:
\\RequirePackage(?:\[([^][]*)\])?[^{]*{([^{}]*)}.*(?:\n\s*\[([^][]*)])?
See the group values in the regex101 demo and another Python demo
I've added ?\[([0-9\-\/]*?)\]
to your regex, so the final result is:
r"\\RequirePackage(\[(.*?)\])?([^{]*?)?{(.*?)}?\[(.*?)\]"
However, it's in the first and 6th matching group. (0 and 5). I don't know if you need it in 0.
import re
test_str = r"""
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}%
[2020-01-02]
\RequirePackage{tocbasic}
\RequirePackage[svgnames]%
{xcolor}%
[2023/11/15]
\RequirePackage[raggedright]% OK?
{titlesec}
\RequirePackage{xcolor}%
[2022/06/12]
\RequirePackage{hyperref}% To load after titlesec!
[2023-02-07]
"""
pattern = repile(
r"\\RequirePackage(\[(.*?)\])?([^{]*?)?{(.*?)}?\[([0-9\-\/]*?)\]", # edited this line
flags = re.S
)
matches = pattern.finditer(test_str)
for m in matches:
print('---')
for i in [0, 2, 4, 5]: # Added 5
print(f"m.group({i}):")
print(m.group(i))
print()
Here's the output:
---
m.group(0):
\RequirePackage[
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
]{geometry}%
[2020-01-02]
m.group(2):
top = 2.5cm,
bottom = 2.5cm,
left = 2.5cm,
right = 2.5cm,
marginparwidth = 2cm,
marginparsep = 2mm,
heightrounded
m.group(4):
geometry}%
m.group(5):
2020-01-02
---
m.group(0):
\RequirePackage{tocbasic}
\RequirePackage[svgnames]%
{xcolor}%
[2023/11/15]
m.group(2):
None
m.group(4):
tocbasic}
\RequirePackage[svgnames]%
{xcolor}%
m.group(5):
2023/11/15
---
m.group(0):
\RequirePackage[raggedright]% OK?
{titlesec}
\RequirePackage{xcolor}%
[2022/06/12]
m.group(2):
raggedright
m.group(4):
titlesec}
\RequirePackage{xcolor}%
m.group(5):
2022/06/12
---
m.group(0):
\RequirePackage{hyperref}% To load after titlesec!
[2023-02-07]
m.group(2):
None
m.group(4):
hyperref}% To load after titlesec!
m.group(5):
2023-02-07
本文标签: python 3xCatching optional content just after a new lineStack Overflow
版权声明:本文标题:python 3.x - Catching optional content just after a new line - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742391770a2466128.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论