admin管理员组文章数量:1406949
I have been trying to write a tool to look at Word documents in a specified path for broken links.
I gave up on having it search a folder, thinking I just need to get it to do a document first. With my limited skills, reading, and trying some suggestions from copilot. (Breaking it down to individual tasks) I put this together:
import docx
import os
import url
doc = docx.Document('C:/Users/demo.docx')
allText = []
def find_hyperlinks(doc):
hyperlinks = []
rels = doc.part.rels
for rel in rels:
if "hyperlink" in rels[rel].target_ref:
hyperlinks.append(rels[rel].target_ref)
return hyperlinks
def find_broken_links_in_docx(doc):
broken_links = []
hyperlinks = find_hyperlinks(doc)
for url in hyperlinks:
try:
response = requests.head(url, allow_redirects=True)
if response.status_code >= 400:
broken_links.append(url)
except requests.RequestException:
broken_links.append(url)
return broken_links
def write_report(report, output_file):
with open(output_file, 'w') as f:
for file_path, links in report.items():
f.write(f"File: {file_path}\n")
for link in links:
f.write(f" Broken link: {link}\n")
f.write("\n")
if __name__ == "__main__":
output_file = "C:/Results/broken_links_report.txt"
report = find_broken_links_in_docx(doc)
write_report(report, output_file)
print(f"Report written to {output_file}")
Here is the error:
File "c:\Users\Scripts\playground\openinganddocx.py", line 41, in <module>
write_report(report, output_file)
File "c:\Users\Scripts\playground\openinganddocx.py", line 31, in write_report
for file_path, links in report.items():
AttributeError: 'list' object has no attribute 'items'
For reference:
Line 31
f.write(f"File: {file_path}\n")
Line 41
print(f"Report written to {output_file}")
I have been trying to write a tool to look at Word documents in a specified path for broken links.
I gave up on having it search a folder, thinking I just need to get it to do a document first. With my limited skills, reading, and trying some suggestions from copilot. (Breaking it down to individual tasks) I put this together:
import docx
import os
import url
doc = docx.Document('C:/Users/demo.docx')
allText = []
def find_hyperlinks(doc):
hyperlinks = []
rels = doc.part.rels
for rel in rels:
if "hyperlink" in rels[rel].target_ref:
hyperlinks.append(rels[rel].target_ref)
return hyperlinks
def find_broken_links_in_docx(doc):
broken_links = []
hyperlinks = find_hyperlinks(doc)
for url in hyperlinks:
try:
response = requests.head(url, allow_redirects=True)
if response.status_code >= 400:
broken_links.append(url)
except requests.RequestException:
broken_links.append(url)
return broken_links
def write_report(report, output_file):
with open(output_file, 'w') as f:
for file_path, links in report.items():
f.write(f"File: {file_path}\n")
for link in links:
f.write(f" Broken link: {link}\n")
f.write("\n")
if __name__ == "__main__":
output_file = "C:/Results/broken_links_report.txt"
report = find_broken_links_in_docx(doc)
write_report(report, output_file)
print(f"Report written to {output_file}")
Here is the error:
File "c:\Users\Scripts\playground\openinganddocx.py", line 41, in <module>
write_report(report, output_file)
File "c:\Users\Scripts\playground\openinganddocx.py", line 31, in write_report
for file_path, links in report.items():
AttributeError: 'list' object has no attribute 'items'
For reference:
Line 31
f.write(f"File: {file_path}\n")
Line 41
print(f"Report written to {output_file}")
1 Answer
Reset to default 0The issue is you are treating a list
(report) like a dict
, which it is not. That's why you are getting an AttributeError: list has no attribute 'items'
.
What you want is a dictionary that has the structure {'filepath': [<urls>]}
. So start there:
def find_hyperlinks(doc_path: str):
doc = docx.Document(doc_path)
hyperlinks = []
rels = doc.part.rels
for rel in rels:
if "hyperlink" in rels[rel].target_ref:
hyperlinks.append(rels[rel].target_ref)
# here is an example of how I might return that value
return {doc_path: hyperlinks}
# from here, prune the hyperlinks that work
broken_links = {}
for doc_path, links in links_dict.items():
broken = []
for link in links:
if link_works(link):
continue
broken.append(link)
broken_links[doc_path] = broken
# etc
This isn't perfect, but will get you on the path to success
本文标签: pythonAttributeError 39list39 object has no attribute 39items39Stack Overflow
版权声明:本文标题:python - AttributeError: 'list' object has no attribute 'items' - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744948154a2633912.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
report
is alist
which does not support.items()
. I'm assuming the error message saysAttributeError: 'list' object has no attribute 'items'
– C.Nivs Commented Mar 6 at 21:38{str: list}
. Tackle the problem with that in mind. The keys to your dict are filepaths, the values are the urls per file. Don't worry about writing the file just yet, parse the links out and see if you can fill out your data structure correctly. I'll add an answer, but I won't solve the whole problem – C.Nivs Commented Mar 6 at 21:49