admin管理员组文章数量:1122832
Let me preface this with the fact that I am aware that there's a lot of resources online for this. So many in fact, that I have trouble finding a solution for my particular issue.
The goal:
We have a folder full of word and excel docs, that point to eachother and also other word documents. The network share they were on has been changed, and so links pointing to file://\\old_share\folder\document.doc
no longer work. My intention was to go through every document, and change the Address of the links to the new share e.g. file://\\new_share\folder\document.doc
.
These word documents are mixed between versions, so some are docx, some are doc, etc.
What I have so far:
I've tried using Python, PowerShell, C#. The exact implementation is almost identical. I open a handle to word.application, then I open the document, and then I try to iterate through the Hyperlinks field and do my work.
For now I would be happy to just print them to the screen so this is my minimal example of what I have so far.
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
$folder = "\\new_share\folder"
$docs = Get-ChildItem -Recurse -LiteralPath $folder -file -include '*.doc*'
foreach($doc in $docs){
$thisDoc = $word.Documents.Open($_.FullName)
$link = [pscustomobject]@{
FileName = $doc.FullName
HyperLink = $thisDoc.Address
}
Write-Host "$($link.FileName) => $($link.HyperLink)"
$thisDoc.Close()
}
$Word.Quit()
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
From this I am expecting to see the addresses, so a bunch of (\old_share\folder...), what I get instead is a bunch of Hyperlinks with $null Address fields. I'm sure that I am doing something wrong, but I'd really appreciate any ideas.
using a com object to open the file and read the Hyperlinks field.
I also considered just unzipping the docx files, and changing the links directly inside the xml, but not all documents are docx, and I feel that it is much more slow and complex to do.
Let me preface this with the fact that I am aware that there's a lot of resources online for this. So many in fact, that I have trouble finding a solution for my particular issue.
The goal:
We have a folder full of word and excel docs, that point to eachother and also other word documents. The network share they were on has been changed, and so links pointing to file://\\old_share\folder\document.doc
no longer work. My intention was to go through every document, and change the Address of the links to the new share e.g. file://\\new_share\folder\document.doc
.
These word documents are mixed between versions, so some are docx, some are doc, etc.
What I have so far:
I've tried using Python, PowerShell, C#. The exact implementation is almost identical. I open a handle to word.application, then I open the document, and then I try to iterate through the Hyperlinks field and do my work.
For now I would be happy to just print them to the screen so this is my minimal example of what I have so far.
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
$folder = "\\new_share\folder"
$docs = Get-ChildItem -Recurse -LiteralPath $folder -file -include '*.doc*'
foreach($doc in $docs){
$thisDoc = $word.Documents.Open($_.FullName)
$link = [pscustomobject]@{
FileName = $doc.FullName
HyperLink = $thisDoc.Address
}
Write-Host "$($link.FileName) => $($link.HyperLink)"
$thisDoc.Close()
}
$Word.Quit()
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
From this I am expecting to see the addresses, so a bunch of (\old_share\folder...), what I get instead is a bunch of Hyperlinks with $null Address fields. I'm sure that I am doing something wrong, but I'd really appreciate any ideas.
using a com object to open the file and read the Hyperlinks field.
I also considered just unzipping the docx files, and changing the links directly inside the xml, but not all documents are docx, and I feel that it is much more slow and complex to do.
Share Improve this question edited Nov 21, 2024 at 16:09 Tim Williams 166k8 gold badges100 silver badges137 bronze badges asked Nov 21, 2024 at 15:14 user2268909user2268909 755 bronze badges 1 |1 Answer
Reset to default 1To search for and change the hyperlinks .Address
properties in Word documents, you need to loop over the documents .Hyperlinks
collection as I have already commented.
The below code will do that for you.
$oldPath = '\\old_share\folder\'
$newPath = '\\new_share\folder\'
$word = New-Object -ComObject Word.Application
$word.Visible = $false
$word.ScreenUpdating = $false
$folder = "\\new_share\folder"
$docs = Get-ChildItem -Recurse -LiteralPath $folder -File -Filter '*.doc*'
foreach($doc in $docs){
$updated = $false
$thisDoc = $word.Documents.Open($doc.FullName)
$thisDoc.HyperLinks | ForEach-Object {
if ($_.Address -like "*$oldPath*") {
Write-Host ("Replaced hyperlink '{0}' in document {1}" -f $_.TextToDisplay, $doc.FullName)
$_.Address = $_.Address -replace [regex]::Escape($oldPath), $newPath
$updated = $true
}
}
$thisDoc.Close($updated)
}
$Word.Quit()
# cleanup com objects
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($thisDoc)
$null = [System.Runtime.Interopservices.Marshal]::ReleaseComObject($word)
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
本文标签: powershellTrouble parsing and changing Hyperlinks in a Word documentStack Overflow
版权声明:本文标题:powershell - Trouble parsing and changing Hyperlinks in a Word document - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736309503a1934044.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
-Filter
is much more efficient that-Include
. [2] You loop over the docs withforeach($doc in $docs)
, but inside the loop you are trying to load the file using$_.FullName
. ($_
would only represent the file when looping with$docs | Foreach-Object
; in your loop you need to use$doc.FullName
) [3] To find the hyperlinks in a Word document, you need to loop over the$doc.Hyperlinks
collection where each of these objects would have anAddress
property. – Theo Commented Nov 21, 2024 at 15:42