admin管理员组

文章数量:1126993

Ex: I have a script that compares file information to a previously saved database file that has the same file information plus the calculated hash of each file. If the current file properties match what is already in the database file (modified date / size / ect), I inherit the stored hash file instead of recalculating it.

However, this is done for thousands of files, and the "for-each" loop that tries to find a matching file index in the database to get the corresponding hash ends up taking a while (about 60 ms per file).

I assume there is a way I can compare the array of file information to the saved database as a whole (instead of each file in a loop), and associate the corresponding hash file from the database in a single command, but it is not clear to me how to do that.

Example of the code below (I've stripped out some of the progress update callbacks for clarity, and I've confirmed those are not adding substantial delay). The loop slows down when the corresponding hash is pulled from the "$AllOldSrcProps" which is the database I noted earlier.

I just really want to compare the "$AllFiles" to "$AllOldSrcProps", and copy the "Hash" property from "$AllOldSrcProps" to "$AllFiles" when the other properties match.

    foreach ($file in $AllFiles) {
            if($file.FullName.StartsWith($SrcPath)){
                $file.FullPotLength = $file.FullName.Length - $SrcLen + $ModLen
                $file.LocKey = $SrcKey
                #If we're not rebuilding the hash, recalculate
                if(-not($RebuildSrcHashTblFlag) -and $AllOldSrcProps){
                    $MatchingFile = @($AllOldSrcProps | ?{( $_.FullName -eq $file.FullName) -and ( $_.Length -eq $file.Length) -and ($_.LastWriteTime -eq $file.LastWriteTime.ToString())})
                    if($MatchingFile.Count -eq 1){
                        $file.Hash = $MatchingFile.Hash
                        $MatchedHash[0] = $MatchedHash[0] +1
                    }
                }
            }
        }

Ex: I have a script that compares file information to a previously saved database file that has the same file information plus the calculated hash of each file. If the current file properties match what is already in the database file (modified date / size / ect), I inherit the stored hash file instead of recalculating it.

However, this is done for thousands of files, and the "for-each" loop that tries to find a matching file index in the database to get the corresponding hash ends up taking a while (about 60 ms per file).

I assume there is a way I can compare the array of file information to the saved database as a whole (instead of each file in a loop), and associate the corresponding hash file from the database in a single command, but it is not clear to me how to do that.

Example of the code below (I've stripped out some of the progress update callbacks for clarity, and I've confirmed those are not adding substantial delay). The loop slows down when the corresponding hash is pulled from the "$AllOldSrcProps" which is the database I noted earlier.

I just really want to compare the "$AllFiles" to "$AllOldSrcProps", and copy the "Hash" property from "$AllOldSrcProps" to "$AllFiles" when the other properties match.

    foreach ($file in $AllFiles) {
            if($file.FullName.StartsWith($SrcPath)){
                $file.FullPotLength = $file.FullName.Length - $SrcLen + $ModLen
                $file.LocKey = $SrcKey
                #If we're not rebuilding the hash, recalculate
                if(-not($RebuildSrcHashTblFlag) -and $AllOldSrcProps){
                    $MatchingFile = @($AllOldSrcProps | ?{( $_.FullName -eq $file.FullName) -and ( $_.Length -eq $file.Length) -and ($_.LastWriteTime -eq $file.LastWriteTime.ToString())})
                    if($MatchingFile.Count -eq 1){
                        $file.Hash = $MatchingFile.Hash
                        $MatchedHash[0] = $MatchedHash[0] +1
                    }
                }
            }
        }
Share Improve this question edited 5 hours ago Santiago Squarzon 58.9k5 gold badges21 silver badges50 bronze badges asked yesterday diytechydiytechy 231 silver badge3 bronze badges 1
  • Related: stackoverflow.com/a/72793529/1701026 – iRon Commented 18 hours ago
Add a comment  | 

1 Answer 1

Reset to default 2

The performance issue in your code is most likely when you do the linear comparison here:

$MatchingFile = @($AllOldSrcProps | Where-Object { ($_.FullName -eq $file.FullName) -and .... }

The way you could improve performance is by using a dictionary type, like a hashtable, the issue however is that you're comparing 3 properties so you need to use a structure that implements IEquatable<T>. One way you could sort that issue is by using ValueTuple as your hash keys, tuples are inherently comparable and equatable. So before your code, you could do this:

$map = @{}
foreach ($src in $AllOldSrcProps) {
    $key = [System.ValueTuple[string, long, datetime]]::new(
        $src.FullName, $src.Length, $src.LastWriteTime)

    $map[$key] = $src
}

Then, in your actual code, you would replace that Where-Object for indexing your $map hash, like so:

foreach ($file in $AllFiles) {
    if ($file.FullName.StartsWith($SrcPath)) {
        $file.FullPotLength = $file.FullName.Length - $SrcLen + $ModLen
        $file.LocKey = $SrcKey

        # If we're not rebuilding the hash, recalculate
        if (-not $RebuildSrcHashTblFlag -and $AllOldSrcProps) {
            $key = [System.ValueTuple[string, long, datetime]]::new(
                $file.FullName, $file.Length, $file.LastWriteTime)

            $MatchingFile = $map[$key]

            if ($MatchingFile) {
                $file.Hash = $MatchingFile.Hash
                $MatchedHash[0]++
            }
        }
    }
}

本文标签: