First time asker here. Please be kind :)

I'm attempting to recursively get all directories in a parallel manner in hopes of decreasing the time it takes to traverse through a drive. Below is the code I've tried. Essentially what I'm looking to do is input a folder and do the same in parallel for it's subfolder and their subfolders and so on, but the function is not recognized inside the parallel block

function New-RecursiveDirectoryList {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory = $true,
            Position = 0,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true,
            HelpMessage = 'Path to one or more locations.')]
        [Alias('PSPath')]
        [ValidateNotNullOrEmpty()]
        [string[]]
        $Path
    )
    process {
        foreach ($aPath in $Path) {
            Get-Item $aPath

            Get-ChildItem -Path $aPath -Directory |
                # Recursively call itself in Parallel block not working
                # Getting error "The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet"
                # Without -Parallel switch this works as expected
                ForEach-Object -Parallel {
                    $_ | New-RecursiveDirectoryList
                }
        }
    }
}

Error:

New-RecursiveDirectoryList: 
Line |
   2 |                      $_ | New-RecursiveDirectoryList
     |                           ~~~~~~~~~~~~~~~~~~~~~~~~~~
     | The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet, function, script file, or executable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

I've also attempted to use the solution provided by mklement0 here but no luck. Below is my attempt at this:

    function CustomFunction {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(Mandatory = $true,
            Position = 0,
            ValueFromPipeline = $true,
            ValueFromPipelineByPropertyName = $true,
            HelpMessage = 'Path to one or more locations.')]
        [Alias('PSPath')]
        [ValidateNotNullOrEmpty()]
        [string[]]
        $Path
    )

    begin {
        # Get the function's definition *as a string*
        $funcDef = $function:CustomFunction.ToString()
    }

    process {
        foreach ($aPath in $Path) {
            Get-Item $aPath

            Get-ChildItem -Path $aPath -Directory |
                # Recursively call itself in Parallel block not working
                # Getting error "The term 'New-RecursiveDirectoryList' is not recognized as a name of a cmdlet"
                # Without -Parallel switch this works as expected
                ForEach-Object -Parallel {
                    $function:CustomFunction = $using:funcDef
                    $_ | CustomFuction
                }
        }
    }
}

Error

CustomFuction: 
Line |
   3 |                      $_ | CustomFuction
     |                           ~~~~~~~~~~~~~
     | The term 'CustomFuction' is not recognized as a name of a cmdlet, function, script file, or executable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

Does anybody know how this may be accomplished or a different way of doing this?

3

There are 3 best solutions below

3
On BEST ANSWER

So, this worked for me, it obviously doesn't look pretty. One thing to note, the foreach ($aPath in $Path) {...} on your script is unnecessary, the process {...} block will handle that for you when you pass multiple paths.

Code:

function Test {
    [CmdletBinding()]
    param (
        # Specifies a path to one or more locations.
        [Parameter(
            Mandatory,
            ParameterSetName = 'LiteralPath',
            ValueFromPipelineByPropertyName,
            Position = 0)]
        [Alias('PSPath')]
        [string[]] $LiteralPath
    )

    begin {
        $scriptblock = $MyInvocation.MyCommand.ScriptBlock.ToString()
    }

    process {
        # Get-Item $Path <= This will slow down the script
        $LiteralPath | Get-ChildItem -Directory | ForEach-Object -Parallel {
            $sb = $using:scriptblock
            $def = [scriptblock]::Create($sb)
            $_ # You can do this instead
            $_ | & $def
        }
    }
}

Looking back at this answer, what I would recommend today is to not use recursion and use a ConcurrentStack<T> instead, this would be miles more efficient and consume less memory. Also worth noting, as mklement0 pointed out in his comment, your code was correct to begin with, the issue was due to a typo: $_ | CustomFuction -> $_ | CustomFunction.

function Test {
    [CmdletBinding()]
    param (
        [Parameter(
            Mandatory,
            ParameterSetName = 'LiteralPath',
            ValueFromPipelineByPropertyName,
            Position = 0)]
        [Alias('PSPath')]
        [string[]] $LiteralPath,

        [Parameter()]
        [ValidateRange(1, 64)]
        [int] $ThrottleLimit = 5
    )

    begin {
        $stack = [System.Collections.Concurrent.ConcurrentStack[System.IO.DirectoryInfo]]::new()
        $dir = $null
    }

    process {
        $stack.PushRange($LiteralPath)
        while ($stack.TryPop([ref] $dir)) {
            $dir | Get-ChildItem -Directory | ForEach-Object -Parallel {
                $stack = $using:stack
                $stack.Push($_)
                $_
            } -ThrottleLimit $ThrottleLimit
        }
    }
}
1
On

From what I am seeing you coded a mandatory parameter, which means that you need to call it when you run your function. For example, in your case, you can try to manually run the selected lines in memory. To do so, open a PowerShell session and simply copy/paste the code you posted here. Once the code is loaded into memory, you can then call the function:

CustomFunction -Path "TheTargetPathYouWant"
0
On

I did something similar at the time. I did it using non recursive function but with RunSpace from DotNet. For it, you will need to install PoshRsJob module and create a list of subfolder to extract in dir.txt. then you run this:

Install-Module PoshRsJob -Scope CurrentUser
function ParallelDir {
    param (
        $Folders,
        $Throttle = 8
    )
    $batch = 'ParallelDir'
    $jobs = Get-RSJob -Batch $batch
    if ($jobs | Where-Object State -eq 'Running') {
        Write-Warning ("Some jobs are still running. Stop them before running this job.
        > Stop-RSJob -Batch $batch")
        return
    }

    $Folders | Start-RSJob -Throttle $Throttle -Batch $batch -ScriptBlock {
        Param ($fullname)
        $name = Split-Path -Path $fullname -Leaf
        Get-ChildItem $fullname -Recurse | Select-Object * | Export-Clixml ('c:\temp\{0}.xml' -f $name)
    } | Wait-RSJob -ShowProgress | Receive-RSJob

    if (!(Get-RSJob -Batch $batch | Where-Object {$_.HasErrors -and $_.Completed})) {
        Remove-RSJob -Batch $batch
    } else {
        Write-Warning ("The copy process has finished with ERROR. You can check:
        > Get-RsJob -Batch $batch
        To consolidate the results from each copy run:
        > Get-ChildItem 'c:\temp\*.xml' | Import-Clixml")
    }
}
$dir = gc .\dir.txt
ParallelDir -Folders $dir
dir c:\temp\*.xml|Import-Clixml|select name,length