If I start a process in Powershell using Start-Process, how can I check that it has finished

226 Views Asked by At

I have a script to process videos, in which I first start a download and then video conversion. The purpose of conversion is not to change container format but to clean up video frames downloaded, reduce audio from 2 channels to one etc.

Both steps take some time, so I thought I could use a kind of parallel processing. The download would run with -Wait parameter, while the conversion would be started in another window,running in parallel. Something like this:

$files_to_process |
  ForEach-Object {

    <# check if $conversion started in previous cycle iteration 
       has already finished or else wait #>

    <# prepare some parameters from the object #>

    Start-Process yt-dlp -ArgumentList $ytdlp_args -Wait -NoNewWindow
    $conversion = Start-Process ffmpeg -ArgumentList $ff_args
  }

However, the conversion takes longer than the download, so if I want to process a larger list of videos, without control I would end up with ffmpeg processes taking all CPU and memory before the list could be finished.

I was trying to find out if I can get any useful information from the object returned by calling Start-Process, it does not seem to be even mentioned in the microsoft docs for Start-Process, in some example I could only see there is a property called "Id"

There is a function Get-Process -Id $id but it does not return any information whether the process is running or has finished.

How can I manage this? The same issue applies to other scenarios, e.g. for downloading data by Invoke-WebRequest and then processing them and writing into a database, where, in some cases, there could also appear this kind of timing issue.

3

There are 3 best solutions below

0
On

In the end I tried this:

$processList = @()
$files_to_process | ForEach-Object {
  while( $processList.Count -ge 2 ) {
    $pending, [array]$other = $processList
    Wait-Process -Id $pending
    $processList = $other
  }
  <# parameter preparation into $step1args, $step2args #>
  Start-Process step1 -ArgumentList $step1args -Wait -NoNewWindow
  $p = Start-Process step2 -ArgumentList $step2args -PassThru
  $processList += $p.Id
}

and it works.

It is specific for the workflow I use in this use case but could be used in similar situations.

5
On

If it would be sufficient to process the downloaded files one by one, you could do it like follows:

$files_to_process | ForEach-Object {

    <# check if $conversion started in previous cycle iteration 
       has already finished or else wait #>
    If ($ytdlpproc.HasExited -ne $true -or $ffmpegproc.HasExited -ne $true) {
        Start-Sleep -Seconds 10
    }

    <# prepare some parameters from the object #>

    $ytdlpproc = Start-Process yt-dlp -ArgumentList $ytdlp_args -NoNewWindow -PassThru
    Wait-Process -InputObject $ytdlpproc

    Start-Sleep -Seconds 1

    $ffmpegproc = Start-Process ffmpeg -ArgumentList $ff_args -PassThru
    Wait-Process -InputObject $ffmpegproc 
}

There is a caveat when using Wait-Process with bootstrapper executables. Some of them kill themselves as soon as they have started the main gui for the application.

2
On

Your own solution is a custom implementation for throttling parallelism, i.e. for ensuring that no more than N operations run in parallel at a time.

  • With Start-Process, a custom implementation is indeed required, and the same goes for Start-Job.

  • However, the Start-ThreadJob cmdlet and the PowerShell (Core) 7+ -Parallel feature of ForEach-Object have this ability built in:

    • Both mechanisms are thread-based (whereas Start-Job heavy-handedly uses child processes and has no throttling mechanism) and throttle by default, allowing at most 5 threads to run at a given time, and offer a -ThrottleLimit parameter (for CPU-bound operations, it's best not to exceed the number of available CPU cores).

    • As an aside: If Start-ThreadJob is available - by default in PowerShell (Core) 7+ and installable on demand in Windows PowerShell (see below), it is almost always the better choice than Start-Job - see this answer for background.


Therefore, you can use either of these mechanism with synchronous calls, and let -ThrottleLimit 2 ensure that the parallelism is throttled as desired:

  • If you're running Windows PowerShell, you can install Start-ThreadJob from the PowerShell Gallery, e.g. with Install-Module -Scope CurrentUser ThreadJob, which then enables the following solution:

    $files_to_process | 
      ForEach-Object {
        Start-ThreadJob -ThrottleLimit 2 {
          $filePath = ($using:_).FullName
          Start-Process -Wait step1 -ArgumentList $step1args  -NoNewWindow
          Start-Process -Wait step2 -ArgumentList $step2args
        }
      } |
      Receive-Job -Wait -AutoRemoveJob
    
  • If you're running PowerShell (Core) 7+, the solution is much simpler, via ForEach-Object -Parallel:

    $files_to_process | 
      ForEach-Object -ThrottleLimit 2 -Parallel {
        $filePath = ($using:_).FullName
        Start-Process -Wait step1 -ArgumentList $step1args  -NoNewWindow
        Start-Process -Wait step2 -ArgumentList $step2args
      }
    

Note:

  • If what you're invoking are console applications, you don't need Start-Process with -Wait, and can simply invoke them directly - see this answer for background.

  • More work is needed if you want to check the exit codes of the child processes launched, namely in the form of outputting them and making the caller collect and analyze them.