I have a script to process videos, in which I first start a download and then video conversion. The purpose of conversion is not to change container format but to clean up video frames downloaded, reduce audio from 2 channels to one etc.
Both steps take some time, so I thought I could use a kind of parallel processing. The download would run with -Wait parameter, while the conversion would be started in another window,running in parallel. Something like this:
$files_to_process |
ForEach-Object {
<# check if $conversion started in previous cycle iteration
has already finished or else wait #>
<# prepare some parameters from the object #>
Start-Process yt-dlp -ArgumentList $ytdlp_args -Wait -NoNewWindow
$conversion = Start-Process ffmpeg -ArgumentList $ff_args
}
However, the conversion takes longer than the download, so if I want to process a larger list of videos, without control I would end up with ffmpeg processes taking all CPU and memory before the list could be finished.
I was trying to find out if I can get any useful information from the object returned by calling Start-Process, it does not seem to be even mentioned in the microsoft docs for Start-Process, in some example I could only see there is a property called "Id"
There is a function Get-Process -Id $id
but it does not return any information whether the process is running or has finished.
How can I manage this? The same issue applies to other scenarios, e.g. for downloading data by Invoke-WebRequest and then processing them and writing into a database, where, in some cases, there could also appear this kind of timing issue.
In the end I tried this:
and it works.
It is specific for the workflow I use in this use case but could be used in similar situations.