Sunday, May 12, 2019

Powershell Script - MassDownloader - Efficient, Automated, Fault Tolerant, idempotent downloader with real time metrics

So it's been exactly 1 year since I published here. Life has been beyond busy. I'm hoping to get back on the bandwagon of updating my blog, I have dozens of functions and scripts I'd like to publish that I've been building for various needs.  I'll also hopefully get some how-to's posted about running windows server 2019 VM host with NAS & other VMs.

Recently, I needed to download a ton of files and I found trying to do it with a browser or wget wasn't going to work.  I wanted a robust, lightweight, efficient tool which I could easily add custom logic URL parsing logic to.  I wanted it to be idempotent while supporting resumable downloads that run in the background.  I decided to leverage the BITS platform which is really the best of breed downloading method on windows.

I also wanted to see real time statistics as it churned through hundreds of downloads.   I also wanted the function to run independent of the scrapper function which was populating the download list file with all the URLs to download.  This way I had a means of throttling downloads that efficiently took advantage of my full bandwidth without over saturating it with excess overhead and packet loss.

The solution I came up with leaves out some more advanced features like auto throttling based on tcp statistics and I took a stab at calculating estimated total time of completion and I found without better accounting, the results were garbage.  I have a few other ideas I might implement in a future version of this, but I figured I'd just publish what I used to download hundreds of files, over 200gb total successfully.

The script also includes a way to stop all downloads should you need to.  Just run "Stop-DownloadFiles"

Ran from ISE:


Ran from standard prompt:



Here's the source code: