Split-Job 0.93

Wow, it has been a long time since I had time to post anything on this blog. However, I have been making some improvements to the Split-Job script over the last few months and the results are below. I use this script myself a lot and I hope this is useful for some of you as well.  Please see the previous version for more information about the concept and usage of Split-Job.

The changes are listed in the script comments. One aspect that deserves extra attention is the runspace environment. As before, the runspaces that process the pipeline objects do not necessarily have the same variables, functions etc. that your main PowerShell session contains. If you do need those, you now have two options:

  • Specify the -UseProfile switch. This will load your PS profile, which should give you (most of) your snapins, functions and aliases. This is most useful when using Split-Job interactively at the console.
  • Import the specific functions, variables etc. you need by using the SnapIn, Function, Alias and Variable switches. Each of these can take multiple arguments and wildcards. This is more efficient when using Split-Job inside a script. 

I am actually considering making -UseProfile the default and creating the opposite -NoProfile switch instead. This will make it even easier to use Split-Job at the command line. In my case this does not slow things down considerably. If you want, you can exclude portions of your profile (e.g. PowerTab) by testing whether $SplitJobRunSpace exists. Feedback is welcome!

Enjoy,

Arnoud


#requires -version 1.0
################################################################################
## Run commands in multiple concurrent pipelines
##   by Arnoud Jansveld - www.jansveld.net/powershell
## Version History
## 0.93   Improve error handling: errors originating in the Scriptblock now
##        have more meaningful output
##        Show additional info in the progress bar (thanks Stephen Mills)
##        Add SnapIn parameter: imports (registered) PowerShell snapins
##        Add Function parameter: imports functions
##        Add SplitJobRunSpace variable; allows scripts to test if they are 
##        running in a runspace 
##        Add seconds remaining to progress bar (experimental)
## 0.92   Add UseProfile switch: imports the PS profile
##        Add Variable parameter: imports variables
##        Add Alias parameter: imports aliases
##        Restart pipeline if it stops due to an error
##        Set the current path in each runspace to that of the calling process
## 0.91   Revert to v 0.8 input syntax for the script block
##        Add error handling for empty input queue
## 0.9    Add logic to distinguish between scriptblocks and cmdlets or scripts:
##        if a ScriptBlock is specified, a foreach {} wrapper is added
## 0.8    Adds a progress bar
## 0.7    Stop adding runspaces if the queue is already empty
## 0.6    First version. Inspired by Gaurhoth's New-TaskPool script
################################################################################

function Split-Job {
    param (
        $Scriptblock = $(throw 'You must specify a command or script block!'),
        [int]$MaxPipelines=10,
        [switch]$UseProfile,
        [string[]]$Variable,
        [string[]]$Function = @(),
        [string[]]$Alias = @(),
        [string[]]$SnapIn
    ) 

    function Init ($InputQueue){
        # Create the shared thread-safe queue and fill it with the input objects
        $Queue = [Collections.Queue]::Synchronized([Collections.Queue]@($InputQueue))
        $QueueLength = $Queue.Count
        # Do not create more runspaces than input objects
        if ($MaxPipelines -gt $QueueLength) {$MaxPipelines = $QueueLength}
        # Create the script to be run by each runspace
        $Script  = "Set-Location '$PWD'; "
        $Script += {
            $SplitJobQueue = $($Input)
            & {
                trap {continue}
                while ($SplitJobQueue.Count) {$SplitJobQueue.Dequeue()}
            } |
        }.ToString() + $Scriptblock

        # Create an array to keep track of the set of pipelines
        $Pipelines = New-Object System.Collections.ArrayList

        # Collect the functions and aliases to import
        $ImportItems = ($Function -replace '^','Function:') +
            ($Alias -replace '^','Alias:') |
            Get-Item | select PSPath, Definition
        $stopwatch = New-Object System.Diagnostics.Stopwatch
        $stopwatch.Start()
    }

    function Add-Pipeline {
        # This creates a new runspace and starts an asynchronous pipeline with our script.
        # It will automatically start processing objects from the shared queue.
        $Runspace = [System.Management.Automation.Runspaces.RunspaceFactory]::CreateRunspace($Host)
        $Runspace.Open()
        $Runspace.SessionStateProxy.SetVariable('SplitJobRunSpace', $True)

        function CreatePipeline {
            param ($Data, $Scriptblock)
            $Pipeline = $Runspace.CreatePipeline($Scriptblock)
            if ($Data) {
                $Null = $Pipeline.Input.Write($Data, $True)
                $Pipeline.Input.Close()
            }
            $Null = $Pipeline.Invoke()
            $Pipeline.Dispose()
        }

        # Optionally import profile, variables, functions and aliases from the main runspace
        if ($UseProfile) {
            CreatePipeline -Script "`$PROFILE = '$PROFILE'; . `$PROFILE"
        }
        if ($Variable) {
            foreach ($var in (Get-Variable $Variable -Scope 2)) {
                trap {continue}
                $Runspace.SessionStateProxy.SetVariable($var.Name, $var.Value)
            }
        }
        if ($ImportItems) {
            CreatePipeline $ImportItems {
                foreach ($item in $Input) {New-Item -Path $item.PSPath -Value $item.Definition}
            }
        }
        if ($SnapIn) {
            CreatePipeline (Get-PSSnapin $Snapin -Registered) {$Input | Add-PSSnapin}
        }
        $Pipeline = $Runspace.CreatePipeline($Script)
        $Null = $Pipeline.Input.Write($Queue)
        $Pipeline.Input.Close()
        $Pipeline.InvokeAsync()
        $Null = $Pipelines.Add($Pipeline)
    }

    function Remove-Pipeline ($Pipeline) {
        # Remove a pipeline and runspace when it is done
        $Pipeline.RunSpace.Close()
        $Pipeline.Dispose()
        $Pipelines.Remove($Pipeline)
    }

    # Main 
    # Initialize the queue from the pipeline
    . Init $Input
    # Start the pipelines
    while ($Pipelines.Count -lt $MaxPipelines -and $Queue.Count) {Add-Pipeline} 

    # Loop through the runspaces and pass their output to the main pipeline
    while ($Pipelines.Count) {
        # Only update the progress bar once a second
        if (($stopwatch.ElapsedMilliseconds - $LastUpdate) -gt 1000) {
            $Completed = $QueueLength - $Queue.Count - $Pipelines.count
            $LastUpdate = $stopwatch.ElapsedMilliseconds
            $SecondsRemaining = $(if ($Completed) {
                (($Queue.Count + $Pipelines.Count)*$LastUpdate/1000/$Completed)
            } else {-1})
            Write-Progress 'Split-Job' ("Queues: $($Pipelines.Count)  Total: $($QueueLength)  " +
            "Completed: $Completed  Pending: $($Queue.Count)")  `
            -PercentComplete ([Math]::Max((100-[Int]($Queue.Count+$Pipelines.Count)/$QueueLength*100),0)) `
            -CurrentOperation "Next item: $(trap {continue}; if ($Queue.Count) {$Queue.Peek()})" `
            -SecondsRemaining $SecondsRemaining
        }
        foreach ($Pipeline in @($Pipelines)) {
            if ( -not $Pipeline.Output.EndOfPipeline -or -not $Pipeline.Error.EndOfPipeline ) {
                $Pipeline.Output.NonBlockingRead()
                $Pipeline.Error.NonBlockingRead() | Out-Default
            } else {
                # Pipeline has stopped; if there was an error show info and restart it
                if ($Pipeline.PipelineStateInfo.State -eq 'Failed') {
                    $Pipeline.PipelineStateInfo.Reason.ErrorRecord |
                        Add-Member NoteProperty writeErrorStream $True -PassThru |
                            Out-Default
                    # Restart the runspace
                    if ($Queue.Count -lt $QueueLength) {Add-Pipeline}
                }
                Remove-Pipeline $Pipeline
            }
        }
        Start-Sleep -Milliseconds 100
    }
}

23 thoughts on “Split-Job 0.93

  1. I’m trying to use this with Exchange 2007, but it isn’t aware of Exchange specific cmdlets. Could I add “Add-PSSnapin Microsoft.Exchange.Management.PowerShell.Admin” somewhere in this script to make it Exchange aware? If so, where would I add it? Thanks.

  2. Hi Nick, I can’t test this myself but can you try adding the following parameter when you call the Split-Job function:

    -SnapIn Microsoft.Exchange.Management.PowerShell.Admin

    Regards,
    Arnoud

  3. Hi Arnoud, that worked like a charm. My script now completes so much faster thanks to you. Thanks for your quick response and time.

  4. Can you give an example of how you would include functions? An example or two showing how to use the switches would be great. Thanks!

  5. Nick,
    The script rocks! I am using it to collect inventory on ~1200 Pcs. The vbscripts handling it prior were getting out of control. It replaced 2 scripts, one of them with around 2500 lines of code.

    Thanks!

  6. Pardon my ignorance but how do you register the Split-Job function so you don’t have to reference it’s script location when you pipe the contents of the machines.txt to it through powershell

    (ie Get-Content machines.txt | SPLIT-JOB {Get-WmiObject …..)

    Thank you in advance for the help.

    1. Ralph, you can either include the function in your PowerShell profile or put it in a .ps1 file and dot-source it, e.g.:
      . .MySplitJob.ps1

      Regards,
      Arnoud

  7. Hi Ralph,

    I’ve got another question for you on this script. With PS 2.0 Microsoft is starting to use modules instead of snapins. For example, the new Active Directory cmdlets (like Get-ADUser) were released via a module with Server 2008 R2 which I’m having a hard time using with this funtion. I use Split-Job all the time and wondering if it may be upadated to work with modules. Thanks.

  8. I am looking into doing a complete rewrite for PowerShell 2.0, and will definitely include support for modules. No ETA though, as I am very busy both in my professional and private life these days.

    Glad to hear the script is of some use to people out there!

    Arnoud

  9. Hello Arnoud,

    Thanks for the reply and for the work you’ve done on the script so far. I understand being busy and look forward to a rewrite down the road. Thanks again.

  10. Your script rocks! We had a dual quad core server that’s idle most of the time and now I can spike out all cores with this script and jobs complete much much faster.
    I’ve had to add two snapins with
    -snapin quest.activeroles.admanagement,Microsoft.Exchange.Management.PowerShell.Admin
    and things work perfectly.
    I’m even incorporating it into some other scripts I have that use a foreach command with extremely large script blocks within and it still works fine.

  11. Hi I’m trying to make a powershell script using your split-job to mail-bomb my company servers. This is purely for testing purposes, but I’m having some issues working out som snags. I’m no powershell expert by any means.

    I’m using the Net.Mail.SmtpClient for this script and my pipeline looks something like this:
    Get-Content c:mailaddresses.txt | Split-Job {$smtp.Send($_, $emailTo, $subject, $body) }
    I’ve set all the variables and also have $smtp = new-object Net.Mail.SmtpClient($smtpServer) in my script. Any help towards why I get this error?

    Expressions are only allowed as the first element of a pipeline.
    At line:7 char:51
    + $smtp.Send($_, $emailTo, $subject, $body) <<<<
    + CategoryInfo : ParserError: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : ExpressionsMustBeFirstInPipeline

    The first 155 lines of this script is the source for the split-job.ps1.

  12. Hello…

    I just found your site by chance looking for help trying to multi-thread a powershell script. I am still rather unfamiliary with powershell syntax and need a quick fix. Here is my code below:

    ##
    # Start of script
    ##

    # Helper Function – convert WMI date to TimeDate object
    function WMIDateStringToDate($Bootup) {
    [System.Management.ManagementDateTimeconverter]::ToDateTime($Bootup)
    }

    # Main script
    $Computer = $args[0] # adjust as needed
    $computers = Get-WMIObject -class Win32_OperatingSystem -Impersonation 3 -computer $computer -authentication 6
    #$computers = Get-WMIObject -class Win32_OperatingSystem -computer $computer

    foreach ($system in $computers) {
    $Bootup = $system.LastBootUpTime
    $LastBootUpTime = WMIDateStringToDate($Bootup)
    $now = Get-Date
    $Uptime = $now – $lastBootUpTime
    $d = $Uptime.Days
    $h = $Uptime.Hours
    $m = $uptime.Minutes
    $ms= $uptime.Milliseconds

    “System Up for: {0} days, {1} hours, {2}.{3} minutes” -f $d,$h,$m,$ms
    }
    # End script

    Any ideas how I could use “start-job” for mutli-threading? Let’s say I have 20 servers I need to run this script against, and I want them all running in parallel.

    ANY help would be greatly appreciated!!!! :)

  13. Would you mind if I would post this ( well an updated version of it) to poshcode.org? I was just at TEC 2011 and the PowerShell Deep Dive some of the people are interested in it. I could always send them a copy, but it would be nice to have it somewhere, where everyone could get it easily and potentially post updates to it.

    Thanks,

    Stephen

  14. Thanks for uploading it. I’ve uploaded a version 1.2 that actually only works with PowerShell V2, but also works with powershell_ise. The 1.0 version would leave the pipelines running if you exited the script early ( by hitting Escape in powershell_ise or Ctrl-C in powershell.exe) . It also adds various other improvements like InitializeScript, MaxDuration, time remaining, and a few other minor things.

    http://poshcode.org/2621

  15. @Charles

    Here’s a quick way to get what you seem to be looking for.

    ‘computer1′, ‘computer2′, ‘computer3′ | Split-Job { % { GWMI Win32_OperatingSystem -ComputerName $_ -Property CSName, LocalDateTime, LastBootUpTime }} | Select @{n=’ComputerName’;e={ $_.CSName}}, @{n=’UpTime’;e={ [datetime]$_.LocalDateTime – [datetime]$_.LastBootUpTime}}

  16. Stephen or Arnoud,

    How would i deal with variables within Functions that I would like to Split-Job?

    Here is a snip of my script:

    Import-Module -Name ._ModulesSplit-Job.psm1

    Function GetSystemState
    { # Begin Function GetSystemState()
    # Pings each of the system names found to check if online
    # Writes findings to a files.
    $intCounter++
    Write-Host “["$intCounter"/"$intCounterMax]” Checking Computer: ” $_ -BackgroundColor Cyan -ForegroundColor White
    If (Test-Connection -ComputerName $_ -Quiet -Count 1)
    { # If Computer is alive
    # Adds Computer Name to Alive File
    $strAliveList += $_
    } # End of True Test-Connection
    Else # Test-Connection
    { # Begin of Else for Test-Connection
    $strDeadList += $_
    } # End of Else for Test-Connection
    } # End Function GetSystemState()

    # Function Variables
    $GLOBAL:strAliveList = @()
    $GLOBAL:strDeadList = @()

    # Get Computer Status
    # Determines if Computer is On-Line
    Write-Host “Getting Computer Status…………………………………………….” -BackgroundColor Black -ForegroundColor White
    $GLOBAL:intCounter = 0
    $GLOBAL:intCounterMax = $computerList.Length
    $computerList | Split-Job { % {GetSystemState}} -Function GetSystemState

    It does not update my Alive or Dead Variables nor does it increment my counter. I also have split-job running another part of my script where it searches the alive computers and put information inside variables as well with the same result of not passing back to main script.

    Any and all help would be greatly appreciated!!! I fairly new to powerschell and all self-taugh with google search help. That’s were I came accross your script. I am running v1.2

  17. Hi Guys, I am trying to use with Multiple Parameters.

    My function has 3 parameters, Server by pipeline and Param2 and Param 3

    $Param2 = “Something2′
    $param3 = “Something3″
    get-content c:servers.txt | split-job {myFunction -param2 $param2 -param3 $param3}

    and looks like the Param2 and Param3 are not passing to my function.

    Any ideas ? Thanks

    1. You can pass local variables to your script with the Variable parameter; in your case you would use this:

      get-content c:servers.txt | split-job {myFunction -param2 $param2 -param3 $param3} -Variable param2,param3

      Regards,
      Arnoud

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>