April 24, 2019

PowerShell :: Parsing text output

PowerShell ISE logo
(Last Updated On: 11th September 2018)

I came across a really great way of parsing output from command line tools within PowerShell so had to right a quick blog about it here to share it.  Normally I like to cite my sources but I’ve lost track of the original stack overflow post that led me to this point… Sorry!

This technique revolves around the switch operator which, if you haven’t come across it before, is much like a chain of if statements.  When provided a value it will invoke a code block dependant on what statement returns $true.  This is even more powerful in powershell as the switch operator can do simple matching, wildcard matching and regex matching!  There is a great resource on the switch operator here which is worth a read.  We will be using some regexes later on and, as always, regexr is an awesome place to go and try out your regex patterns before incorporating them in your code.  Let take a quick look at switch in action before we dive into parsing out some command line output in powershell, namely netstat.  I’ve wrapped it in a function just so I get a cleaner output to paste here.

function switcher ($function_veriable)
{
    switch ($function_variable ) 
    {
    1 {"One"}
    2 {"Two"}
    3 {"Three"}
    4 {"Four"}
    5 {"Five"}
    default {"Hmmmm..... Tricky!"}
    }
}

---------- output -------------
PS C:\> switcher 1
One

PS C:\> switcher 2
Two

PS C:\> switcher 25
Hmmmm..... Tricky!

PS C:\> switcher 5
Five

As you can see, the code above behaves much like a chain of if statements just like the code below.

function iffer ($function_variable)
{
    if($function_variable -eq 1){"One"}
    elseif($function_variable -eq 2){"Two"}
    elseif($function_variable -eq 3){"Three"}
    elseif($function_variable -eq 4){"Four"}
    elseif($function_variable -eq 4){"Five"}
    else{"Hmmmm..... Tricky!"}
}

---------- output -----------
PS C:\> iffer 1
One

PS C:\> iffer 2
Two

PS C:\> iffer 3
Three

PS C:\> iffer 25
Hmmmm..... Tricky!

Hopefully you have followed that and know all about the switch statement.  Normally it just makes for more readable code but, in PowerShell, you can do awesome stuff.  Lets get into netstat and regexing!  The admins amongst you probably know of get-nettcpconnection (which is a powershell netstat equivalent) but this article intended as an instruction for all manner of binaries, not just netstat.  Plus get-nettcpconnection isn’t available on all systems.  The command we are going to run is netstat -abno to spit out verbose output on all established connections and listening ports.  I’ll post some example lines below so we can see what we need to parse out.

  Proto  Local Address          Foreign Address        State           PID
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING       1188
  RpcSs
 [svchost.exe]
  TCP    0.0.0.0:445            0.0.0.0:0              LISTENING       4
 Can not obtain ownership information
  TCP    127.0.0.1:4014         127.0.0.1:1736         ESTABLISHED     15700
 [NVIDIA Share.exe]
 UDP    127.0.0.1:48201        *:*                                    11080
 [NVIDIA Web Helper.exe]
  UDP    127.0.0.1:49668        *:*                                    4796
  iphlpsvc
 [svchost.exe]

Hopefully that doesn’t look too horrible on mobile devices!  Let’s get stuck it at trying to write some regex’s to carve that data out and into useful PowerShell objects.  First we will build some regexs in with regexr.  The first line can be handled with…

(TCP|UDP)\W*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}):(\d{1,5})\W*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}):(\d{1,5})\W*(LISTENING|ESTABLISHED)\W*(\d{1,5})

… but we have an issue that we are missing the important process name information.  As I wrote this switch block I realised how complicated this gets quite quick but also what a great example this binary is for parsing output.  The first snag we hit is that those process names are on a new line.  Luckily the output is ordered so we can simply tag those process names onto the object we create for the line directly before it in the output.  As I wrote the regexes for this (trial and error and trial and error) I realised that there are only two formats to accommodate here, both of which are shown in the sample above.  Sometimes we have a foreign address and a state and sometimes we do not.  With this in mind, we can write our switch block in an order so that the first matching pattern will instruct powershell to continue and therefore not act on any further matches, much like a firewall does or a routing lookup.  Lets look at the code (this works by the way so feel free to use it).

[System.Collections.ArrayList] $output = @()
switch -regex (&netstat -abno)
{
"^\W*$" {continue}
"^\W*Act.*" {continue}
"Can not obtain" {continue}
"Proto" {continue}
"(TCP|UDP)\W*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}):(\d{1,5})\W*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}):(\d{1,5})\W*(LISTENING|ESTABLISHED|TIME_WAIT|CLOSE_WAIT)\W*(\d{1,5})" 
{
$output.Add((New-Object -TypeName PSObject -Property @{
"Protocol" = $Matches[1];
"LocalIP" = $Matches[2];
"LocalPort" = $Matches[3];
"RemoteIP" = $Matches[4];
"RemotePort" = $Matches[5];
"State" = $Matches[6];
"ProcessID" = $Matches[7];
"ProcessName" = $false;
"ProcessName2" = $false;
})) | out-null ;continue
}
"(TCP|UDP)\W*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}):(\d{1,5})\W*(\d{1,5})" 
{
$output.Add((New-Object -TypeName PSObject -Property @{
"Protocol" = $Matches[1];
"LocalIP" = $Matches[2];
"LocalPort" = $Matches[3];
"RemoteIP" = $false
"RemotePort" = $false
"State" = $false
"ProcessID" = $Matches[4];
"ProcessName" = $false;
"ProcessName2" = $false;
})) | out-null ;continue
}
"\W*\[([^\]]*)\].*" {$output[$output.Count-1].ProcessName=$Matches[1];continue}#
"\W*(.*)\W*" {$output[$output.Count-1].ProcessName2=$Matches[1];continue}
default {$_}
}

So what is this doing?  Well it looks more complicated than it is.  This code iterates through the output of netstat -abno and then (in this order):
* Discards some lines we know are junk by continueing to the next line.
* Match and parse out the lines that fit the regex.  It will then use regex groups to build a new PS-Object with the relevant properties, including placeholders for fields not present.
* If nothing has matched so far we can assume we are looking at process names so we extract it and add it as the processname or processname2 property of the last object in the list (thats what the count is for).
* Finally the default case will always match (if we haven’t continued past it) and will return the line to the screen (so we can write an extra regex rule).

That default rule should be there from the start as it is what you returns unmatched lines back to the screen.  You can use this as you regex more and more lines until you get nothing back, at which point you are done 🙂

So whats the output like?  Its a list of our custom object which we can manipulate at our leisure 🙂

We can now get real fancy by letting PowerShell do some leg work for us.  How about looking for connections where the owning process does not have a path containing system32 or program.  These might be worth a look right?

From here, we can do more digging.  What is the OneDrive process path?  Can we obtain process information for those PIDs?

That’s it folks.  Hopefully you can see how powerful it is now we have extracted all that information from that horrible output.  I have used this to parse everything from VSSadmin to the free capacity of a storage array from vendor CLI tools.  Happy hacking 🙂

Previous «

Simon is a sysadmin for a global financial organisation and specialises in Windows, security and automation.

Leave a Reply

Subscribe to SYNACK via Email

%d bloggers like this: