Piping and the Pipeline

It’s inevitable: no sooner do you get Windows PowerShell installed then you start hearing about and reading about “piping” and “the pipeline.” In turn, that leads to two questions: 1) What’s a pipeline?, and 2) Do I even need to knowabout piping and pipelines?

Let’s answer the second question first: Do you even need to knowabout piping and pipelines? Yes, you do. There’s no doubt that you can write Windows PowerShell scripts without usingpipelines. The question, however, is whether you’d wantto write PowerShell scripts without using pipelines. You can order a banana split and ask them to hold the ice cream; that’s fine, but at that point you don’t really have a banana split, do you? The same is true of the PowerShell pipeline: you can write scripts without a pipeline. But, at that point, do you really have a PowerShellscript?

Note. OK, we’re exaggerating a little bit: some scripts don’t needa pipeline. In general, however, that’s not going to be true of longer and more complicated PowerShell scripts.

Assembling a Pipeline

So then what ispiping and the pipeline? To begin with, in some ways the term “pipeline” can be considered something of a misnomer. Suppose you have an oil pipeline, like the Alaska Pipeline. You put oil in one end of the pipeline; what do you suppose comes out the other end? You got it: oil. And that’s the way a pipeline is supposed to work: it’s just a way of moving something, unchanged, from one place to another. But that’s not the way the pipeline works in Windows PowerShell.

Note:There’s some debate about this. For example, when you “pipe” an object from one part of a command to another, you’re simply passing that object – unchanged – from one part of the command to another. There have been some interesting discussions around the scripting water cooler about this very topic that involve Kool-Aid and waste management facilities (don’t ask), as well as mentions of marketing pipelines and other industry-specific terminology, but we won’t confuse you with all that. You’re probably confused enough already, and since our goal is to un-confuse you, we’ll move on.

Instead, think of the Windows PowerShell pipeline as being more like an assembly line. With an assembly line you start with a particular thing; for example, you start with a car. However, you don’t start with a finishedcar; instead, you start with a sort of framework that will eventually bea car. As the car travels down the assembly line it passes through various stations; at each station workers make some sort of modification, welding on doors, adding windows, installing seats. When you’re all done you’ll still have a car; you won’t have a tube of toothpaste or a barrel of oil. But thanks to all the changes that were made along the way, you’ll have a very different car than the “car” you started with.

Start with a Cmdlet

A similar process takes place when you use the pipeline in Windows PowerShell. For example, suppose there happened to be a cmdlet named Get-Shapes; when you run this cmdlet it returns a collection of all the geometric shapes found on your computer. To call this hypothetical cmdlet you use a command similar to this:

Get-Shapes

In return, you get back a collection like this one:

The Filtering Station

That’s pretty cool – except for one thing. As it turns out, we’re only interested in the orangeshapes. Unfortunately, though, our hypothetical Get-Shapes cmdlet doesn’t allow us to filter out items that fail to meet specified criteria. Oh, well; guess we’re out of luck, right?

Right.

No, wait, we mean wrong. Granted, Get-Shapes doesn’t know how to filter out unwanted items. But that’s not a problem, because PowerShell’s Where-Objectcmdlet doesknow how to filter out unwanted items. Because of that, all we have to do is use Get-Shapes to retrieve all the shapes, then hand that collection of shapes over to Where-Object and let itfilter out everything but the orange shapes. In other words:

Get-Shapes | Where-Object {$_.Color –eq " Orange "}

Don’t worry about the syntax of the Where-Object cmdlet for now; for an overview of using Where-Object you can see this article . The important thing to note for now is the pipe separator character (|) that separates our two commands (Get-Shapes and Where-Object). When we use the pipeline in Windows PowerShell that typically means that we use a cmdlet to retrieve a collection of objects. However, we don’t do anything with those objects, at least not right away. Instead, we hand that collection over to a second cmdlet, one that does some further processing (filtering, grouping, sorting, etc.). That’swhat the pipeline is for.

And in our hypothetical example, the pipeline provides a way for us to filter out everything except the orange-colored shapes:

The Sorting Station

That’s cool, but what’s even cooler is the fact that you aren’t limited to just two stations on your assembly line. For example, suppose we want to sort the orange shapes by size. Where-Object doesn’t know how to sort things. But Sort-Objectdoes:

Get-Shapes | Where-Object {$_.Color –eq " Orange "} | Sort-Object Size

Does this really work? Of course it does:

A Real-Life Pipeline

Here’s a somewhat more practical use of the PowerShell pipeline. The command we’re about to show you uses the Get-ChildItemcmdlet to retrieve a list of all the items found in the folder C:\Scripts. The command then hands that collection over to the Where-Object cmdlet; in turn, Where-Object filters out any item (file or folder) that is less than 200 KB in size. After it finishes filtering, Where-Object hands the remaining items over to the Sort-Object cmdlet, which sorts those items by file size.

The command itself looks like this:

Get-ChildItem C:\Scripts | Where-Object {$_.Length -gt 200KB} | Sort-Object Length

And when we run the command we get back something along these lines:

Directory: Microsoft.PowerShell.Core\FileSystem::C:\Scripts

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         2/19/2007   7:42 PM     266240 scores.mdb
-a---         5/19/2007   9:23 AM     328620 wordlist.txt
-a---         12/1/2002   3:35 AM     333432 6of12.txt
-a---         5/18/2007   8:12 AM     708608 test.mdb

That’s pretty slick, but some of you seem a little skeptical. “OK, that isnice, but it’s not that big of a deal,” you say. “After all, if I write a WMI query I can do filtering right in my query. And if I write an ADSI script I can add a filter that limits my collection to, say, user accounts. I’m already doing all this stuff.”

Depending on how you want to look at it, that’s true; after all, you canuse filtering in either a WMI or an ADSI script. However, the approach used when writing a filter in WMI is typically very different from the approach used when writing a filter in ADSI. In turn, an ADSI filter is different from the approach used when writing a filter using the FileSystemObject. The advantage to Windows PowerShell, and to using the pipeline, is that it doesn’t matter what kind of data or what kind of object you’re working with; you just hand everything off to Where-Object and let Where-Object take care of everything.

Or take sorting, to name another commonly-used operation. If you’re doing a database query (including ADO queries against Active Directory) you don’t need a pipeline; you can specify sort options as part of the query. But what if you’re doing a WQL query against a WMI class? That’s a problem: WQL doesn’t allow you to specify sort options. If you’re a VBScripter that means you have to do something crazy, like write your own sort function or rely on a workaround like disconnected recordsets, just so you can do something as seemingly-simple as sorting data.

Is that the case in PowerShell? You already know the answer to that, don’t you? Of course that’s not the case; in PowerShell you just pipe your data to the Sort-Object cmdlet, sit back, and relax. For example, say you want to retrieve information about the services running on a computer, then sort the returned collection by service status (running, stopped, etc.). Okey-doke:

Get-Service | Sort-Object Status | Format-Table

Note. You might note that as a bonus we took the sorted data and piped it to the Format-Tablecmdlet; that means the final onscreen display ends up as a table rather than a list.

Don’t Get Carried Away

Yes, this iseasy isn’t it? In fact, about the only time you’ll ever run into a problem is if you get carried away and try pipelining everything. Remember, you can’t pipeline something unless it makes sense to use a pipeline. It makes sense to pipeline service information to Sort-Object; after all, Sort-Object can pretty much sort anything. It also makes sense to pipe the sorted information to Format-Table; after all, Format-Table can take pretty much any information and display it as a table.

But consider this command:

Sort-Object | Get-Process

What’s this command going to do? Absolutely nothing. Nor should we expect it to do anything. After all, Sort-Object is designed to sort information and there’s nothing here to sort. (Incidentally, that’s a hint that Sort-Object should typically appear on the right-hand side of a pipeline. You need to first grab some information and thensort that information.)

Note. Are there exceptions to this rule? Sure. For example, suppose you have a variable $a that contains a collection of data. You can sort that data, and sidestep the pipeline altogether, by using a command like this:

Sort-Object –inputobject $a

Someday you might actually have to use an approach similar to this; as a beginner, however, you shouldn’t worry about it. Instead, you should get into the habit of using piping and the pipeline. Learn the rules first; later on, there will be plenty of time to learn the exceptions.

But even if there wassomething for Sort-Object to sort this command still wouldn’t make much sense. After all, the Get-Process cmdlet is designed to retrieve information about the processes running on a computer; what exactly would Get-Process do with any sorted information handed over the pipeline? For the most part, you first acquire something (a collection, an object, whatever) and thenhand that data over the pipeline. The cmdlet on the right-hand side of the pipeline will then proceed to do some additional processing and formatting of the items handed to it.

As we implied above, when you dohand data over the pipeline make sure there’s a cmdlet waiting for it on the other side. The more you use PowerShell the more you’re going to be tempted to do something like this:

$a = Get-Process | $a

Admittedly, that looksOK – it looks like you want to assign the output of Get-Process to the variable $a then display $a. However, it’s not going to work; instead you’re going to get an error message similar to this

Expressions are only permitted as the first element of a pipeline.
At line:1 char:21
+ $a = Get-Process | $a <<<<

We’ll concede that this can be a difficult distinction to make, but pipelines are used to string multiple commands into a single command, with data being passed from one portion of the pipeline to the next. Furthermore, as that data gets passed from one section to another it gets transformed in some way: filtered, sorted, grouped, formatted, whatever. In the invalid command we just showed you, we’re not passing any data. We’ve really got two totally separate commands here: we want to use Get-Process to return information about the processes running on a computer and then, without transforming that data in any way, we want to display the information. Because we really have two independent commands, we need two lines of code:

$a = Get-Process
$a

All right, if you’re bound and determined to do this all on a single line of code, separate the commands using a semicolon rather than the pipe separator:

$a = Get-Process; $a

But this isn’t pipelining, this is just putting multiple commands on one line.

Bonus Tip

OK, but suppose you wanted to get process information, sort that information by process ID, and then – instead of displaying that information – store the data in a variable named $a. Can you do that? Yes you can, just like this:

$a = (Get-Process | Sort-Object ID)

What we’re doing here is assigning a value to $a. Which value are we assigning it? Well, we’re assigning it the value we get back when we call the Get-Process cmdlet and then pipe the returned information to Sort-Object. This command works because we put parentheses around our Get-Process/Sort-Object command. Any time PowerShell parses a command, it carries out the instructions enclosed in parentheses before it does anything else. In this case, that means PowerShell first gets and sorts process information, then assigns that data to $a. Display the value of $a and see for yourself.

But if you’re a beginner, don’t worry too much about this bonus example. Get used to using pipelines in the “traditional” way, then come back here and start playing around with parentheses.