Filtering Get-ChildItem Output with Regular Expressions

In my last article I produced a skeleton application that had a bit more than the Empty ASP.NET project but much less than the Starter ASP.NET application – a sort of in-between house. It had the scaffolding for Bower, Gulp and NPM, plus ECMAScript 6 transpiling, ESLint and less all wired up.

I’m currently writing a short PowerShell cmdlet to clone that project into a new project. Instead of using the Visual Studio New -> Project… workflow, I’m going to go into PowerShell and run Clone-VSProject (my new cmdlet), then import the project into Visual Studio. This will save me time because I don’t have to spend the first half hour of the project wiring up the bits I need.

The first problem I ran into was cloning the project. I had thought to do the following:

Copy-Item -Path $src -Destination $dest `
    -Recurse -Exclude "node_modules","bower_components"

However there is a bug that prevents -Exclude working with -Recurse in this manner. Instead, I have to do it the long way around. Firstly, let’s define what I want to copy:

$Source = ".\BaseAspNetApplication"
$Destination = ".\MyNewProject"
$Exclude = @( "node_modules", "bower_components" )

To get a list of files, use Get-ChildItem -Path $Source -Recurse. This has two issues. Firstly, it doesn’t exclude anything (more on that in a minute) and secondly, it doesn’t handle long filenames – printing an error instead. Let’s get rid of the errors first since they are going to be in the node_modules or bower_components areas anyway and thus we don’t want them.

Get-ChildItem -Path $Source -Recurse -ErrorAction SilentlyContinue

Now for the filtering. I first of all need to turn my exclude list into a regular expression. You can do this by hand if you like, but it’s easy enough to handle:

$regexp = "("+(($Exclude | %{ "\\$_\\" }) -join "|") + ")"

This will turn our exclude list into this:

$regexp = "(\\node_modules\\|\\bower_components\\)"

The way to read this is “contains either \node_modules\ or \bower_components\”.

Now, back to getting the child items:

$Source = Resolve-Path -Path $Source
Get-ChildItem -Path $Source.Path -Recurse -ErrorAction SilentlyContinue | Where-Object {
    $_.Fullname.Substring($Source.Path.Length) -notmatch $regexp }

Take a look at the script block a moment. $_ is filled in with each file that Get-ChildItem produces. $_.FullName is the full path to that file (so something like C:\Users\Adrian\Source\GitHub\blog-code\BaseAspNetApplication\node_modules\foo.js). $Source.Path is the full path to the BaseAspNetApplication directory. Using the substring in this way is an effective way of saying “strip off the source path”. What will be left is something like \node_modules\foo.js. I then match that against my regular expression and only pass the object on if it does not match.

This will print out just the files we need – nothing more. I now need to copy them to their new location. This brings me to my next problem – if you just pipe these to Copy-Item then they get placed in the destination and the hierarchy is destroyed. I need to construct the directory structure within the new directory:

$files | Foreach-Object {
    Copy-Item -Path $_.FullName -Destination (Join-Path $Destination $_.FullName.Substring($Source.Path.Length)) 

I use the same trick on the copying that I do on the filtering to get the relative path, but then I join it to the Destination – this gives me an absolute path. All the intervening paths are created by Copy-Item so this creates the directory structure as well.

This is only the first part of the Clone-VSProject cmdlet I am producing. In the next part, I have to alter all the references to BaseAspNetApplication to my new project name. But that’s the subject of another article.