Sitecore PowerShell Extensions (SPE) is absolutely one of my favorite Sitecore modules (I'll never thank Adam Najmanowicz and Michael West enough). I routinely use it for creating reports and querying data out of the content tree. In a few cases, I've automated reports via scheduled tasks and write the output to CSV files on disk.
Most reports I write in PowerShell end up returning lots of items - sometimes hundreds or thousands of items. One client's content tree has roughly 200,000 items, and it's not uncommon for me to report on 2000 to 4000 items at a time. What's the best way of loading and filtering on thousands of items in SPE?
There are three common methods of getting items from the Sitecore content tree, each with pros and cons, especially when it relates to performance.
Get-ChildItem: Filter & Iterate
This is the most direct and obvious way to get items in SPE. For example:
# Get items by path; don't forget to recurse
$allItems = Get-ChildItem -Path 'master://sitecore/content/sites/spark/homepage' -Recurse
The -Recurse parameter is important to call out, as it will truly recurse over all child items under the provided parent item.
With this list of items, you can iterate over all items and pick out what you need:
$allItems | ForEach-Object {
# If the current item has a certain template, print out its path in the tree
if ($_.TemplateName -eq 'SitePage') {
Write-Host $_.FullPath
}
}
Alternatively, the Where-Object cmdlet does something similiar with less code:
$filteredItems = $allItems | Where-Object { $_.TemplateName -eq 'SitePage' }
And to simplify that syntax to something even better using ForEach:
ForEach ($item in $allItems) {
Write-Host $item.FullPath
}
Pros
Quick code to write
Easy to understand and parse
Cons
Iteration with ForEach-Object is slow
Filtering with Where-Object is also slow and very verbose
Get-Item: Sitecore Query
Sitecore's tired-and-true query methods provide a decent way of getting items from the content tree by using Sitercore Query Language. Using the previous scenario:
# Get items of a certain template using a query
$allItems = Get-Item -Path master:// -Query "/sitecore/content/sites/spark/homepage//*[@@TemplateName = 'SitePage']"
This is generally performant for hundreds of items, but scales badly after 1000 or so items need to be parsed. For my content tree of 200,000 items, I might as well as make and eat lunch while I wait for results.
Pros
Straightforward and concise code
Generally performant when target tree node is under 1000 items
Cons
The default Sitecore query limit (usually 100 or 260, depending on the version of Sitecore) applies here, so result sets are limited
Raising the query limit will only hurt performance
Find-Item: Search All the Things
Both Get-ChildItem and Get-Item cmdlets operate directly from the Sitecore content databases, which is a major factor in their performance issues with large item counts. By utilizing search indexes (via Lucene, Solr, Azure Search, etc.), the Sitecore Content Search API is much, much faster. This is where the Find-Item cmdlet comes into play:
$results = Find-Item -Index sitecore_master_index -Criteria @{
Filter = "Equals";
Field = "_templatename";
Value = "SiteBase"
}
Pros
Very performant, especially compared to Get-ChildItem iteration and Get-Item query
Utilizes search indexes; doesn't run expensive queries against the database
Cons
Indexes must be up-to-date to get accurate results
Search term must be indexed in a standard or computed field
Bonus Round: Using the Links Database
If you are looking for hundreds of items that are related to one item (a one-to-many relationship), there is another option that I have turned to from time to time. The Sitecore Links Database becomes a great options for SPE via the Get-ItemReferrer cmdlet:
# Get template item
$templateItem = Get-Item -Path 'master://sitecore/templates/sites/spark/SitePage'
# Get referrers of the template item
$allItems = $templateItem | Get-ItemReferrer
The above code gives us all items that utilize the 'SitePage' template.
Note that Get-ItemReferrers will return every item that is internally linked to the source item, so you'll have to filter out some additional cruft (and we're back to Where-Object). The payoff: using the links database saves a lot of time and is fairly performant (it's certainly better than iterating over a list of thousands of items).
One final note: the Links Database must be up-to-date for this to be a reliable way of loading and reporting on content items. I'm calling this out, because I've seen my fair share of issues in Sitecore because of outdated Links Databases.