How can I make Powershell parse XML faster or further optimize my script?

movielover85

New Member
I have a setup that contains 7 million XML files, varying in size from a few KB to multiple MB. All in all, it's about 180GB of XML files. The job I need performed is to analyze each XML file and determine if the file contains the string \[code\]<ref>\[/code\], and if it does not to move it out of the Chunk folder that it currently is contained in to the Referenceless folder. The script I have created works well enough, but it's extremely slow for my purposes. It's slated to finish analyzing all 7 million files in about 24 days, going at a rate of about 3 files per second. Is there anything I can change in my script to eek out more performance?Also, to make matters even more complicated, I do not have the correct permissions on my server box to run .PS1 files, and so the script will need to be able to be run from the PowerShell in one command. I would set the permissions if I had the authorization to.\[code\]# This script will iterate through the Chunk folders, removing pages that contain no # references and putting them into the Referenceless folder.# Change this variable to start the program on a different chunk. This is the first # command to be run in Windows PowerShell. $chunknumber = 1#This while loop is the second command to be run in Windows PowerShell. It will stop after completing Chunk 113.while($chunknumber -le 113){#Jumps the terminal to the correct folder.cd C:\Wiki_Pages#Creates an index for the chunk being worked on.$items = Get-ChildItem -Path "Chunk_$chunknumber"echo "Chunk $chunknumber Indexed"#Jumps to chunk folder.cd C:\Wiki_Pages\Chunk_$chunknumber#Loops through the index. Each entry is one of the pages.foreach ($page in $items){#Creates a variable holding the page's content.$content = Get-Content $page#If the page has a reference, then it's echoed.if($content | Select-String "<ref>" -quiet){echo "Referenced!"}#if the page doesn't have a reference, it's copied to Referenceless then deleted.else{Copy-Item $page C:\Wiki_Pages\Referenceless -forceRemove-Item $page -forceecho "Moved to Referenceless!"}}#The chunk number is increased by one and the cycle continues.$chunknumber = $chunknumber + 1}\[/code\]I have very little knowledge of PowerShell, yesterday was the first time I had ever even opened the program.
 
Back
Top