How to Split Large Text Files with PowerShell?

When working with large text files, it can be necessary to split them into smaller files. This is a common requirement while working with log files. PowerShell provides different commands and methods to divide these types of large files. In this tutorial, we’ll explore different methods to split large text files using PowerShell.

To split a large text file in PowerShell, you can use the Get-Content cmdlet combined with the | (pipe) operator and the Set-Content cmdlet. By reading the file line-by-line and writing to new files after a certain number of lines or size limit is reached, you can effectively divide a large file into smaller, more manageable chunks. Here’s a simple example script that splits a file by line count:

$sourceFile = "largefile.txt"
$lineCount = 1000
$counter = 1
Get-Content $sourceFile | ForEach-Object {
    $fileName = "splitfile_" + $counter.ToString("000")
    Add-Content -Value $_ -Path $fileName
    if ((Get-Content -Path $fileName).Count -ge $lineCount) { $counter++ }
}

This script will produce a series of new files each containing up to 1000 lines from the original file.

Split Large Text Files with PowerShell

Now, let us check how to split large text files with PowerShell. The Get-Content cmdlet is the basic command for reading data from a file. It allows you to specify a delimiter to divide the file into objects as it reads, which is useful when splitting files based on content.

Method 1: Splitting by Line Count

One common requirement is to split a file into chunks containing a specific number of lines. Here’s a complete PowerShell script.

$sourceFile = "C:\MyFolder\largefile.txt"
$lineCount = 500  # The number of lines each split file should contain
$splitFilePrefix = "C:\MyFolder\splitfile_"
$counter = 1

Get-Content $sourceFile | ForEach-Object {
    $fileName = $splitFilePrefix + $counter.ToString("000")+".txt"
    Add-Content -Value $_ -Path $fileName

    if ((Get-Content -Path $fileName).Count -ge $lineCount) {
        $counter++
    }
}

This script reads the source file line by line, adding lines to a new file until the line count reaches the specified limit. It then increments the counter and starts writing to the next file.

I had one large text file and you see in the screenshot below the output after I executed the above script. It split the large file to 4 files.

Split Large Text Files with PowerShell

Method 2: Splitting by File Size

Another approach is to split the file based on the desired file size. The following PowerShell script will help you split a file into multiple parts with the maximum size you specify:

$sourceFile = "C:\MyFolder\largefile.txt"
$maxSize = 10MB
$bufferSize = 1024 * 1024  # Read in 1MB chunks
$splitFilePrefix = "C:\MyFolder\splitfile_"
$counter = 1
$fileStream = [System.IO.File]::OpenRead($sourceFile)
$buffer = New-Object Byte[] $bufferSize
$destinationFile = $splitFilePrefix + $counter.ToString("000")+".txt"

while ($fileStream.Position -lt $fileStream.Length) {
    $destinationStream = [System.IO.File]::Create($destinationFile)
    while ($destinationStream.Length -lt $maxSize -and $fileStream.Position -lt $fileStream.Length) {
        $readLength = $fileStream.Read($buffer, 0, $buffer.Length)
        $destinationStream.Write($buffer, 0, $readLength)
    }
    $destinationStream.Dispose()
    $counter++
    $destinationFile = $splitFilePrefix + $counter.ToString("000")+".txt"
}
$fileStream.Dispose()

This script uses .NET file stream objects to read and write data in chunks, ensuring that the split files do not exceed the maximum size specified.

Method 3: Split by Custom Delimiter

You can also split a file based on a specific delimiter, such as a special character or a string. You should have a delimiter presented in the large file. Here’s how you can do it with PowerShell:

$sourceFile = "C:\MyFolder\largefile.txt"
$delimiter = "YOUR_DELIMITER"
$splitFilePrefix = "C:\MyFolder\splitfile_"
$counter = 1
$content = Get-Content -Path $sourceFile -Raw
$chunks = $content -split $delimiter

foreach ($chunk in $chunks) {
    $fileName = $splitFilePrefix + $counter.ToString("000")+".txt"
    $chunk | Set-Content -Path $fileName
    $counter++
}

This script reads the entire file content as a single string and then uses the -split operator to divide the content based on the specified delimiter. Each chunk is then written to a new file.

Method 4: Using External Modules

For even more functionality, you can leverage external modules like FileSplitter, which is available from the PowerShell Gallery. Here’s an example of how to use this module:

Install-Module -Name FileSplitter
$sourceFile = "C:\MyFolder\largefile.txt"
$maxSize = 10MB
$splitFilePrefix = "C:\MyFolder\splitfile_"

Split-File -InputFile $sourceFile -Size $maxSize -Destination $splitFilePrefix

This script installs the FileSplitter module and then calls its Split-File cmdlet to split the source file into parts of the specified size.

Conclusion

I hope now you got to know how to split large text files using PowerShell. I have explained different methods to divide files by line count, file size, or a custom delimiter. I have also executed the above PowerShell scripts and tested it properly in my system, and I hope it will help you.

You may like the following PowerShell tutorials:

100 PowerShell cmdlets download free

100 POWERSHELL CMDLETS E-BOOK

FREE Download an eBook that contains 100 PowerShell cmdlets with complete script and examples.