Powershell script regex

Author
Discussion

mizx

Original Poster:

1,570 posts

186 months

Thursday 30th July 2015
quotequote all
I have a Powershell script scheduled to run and download this txt file, but I also need to strip everything other than the IP addresses.

It's annoying as the SIEM I'm trying to pull this into supports HTTP GET and POST to append stuff to lists, which might othwerwise work, but it doesn't seem to like anything other than the value type you want imported; I'm having to retrieve it from a share on the server that's downloading and preparing the file...

This works to remove the references after the IP addresses, but I need to remove the comments at the top too:

(Get-Content C:\SIEMtasks\files\droplist.txt) |
Foreach-Object {$_ -replace " ; ([A-Z])\w+", ""} |
Set-Content C:\SIEMtasks\files\droplist.txt

Can anyone help with the regex? Also, it's running as a separate script scheduled to run after the download at the moment, I tried a few things to merge it into the end of the download script below but I have no idea what I'm doing with Powershell smile

$url = "http://www.spamhaus.org/drop/drop.txt"
$path = "C:\SIEMtasks\files\droplist.txt"
  1. param([string]$url, [string]$path)
if(!(Split-Path -parent $path) -or !(Test-Path -pathType Container (Split-Path -parent $path))) {
$path = Join-Path $pwd (Split-Path -leaf $path)
}
"Downloading [$url]`nSaving at [$path]"
$client = new-object System.Net.WebClient
$client.DownloadFile($url, $path)
  1. $client.DownloadData($url, $path)
$path

Edited by mizx on Thursday 30th July 11:03

lestag

4,614 posts

277 months

Thursday 30th July 2015
quotequote all
I hate regex...

I would:
skip 4 lines
http://britv8.com/remove-4-lines-from-the-start-of...

$par = Import-Csv -Path droplist.txt -delimiter ';'
Then trim the blanks off each field


Surfr

629 posts

196 months

Thursday 30th July 2015
quotequote all
I can't speak for powershell but extended regex on unix wold be \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

e.g.

macdanosx:/tmp dof$ grep -o -E "\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b" drop.txt
1.4.0.0
1.10.16.0
1.116.0.0
5.34.242.0
5.72.0.0
14.4.0.0
14.129.0.0
14.245.0.0
27.122.32.0
27.126.160.0
etc....


Edited by Surfr on Friday 31st July 08:38

lestag

4,614 posts

277 months

Thursday 30th July 2015
quotequote all
cool, and if the OP wanted to include the /xx

ETA:
Here a simpler download script
$a = Invoke-WebRequest -uri "http://www.spamhaus.org/drop/drop.txt"
$B = $a.content
$b |Set-Content C:\SIEMtasks\files\droplist.txt

or
(Invoke-WebRequest -uri "http://www.spamhaus.org/drop/drop.txt").content |Set-Content C:\SIEMtasks\files\droplist.txt


problem is the content is CR terminated, not CR/LF


Edited by lestag on Thursday 30th July 13:04

lestag

4,614 posts

277 months

Thursday 30th July 2015
quotequote all
Here you go
http://britv8.com/powershell-parse-spamhaus-org-dr...


Edited by lestag on Thursday 30th July 13:34

mizx

Original Poster:

1,570 posts

186 months

Thursday 30th July 2015
quotequote all
clapbeer

Thanks for that, unfortunately this is PS version 2.0 which doesn't understand the Invoke-WebRequest cmdlet, I'll have 3.0 put on in the meantime.

lestag

4,614 posts

277 months

Friday 31st July 2015
quotequote all
no worries, you could use your existing get of the webpage and parse your result through from Line 8. The data is the same.
But really the sooner you can get to the newer versions of Powershell the better! The powershell 4 ISE is great for developing code