Unix regular expression
Author
Discussion

Jake.

Original Poster:

1,195 posts

253 months

Saturday 23rd January 2010
quotequote all
Anyone know how I would select a range files by numbers in the filenames?

(Simplified) I'm trying to select all files in a directory that have _39_ or _40_ in the name.
I've tried things like

ls *_[3,4][9,0]_*

but that also selects other files, like ones with _30_ in their name.


I know I could do it other ways, like ls *_[39_* *_40_*

but it's the regular expression part I'm trying to understand.



Thanks for any pointers.

guffhoover

564 posts

204 months

Saturday 23rd January 2010
quotequote all
Only did Linux once in Uni many years ago but would the grep command do anything for you?

cs02rm0

13,816 posts

209 months

Saturday 23rd January 2010
quotequote all
Not sure you can do regexps with ls, just wildcards, would be happy to be corrected there though. You can use regexps with grep as suggested.

ls|grep "_39_\|_40_"


Edited by cs02rm0 on Saturday 23 January 22:03

ewenm

28,506 posts

263 months

Saturday 23rd January 2010
quotequote all
I think ^^^^^ is right - you'll need to pipe the ls output to another expression to use the filters. If you can, get hold of a copy of Unix in a Nutshell - very useful tome. I think I once found an online version, but can't locate it easily now.

Jake.

Original Poster:

1,195 posts

253 months

Saturday 23rd January 2010
quotequote all
Very interesting guys - thanks. I wasn't really expecting any replys at this time on a Saturday.


cs02rm0 said:
Not sure you can do regexps with ls, just wildcards, would be happy to be corrected there though. You can use regexps with grep as suggested.

ls|grep "_39_\|_40_"
I'm pretty sure regexps work with ls (only some maybe?) because
ls *_[3,4][9,0]_* does work (it works wrong for what I want but it's a regexp, isn't it?

Cheers






ps,
Any idea why I cant do more than one {{{ code tag in a post? Try it, very odd.


grumbledoak

32,242 posts

251 months

Saturday 23rd January 2010
quotequote all
No regexps for ls - you get the shell's wildcard expansion (the absence of quotes is a bit of a clue).

Jake.

Original Poster:

1,195 posts

253 months

Saturday 23rd January 2010
quotequote all
OK then I'll give
_39_\|_40_
a try tomorrow then.

Thanks all, much appreciated.

cyberface

12,214 posts

275 months

Sunday 24th January 2010
quotequote all
Jake. said:
Any idea why I cant do more than one {{{ code tag in a post? Try it, very odd.
That's a well known problem in the PH forum code that hasn't been fixed and it's been outstanding for ages.

There are lots of other bugs in the PH forum code too... but I won't divulge them because I like to still be able to own the forum if I want evilhehe

The inability to do two code blocks in one post is *very* irritating though... however since it's probably only an issue for the Computers & Stuff forum, it's probably not high on the list of priorities, sadly... frown

dcb

6,012 posts

283 months

Sunday 24th January 2010
quotequote all
Jake. said:
(Simplified) I'm trying to select all files in a directory that have _39_ or _40_ in the name.
ls -1 | egrep "_39_|_40_"

If you need further help, man egrep has all the detail.

TooLateForAName

4,900 posts

202 months

Sunday 24th January 2010
quotequote all
Jake. said:
(Simplified) I'm trying to select all files in a directory that have _39_ or _40_ in the name.
I've tried things like

ls *_[3,4][9,0]_*

it's the regular expression part I'm trying to understand.
The unix in a nutshell book is very good. I remember starting a project with a bunch of people who were all mainframe/mini people and none of them had a clue about unix. I bought a case of unix in a nutshell, went down very well.

There is also a regex book but probably you don't need the complexity.
Basically the square brackets matches a single character so as you've found you match a 3 or a 4 followed by a 9 or a 0 hence 39,30,49,40.

Depending on what you are tyring to do you might want to have a look at sed and awk which are very handy little utilities.