Bit of help needed in some bash/shell script please
Bit of help needed in some bash/shell script please
Author
Discussion

mystomachehurts

Original Poster:

11,669 posts

274 months

Saturday 13th October 2007
quotequote all
I need to chop up a string in bash/shell and put the substrings into variables for use later on.

Would someone mind helping out with the syntax a little for me.

The string has a header on it and then the parameters follow on. The header describes the start position and length of each paramter.

It's all basic stuff but my shell script knowledge is pants.

Here's the string with header



[0:7,7:1,8:1,9:4,13:21,34:186,220:38]add_rcs11eth0port 3330 and not tcpframe.time,frame.number,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,frame.time,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,bgan,cncpbgan.imsi == 12 || cncp.opcode == 1234



This is the header - encolsed in []
[0:7,7:1,8:1,9:4,13:21,34:186,220:38]

There are always seven paramters defined.
The Header declares the start and length of each paramters in the string in an xx:yy pair that are delimited by commas.

The start of the string is 0 indexed and beings after the ']'
So
p1 starts at 0 has length 7 and is add_rcs
p2 starts at 7 has length 1 and is 1
p3 starts at 8 has length 1 and is 1
p4 starts at 9 has length 4 and is eth0
etc.

Please would someone give me a bit of help coding this in bash/shell?
Even just to get the start and length values would be a help.

Many thanks
Ade







cyberface

12,214 posts

281 months

Saturday 13th October 2007
quotequote all
Are you limited to shell scripting for this? Something like Ruby with its String, Array and RegExp classes would be just the job for this. Perl's probably got similar capabilities.

Some Unix greybeard may be able to cook up a bunch of sed and awk commands with regular expressions to cut the string up, but it'll be inpenetrable and hard to debug. I'd go with a scripting language with rich string / array support to do this job, it'd be much easier.

zaktoo

805 posts

231 months

Saturday 13th October 2007
quotequote all
cyberface said:
Are you limited to shell scripting for this? Something like Ruby with its String, Array and RegExp classes would be just the job for this. Perl's probably got similar capabilities.

Some Unix greybeard may be able to cook up a bunch of sed and awk commands with regular expressions to cut the string up, but it'll be inpenetrable and hard to debug. I'd go with a scripting language with rich string / array support to do this job, it'd be much easier.
sed & awk are bog-standard ways to deal with these issues, ruby and to a lesser extent perl would make things more complicated, not less.

That sed (badum-tish) I haven't got a solution for the OP, but it should be pretty simple.

Zumbruk

7,848 posts

284 months

Saturday 13th October 2007
quotequote all
I'd never do anything like this any other way than in perl. I'm also a great believer in the following aphorism; "You have a problem. You decide to solve it with regular expressions. Now you have two problems."

In that spirit, here's a quick and dirty perl program to chop the data up (Lord knows what Pistonheads is going to do to the layout and brackets.);

  1. !/usr/bin/perl
while (<&gtwink
{
chomp;
# Split the header and data in two by the ']'
($header,$data) = split(/\]/,$_);
# Strip off the leading '['
$header = substr ($header,1);
# Split the header fields up by the ','. The @fields array now
# contains 7 elements, each with the start & length of a data field
@fields = split (/,/,$header);
$i = 0;
# For each field specifier
foreach (@fields)
{
($start,$length) = split (/:/,$_);
@field_data[$i++] = substr ($data,$start,$length);
}
# The @field_data array now contains the individual fields
# Print it out ...
$i = 0;
foreach (@field_data)
{
print $i++,":\t",$_,"\n";
}
}

cyberface

12,214 posts

281 months

Saturday 13th October 2007
quotequote all
zaktoo said:
cyberface said:
Are you limited to shell scripting for this? Something like Ruby with its String, Array and RegExp classes would be just the job for this. Perl's probably got similar capabilities.

Some Unix greybeard may be able to cook up a bunch of sed and awk commands with regular expressions to cut the string up, but it'll be inpenetrable and hard to debug. I'd go with a scripting language with rich string / array support to do this job, it'd be much easier.
sed & awk are bog-standard ways to deal with these issues, ruby and to a lesser extent perl would make things more complicated, not less.

That sed (badum-tish) I haven't got a solution for the OP, but it should be pretty simple.
It wouldn't, that's the whole point. Zumbruk's correct - the way you'd get it working with sed and awk would be a lot more complicated than using perl or Ruby. As soon as you get into regular expressions, you cut out a proportion of support staff who can help if something goes wrong in production - not everyone understands how they work, and they can get fiendishly complex.

I don't have the time but I could write a similar process to Zumbruk's perl script in Ruby that'd be more concise and elegant, but I assume he could too, and I'm not getting into a coding dick-waving contest. smile

Zumbruk

7,848 posts

284 months

Saturday 13th October 2007
quotequote all
cyberface said:
I don't have the time but I could write a similar process to Zumbruk's perl script in Ruby that'd be more concise and elegant, but I assume he could too, and I'm not getting into a coding dick-waving contest.
Quite so, in all respects.

smile


Edited by Zumbruk on Saturday 13th October 17:08

Pigeon

18,535 posts

270 months

Saturday 13th October 2007
quotequote all
Uninspired bash kludge...

{{{

  1. !/bin/bash
  1. Sample input line to extract THINGs from
LINE="[0:7,7:1,8:1,9:4,13:21,34:186,220:38]add_rcs11eth0port 3330 and not tcpframe.time,frame.number,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,frame.time,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,bgan,cncpbgan.imsi == 12 || cncp.opcode == 1234"

  1. Split line into HEADER and The Rest, using ] as delimiter
HDR=`echo "$LINE" | sed -e 's/^\[//' -e 's/\].*$//'`
LINE=`echo "$LINE" | sed -e 's/^.*\]//'`

  1. Iterate through the header
while [ -n "$HDR" ]; do
# Extract first set of parameters, using anything other than
# 0-9 or : as delimiter
P=`echo "$HDR" | sed -e 's/[^0-9:].*$//'`

# Chop that parameter set and its following delimiter
# off the start of the header
HDR=`echo "$HDR" | sed -e "s/$P//" -e 's/^[^0-9:]//'`

# Split the parameter set into START and LENGTH,
# using : as delimiter
START=`echo "$P" | sed -e 's/:.*$//'`
LENGTH=`echo "$P" | sed -e 's/^.*://'`

# Use START and LENGTH to extract THING from the input line
THING=`echo "$LINE" | sed -r -e "s/(^.{$START})(.{$LENGTH})(.*$)/\2/"`

# Output THING
echo "$THING"

done

}}}

gives



$ ./ex
add_rcs
1
1
eth0
port 3330 and not tcp
frame.time,frame.number,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,frame.time,ip.src,ip.dst,tcp.srcport,tcp.dstport,udp.srcport,udp.dstport,ip.proto,bgan,cncp
bgan.imsi == 12 || cncp.opcode == 1234
$



looks right to me...

Pigeon

18,535 posts

270 months

Saturday 13th October 2007
quotequote all
FFS

Looks like PH's formatting code can't cope with shebangs...

Download

Zumbruk

7,848 posts

284 months

Saturday 13th October 2007
quotequote all
Pigeon said:
FFS

Looks like PH's formatting code can't cope with shebangs...
No, it can't. You get;

1. !/path/to/mumble

GreenV8S

30,999 posts

308 months

Saturday 13th October 2007
quotequote all
Zumbruk said:
"You have a problem. You decide to solve it with regular expressions. Now you have two problems."...
Zumbruk said:
split(/\]/,$_);
hehe

cyberface

12,214 posts

281 months

Saturday 13th October 2007
quotequote all
GreenV8S said:
Zumbruk said:
"You have a problem. You decide to solve it with regular expressions. Now you have two problems."...
Zumbruk said:
split(/\]/,$_);
hehe
Yeah, but compare it to Pidge's sed commands in his script above. wobble

Zumbruk

7,848 posts

284 months

Saturday 13th October 2007
quotequote all
GreenV8S said:
Zumbruk said:
"You have a problem. You decide to solve it with regular expressions. Now you have two problems."...
Zumbruk said:
split(/\]/,$_);
hehe
Blimey, that's trivial - I started with one that pattern matched the whole of the header, but I wanted to keep it simple for the non-perl types.


Edited by Zumbruk on Saturday 13th October 22:09

mystomachehurts

Original Poster:

11,669 posts

274 months

Saturday 13th October 2007
quotequote all
It hurts, it hurts...

split(/\]/,$_) is not programming it's utter gobblygook.

shootnerd
drink
read
type
weeping
headache
thumbup







GreenV8S

30,999 posts

308 months

Saturday 13th October 2007
quotequote all
Zumbruk said:
Blimey, that's trivial - I started with one that pattern matched the whole of the header, but I wanted to keep it simple for the non-perl types.
Of course it is, I just found it amusing how you warned us against using RegExp and then used it yourself. hehe

Edited by GreenV8S on Saturday 13th October 22:18

mystomachehurts

Original Poster:

11,669 posts

274 months

Saturday 13th October 2007
quotequote all
GreenV8S said:
For what it's worth, I think it should be possible to solve the problem using four similarly trivial regular expressions.
Can we do this in bash? rofl



Tomorrow I will be mostly playing with Pigeon's script

cyberface

12,214 posts

281 months

Sunday 14th October 2007
quotequote all
mystomachehurts said:
GreenV8S said:
For what it's worth, I think it should be possible to solve the problem using four similarly trivial regular expressions.
Can we do this in bash? rofl

Tomorrow I will be mostly playing with Pigeon's script
FFS use Ruby and the String.chomp, String.chop, String.each or String.split, and String.scan methods. You could probably do it without regular expressions or even escape characters (though the slashes and square brackets sort of get in the way here).

Zumbruk's example wasn't even a proper complex regular expression - the slashes are merely there as escape characters. If you think Pidge's example is less complex, then run with it, but if those big sed commands don't immediately make sense to you, then I'd seriously advise using perl or ruby.

If I have time tomorrow then I'll have a go at a ruby script without any regexps or escape characters to make it nice and simple...

Pigeon

18,535 posts

270 months

Sunday 14th October 2007
quotequote all
cyberface said:
those big sed commands
Those aren't big sed commands. Those are weeny little sed commands hehe

mystomachehurts

Original Poster:

11,669 posts

274 months

Sunday 14th October 2007
quotequote all
cyberface said:
FFS use Ruby
Can't guarantee it will be on the target platform, and it's highly likely that I won't be allowed to install it on the Customer's mission critical satellite monitoring kit.

Hence the requirement for Bash blah

Zumbruk

7,848 posts

284 months

Sunday 14th October 2007
quotequote all
Pigeon said:
cyberface said:
those big sed commands
Those aren't big sed commands. Those are weeny little sed commands hehe
Precisely.

You can regexp the header in perl with something like (off the top of my head, so likely wrong);


/\[(d+)\:(d+)\,(d+)\:(d+)\,(d+)\:(d+)\,(d+)\:(d+)\,(d+)\:(d+)\,(d+)\:(d+)\,(d+)\:(d+)\]/;

And the parameters are then available in the metavariables $1,$2,$3 ... $14, but that's hideous. And I couldn't think of a way to iterate over the metavariables. Much easier to use the "split" function and bung them in an array, like wot I did.

(Oh, poo. Is there a way to do literal quoting in PH without it turning all your parentheses into smiley faces???)


Edited by Zumbruk on Sunday 14th October 11:14


Edited by Zumbruk on Sunday 14th October 11:50


Edited by PetrolTed on Monday 15th October 09:56

Zumbruk

7,848 posts

284 months

Sunday 14th October 2007
quotequote all
mystomachehurts said:
It hurts, it hurts...

split(/\]/,$_) is not programming it's utter gobblygook.
Pah. That ain't nothing.

This is the command that sets my shell prompt;


PS1='\[\e[$(($??1:0))m\][\u@\h \w]\[\e[0m\]: '




Edited by PetrolTed on Monday 15th October 09:56