[GLLUG] Shell Scripting Question

Mike Rambo mrambo@lsd.k12.mi.us
Wed, 12 Mar 2003 12:11:40 -0500


It's amazing what an (explained) example can do - something that's often
not in man pages. Actually, at least the search & replace part is the
same as what is used with vi. I guess all the various slashes etc that
are often present when I've seen sed used in examples have been somewhat
intimidating. The repuatation (for complexity) that regular expressions
have also probably didn't help.

Thanks for the explanation.



Matt Graham wrote:
> 
> On Wednesday 12 March 2003 07:34, after a long battle with technology,
> Mike Rambo wrote:
> > "Melson, Paul" wrote:
> > > cat logfile |sed -e 's/|/\`/4' |sed -e 's/|/\`/3' |sed -e
> > > 's/|/\`/2' |sed -e 's/|/\`/1' |sed -e 's/|/\|/g' | awk -F\`
> > > '{print("<tr><td>",$2,"</td><td>",$5,"</td></tr>")}'
> >
> > Now there (sed) is a potential subject for a GLLUG presentation!
> > Everytime I've looked at sed my eyes glaze over and my brain
> > segfaults.
> 
> Paul didn't even use any of the *complicated* parts of sed.  Let's
> dissect this example:
> 
> sed -e 's/|/\`/4'
> -e 's/|/\`/3'
> -e 's/|/\`/2'
> -e 's/|/\`/1'
> -e 's/|/\|/g'
> 
> Each -e argument is a regular expression.  Regular expressions are
> complicated, but these look more complicated than they really are.  The
> first one, 's/|/\`/4' , means:  Substitute the 4th occurrence of | in
> the input with ` .  s means substitute, or 'find and replace'.  The
> part between the first pair of '/'s is the thing to find.  The part
> between the second pair of '/'s is the thing to replace.  The ` is
> preceded with a \ because ` is a special character.  Finally, the
> things after the last / are special flags.  '1-9' for 'only apply this
> to the 1-9th occurrence of the pattern, 'g' for 'apply this to every
> occurrence of the pattern, 'i' for case-insensitive matching, and lots
> more besides.
> 
> The number of things you can do with regular expressions is absolutely
> amazing.  Simple:
> 
> s/bob/fred/      (replace first 'bob' with 'fred')
> s/bob/fred/g     (replace every 'bob' with 'fred')
> s/bob/fred/gi    (replace every 'bob','BOB','BoB','bOb'... with 'fred')
> s/bob.*bill/fred/   (replace every string that matches 'bob' plus an
> arbitrary number of any characters up to the string 'bill' with 'fred'.
> This would replace 'bob joe bill' but not 'bob tom'.  In a regular
> expression, '.*' is like the shell glob '*' (matches any number of
> characters), while '.' is like the shell glob character '?' (matches
> one character).)
> s/^/>/         (replace the beginning of a line with '>'.  This really
> adds a '>' on to the beginning of a line.)
> s/$/./      (replace the end of a line with '.'.  This really adds a '.'
> to the end of a line.)
> 
> More complex:
> 
> s/(\d\d)-(\d\d)-(\d\d\d\d)/\3-\1-\2/
> 
> Converts a date in American format, like 06-26-1976, to a date in ISO
> standard format, like 1976-06-26 .  Lots of new things here.  The '\d'
> matches any digit [0-9].  Each set of ()s creates a 'group'.  The
> regular expression engine stores whatever matched each group in
> internal variables, so you can use them later.
> 
> So, if we feed this expression "06-26-1976", "06" matches the first
> group (\d\d).  This is stored in variable 1.  "26" matches the second
> (\d\d), and is stored in variable 2.  "1976" matches (\d\d\d\d) and is
> stored in variable 3.
> 
> Then, in the second ("replace") part of the regular expression, we
> replace whatever we "found" with the contents of variable 3 ("\3"),
> then a dash, then the contents of variable 1, then the contents of
> variable 2.  End result:  1976-06-26 .
> 
> NOTE:  The above regular expression follows Perl syntax.  sed may
> require \ before the ( to get the grouping right.
> 
> I hope this was useful; if I have made egregious errors, I'm sure
> someone will point them out shortly.
> 

-- 
Mike Rambo
mrambo@lsd.k12.mi.us