[GLLUG] tr
Matt Graham
danceswithcrows@usa.net
Mon, 30 Sep 2002 13:08:01 -0400
On Monday 30 September 2002 11:59, after a long battle with technology,
Melson, Paul wrote:
> According to my understanding of its syntax, this
> should work:
>
> tr -s '\134\156' '\040' < dumpfile.txt > newfile.txt
>
> In this instance, tr uses the octal values for '\' and 'n' and
> replaces them with the octal value for ' ' (space). It does what it
> should do - replaces each '\n' with a ' ', but it's also matching on
> each 'n' and replacing it with a ' ', which causes a whole new set of
> problems.
>
> It seems to me that someone must've run across something like this in
> the past. Ideas, suggestions, and slaps upside the head are all
> appreciated. Anybody?
tr doesn't do regexp matching the way sed and perl do. tr in this case
is matching either '\' or 'n' and replacing either one with ' '. This
is not what you want. I think you want this:
perl -pe 's#\\n# #g' < dumpfile.txt > newfile.txt
Perl handles "weird" characters like 0x0A and 0x0D in regexps in a more
consistent way than sed does. However, if this is a really large file
(over a few hundred M) that doesn't have any newlines in it at all,
Perl might choke since the -pe option makes it use the normal <>
operator on STDIN, and the <> operator breaks on newlines by default...
sucking several hundred M into RAM and then using the normal regexp
engine on it could cause obvious problems.
--
...In Hong Kong action movies, they don't have Hollywood Guns with
infinite bullet supplies. Instead, they have Hong Kong Pants(tm) which
hold an infinite supply of loaded pistols. --M. Sphar, the Monastery
There is no Darkness in Eternity/But only Light too dim for us to see