[ILUG] script handling filenames with spaces
Brian Foster
blf at utvinternet.ie
Sun Aug 19 00:02:20 IST 2007
| Date: Sat, 18 Aug 2007 20:34:28 +0100
| From: "Ian Spillane" <iantheteacher at gmail.com>
|
| A while loop piped from a temporary text file (of one filename per line)
| works best, otherwise you have to mess around really with sed converting
| spaces to a temporary string before passing through a for loop. [ ... ]
indeed, albeit I'm not convinced you can write a
reliable sed(1) in this instance. nonetheless,
I concur, a good alternative is to eliminate the
cause of (most of) the problems and simply not
use a “for”-loop:
find ... -print | while read -r file; do
... # "$file" is the filename
done
(I've also eliminated the temporary file,
which (often) isn't needed.) the above trick
still requires one filename per line, so if
any of the filenames contains a newline, the
above is not correct. (the “-r” is needed only
if some filenames may contain a backslash (\).)
a more robust solution, which works even if there
are newlines (not to mention spaces, backslashes,
and all(?) other weird cases) is shown below ....
| On 8/18/07, Pete McEvoy <pete at yerma.org> wrote:
| > Could anyone advise on the below script [ vertically compressed -blf]:
| >
| > FOLDERS=/tmp/folders/
| > MAILDIR=/tmp/maildir/
| > for i in `ls $FOLDERS`; do
| > for j in "`find $FOLDERS$i/mail/ -type f`"; do
| > mb2md -s "$j" -d "$MAILDIR$i/.$(basename $j)"
| > done; done
| >
| > The contents of /tmp/folders/$i/mail/ are mailboxes, which can have
| > spaces and perhaps odd characters in the name, as such I need to ensure
| > $j is quoted before being passed to mb2md.
| > Niall on irc recommended the use of print0 and xargs -0 , but at this
| > late stage of the day I'm unable to grok how I would work them into my
| > script.
the sensible suggestion to use “find ... -print0” and
“xargs -0 ...” replaces that problem inner-“for”-loop
with something similar to the following (NOT fully
tested!):
find "$FOLDERS$i/mail/" -type f -print0 | \
xargs -0 -l1 bash -c 'mb2md -s "$1" -d "$MAILDIR$i/.$(basename "$1")"' --
both the “xargs -l1” and trailing “--” are critically
important! (so are the more obvious “find ... -print0”
and “xargs -0”.)
how the above works:
• the “find ... -print0” writes each filename to stdout
terminated with a nul (\0).
• the “xargs -0 -l1 ...” executes command “...” for
each nul-terminated filename read from stdin, with
the filename as an additional, last, argument.
( without the “-l1” the command could be executed
with multiple filenames, which in this case is
rather confusing. )
• the command “bash -c '...' --” executes script “...”
with $@ set to the arguments following the “--”;
in this case, there is only one argument ($1), the
filename, which may safely contain spaces, newlines,
backslashes, dollars, and other odd characters.
( without the trailing “--” bash may not understand
the argument added by xargs (the filename) is always
an argument to the script. )
hence, in this case, the “...” script is simply:
mb2md -s "$1" -d "$MAILDIR$i/.$(basename "$1")"
since we have arranged for $1 to be the filename.
incidentally, you can eliminate the basename(1) with:
mb2md -s "$1" -d "$MAILDIR$i/.${1##*/}"
there are other ways of doing this (e.g., “xargs -0 -i”)
but I suspect the above is both the most robust and the
simplest. also, I strongly suspect all the above relies
on GNU versions of find(1), xargs(1), and the bash(1)
Bourne-ish shell.
cheers!
-blf-
--
▶ ▶ I AM CURRENTLY LOOKING FOR A JOB! ◀ ◀ | Brian Foster
Experienced (>25 yrs) software engineer: | Montpellier, FRANCE
• Unix, Linux, embedded, design-for-test; | Stop E$$o (ExxonMobile)!
• Software/hardware co-design, debugging; | http:/www.stopesso.com
• Kernels, drivers, filesystems, &tc; Résumé (CV) & contact details:
• IDL, automated testing, process, &tc. http://www.blf.utvinternet.ie
More information about the ILUG
mailing list