There are a few answers on this topic already, but pretty much all of them say that it's bad to parse the output of ls -l
, and therefore suggest other methods.
However, I'm using ncftpls -l
, and so I can't use things like shell globs or find
– I think I have a genuine need to actually parse the ls -l
output. Don't worry if you're not familiar with ncftpls
, the output returns in exactly the same format as if you were just using ls -l
.
There is a list of files at a public remote ftp directory, and I don't want to burden the remote server by re-downloading each of the desired files every time my cronjob fires. I want to check, for each one of a subset of files within the ftp directory, whether the file exists locally; if not, download it.
That's easy enough, I just use
tdy=`date -u '+%Y%m%d'`_
# Today's files
for i in $(ncftpls 'ftp://theftpserver/path/to/files' | grep ${tdy}); do
if [ ! -f $i ]; then
ncftpget "ftp://theftpserver/path/to/files/${i}"
fi
done
But I came upon the issue that sometimes the cron job will download a file that hasn't finished uploading, and so when it fires next, it skips the partially downloaded file.
So I wanted to add a check to make sure that for each file that I already have, the local file size matches the size of the same file on the remote server.
I was thinking along the lines of parsing the output of ncftpls -l
and using awk, something like
for i in $(ncftpls -l 'ftp://theftpserver/path/to/files' | awk '{print $9, $5}'); do
...
x=filesize # somehow get the file size and the filename
y=filename # from $i on each iteration and store in variables
...
done
but I can't seem to get both the filename and the filesize from the server into local variables on the same iteration of the loop; $i alternates between $9 and $5 in the awk string with each iteration.
If I could manage to get the filename and filesize into separate variables with each iteration, I could simply use stat -c "%s" $i
to get the local size and compare it with the remote size. Then its a simple ncftpget
on each remote file that I don't already have. I tinkered with syncing programs like lftp
too, but didn't have much luck and would rather do it this way.
Any help is appreciated!
for loop splits when it sees any whitespace like space, tab, or newline. So, IFS is needed before loop, (there are a lot of questions about ...)
The better way I think is to use while not for, so