How do I pipe `dos2unix` into `while read` in bash?

2.4k Views Asked by At

Right now, I'm trying to fix an issue in my PrintBans.sh script.

The problem is, the program that generates this file saves it with \r\n line endings, so I need the while loop to be able to read \r\n lines, otherwise there's an extra \r at the end of the last line which results in the arithmetic failing:

 - 621355968000000000")syntax error: invalid arithmetic operator (error token is "

I've tried these.

while read ban
do
    ...
done < dos2unix $file

while read ban
do
    ...
done < `dos2unix $file`

cat $file > dos2unix > while read ban
do
    ...
done

while read ban
do
    ...
done < dos2unix < $file

I also see that some people set IFS='\r\n', but this did not work for me.

Is it impossible to pipe files through dos2unix without overwriting the original file?

2

There are 2 best solutions below

4
On BEST ANSWER

Literal Answer: Pipe Through!

If you don't tell dos2unix the name of the file it's working with, it can't modify that file in-place.

while IFS= read -r line; do
  echo "No carriage returns here: <$line>"
done < <(dos2unix <"$file")

Redirections are performed by the shell before a program is started, when you invoke dos2unix <input.txt, the shell replaces file descriptor 0 with a read handle on input.txt before invoking dos2unix with no arguments.

If you wanted to be really paranoid (and pay a performance cost for that paranoia), you could prevent a hypothetical nonexistent dos2unix that modified a file descriptor received on stdin in-place from doing so by making it <(cat <"$file" | dos2unix), such that dos2unix is reading from a FIFO connected to the separate executable cat, rather than straight from the input file. Needless to say, I don't ever advise this in practice.


Better Answer: Don't

You don't need dos2unix (which -- with its default in-place modification behavior -- is meant for human interactive users, not scripts); the shell itself can strip carriage returns for you:

#!/usr/bin/env bash
#              ^^^^- not /bin/sh; needed for $'' syntax and [[ ]]

while IFS= read -r line || [[ $line ]]; do
  line=${line%$'\r'}
  echo "No carriage returns here: <$line>"
done <"$file"
  • ${var%expr} is a parameter expansion which strips any trailing instance of the glob expression expr from the contents of the variable var.
  • $'\r' is ANSI C-like string syntax for a carriage return. Using that syntax is important, because other things that look like they might refer to a carriage return don't.

    • \r outside any kind of quoting context is just the letter r.
    • "\r" or '\r' are two characters (a backslash and then the letter r), not a single carriage return.
  • [[ $line ]] is a ksh extension adopted by bash equivalent to [ -n "$line" ]; it checks whether the variable line is non-empty. It's possible for read to return false while still populating a line if you have a partial line without any terminator; as l0b0 points out, line separators rather than terminators are common on Windows. This ensures the last line of a file is processed even if it doesn't end in a newline.
2
On

Considering the context of your script on github, assuming none of the fields of your CSV file contain a CR character, you just have to put CR in IFS.

Change:

while read ban; do ...

to:

while IFS=$'\r' read -r ban; do ...

For the same price, you can get the split of ban into six fields with:

while IFS=$';\r' read -r name idip end_time reason admin start_time remaining; do
    name=${name/,/\\,}
    ticksToDateString "$end_time"
    ...