How to find line break style (CRLF, CR or LF) in a CSV-file using Mac

933 Views Asked by At

I need to find what type of line break that is used in a csv file, using Mac. I have exported a data set from SPSS (a statistical software) to a CSV-file. This CSV-file will be sent to be run through a register and I need to provide information regarding the file, such as which line break-style that is used in the file.

As I open the CSV-file in TextEdit on my Mac I see no symbols corresponding to line break (does not say \r\n, \r or \n. There is simply a new row with no symbol indicating line break. I have not been able to find what's used as default in SPSS or how to customize this. I tried by running the file through the Terminal-app and Visual studio code (what I had access to) but no symbols indicating line break. Does anyone know how to determine which line break-style that is used in the CSV-file in this case?

3

There are 3 best solutions below

4
AudioBubble On

You can open the file in Visual Studio using the Binary editor. You will see all characters.

0
Fravadona On

According the RFC 4180, CRLF is the standard record delimiter for CSVs, but LF is also frequently used. Forget about CR-delimited records as that kind of CSV probably doesn't exist anymore.

Here's a solution that works in most cases:

awk '{print (/\r$/ ? "CRLF" : "LF"); exit}' file.csv

The problem with the previous approach is that a CSV record can span multiple lines, so encountering a LF doesn't guaranty that you got to the end of the record. A workaround would be to go to the end of the file and check how it is terminated.

You can use perl for that:

perl -le '
    open(F, '<', $ARGV[0]) or die $!."\n";
    seek(F, -2, 2);
    read(F, $e, 2);
    close(F);
    if("\r\n" eq $e) {print "CRLF"}
    elsif("\n" eq ($e = substr($e, -1))) {print "LF"}
    elsif($e eq "\r") {print "CR"}
' file.csv
0
dawg On

Given:

printf 'Line 1\r\nLine 2\r\n' >f1.txt
printf 'Line 1\nLine 2\n' >f2.txt 

You can use file on MacOS to determine line termination:

file f{1..2}.txt
f1.txt: ASCII text, with CRLF line terminators
f2.txt: ASCII text

Or awk:

awk 'FILENAME in fn{next}
{fn[FILENAME]; print FILENAME, /\r$/ ? "CRLF" : "LF"}' f{1..2}.txt

Or Ruby:

ruby -e 'ARGV.each{|fn| 
    puts "#{fn}: #{File.open(fn).readline[/\r\n$/] ? "CRLF" : "LF"}"}' f{1..2}.txt

Or Perl:

perl -E 'for $fn (@ARGV){
             open($fh, $fn); say "$fn: ", <$fh>=~/\r\n$/ ? "CRLF" : "LF"}' f{1..2}.txt

Or in the shell:

for fn in f{1..2}.txt; do
    head -n 1 "$fn" | grep -q "\r$" 
    [ $? -eq 0 ] && echo "$fn: CRLF" || echo "$fn: LF"
done

Any of those (other than file) prints:

f1.txt: CRLF
f2.txt: LF