Locate files which ONLY contain printable characters in bash script

Question

Locate files which ONLY contain printable characters in bash script

2.3k Views Asked by RikSaunderson At 01 July 2025 at 18:39

I'm trying to write a bash script that looks at a directory full of files and categorises them as either plaintext or binary. A file is plaintext if it ONLY contains plaintext characters, otherwise it is binary. So far I have tried the following permutations of grep:

#!/bin/bash
FILES=`ls`
for i in $FILES
do
    ########GREP SYNTAX###########
    if grep -qv -e[:cntrl:] $i
    ########/GREP SYNTAX##########
    then
        mv $i $i-plaintext.txt
    else
        mv $i $i-binary.txt
    fi
done

In the grep syntax line, I have also tried the same without the -v flag and swapping the branches of the if statements, as well as both combinations of the same with [:alnum:] and [:print:]. All six of these variations produce some files labelled binary wich consist solely of plantext and some files labelled plaintext which contain at least one non-printable character.

I need to find a way to identify files that only contain printable characters i.e. A-Z, a-z, 0-9, punctuation, spaces and new lines. All files containing any character that is not in this set shoudl be classified as binary.

I've been bashing my head against a wall trying to sort this for half a day. Help! Thanks in advance, Rik

Original Q&A

There are 2 best solutions below

Bart Sas On 21 September 2010 at 09:33

You can use the -I option of grep which will treat binary files as files without a match and just use a regex that will always match (like the empty string):

if grep -qI -e '' $i

**Dennis Williamson** · Accepted Answer

First you can/should do

for f in *

instead of putting the output of ls in a variable. The chief reason for doing this is to be able to handle filenames that include spaces.

Second, you need to enclose the character class in a set of brackets or it's going to look at those characters as literals. And I would enclose them in a set of single quotes to protect against the shell interpreting them. Don't use -v and negate the print class and see if that works for you.

if grep -aq -e '[^[:print:]]' "$f"

And as shown in that line, always quote variables when they contain filenames.

mv "$f" "$f-plaintext.txt"

To keep grep from complaining about binary files, use -a.

The variable i is often used for an integer or an index. Use f or file.

Finally:

#!/bin/bash
for f in *
do
    if grep -aq -e '[^[:print:]]' "$f"
    then
        mv "$f" "$f-binary.txt"
    else
        mv "$f" "$f-plaintext.txt"
    fi
done

Locate files which ONLY contain printable characters in bash script

There are 2 best solutions below

Related Questions in BASH

Related Questions in GREP

Related Questions in NON-PRINTABLE

Trending Questions

Popular # Hahtags

Popular Questions