So I have gene files named 1, 2, ... 19500.fa and want to sort them into folders 200, 400, 600... 19600 for a downstream pipeline. I have an idea of how to do this but it's pretty gruesome:
for file in "${files[@]}"; do
base_name=$(basename "$file")
gene_number=$(echo "$base_name" | cut -d'_' -f2 | cut -d'.' -f1)
to_path= (path to folder containing 200, 400, ... 19600 folders)
#if it's gene_200.fa, 400.fa etc. copy into that dir
if (( $gene_number%200 == 0)); then
cp file $to_path/$gene_number/$file
elif (( $gene_number < 200 )); then
cp file $to_path/200/$file
elif (( $gene_number > 19400)); then
cp file $to_path/19600/$file
# the endless pain of 200-400, 400-600, 600-800 ... 19200-19400
elif (( $gene_number > 200 && $gene_number < 400)); then
cp file $to_path/19600/$file
elif ....
My question is then: is there a less tedious way to do this without copying any one file into multiple folders? (e.g. if i only sorted by gene number < file name a file named gene_3.fa would be copied into all folders)
You could do this, just change the
forto loop over the files, change thedeltavalue to200and add thecpormvas you like:The math works because bash does integer arithmetic, not floating point, and so the part after the decimal point after the division will be truncated.