How to subtitute the cat command in a bash file?

90 Views Asked by At

I'm working in testing my smt model using different solvers. In this moment I have the file .smt2 which contains the instruction of a model converted in smtlib.
I have created a file .sh which implements the linear search in order to test my model using different solvers.
This is the code of my bash file:

#!/bin/bash

# Usage:
#   ./smt-opt <SMT-LIB file> <obj.var name> <initial bound> <solver> <min/max>
# No (get-model) query should in the SMT-LIB input file.

TIMEOUT=300 # Timeout in seconds. TODO: Set timeout as a parameter.
in_file=$1
obj_var=$2
obj_val=$3

if
  [ "$4" = 'z3' ];
then
  solver_cmd="z3 -in "
elif
  [ "$4" = 'cvc4' ];
then
  solver_cmd="cvc4 --lang smt --produce-models --incremental "
elif
  [ "$4" = 'cvc5' ];
then
  solver_cmd="cvc5 --produce-models --incremental "
else
  echo "Unknown solver!"
  exit 1
fi
if [ "$5" = 'max' ]; then
  # Maximization problem.
  rel='>='
  next=1
else
  # Minimization problem.
  rel='<='
  next=-1
fi

# Creating input/output pipes.
in_pipe=pipe.in
out_pipe=pipe.out
rm $in_pipe $out_pipe pkill sleep 2>/dev/null
mkfifo $in_pipe
mkfifo $out_pipe

# Block input pipe for $TIMEOUT seconds.
sleep $TIMEOUT > $in_pipe &
# Send piped input data to solver, and the output of solver to output pipe.
$solver_cmd < $in_pipe > $out_pipe &
# Feed input pipe with the queries SMT-LIB specified in the input file.
cat $in_file > $in_pipe
# SOL = 0 iff solver hasn't find any solution yet.
SOL=0

while read line < $out_pipe
do
  if [ ! "$line" = 'sat' ]; then
    break;
  fi
  SOL=1
  echo -e "$obj_var = $obj_val\n----------"
  # Updating obj. value (linear search). TODO: Implement binary search.
  obj_val=`awk -v ov=$obj_val -v n=$next 'BEGIN {ov = int(ov) + n; print ov}'`
  if
    [ $obj_val -lt 0 ]
  then
    # Adjusting for unary minus.
    echo "(assert ($rel $obj_var (-"`echo $obj_val | tr '-' ' '`")))" > $in_pipe
  else
    echo "(assert ($rel $obj_var $obj_val))" > $in_pipe
  fi
  echo "(check-sat)" > $in_pipe
#  echo "%%% Solving with $obj_var $rel $obj_val"
done

if
  [[ $SOL -eq 0 ]]
then
  if [ "$line" = 'unsat' ]; then
    echo '=====UNSATISFIABLE====='
  else
    echo '=====UNKNOWN====='
  fi
elif [ "$line" = 'unsat' ]; then
  echo '=========='
fi

rm $in_pipe $out_pipe pkill sleep 2>/dev/null

In particular my problem is in this line:

cat $in_file > $in_pipe

The problem of this instruction is that it takes a lot of times (the dimension of my smtlib file is 80 megabytes). How can I substitute this instruction in order to have a more efficient instruction ?

I am expecting a more efficient instruction which works for this project.

1

There are 1 best solutions below

2
On

cat is not your performance problem here.

Really, it's not. The things on the output side of the pipe are much slower than the things on the input side; assuming reasonable I/O performance, cat is getting hung waiting for solver_cmd to run, and solver_cmd could well be getting hung waiting for read to run. (Remember, all parts of a pipeline run in parallel, and pipe buffer size is limited -- once the pipe is full, nothing more can be written into it on the input side until something is pulled off the output side).

...but you can replace it with a direct handle on your input file.

Instead of trying to replace cat with something that does a faster copy from a file to a FIFO, we can just attech your solver process direct to the file and not try to use any FIFO at all.

#!/usr/bin/env bash
#              ^^^^- bash only, not sh

solve_type=$4
case $solve_type in
  z3)   solver_cmd=( z3 -in ) ;;
  cvc4) solver_cmd=( cvc4 --lang smt --produce-models --incremental ) ;;
  cvc5) solver_cmd=( cvc5 --produce-models --incremental ) ;;
  *)    echo "ERROR: Unknown solver type" >&2; exit 1;;
esac
  
  
while IFS= read -r line; do
  echo "Processing $line" >&2
  # ...put your logic here as appropriate...
done < <("${solver_cmd[@]}" <"$in_file").

Note the <"$in_file" -- no mkfifo or cat needed. This will be much faster, in particular, when your solver can take advantage of a seekable file descriptor on input to read multiple parts of the input stream in parallel and hand them off to different threads or subprocesses; it'll be a considerably more marginal improvement otherwise.


None of this will help if you have a deadlock

...and it looks like your code does have one.

Remember that only one process can write into a FIFO at a time: if cat is writing into it, then echo can't. And if the cat can't finish until after the echo does (because the output side of your pipeline blocks on that echo's completion and there's enough data to fill the pipe buffer so the solver can't write more content until the echo is complete), then your program will never complete.