I trying to prove correct an algorithm to split a list of integers into sublists of equal sum in linear time. Here you can see the algorithm I have chosen to do so.
I would like to get some feedback regarding:
The convenience of my definition for the splitting function.
The "induction" hypothesis to use in my situation.
Please, bear in mind that up to now I have only worked with apply-scripts and not with Isar proofs.
Here is a preliminary implementation of the algorithm and the correctness definition:
definition
"ex_balanced_sum xs = (∃ ys zs. sum_list ys = sum_list zs ∧
xs = ys @ zs ∧ ys ≠ [] ∧ zs ≠ [])"
fun check_list :: "int list ⇒ int ⇒ int ⇒ bool" where
"check_list [] n acc = False" |
"check_list (x#xs) n acc = (if n = acc then True else (check_list xs (n-x) (acc+x)))"
fun linear_split :: "int list ⇒ bool" where
"linear_split [] = False" |
"linear_split [x] = False" |
"linear_split (x # xs) = check_list xs (sum_list xs) x"
The theorem to prove is as follows:
lemma linear_correct: "linear_split xs ⟷ ex_balanced_sum xs"
If I reason for instance for the first implication as:
lemma linear_correct_1: "linear_split xs ⟹ ex_balanced_sum xs"
apply(induction xs rule: linear_split.induct)
Then I get a list of subgoals that I think are not appropriate:
- linear_split [] ⟹ ex_balanced_sum []
- ⋀x. linear_split [x] ⟹ ex_balanced_sum [x]
- ⋀x v va. linear_split (x # v # va) ⟹ ex_balanced_sum (x # v # va)
In particular, these subgoals don't have an induction hypothesis! (am I right?). I tried to perform a different induction by just writing apply(induction xs)
but then the goals look as:
- linear_split [] ⟹ ex_balanced_sum []
- ⋀a xs. (linear_split xs ⟹ ex_balanced_sum xs) ⟹ linear_split (a # xs) ⟹ ex_balanced_sum (a # xs)
Here the hypothesis is also not an induction hypothesis since it is assuming an implication.
So, what is the best way to define this function to get a nice induction hypothesis?
Edit (a one-function version)
fun check :: "int list ⇒ int ⇒ int ⇒ bool" where
"check [] n acc = False" |
"check [x] n acc = False" |
"check (x # y # xs) n acc = (if n-x = acc+x then True else check (y # xs) (n-x) (acc+x))"
definition "linear_split xs = check xs (sum_list xs) 0"
Background
I was able to prove the theorem
linear_correct
for a function (splitl
) that is very similar to the functioncheck
in the statement of the question. Unfortunately, I would prefer not to make any attempts to convert the proof into an apply script.The proof below is the first proof that came to my mind after I started investigating the question. Thus, there may exist better proofs.
Proof Outline
The proof is based on the induction based on the length of the list. In particular, assume
splitl xs (sum_list xs) 0 ⟹ ex_balanced_sum xs
holds for all lists with the length less than
l
. Ifl = 1
, then the result is easy to show. Assume, thatl>=2
. Then the list can be expressed in the formx#v#xs
. In this case if it is possible to split the list usingsplitl
, then it can be shown (splitl_reduce
) that either"splitl ((x + v)#xs) (sum_list ((x + v)#xs)) 0"
(1)or
"x = sum_list (v#xs)"
(2).Thus, the proof proceeds by cases for (1) and (2). For (1), the length of the list is
(x + v)#xs)
isl-1
. Hence, by the induction hypothesisex_balanced_sum (x + v)#xs)
. Therefore, by the definition ofex_balanced_sum
, alsoex_balanced_sum x#v#xs
. For (2), it can be easily seen that the list can be expressed as[x]@(v#xs)
and, in this case, given (2), it satisfies the conditions ofex_balanced_sum
by definition.The proof for the other direction is similar and based on the converse of the lemma associated with (1) and (2) above: if
"splitl ((x + v)#xs) (sum_list ((x + v)#xs)) 0"
or"x = sum_list (v#xs)"
, then"splitl (x#v#xs) (sum_list (x#v#xs)) 0"
.