I have a very basic exposure to algorithms. I am a graduate in Mathematics. I was reading Halting Problem in the book Discrete Mathematics with applicationbs by Susanna Epp. It has a following theorem :

Theorem : There is no computer algorithm that will accept any algorithm X and data set D as input and then will output "halts" or "loops forever" to indicate whether or not X terminates in a finite number of steps when X is run with data set D.

Proof : Suppose there is an algorithm, call it CheckHalt, such that if an algorithm X and a data set D are input, then CheckHalt prints "halts" if X terminates in a finite number of steps when run with the data set D or "loops forever" if X does notterminate in a finite number of steps when run with data set D.

Now next lines are those which I don't understand in this proof

Observe that the sequence of characters making up an algorithm X can be regarded as a data set itself. Thus it is possible to consider running a CheckHalt with input (X,X).

So I have understood that CheckHalt essentially gets input as an algorithm X and a data set D. It tells whether if we run the algorithm X with that data set D as it's (X's) input, then X will halt or loop forever. Thus CheckHalt(X,D) seems good.

My question is how can an algorithm X have an input X itself i.e how can we call an algorithm as a data set?

What is the meaning of the sentence "sequence of characters making up an algorithm X"?

I can understand CheckHalt(X,D). But what is CheckHalt(X,X)?

Thanks.

2

There are 2 best solutions below

4
On

A Dataset is a pretty open definition, so it should definitely be conceivable that a dataset would consist of a string of characters. But I think an example will help.

Imagine that X is an algorithm for counting periods (.) in a string. X could be written any number of ways, but if you want imagine this particular way:

  1. Start a count at 0 and a position pointer at 0.
  2. Read the character at pointer position in the string.
  3. If the character is a ., increment our count.
  4. If the character is the last character in the string, return our count.
  5. Increment the position pointer
  6. Go back to step 2.

The six step list I just wrote is itself a string... and can thus be applied to itself as data (we get 12 I think). In this case the algorithm can be applied to itself as data.

In this case, CheckHalt(X,X) would return halt since the algorithm does not loop forever.

Of course, not every algorithm will be able to accept itself as data. For instance, the GCD algorithm needs integer input, so it could not be applied to itself. However, I presume the counter-example being constructed involves an algorithm that can be applied to itself as a string of characters.

0
On

Consider the following algorithm to reverse a string:

function reverse(s) {
    var ret = "";
    for (var i = s.length - 1; i >= 0; i--) {
        ret = ret + s[i];
    }
    return ret;
}

It takes a string as input, and returns a string. Now consider the following input dataset:

"function reverse(s) {\n"
+ "    var ret = \"\";\n"
+ "    for (var i = s.length - 1; i >= 0; i--) {\n"
+ "        ret = ret + s[i];\n"
+ "    }\n"
+ "    return ret;\n"
+ "}"

This is a string. It happens to be a string encoding the source of an algorithm. Because it is a string, it is a valid input to algorithms that accept strings; like the algorithm it happens to encode does. Indeed, if pass this algorithm ('s encoding) to itself, you get the following well-defined output:

"}"
+ ";ter nruter    "
+ "}    "
+ ";]i[s + ter = ter        "
+ "{ )--i ;0 => i ;1 - htgnel.s = i rav( rof    "
+ ";"" = ter rav    "
+ "{ )s(esrever noitcnuf"

In the same way, if you have a program X with a string encoding enc(X) and X accepts a string, you can pass enc(X) to X. If you have another algorithm that takes two strings, you can pass enc(X) as both of the parameters.