Why does OCaml have mandatory distinct float and int literal syntax?

352 Views Asked by At

OCaml has distinct syntax for BOTH:

  • float operations vs. int operations. Float operations end in dot: +..
  • float literals vs. int literals. Floats literals end in dot: 3..
# 3 + 3;;
- : int = 6
# 3. +. 3.;;
- : float = 6.
# 3. + 3;;
Error: This expression has type float but an expression was expected of type
         int

I can see using one of these mechanisms to disambiguate, but why are two needed always? For example, I could see a case where . is sometimes needed at the end of a literal, but not why it is needed for 3 +. 3 where OCaml could figure out that we want float arithmetic because we're saying we want float arithmetic.

I'm looking for specific technical justification based on interaction with other language features–not opinion or arguments from ergonomics.

2

There are 2 best solutions below

0
On

In short, it is because (+.) and (+) as well as 1 and 1. denote different values despite the fact that they look alike. Those values have different types, different representations, and different semantics. The feature of a language that makes (+) and 1 work the same in different contexts is called ad-hoc polymorphism and it is not present in OCaml. OCaml is not trying to infer which value you're going to use from the textual representation of your program and its type, instead, it infers what is the type of your program based on the values that you're using and how you're using them. It is an important distinction of ML wrt some other languages that move the other way around, i.e., they have the user-specified type and then infer the correct values that match that type and check that the values are used correctly. In ML the input is the program and the output is the type of that program, which is (a) used to prove that the program is well-typed and will not have any type errors in runtime and, (b), is used to generate native and performant code with all inferred types erased. It is also important to understand that type inference in OCaml is not a convenience utility that lets you omit the types when they could be inferred (like in C++'s auto or local type inference in Scala). Instead, it is a principal step in the compilation process and in the semantics of the language, as OCaml must be able to infer the type for any program. The types that we occasionally write in OCaml programs are used only as constraints and are never taken as inputs. Moreover, type annotations never change the behavior of the program. Well, unless GADTs come into play, but this is a completely different story.

For the deeper insight, we should recall that the type inference algorithm underneath OCaml is syntax-driven and declarative. The syntax-driven means that the program syntax fully defines the type of that program. Moreover, the algorithm ensures that if the type exists (i.e., the program is well-typed), then this inferred type is unique and principal. I.e., either there are no other types that represent this program or all other types are instances of the inferred type. The declarative means that the rules of how types are assigned to the programs are described with declarative type rules. This algorithm gives the formal translation of programs to types, which enables/ensures Curry-Howard correspondence, a deep connection between computer programs and mathematical proofs. That enables us to speak about program correctness and the other way around, about truth correctness, i.e., to prove with programs that our theorems are correct. This brings us back to the history of OCaml/ML, which was originally a Meta Language (ML) for LCF (Logic for Computable Functions) theorem prover.

Given that we agree that we don't want to lose that important property of the language, the question still remains open, why couldn't we implement ad-hoc polymorphism that is syntax-driven and declarable. Well, in fact, we can, for example, Haskell has one. And there is some work on adding modular implicits to OCaml. But it all comes with the tradeoffs and it would be, at the end, a completely different language. So far, in OCaml, each value has a type or a type scheme and there is no type in the OCaml runtime system that will represent a + operator that works for ints and floats.

With that said, it would be dishonest to OCaml not to say that you can define your own (+) operator that will have type number -> number -> number, where the number is an existential type number = Num : 'a typeid * 'a -> number with a separate table of operations for each typeid stored in some hidden hashtable. But this would be dynamic typing (see how the typeid is packed in each value) and it is completely different from the static typing, which gives you guarantees about your program before it runs (in our example this (+) function could fail in runtime and our typing rules are not declarative but are intrinsic to the implementation of the (+) operator).

0
On

Each function has a type, this is what lead the OCaml syntax. If you pass an argument which is not appropriate, the compiler reject your code with an error.

This seems pretty obvious, but this is also the case in any function, including the basic math operators: the function named +. has the type :

val (+.): float -> float -> float

That's all. If you give an int instead of a float, the compiler tells you your code is wrong. OCaml does not figure anything. The compiler just check if the parameters are valid with the function signature. They aren't ? Error.

You could argue that the point is false, because any C compiler will accept your line of code and produce a float as result as expected. But this is another language, and even if this could¹ work, this is not allowed inside OCaml. You can't escape the type.

The language specification is pretty clear : each function is defined by the type of the parameters, and you can't give a multityped parameter.

This allow the compiler to optimize the code easilly , but you have to explicitally convert the argument into the one expected by your function signature.

¹ It will not actually, because the memory representation for an int in OCaml is not exactly the same as the int in C (but this is another subjet)