Δ ℚuantitative √ourney

Learning Lean (Part 2)

In part 1 we learned some basics about Lean and we defined the natural numbers as a type Nat and the addition function for Nat.

namespace Tutorial

inductive Nat : Type where
| zero : Nat
| succ : Nat → Nat

def add (a b : Nat) : Nat :=
  match b with
  | zero => a
  | succ c => succ (add a c)

end Tutorial

Now we will learn how to prove our first theorems about the natural numbers. [Note that all of the code in these posts are in the namespace Tutorial to avoid name clashes with types and functions in Lean's standard library.]

Here we declare our first theorem:

theorem add_zero_zero : add zero zero = zero := sorry

Almost every expression in Lean is either an inductive type or a function type (constructed using def), but again, Lean has a lot of syntactic sugar and aliases for things. The keyword theorem is just an alias for def, and indeed, theorems are just terms of a specified type (in fact, often theorems are just functions). We used the def keyword in the last post to define functions, but as the name suggests, it is a general way to define a named term of some specific type, or in other words, it lets you assign a type and expression or value to an identifier.

For example, def x : Nat := succ zero

We've defined a term named x of type Nat assigned to the value zero. Now you can use it in other expressions, such as: #reduce add x x

Back to the theorem.

theorem add_zero_zero : add zero zero = zero := sorry

So we are defining a theorem called add_zero_zero and this term has the type add zero zero = zero. We have assigned the term to the expression sorry, which is a keyword in Lean that serves as a placeholder when you want to be able to state a theorem without proving it yet.

It is interesting that this theorem has the type add zero zero = zero, what is going on here? We are trying to state the theorem that 0 + 0 = 0 (but we haven't built the nice + notation for our Nat type yet), and we do so by defining a term with that type. If we can actually specify an expression after the assignment operator := that constructs a term of type 0 + 0 = 0, then our term will type check as a valid term, or in this context, Lean will verify we have proved the theorem.

Firstly, it might seem ridiculous that we should prove 0 + 0 = 0, isn't that totally obvious? Well Lean is a powerful programming environment but it doesn't generate theorems automatically, it doesn't know what we care about, so even extremely trivial statements like this must be explicitly stated and proved. However, Lean does agree that proving this is obvious in the sense that the proof is trivial. And recall, by proof we mean an implementation of an expression that generates a term of this type.

Well what exactly is this type add zero zero = zero? We are trying to prove two things are equal, and recall the (mostly true) mantra everything in Lean is an inductive type or function type, so this expression must be one of those. Indeed, this is an equality type, and the equality type is defined in Lean as an inductive type. Here's how it is defined:

inductive Eq {α : Sort _} : α → α → Prop where
  | refl (a : α) : Eq a a

Okay there's a lot more going on here than our Nat type, and we will need to learn more features of Lean to understand it. First, notice this is defining an inductive type called Eq, but then there are these curly braces and an alpha symbol and sort... Confusing.

{α : Sort _} is an implicit input to this type.

Recall our identity function, def natId (a : Nat) : Nat := a

The a is an explicit input to this function; without supplying a term a, this function doesn't work. However, some functions have inputs that do not need to be supplied explicitly, but can be inferred by Lean automatically. When this is the case, these inputs can be put in curly braces instead of parentheses.

As an example, let's make a more general version of our identity function. Before our identity function only worked on terms of type Nat, but we can make a polymorphic identity function that works on terms of any type.

def id { α : Type _ } (a : α) := a

This function has one implicit (optional because it can be automatically inferred) input α (which you can type in Visual Studio Code by typing \a) and one explicit (required) a. If you apply the function id to a term of any type it will work and just return that term unchanged.

The implicit input α is assigned the type Type _ because it is itself representing some type, and we want to make sure our function works on all types in the type hierarchy Type 0, Type 1, Type 2 ... that we discussed in the last post. These numbered type levels are called universes, by the way. The underscore character _ is like a "wildcard" that tells lean to infer what should go there based on the context.

When we apply id to a term of type Nat,

#check id zero
--Output: id zero : Nat

Lean will infer that α must be Nat and Nat lives in the type universe Type 0, so the wildcard _ can be inferred to be 0. The underscore wildcard is quite useful as Lean can often infer simple holes we leave for it.

Okay, back to trying to understand the equality type.

inductive Eq {α : Sort _} : α → α → Prop where
  | refl (a : α) : Eq a a

We now know what the {α : Sort _} means, oh wait, no we don't, because it says Sort _ instead of Type _. Ok, so Type u (where u is some non-negative number) is actually an alias for Sort (u + 1), so e.g. Type 0 = Sort 1. What's the point of this alias? Well, because Sort 0 is special. Sort 0 itself has another alias, namely Prop. Prop (Sort 0) is a special type universe because it is where propositions live.

A proposition is a statement with a truth value, i.e. something that we think is either true or false, like 0 + 0 = 0. A proposition in Lean, therefore, is just a type that lives in the Prop universe, but everything else we've talked about with regard to "regular" types and universes like Nat and Type 0 still applies. And as mentioned earlier, a proof of a proposition (or theorem) is just an expression that generates or constructs a term of that specific type, and that type will itself have the type Prop.

So the type 0 + 0 = 0 is of type Prop (aka Sort 0), and a proof of 0 + 0 = 0 is an expression that constructs a term of type 0 + 0 = 0.

Oh, and the main reason Prop is special is that in Lean, any two terms of a given Prop are considered definitionally equal. So if we came up with two different ways of proving that 0 + 0 = 0, then Lean will consider them definitionally equal, unlike in other type universes Type 0, Type 1, etc... If we have proved a proposition to be true, it no logner matters in Prop how exactly it was proved, we just care that it was indeed proved.

And again, let's get back to trying to understand the equality type.

inductive Eq {α : Sort _} : α → α → Prop where
  | refl (a : α) : Eq a a

So far we have explained everything up to the colon. After the colon, we are defining the type of our new inductive type, and unlike Nat which has the (default) type Type 0, Eq is given the type α → α → Prop, which you should recognize as a function type. The input types to functions are all the types before the last type after a chain ... → ... → .... So in this case, this function has two inputs of type α, and remember, α is an implicit input type that Lean will infer for us.

Inductive types like this are actually defining a family of inductive types, each family is parameterized by the specific α given. If we parameterize Eq with Nat, then we get a family of inductive equality types where some expression of Nat (propositionally) equals some other expression of type Nat.

We can force implicit inputs to become explicit inputs to a function by prefixing the function name with the @ character, for example:

#check @Eq Nat
--Output: Eq : Nat → Nat → Prop

This defines the family of equality propositions between natural numbers. A specific proposition in this family of equality types Nat → Nat → Prop would be what we have been trying to prove, namely add zero zero = zero (aka 0 + 0 = 0).

#check Eq (add zero zero) zero
-- Which is the same as:
#check add zero zero = zero
--Output: add zero zero = zero : Prop

By supplying Eq with two terms of type Nat, we get a proposition, as the type signature Nat → Nat → Prop would predict.

Back to the equality type.

inductive Eq {α : Sort _} : α → α → Prop where
  | refl (a : α) : Eq a a

We should now understand the top line, now let's figure out what's going on in the one constructor for this type. The one constructor is called refl for reflexivity, which is a math term, and it allows us a construct a term of type Eq _ _ (there are those wildcards again) simply by giving it a term of type α.

For example,

#check Eq.refl zero
--Output: Eq.refl zero : zero = zero

The only things that are equal are those terms that are literally the same term, e.g. zero = zero. This defines syntactic equality, the notion that expressions in Lean are equal only when they represent the exact same string of characters. There is another kind of equality in Lean that is not an inductive type, namely, definitional equality. If I define def x := zero then x = zero by definition, because anywhere there is an x in an expression it can be replaced with zero.

Now we can get back to studying our first theorem.

theorem add_zero_zero : add zero zero = zero := sorry

We now know that in order to prove this theorem (replace the sorry with a valid expression of type add zero zero = zero), we need to construct an equality term, and we now know the way to do that is using equality's single constructor refl.

At long last, we are ready for the proof.

theorem add_zero_zero : add zero zero = zero := Eq.refl zero

We proved this theorem with Eq.refl zero. How do we know it is proved? Because Lean doesn't give us any errors and if we do #check add_zero_zero we don't get an error.

But wait, what just happened when we did Eq.refl zero? Well, Lean will automatically compute (or reduce) add zero zero, which reduces to zero, so the proposition becomes zero = zero, which is a type whose term is constructed with Eq.refl zero. What this illustrates is that proofs of equality where one side of the equality expression can be automatically reduced (by definition of the function) can be proved by Eq.refl _. This pattern occurs frequently enough that, yep, there is an alias for it called rfl.

So this does the same thing, and is more commonly used:

theorem add_zero_zero : add zero zero = zero := rfl

Here rfl is essentially an alias for Eq.refl _ where again, the wildcard _ tells Lean to infer, in this case, that _ needs to be zero for the expression to type check.

Okay, so now we understand how to state the proposition that 0 + 0 = 0 and how to prove it Lean.

Let's prove something slightly more general, namely that for any a : Nat, a + zero = a, or in math-y notation that \(\forall a \in \mathbb{N},\ a + 0 = a\), where the upside down A \(\forall\) means "for all", \(\in\) means "in", and \(\mathbb{N}\) is the standard notation for the set of natural numbers, if that isn't already familiar to you.

Actually, let me take some time to degress to talk about the difference between mathematics in Lean and mathematics in say a typical university-level mathematics course. One could think of mathematics in very broad strokes as starting with some rules that are assumed to be true (i.e. axioms) and then just seeing what you can create from there. In this sense, a game like Chess or Go are mathematical because we can study the space of things that can happen given the rules.

It is possible to come up with all sorts of different starting rules (axioms), but one must take care that they do not lead to inconsistencies such as being able to prove 0 = 1, or like a board game where no one can win. But there are many axiom systems that are consistent, and many of them can be proved to be equivalent to other axiom systems, in the sense that expressions in one system can be faithfully translated into another system.

But it seems that by historical happenstance, modern mathematics worldwide is dominated by a particular axiom system called Zermelo–Fraenkel (with Choice) set theory, typically abbreviated ZFC. In ZFC, everything in mathematics is a set, which is an unordered collection of things without repeats. Then there are a bunch of rules (axioms) about how sets work and what you can do with them. Sets are denoted using curly braces, for example, we can say \(\{1,2,3\}\) is a set of the numbers \(1,2,3\). However, in set theory, everything is a set, even the numbers. You denote that some element is in a set using the \(\in\) notation, e.g. \(1 \in \{1,2,3\}\).

Many propositions in set theory are stated as propositions for all members of a set, such as our proposition that \(\forall a \in \mathbb{N},\ a + 0 = a\). Another form of proposition is the existence of some element or set, for example, \(\forall a \in \mathbb{N}, \exists b \in \mathbb{N}, b > a\), which is read as "for all natural numbers \(a\), there exists another natural number \(b\) such that \(b\) is greater than \(a\).

My goal is not to explain set theory because I only know the basics myself and that's a whole massive subject in its own right. The point I have been leading up to is that Lean does not use ZFC set theory as it's logical/mathematical foundation (i.e. axiom system). Lean uses type theory, which is actually a family of axiom systems. Lean uses a particular type theory called the Calculus of Constructions, which doesn't matter for our purposes unless you're into that sort of thing.

In type theory, everything is not a set, everything is a type as I have already exclaimed earlier. Types and sets are very similar, and as far as I know, unless you're a set theorist or into studying axiom systems, we can mostly ignore the differences. It turns out that every expression in Lean's type theory can be translated into ZFC set theory and vice versa, so all the mathematics that has been developed on top of ZFC can be easily translated into Lean's type theory. Perhaps in parallel universe, modern mathematics was developed in type theory rather than set theory.

In any case, the main reason for that digression was to point out that if we state our theorem in Lean and check its type:

theorem add_zero (a : Nat) : add a zero = a := sorry
#check add_zero
-- Output: ∀ (a : Nat), add a zero = a

The type of add_zero is actually ∀ (a : Nat), add a zero = a, unlike what we actually typed in, which did not include the \(\forall\) "for all" symbol. Remember that theorem is just an alias for def, and in this particular theorem we have an explicit input a, so this theorem is actually a function that takes an a (of type Nat) and produces a proposition add a zero = a, which recall, is an equality type.

But remember that the equality type is actually parameterized by two inputs of the same type, and when it is parameterized the equality type has the type Prop instead of Type. Remember we said Lean is a dependently-typed functional programming language? Well here is our first introduction to dependent types. The concrete output type of our theorem (function in this case) add_zero depends on the first input type. Hence, it is a dependent function type.

We actually already saw an example of a dependent function type when we made a polymorphic identity type:

def id { α : Type _ } (a : α) := a
#check id
-- Output: {α : Type u_1} → α → α

This function is a dependent function type because the output type will depend on the input type. But there's none of that ∀ notation here... But watch what happens when we change Type _ to Prop

def id { α : Prop } (a : α) := a
#check id
-- Output: ∀ {α : Prop}, α → α

We get that ∀ notation again. There is nothing fundamenetally different going on in these two examples, we're just changing the type universe that α lives in. But by default, Lean will utilize the ∀ notation in place of the → notation in the Prop type level, because when we're restricting ourselves to Prop it's usually because we're doing mathematics, and then the ∀ notation is more convenient.

Now, this sort of type polymorphism is present in non-dependently typed functional programming languages, so the real power of dependent types comes from the fact that the output type can depend on specific values (terms), and not just types. For example, we can (and will in a later post), define lists of a specific length, so the type will depend on a particular term of type natural number.

Okay, that was another digression to explain some more Lean syntactic sugar. Back to our theorem.

theorem add_zero (a : Nat) : add a zero = a := sorry

If you look back at our definition for the add function, you will see that when the second input is zero, it will just return the first input. So Lean can automatically reduce add a zero to just a, so we will get a = a. How do we construct a term of type a = a? We use the one constructor for the equality type, Eq.refl. So we can easily prove this theorem:

theorem add_zero (a : Nat) : add a zero = a := Eq.refl a

Or equivalently:

theorem add_zero (a : Nat) : add a zero = a := Eq.refl _

Or equivalently:

theorem add_zero (a : Nat) : add a zero = a := rfl

Let's move on to proving something very similar appearing, but significantly more difficult to prove:

theorem zero_add : add zero a = a := sorry

So instead of proving a + 0 = a like we just did, we want to prove 0 + a = a. We know that in math a + b = b + a, because addition is commutative (the order of the inputs doesn't matter). But we haven't proved that yet, and in fact our proof of the commutativity of addition will use our proof of zero_add and add_zero.

Why is zero_add harder than proving add_zero? Because 0 + a is not reduced to a by definition. Look back at our addition function, if the first argument is zero and the second argument is a, then without knowing what a is, Lean can't just reduce it. So in this case, since we don't know what a is, we have to do a match on it, and we will have a branch of code for if a is zero and a separate branch of code for if a has the pattern succ b where b is another natural number.

I'm going to build up the proof incrementally and explain the steps.

theorem zero_add : add zero a = a := 
  match a with
  | zero => _

/- 
Proof state:
a: Nat
⊢ add zero zero = zero
-/

I've started a proof for our theorem, and I've left a wildcard _. In this case, I'm not asking Lean to infer what to put there, because it can't, but what Lean will do is tell me what data I have access to and what term I need to construct in place of the wildcard. You should be able to see this by putting your cursor after the _ in Visual Studio Code and looking at the Lean Infoview pane. This is what makes Lean an interactive theorem prover, because Visual Studio Code is continuously running Lean in the background and updating the state depending on where our cursor is.

The expression after the turnstile ⊢ symbol shows us that for the case where a = zero, we need to prove that add zero zero = zero. Well we already know how to do this, we can just use rfl.

Let's keep going.

theorem zero_add : add zero a = a := 
  match a with
  | zero => rfl
  | succ c => _

/-
Proof state:
a c: Nat
⊢ add zero (succ c) = succ c
-/

We've "closed the subgoal" where a = zero, but in order to prove the general case that 0 + a = a we also need to prove that it holds when a = succ c. We see that this sub-goal is add zero (succ c) = succ c. So now we are stuck. We cannot prove this with rfl. However, look back at our definition for add. Notice that by definition add a (succ b) => succ (add a b). So we can use this fact to change the (sub)goal expression a bit:

⊢ add zero (succ c) = succ c
-- By definition we can change it to:
⊢ succ (add zero c) = succ c

Let's take stock of what we already know. We have already proved a subgoal that add zero c = c when c = zero (the base case). Can we use that proof to produce a proof of succ (add zero c) = succ c? If we could somehow apply succ to both sides of the equation, we could, but Lean doesn't know if that is a valid move yet.

In order to solve this, we need to prove an intermediate proposition or theorem, that if a = b then that implies succ a = succ b. This should be obviously true, because a function has to map equal input terms to the same output term, otherwise it would be a non-deterministic function. But even the obvious things need to be stated and formally proved.

So here we go.

theorem congr {a b : Nat} (h : a = b) : (succ a = succ b) := sorry

We are stating a theorem named congr (for congruence), which takes two implicit parameters a and b and a term (called a hypothesis in this context) h : a = b and is of the type succ a = succ b.

If we check the type of this theorem, we get ∀ {a b : Nat}, a = b → succ a = succ b To translate this into English: "For all natural numbers a and b, if we have a proof that a = b then that implies that succ a = succ b."

In type theory, implication is simply a function type, so we can interpret the arrow as "... implies ..."

Let me show you the proof of this:

theorem congr {a b : Nat} (h : a = b) : (succ a = succ b) := by rw [h]

The proof is by rw [h], which is something we haven't seen before. This is opening a can of worms that we cannot fully explain in this post, but I will introduce it here. The keyword by puts us into tactic mode. Tactic mode is a high-level Lean language feature that lets us use tactics. Tactics are metaprograms, which is to say they are not regular language constructs but can read, construct and manipulate the regular language. These metaprograms can make proving theorems a lot easier. For one, they allow us to write proofs in a sequential manner, avoiding the hassle that comes with pure functional programming. Secondly, they can do a lot of automation behind the scenes for us. That's as much as I'll say about tactics in this post as that is not the focus of this post and we will learn more about tactics later.

The rw tactic stands for rewrite, as it allows us to rewrite expressions in terms of equalities that are known to us. In our case of proving ∀ {a b : Nat}, a = b → succ a = succ b, we rewrite the final goal succ a = succ b with the hypothesis h : a = b, which gives us a new goal succ b = succ b, which is true by rfl.

Now that we have this intermediate theorem congr available, we can get back to proving that 0 + a = a.

theorem zero_add : add zero a = a := 
  match a with
  | zero => rfl
  | succ c => _

/-
Proof state:
a c: Nat
⊢ add zero (succ c) = succ c
-/

The goal is technically add zero (succ c) = succ c but as we previously discussed, that is definitionally equal to succ (add zero c) = succ c, which Lean knows, so we can treat the goal as the latter. Since we proved the base case that add zero c = c when c = zero, we can use this as our hypothesis h in the congr theorem to get a proof that succ (add zero c) = succ c. Thus to prove this theorem we use recursion, which is going to be a common approach. As we will learn later, proving theorems using recursion is identical to the mathematical concept of proof by induction.

So here's the final proof.

theorem zero_add : add zero a = a := 
  match a with
  | zero => rfl
  | succ c => congr (zero_add c)

Let's work through a concrete input, because remember zero_add is a function, so we can apply it as a function.

Let's say we give it the input a = zero.succ.succ (which, recall is the number 2). It will match the pattern succ c, and the output will be congr (zero_add zero.succ). Now we do the recursive call with zero_add zero.succ, and again, we match the pattern succ c, so we get congr (zero_add zero). We do another recursive call, this time matching the base case zero => rfl, which will return zero = zero.

So now we can start subtituting these recursive outputs back into the first function call; here's the sequence of steps happening:

congr (zero_add zero.succ)
congr (congr (zero_add zero))
congr (congr (zero = zero))
congr (succ zero = succ zero)
succ succ zero = succ succ zero

And this will of course work the same way for any natural number, hence the theorem is proved for all natural numbers.

The theorem congr we proved only works for inputs of type Nat, but this theorem must be true for any type and any function on that type, because if it were not true then the function would not be deterministic, and all functions in Lean are pure, which implies they are deterministic. The Lean standard library includes a polymorphic version of this theorem called congrArg, and we could have used that in place of our own.

theorem zero_add (a : Nat) : add zero a = a := 
  match a with
  | zero => rfl
  | succ c => congrArg succ (zero_add c)

The only difference is that congrArg also requires the function succ as input, because it is a theorem that ∀ {α : Sort u_1} {β : Sort u_2} {a₁ a₂ : α} (f : α → β), a₁ = a₂ → f a₁ = f a₂, which is read as "for some types α and β, some terms a₁ a₂ : α, and a function f : α → β, if we have a proof that a₁ = a₂, then that implies f a₁ = f a₂. So we need to supply the function f which is succ in this case.

That's it for now. Next time we will take a brief pause from the natural numbers to discuss type classes, structures, custom notation, and construct some other kinds of types.

References

https://golem.ph.utexas.edu/category/2013/01/from_set_theory_to_type_theory.html
https://leanprover.github.io/lean4/doc/
https://leanprover.github.io/theorem_proving_in_lean4/
Friendly people in the Lean zulip.

posted at 22:35 by Brandon Brown · Lean · Lean

Science, Math, Statistics, Machine Learning ...

Jul 10, 2022

Learning Lean (Part 2)

References