# How to derive the Y combinator

I forgot my favorite method of deriving Y combinator for the n-th time, so this time I am forcing myself to write this up as a quick reminder.

For a proper tutorial, consider this: https://homes.cs.washington.edu/~sorawee/en/blog/2017/10-05-deriving-Y.html.

For each example here, verify with `fact(10) == 3628800`

.

## Version 0

This definition of `fact`

requires that its body can look up the name `fact`

in its parent environment, which is bad.

```
fact = lambda n: 1 if n == 0 else n * fact(n-1)
```

We want to be able to write `fact`

entirely in terms of anonymous functions which do not have free variables. These things have a fancy name: “combinators.”

Y combinator, which we will eventually reinvent, is of interest because it provides us with a primitive for building recursive functions in languages that provide higher-order functions but not recursion by default.

## Version 1

The trick is to pass in the recursion as an argument instead.

```
fact_base = lambda f: lambda n: 1 if n == 0 else n * f(f)(n-1)
fact = fact_base(fact_base) # this relies on currying/partial application
```

## Version 2

Because there are no circular dependencies, we can just inline everything. We can just stop here if we want.

```
fact = (lambda f: lambda n: 1 if n == 0 else n * f(f)(n-1))(lambda f: lambda n: 1 if n == 0 else n * f(f)(n-1))
```

## Version 3

I hate code duplication, so I am writing a helper function to call the function on itself instead.

```
fact = (lambda f: f(f))(lambda f: lambda n: 1 if n == 0 else n * f(f)(n-1))
```

Suppose I define `make_recur = lambda f: f(f)`

, then the user could make their own recursive function just by writing a function that takes in `f`

. To make a recursive call, write `f(f)(...)`

. Here is an example of how to use it.

```
make_recur = lambda f: f(f)
fib = make_recur(lambda f: lambda n: n if n <= 1 else f(f)(n-1)+f(f)(n-2))
# fib(10) == 55
```

## Version 4 (broken)

This is where I usually get stuck.

Right now, requiring the user to write `f(f)(...)`

is disgusting. We want to allow the user to write `f(...)`

instead.

```
user_fact_base = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
fact_base = ???
fact = fact_base(fact_base)
```

Our task is to figure out the definition of `fact_base`

to wrap around `user_fact_base`

. Recall, `fact_base`

takes in a function `f`

as its argument. It should call `user_fact_base`

with `f(f)`

already prepped in that argument slot. This should work:

```
user_fact_base = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
fact_base = lambda f: user_fact_base(f(f))
fact = fact_base(fact_base)
```

Actually, this results in stack overflow because Python uses strict evaluation.

```
fact = fact_base(fact_base)
= (lambda f: user_fact_base(f(f)))(lambda f: user_fact_base(f(f)))
= user_fact_base( (lambda f: user_fact_base(f(f)))(lambda f: user_fact_base(f(f))) )
= user_fact_base( user_fact_base( ... ) ) # infinite recursion
```

## Version 5

Realize that the issue is that Python evaluates `f(f)`

before passing into `user_fact_base`

. We do not want this to be evaluated unless it is necessary (i.e. when function body requires a recursive call). The trick is to simply wrap the expression `f(f)`

in a somewhat redundant lambda expression (make the argument explicit).

```
user_fact_base = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
fact_base = lambda f: user_fact_base(lambda x: f(f)(x))
fact = fact_base(fact_base)
```

Now we can inline everything if we want.

## The Y combinator

What we discovered in version 4 was the Y combinator. Specifically:

```
user_fact_base = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
Y = lambda f: (lambda x: f(x(x)))(lambda x: f(x(x)))
# Y = lambda f: (lambda x: x(x))(lambda x: f(x(x)))
fact = Y(user_fact_base)
```

$Y = \lambda f. (\lambda x. f (x x))(\lambda x. f (x x))$.

Some people might like to write: $Y = \lambda f. (\lambda x. x x)(\lambda x. f (x x))$.

## The Z combinator

What we discovered in version 5 was the Z combinator. Specifically:

```
user_fact_base = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
Z = lambda f: (lambda x: f(lambda v: x(x)(v)))(lambda x: f(lambda v: x(x)(v)))
# Z = lambda f: (lambda x: x(x))(lambda x: f(lambda v: x(x)(v)))
fact = Z(user_fact_base)
```

$Y = \lambda f. (\lambda x. f (\lambda v. x x v))(\lambda x. f (\lambda v. x x v))$.

Again, you can condense this into $Y = \lambda f. (\lambda x. x x)(\lambda x. f (\lambda v. x x v))$.

This works in Python.

## Another way to derive the Y combinator

Here is another way I like to use to memorize how to write the Y combinator. This is probably not very understandable if you have not seen it before.

First, you need to know that $Y$ finds fixed point of a function, namely $Y f$ gives the input $x$ such that $f x = x$, i.e. $f (Y f) = Y f$.

We can repeatedly substitute $Y f$ with $f (Y f)$. So, $Y f = f (Y f) = f (f (Y f)) = f (f (f (f \dots)))$.

This equation, $Y f = f (f (f (f \dots)))$, is the goal we are trying to achieve. How do we apply $f$ repeatedly?

Well, let’s answer a simpler question first: How do you make something run infinitely?

This works: $(\lambda x. x x)(\lambda x. x x)$.

Notice that if you keep simplifying the expression, you are always stuck at $(\lambda x. x x)(\lambda x. x x)$.

We just need to sneak $f$ in there: $(\lambda x. f (x x))(\lambda x. f (x x))$.

Observe $(\lambda x. f (x x))(\lambda x. f (x x)) = f (\lambda x. f (x x))(\lambda x. f (x x)) = f (f (\lambda x. f (x x))(\lambda x. f (x x))) = f (f (f \dots))$.

We are done! $Y = \lambda f. (\lambda x. f (x x))(\lambda x. f (x x))$.

To get the Z combinator, just add a redundant $v$: $Z = \lambda f. (\lambda x. f (\lambda v. x x v))(\lambda x. f (\lambda v. x x v)) $.

Yay!