Introduction to the Theory of Computation: The Church-Turing Thesis

My favorite authors (David Deutsch, Roger Penrose, and Douglas Hofstadter) all delve into the Church-Turing Thesis of Computational Theory and, more importantly, its strongest interpretation: The Turing Principle. [1] In this post I will first explain the Church-Turing Thesis in layman’s terms.

Back when I was working on my computer science degree I studied Turing machines and the Church-Turing Thesis in my Intro to Computational Theory class. Back then I thought it was a big waste of time. I just wanted to program computers and I couldn’t care less about this long-dead Turing guy (nor this Church guy) nor his stupid theoretical machines.

Now that I understand the philosophical ramifications of the Church-Turing Thesis, I wish I had paid attention in class! Because the Church-Turing Thesis, if true, has some profound philosophical ramifications and it might also tell us something about the deep — and special — nature of reality.

Finite Automata

All texts and classes on the Theory of Computation start out with something called “Finite Automata.” The basic idea behind them is pretty easy. You just imagine a simple ‘machine’ that is able to make choices and move between states. Here is an example of a very simple one that represents the “logic” of a coin-operated turnstile.

A Finite Automata for a coin-operated turnstile

In plain English, this says that if you try to push through a turnstile that is locked, you can’t if you haven’t first put in a coin. If you have put in a coin but haven’t pushed through yet, additional coins just leave it in an unlocked state. If you have put in a coin, then you can push through. It then locks again for the next person.

It was probably easier to understand from the diagram than from the description.

Finite Automata are capable of performing far more complicated logic than this. But this should give you a basic feel for how finite automata works.

One thing to note is that a finite automaton, like the one above, is purely theoretical because it only exists as a bunch of bubbles and lines on a piece of paper. It’s not like there is some little “finite automaton machine” inside the turnstile that makes these decisions. Or perhaps I should instead say that the turnstile itself is the finite automaton machine.

If we really wanted to, we could probably build a machine in real life that would be a Finite Automaton.There is nothing stopping someone from building it as a real machine in real life and then installing it into the turnstile. That just isn’t the cheapest way to do it.

So any ‘program’ you make as a drawing of a finite automaton can be turned into a real life “computation” that really works. The importance of the difference between a computational machine that can actually exist (like a Finite Automaton) and one that is only hypothetical and violates the laws of physics becomes important in a moment.

More Powerful Machines

As a class in computational theory progresses, the students are introduced to increasingly complex ‘machines’ that are more powerful than Finite Automata. As the figure to the right shows, the next most powerful is the Pushdown Automata (PDA). A PDA is really just an DFA with the addition of a sort of “memory”. This memory allows a PDA to create and run computations (or programs) that a DFA can’t.

The key point is only that there are certain types of programs that can be written for a PDA that can’t be written on a Deterministic Finite Automaton (DFA). In other words PDAs are ‘more powerful’ than DFAs because they can express classes of “programs” that DFAs can’t.

So there is a relationship between DFAs and PDAs in terms of “computational power.” Namely it’s possible to prove that any program written on a DFA can also be written on a PDA, but that the reverse isn’t true.

The Proof is in the Proof

The proof that a PDA can run anything that a DFA can is done by coming up with a scheme by which the logic of an DFA can be mapped to a PDA. Since a PDA is just a DFA with memory, this isn’t hard to do — just don’t use the “memory feature”.

But what about the reverse? Can we prove that it’s impossible to take certain types of “programs” written for a PDA and translate them to an DFA? That is to say, is there a proof that Pushdown Automata can’t be mapped to a Finite Automata? Or are we just assuming a Finite Automata is less powerful than a Pushdown Automata because we don’t currently know of a way to map a PDA back to an FDA? Maybe there is a way to map PDAs to FDAs and maybe no one has discovered how to do that yet? Isn’t that at least a possibility?

As it turns out, it is possible to prove that a PDA can run certain types of programs that an DFA cannot. The way you’d do it is you’d find a computation (i.e. a program) that you can prove a DFA can’t compute and then demonstrate that a PDA can compute it.

Computational Power of a Machine

This fact — that there are more powerful (PDA) and less powerful (DFA) logic machines — is interesting in and of itself.

But it leads to a philosophical question: is there such a thing as a “most powerful computing machine?”

If there was such a “most powerful machine”, how would we know any specific proposed machine happens to be “the most powerful?” Or are there just different types of computing machines available and you have to pick the right one for the job?

Turing Machines

So what machine is more powerful than a PDA?

As history would have it, at about the same time two very different types of “machines” were proposed that were both provably more powerful than PDAs.

No, it’s not Sherlock — it’s Alan Turing!

One was Alan Turing’s Turing Machine. The other wasn’t so much a machine as a clever set of notations developed by Alonzo Church that served the same purpose as developing a machine. Of these two “machines” the Turing machine is conceptually easier to teach, so usually that’s the machine that is taught in a Computational Theory course.

Turing Machines are funny little theoretical machines that have a read/write head and a (hypothetical) paper tape that it can read from or write to. Based on what the Turing Machine reads it puts the Turing Machine into an action state that performs some sort of combination of tasks consisting of either moving the read/write head forward or backward, reading from a new position on the tape, or writing so a new position on the tape. A Turing Machine looks like this:

A Turing Machine

Turing Machines and Modern Computers

One thing of interest is that a Turing Machine is, despite surface appearances, actually quite similar to a modern computer. In a modern computer the Central Processing Unit (CPU) is equivalent to the read/write head of a Turing Machine. The memory chips (RAM or ROM) is very similar to the cells of the long paper tape that the turning machine can read from or write to. So modern computers seem to be roughly equivalent to a Turing Machine.

A modern computer does have one advantage over the Turing Machine. A modern computer does not have to move from one cell of its “memory” in sequential order like a Turing Machine does.

The fact that a Turing machine can only move one cell at a time seems like a significant limitation, doesn’t it? We just saw how some machines are logically ‘more powerful’ than others: a PDA can perform computational tasks that an FDA can’t. So perhaps there are machines that are more powerful than Turing machines that can perform tasks that Turing Machines can’t? And maybe modern computer — due to their ability to jump around memory rather than have to move from cell to cell sequentially — can run some programs that a Turing Machine can’t?

In fact modern computer actually have less expressive power than a Turing Machine by virtue of the fact that Turing Machines were conceived as having an infinitely long paper tape (i.e. infinite memory) where as a real life computer will always have finite memory. However, in general this makes very little difference in what types of computations one can perform since human beings are not generally all that interested in infinitely long computations that give out infinitely long results. That is why I say modern computers and Turing Machines are “roughly” equivalent. In fact, so long as you assume any arbitrary but finite size of memory, they are exactly equivalent in terms of what types of programs they can run.

The Church Turing Thesis: Turing Machine = Max Logical Power

But what about poor Alonzo Church? His poor little “machine” forgotten because Turing Machines are easier to teach. Is his machine maybe able to express some computations /programs that a Turing Machine can’t, or vice versa?

Wouldn’t it be a stellar coincidence if it just so happened that Alan Turing and Alonzo Church just so happened to create two entirely different types of theoretical computation machines and it just so happened that they were exactly identical in terms of what types of computations they could perform?

So imagine everyone’s surprise when Alan Turing was able to produce a proof that any program written for a Turing Machine could also be written for a Church machine and also a proof that any program written for a Church machine could also be written for a Turing machine.

In fact, there are a number of proposed types of theoretical computational machines. For example, theoreticians tried allowing a Turing Machine to have multiple tapes to read/write with. They even tried allowing a Turing Machine a 2-dimensional ‘sheet’ to read and write with. Theoreticians tried all sorts of improvements to Turing Machines (and Church machines).

And so far it’s been possible to produce a proof for every single one of them that they are equivalent to a simple Turing Machine.

That does seem like a wild coincidence, doesn’t it? And it would be a wild coincidence unless there is an upper limited to what types of computation can be performed.

If there is such an upper limit, then it would be no coincidence at all that the Turing Machine and Church Machine and all other computational machines proposed all just happen to have the same computational power, since in fact the reason why they are all equivalent is because we’ve reached the upper limit of computational power.

But can we prove that there is not some computational machine out there — one that we just haven’t discovered yet — that has the ability to perform computations that a Turing Machine can’t?

How, exactly, would we produce such a proof? The fact is that we cannot prove that there is nothing more powerful than a Turing Machine. So, who knows, maybe there is.

But the fact is that we can’t find (or invent) any such machines.

So after considerable effort trying and failing, to find a way to improve on the power of Turing Machines, finally the Church-Turing Thesis was accepted even though it was not proven to be true. The Church-Turing Thesis essentially says something like this:

It’s not possible to come up with any sort of computational machine that can perform a logic program that a plain old Turing Machine can’t.

Or in other words:

Turing Machines and their equivalents are the most powerful possible types of computational machines and there are no more powerful ones out there that we just don’t know about yet.

After years and years of research on this Thesis, this Thesis still basically holds. We’ll see later that there has been somewhat of a modification to the Thesis with the introduction of theoretical quantum computers. But, basically, the Thesis still holds true today. No one has ever come up with a way to outperform Turing machines when it comes to logical expressiveness. Turing Machines are still the reigning champion.

So now you understand the Church-Turing Thesis. However, the Church-Turing Thesis is not really quite equivalent to the Turing Principle. So in a future post I’ll develop what the difference is and what it’s philosophical ramifications are.


[1] The Turing Principle. So named by Roger Penrose, who does not believe in it (at least not in current form). It was developed into the Turing-Deutsch principle by David Deutsch, who does believe in it, at least in his form of it.) (See article in Wikipedia for more details)

Bruce is a Master's student specializing in Machine Learning and Artificial Intelligence at the Georgia Institute of Technology.

Bruce is a Master's student specializing in Machine Learning and Artificial Intelligence at the Georgia Institute of Technology.