CVE-2021-32471 – Input validation in Marvin Minsky 1967 Turing Machine

CVE-2021-32471 – Input validation in Marvin Minsky 1967 Turing Machine(cve.mitre.org)

139 points by wsces 5 years ago | 37 comments

A better link is the research paper:

> The universal Turing machine is generally considered to be the simplest, most abstract model of a computer. This paper reports on the discovery of an accidental arbitrary code execution vulnerability in Marvin Minsky's 1967 implementation of the universal Turing machine. By submitting crafted data, the machine may be coerced into executing user-provided code. The article presents the discovered vulnerability in detail and discusses its potential implications. To the best of our knowledge, an arbitrary code execution vulnerability has not previously been reported for such a simple system.

> A common strategy for understanding a problem is to reduce it to its minimal form. In the field of computer security,we may ask the question: "What is the simplest system exploitable to arbitrary code execution?" In this article, we pro-pose an answer to that question by reporting on the discovery that a well-established implementation of the universal Turing machine is vulnerable to a both unintentional and non-trivial form of arbitrary code execution.

OskarS 5 years ago | |

Haven't looked into this paper deeply, but this reads very strange to me:

> This paper reports on the discovery of an accidental arbitrary code execution vulnerability in Marvin Minsky's 1967 implementation of the universal Turing machine. By submitting crafted data, the machine may be coerced into executing user-provided code.

It's a universal Turing machine. Its whole purpose is running "user-provided code". That's what a Universal Turing machine does, it runs arbitrary Turing machines. This is a little bit like saying "we found a weakness in the Python interpreter whereby you can feed it specially crafted input that allows you to run arbitrary Python programs". Like... yeah... that's what it's supposed to do.

tromp 5 years ago | | |

If you do read the paper, you see that the universal machine U expects both a description of the simulated machine T, and a description of the input to T. The exploit is that they can cause arbitrary code execution not by changing the former but the latter. They also suggest a fix to U that makes it robust against exploits in the description of T's input.

segfaultbuserr 5 years ago | | |

> This is a little bit like saying "we found a weakness in the Python interpreter whereby you can feed it specially crafted input that allows you to run arbitrary Python programs".

If my understanding is correct, it's more like, "we found a weakness in the Python interpreter whereby you can feed it a specially crafted input to a Python script that allows you override that script and forces the Python interpreter to do something else", which can be a reasonable point to make.

BTW, if we keep using the Python analogy, this scenario sounds like a Python script that loads its configuration data from a Pickle file on the hard drive. In Python, Pickle doesn't validate the input and it's capable of modifying the internal state of the program. What the authors say is that malicious data in the Pickle file leads to code execution due to deserialization of untrusted data. Depending on your perspective, just like Python says a malicious Pickle file is not in its threat model, you can argue that the Universal Turing Machine is never meant to be protected from malicious data on the tape, and this is not a exploit, but an interesting thought experiment nevertheless.

The paper says,

> The universal machine, U, will be given just the necessary materials: a description, on its tape, of T and of [the initial configuration on T's own, simulated tape] s_x; some working space; and the built-in capacity to interpret correctly the rules of operation as given in the description of T. Its behavior will be very simple. U will simulate the behavior of T one step at a time [...]

> [...] There is one obvious trust boundary in a universal Turing machine, U: the initial string on the tape of the simulated Turing machine, T. That string corresponds to the user-provided data of an ordinary computer program. Because the potential users may be unknown to the developers and administrators of the computer and its programs, it is common to view this data as untrusted. In our explorations of the universal Turing machine, we will make the same assumption. Therefore, if it were possible to execute arbitrary code without manipulating the program of T, but only by providing crafted data on T’s simulated tape, that would constitute a vulnerability.

Basically the authors' reasoning is:

1. An Universal Turing machine is an interpreter/simulation that is capable of executing code written for any Turing machine, provided by the user.

2. The user code takes an external input - the initial content on the tape.

3. It's possible to maliciously craft an input (the content on the tape) to the user code to hijack the simulation, without modifying the user code.

qsort 5 years ago | | |

Yes, that's the joke.

yabones 5 years ago |

> NOTE: the discoverer states "this vulnerability has no real-world implications."

At least they're honest about it with this CVE...

stevekemp 5 years ago | |

Issues like this, in obsolete code, are a lot of fun. Even if they are essentially meaningless.

I reported CVE-2014-3423 back in the day, relating to GNU Emacs using a predictable filename when talking to the Mosiac browser. No choice, as that was what the browser required, something that wouldn't exist these days.

buitreVirtual 5 years ago | |

I'm not so sure. Some banks and airlines might still be running those machines.

avibhu 5 years ago |

> NOTE: the discoverer states "this vulnerability has no real-world implications."

Not sure if declaring this is standard practice, but I had a good laugh.

zitterbewegung 5 years ago |

Sure this CVE sounds like a joke but if you create a programming language that is non Turing complete it is much easier to secure than a Turing complete language.

Making a language that have the expressive power of finite state machines could be an example.

qsort 5 years ago | |

We are doing this already.

- Configuration languages (JSON, YAML, XML) are pure combinatorial logic.

- Regular expressions are... mostly not actually regular, but you get the idea.

- Some templating languages are deliberately less powerful than Turing machines, e.g. ST4 is context-free.

- Prepared SQL statements are a similar idea on a different axis.

The real question is whether a non-TC language could be useful for general purpose programming. Such a language might come with very strong guarantees (termination, time complexity, even correctness or a limited form of correctness), but they might be extra-cumbersome for 'normal' workloads.

ludamad 5 years ago | | |

A non-TC language can be achieved just by forcing proofs. Oh, you want that program to run? Just prove that it is bounded memory and can't halt, etc

tester34 5 years ago | |

What does Turing completness even mean in practice when it comes to non-theoretical languages?

zitterbewegung 5 years ago | | |

You can't do anything like recursion.

al2o3cr 5 years ago |

Meh. The "exploit" seems to rely on passing input that contains cells with values that are valid for the universal machine's tape alphabet, but not the simulated machine's.

It's as if you could pass a "null terminator" in a Unicode string that didn't match the "normal" null character.

mousepilot 5 years ago |

I'm going to forward this to our local windows server admins, with the caveat that that it only applies if the particular windows server version is turing complete.

SilasX 5 years ago |

@dang, this one should be merged here or vice versa:

https://news.ycombinator.com/item?id=27104125

userbinator 5 years ago |

I think this should be considered a satire of what the "security industry" has become: finding any little thing that they can claim is exploitable, regardless of actual significance, and all the while using paranoia to slowly destroy general-purpose computing and user freedom.