Programming languages for AI

Which programming language is best suited for LLM-based generation?

I recently asked readers to consider which programming language they would choose for a software system generated by one or more LLMs, and offered these options:

I deliberately made the menu unreasonable for a few reasons. I'm well aware that most people who dabble with LLM-based code generation use mainstream languages like Python, Rust, TypeScript, Go, etc. This makes sense for programmers. If you intend to keep an eye on the code that these systems generate, it's reasonable to pick a language you're familiar with. This enables you to review the code.

Code for machines #

A new sentiment is, however, on the rise. Vibe coding may only be a symptom. More broadly, people are speculating whether it even makes sense for LLMs to write in higher-level programming languages. After all, even assembly code was created for humans. If no human has to look at the code, then why not generate machine code?

I briefly discussed this in a conference keynote in 2024, but here I want to look at the question from a different perspective.

The underlying assumption behind the idea that LLMs might just as well generate machine code is that the only benefit gained from high-level languages is that they are human-readable. There are several flaws in that thinking.

The first is that machine code is not portable. And neither is assembly language. You could, of course, ask an LLM to produce the same app, but generate machine code for more than one operating system and processor, but since LLMs are non-deterministic, this doesn't sound like a good idea.

You could, instead, ask an LLM to generate code in C. This is, after all, C's original claim to fame. That language is designed to be portable, and it is.

This, however, suggests that a programming language offers more than just being human-readable. C, for example, has the crucial feature of allowing compilation to many different platforms; a feature shared by many more modern languages.

Might there be other features of programming languages that would be useful, even if no human looks at the code?

Guardrails #

Some programming languages are easier to analyse than others. I'm aware of three kinds of analysis of code:

Linting, or static code analysis
Static type systems
Formal methods

I admit that I don't know much about formal methods, so I'll leave that topic to someone who does.

Linting is useful, but my experience suggests that human oversight is required to make such tools useful. They tend to produce false positives, and you need to understand code to judge whether a linter warning is a real problem, or something that can be ignored.

Static type systems are, on the other hand, much more rigorous. If a program doesn't type-check, it doesn't compile. While you could argue that type systems come with false negatives, this happens more rarely. Some languages allow 'overriding' the type system; for example, Java, C#, and other languages in the C family allow downcasting. Even so, my experience is that you don't need that language feature.

This may seem incomprehensible to some programmers, but I've written C# without resorting to dynamic type checks for more than a decade. F#, too.

If you're still doubtful that I'm speaking the truth, consider that Haskell doesn't have dynamic type-checking. You not only get by regardless; Haskell is a much more powerful language.

A typical joke is that if you can get your Haskell code to compile, it probably works. 'If' being the operative word.

Although funny, it's not true. No Turing-complete language could have that property, and even in languages that sacrifice Turing-completeness for provability, you are never safe from bugs arising from a misunderstanding of requirements. ("Oh, that was what the customer wanted!")

Even so, there's benefit from a strong static type system. It prevents whole categories of errors, where the infamous null-reference exceptions are only the top of the iceberg.

There are languages with type systems more powerful than Haskell. Idris is one of them. That's the reason I picked it for the above menu to choose from.

Constraints liberate #

As a thought experiment, imagine that you asked an LLM to produce a non-trivial, important software system where correctness matters. First, imagine that it produces this system directly in machine code. Or, if we wish to make it portable, in C. Would you trust this system?

C is a programming language infamous for being difficult to make correct. It's easy to forget to free memory, causing memory leaks. It's easy to get pointer arithmetic wrong, leading to crashes or even segfaults. Try to make the code multi-threaded, and it becomes even harder.

In addition, C is infamous for being insecure. Most buffer overrun vulnerabilities are caused by C's ability to allow ad-hoc access to memory. Even with an LLM superior to today's systems, will you trust that the software it produces in C is safe if you put it on the internet?

I wouldn't. Even if you believe that testing is enough to demonstrate that a system works, it's much harder to convince yourself that a black box is secure.

I'd much rather trust a system programmed in a language that comes with fewer opportunities for writing insecure code. As usual, constraints liberate. By having a statically-typed programming language, backed by a robust compiler, there's a lot that you don't have to worry about. You don't have to worry about null-reference exceptions. You don't have to worry about buffer overflows.

And with most such languages, you can turn the dial to 11 by treating warnings as errors (as discussed in Code That Fits in Your Head). For languages with algebraic data types, for instance, this would mean that you wouldn't be able to compile code that doesn't handle all cases of a sum type.

Programming languages for AI #

I listed Idris for two reasons: It has a type system even more expressive than (standard) Haskell, and the number of people who read it is small. I deliberately didn't want to list any popular language (like Haskell), because the purpose of the original 'poll' was to make you consider which kind of language would be best for LLMs, assuming that you had no bias.

Realistically, Idris is not going to be a standard programming language for AI. If things go as they usually go, it'll probably end up being JavaScript or Python (and if the choice is between those two, Python would be preferable).

What would make more sense, though, is a new language tailor-made for LLMs. And of course, people are already experimenting with that idea.

An AI-first language should, I think, have as many guardrails as possible built in. A powerful static type checker, perhaps refinement types, dependent types, or something else that I don't know of. Even if some of these features are inconvenient for human programmers, they may prove useful for machine-written code.

As Szymon Teżewski phrases it: Design by inconvenience. If LLMs are writing all the code, optimize for verifiability, not for how easy it is to write.

Conclusion #

Some people envision a future where AI writes all code. They believe that the implication is that the programming language no longer matters; that LLMs might just as well write directly in machine code.

In this article I've argued that readability by humans is only one property of a programming language. Other qualities, possibly independent of readability, are portability, security features, static and dynamic analysis, and verifiability. An AI-first programming language should, in my opinion, come loaded with as many of such properties that we can think of.

If we no longer write code, we should at least have the ability to verify it.

Published: Monday, 30 March 2026 06:00:00 UTC

Programming languages for AI by Mark Seemann

Code for machines #

Guardrails #

Constraints liberate #

Programming languages for AI #

Conclusion #

Wish to comment?

Published

Tags