Martin's blog

Write-only code

Posted on 2020-05-13 under development python ramblings rust

The compiler as we know it is generally attributed to Grace Hopper, who also popularized the notion of machine-independent programming languages and served as technical consultant in 1959 in the project that would become the COBOL programming language. The second part is not important for today's post, but not enough people know how awesome Grace Hopper was and that's unfair.

It's been at least 60 years since we moved from assembly-only code into what we now call "good software engineering practices". Sure, punching assembly code into perforated cards was a lot of fun, and you could always add comments with a pen, right there on the cardboard like well-educated cavemen and cavewomen (cavepeople?). Or, and hear me out, we could use a well-designed programming language instead with fancy features like comments, functions, modules, and even a type system if you're feeling fancy.

None of these things will make our code run faster. But I'm going to let you into a tiny secret: the time programmers spend actually coding pales in comparison to the time programmers spend thinking about what their code should do. And that time is dwarfed by the time programmers spend cursing other people who couldn't add a comment to save their life, using variables named var and cramming lines of code as tightly as possible because they think it's good for the environment.

The type of code that keeps other people from strangling you is what we call "good code". And we can't talk about "good code" without it's antithesis: "write-only" code. The term is used to describe languages whose syntax is, according to Wikipedia, "sufficiently dense and bizarre that any routine of significant size is too difficult to understand by other programmers and cannot be safely edited". Perl was heralded for a long time as the most popular "write-only" language, and it's hard to argue against it:

open my $fh, '<', $filename or die "error opening $filename: $!";
my $data = do { local $/; <$fh> };

This is not by far the worse when it comes to Perl, but it highlights the type of code you get when readability is put aside in favor of shorter, tighter code.

Some languages are more propense to this problem than others. The International Obfuscated C Code Contest is a prime example of the type of code that can be written when you really, really want to write something badly. And yet, I am willing to give C a pass (and even to Perl, sometimes) for a couple reasons:

C was always supposed to be a thin layer on top of assembly, and was designed to run in computers with limited capabilities. It is a language for people who really, really need to save a couple CPU cycles, readability be damned.
We do have good practices for writing C code. It is possible to write okay code in C, and it will run reasonably fast.
All modern C compilers have to remain backwards compatible. While some edge cases tend to go away with newer releases, C wouldn't be C without its wildest, foot-meet-gun features, and old code still needs to work.

Modern programming languages, on the other hand, don't get such an easy pass: if they are allowed to have as many abstraction layers and RAM as they want, have no backwards compatibility to worry about, and are free to follow 60+ years of research in good practices, then it's unforgivable to introduce the type of features that lead to write-only code.

Which takes us to our first stop: Rust. Take a look at the following code:

let f = File::open("hello.txt");
let mut f = match f {
    Ok(file) => file,
    Err(e) => return Err(e),
};

This code is relatively simple to understand: the variable f contains a file descriptor to the hello.txt file. The operation can either succeed or fail. If it succeeded, you can read the file's contents by extracting the file descriptor from Ok(file), and if it failed you can either do something with the error e or further propagate Err(e). If you have seen functional programming before, this concept may sound familiar to you. But more important: this code makes sense even if you have never programmed with Rust before.

But once we introduce the ? operator, all that clarity is thrown off the window:

let mut f = File::open("hello.txt")?;

All the explicit error handling that we saw before is now hidden from you. In order to save 3 lines of code, we have now put our error handling logic behind an easy-to-overlook, hard-to-google ? symbol. It's literally there to make the code easier to write, even if it makes it harder to read.

And let's not forget that the operator also facilitates the "hot potato" style of catching exceptions¹, in which you simply... don't:

File::open("hello.txt")?.read_to_string(&mut s)?;

Python is perhaps the poster child of "readability over conciseness". The Zen of Python explicitly states, among others, that "readability counts" and that "sparser is better than dense". The Zen of Python is not only a great programming language design document, it is a great design document, period.

Which is why I'm still completely puzzled that both f-strings and the infamous walrus operator have made it into Python 3.6 and 3.8 respectively.

I can probably be convinced of adopting f-strings. At its core, they are designed to bring variables closer to where they are used, which makes sense:

"Hello, {}. You are {}.".format(name, age)
f"Hello, {name}. You are {age}."

This seems to me like a perfectly sane idea, although not one without drawbacks. For instance, the fact that the f is both important and easy to overlook. Or that there's no way to know what the = here does:

some_string = "Test"
print(f"{some_string=}")

(for the record: it will print some_string='Test'). I also hate that you can now mix variables, functions, and formatting in a way that's almost designed to introduce subtle bugs:

print(f"Diameter {2 * r:.2f}")

But this all pales in comparison to the walrus operator, an operator designed to save one line of code²:

# Before
myvar = some_value
if my_var > 3:
    print("my_var is larger than 3")

# After
if (myvar := some_value) > 3:
    print("my_var is larger than 3)

And what an expensive line of code it was! In order to save one or two variables, you need a new operator that behaves unexpectedly if you forget parenthesis, has enough edge cases that even the official documentation brings them up, and led to an infamous dispute that ended up with Python's creator taking a "permanent vacation" from his role. As a bonus, it also opens the door to questions like this one, which is answered with (paraphrasing) "those two cases behave differently, but in ways you wouldn't notice".

I think software development is hard enough as it is. I cannot convince the Rust community that explicit error handling is a good thing, but I hope I can at least persuade you to really, really use these type of constructions only when they are the only alternative that makes sense.

Source code is not for machines - they are machines, and therefore they couldn't care less whether we use tabs, spaces, one operator, or ten. So let your code breath. Make the purpose of your code obvious. Life is too short to figure out whatever it is that the K programming language is trying to do.

Footnotes

1: Or rather "exceptions", as mentioned in the RFC
2: If you're not familiar with the walrus operator, this link gives a comprehensive list of reasons both for and against.

document.write(random_tagline());

Footnotes