7c0h

I need your help liking Rust

If you're a software developer, you know the rules: new year, new programming language. For 2020 I chose Rust because it has a lot going for it:

  • Memory safe and high performance
  • Designed for low-level tasks
  • Backed by Mozilla, of which I'm a fan
  • Named "most loved programming language" for the fourth year in a row by the Stack Overflow Annual Survey

With all of these advantages in mind, I set to build something concrete: an implementation of the Aho-Corasick algorithm. This algorithm, at its most basic, builds a Trie and then converts it into an automaton, with the final result being efficient search of sub-strings (why? I hope I can write about why in the near future). It also seemed like the type of problem you'd like to tackle with Rust: implementing a Trie in C requires some liberal use of pointers, a task for which I had expected Rust to be the right tool (memory safety!). And since I need to run a lot of text through it, I need it to be as fast as possible.

So how did I fare? Two weeks into this project, Rust and I have... issues. More specifically, I'm having real trouble figuring out what is Rust good for.

Part I: Pointers and Strings are too complicated

Dealing with pointers is straight up painful, because allocating a piece of memory and linking it to something else gets very difficult very fast. I followed this book, titled "Learn Rust With Entirely Too Many Linked Lists", and the opening alone warns me that programming a linked list requires learning "the following pointer types: &, &mut, Box, Rc, Arc, *const, and *mut". A Reddit thread, on the other hand, suggests that a doubly-linked list is straightforward - all you need to do is declare your type as Option<Weak<RefCell<Node<T>>>>. Please note that neither Option, weak, nor RefCell are mentioned in the previous implementation...

So, pointers are out as killer feature. If optimizing memory usage is not its strong point, then maybe "regular" programming is? Could I do the rest of my text handling with Rust? Sadly, dealing with Strings is not great either. Sure, I get it, Unicode is weird. And I can understand why the difference between characters and graphemes is there. But if the Rust developers thought long and hard about this, why is "get me the first grapheme of this String" so difficult? And why isn't such a common operation part of the standard library?

For the record, this is a rhetorical question - the answer to "how do I iterate over graphemes" (found here) teaches us that...

  • ... the developers don't want to commit to a specific method of doing this, because Unicode is complicated and they don't want to have to support it forever. If you want to do it, you have to pick an external library. But it won't be part of the standard library anytime soon. At the same time, ...
  • ... they don't want to "play favorites" with any specific library over any other, meaning that no trace of a specific method is to be found in the official documentation.

The result, then, is puzzling: the experts who designed the system don't want to take care of it, the official doc won't tell you who is doing it right (or, more critical, who is doing it wrong and should be avoided), and you are essentially on your own.

Part II: the community

If we've learn anything from the String case, is that "just Google it" is a valid development strategy when dealing with Rust. This leads us inevitably to A sad day for Rust, an event that took place earlier this year and highlighted how bad the Reddit side of the community can be. To quote the previous article,

the Rust subreddit has ~87,000 subscribers (...) while Reddit is not official, and so not linked to by any official resources, it’s still a very large group of people, and so to suggest it’s "not the Rust community" in some way is both true and very not true.

So, why did I bring this up? Because the Reddit thread I mentioned above displays two hallmarks of the type of community I don't want to be a member of:

  • the attitude of "it's very simple, all you need to create a new node is self.last.as_ref().unwrap().borrow().next.as_ref().unwrap().clone()"
  • the other attitude, where the highest rated comment is the one that includes nice bits like "The only surprising thing about this blog post is that even though it's 2018 and there are many Rust resources available for beginners, people are still try to learn the language by implementing a high-performance pointer chasing trie data-structure". The fact that people may come to Rust because that's the type of projects a systems language is supposedly good for seems to escape them.

If you're a beginner like me, now you know: there is a good community out there. And it would be unfair for me to ignore that other forums, both official and not, are much more welcoming and understanding. But you need to double check.

Part III: minor annoyances

I really, really wish Rust would stop using new terms for concepts that already exist: abstract methods are "traits", static methods are "associated functions", "variables" are by default not-variable (and not to be confused with constants), and any non-trivial data type is actually a struct with implementation blocks.

And down to the very, very end of the scale, the trailing commas at the end of match expressions, the 4-spaces indentation, and the official endorsement of 1TBS instead of K&R (namely, 4-spaces instead of Tabs) are just plain ugly. Unlike Python, however, Rust does get extra points for allowing other, less-wrong styles.

Part IV: not all is hopeless

Unlike previous rants, I want to point out something very important: I want to like Rust. I'm sure it's very good at something, and I really, really want to find what it is. It would be very sad if the answer to "what is Rust good for?" ended up being "writing Rust compilers".

Now, the official documentation (namely, the book) closes with a tutorial on how to build a multi-threaded web server, which is probably the key point: if Rust claims that error handling and memory safety are its main priorities are true, then multi-threaded code could be the main use case. So there's hope that everything will get easier once I manage to get my strings inside my Trie, and iterating over Gb of text will be amazingly easy.

I'll keep you updated.

A new and improved homeopathy

You might have heard of this new, alternative take on medicine called Homeopathy. If you haven't, the basic idea is that you take a (possibly active) substance, dilute it with alcohol or distilled water, and repeat the process until only the "vital energy" of the original substance remains. According to Hahnemann, the creator of Homeopathy (or, to be precise, according to the Wikipedia article), each dilution increases the potency of the preparation while ensuring that all traces of the original substance are effectively gone.

The efficacy of this practice has been called into question several times, which to me sounds less like a problem and more like an opportunity: how do we bring Homeopathy into the 21st century?

Enter the Nocebo effect. Unlike it's big brother the Placebo effect, the nocebo effect is at play when a treatment has a negative effect simply because the patient believes it to be so - the common example being patients that suffer from "side effects" when receiving an inert substance. While precise numbers are impossible to obtain, around 5% of all patients are considered susceptible to this nocebo effect.

If a nocebo "weakens" a patient's positive response to a medication, and Homeopathy is based on diluting substances, we can combine them both! In what I have decided to call "Martinopathy" in honor of its creator (me), I suggest the following clinical procedure: when a patient is prescribed a Martinopathic treatment for (say) common cold, they are first directed to a standard pharmacy, where they buy a common, over the counter, non-homeopathic common cold drug. They are then sent to their physician. The Doctor will take a look at the medicine, repeat to the patient "this medicine will not work" around 20 times, after which the patient is free to continue their treatment with their now-martinopathic medication. In this way the effect of the regular medicine has been "diluted" down to homeopathic standards, but this time in a scientifically sound way.

There is still some room for improvements. If costs are an issue, they could be kept low by martinopathing the medicine at the source - instead of yelling at a patient, a medical professional could yell at the boxes directly in the factory floor. It is not entirely clear whether the medical professional would have to be certified in this new treatment or not. But those are small details that we can sort after I get my Nobel prize.

Untitled Bash blog - source code is now online

As I mentioned before, I have now replaced my blogging engine with Pelican. Now that I'm mostly done dealing with ensuring that everything behaves more-or-less as before, it's time to talk about Bash.

I don't really remember why I wanted to write my blog using Bash. In general, I think it was a combination of the following factors:

  • I didn't want to install PHP. I have way too many memories of script-kiddies constantly probing my server, and that was never fun. Sure, I keep my server up-to-date, but why risking it? The less entry points for wannabe crackers, the better.
  • I didn't really thought it would be possible. Sure, writing it at first was kind of fun, but once I started writing triple-nested-quotes (all of which needed to be escaped properly) things got weird. I probably should have called it a day at that point, but I was close enough to my goal to make it worth it to continue till the end.
  • It was a good conversation starter - if you want to get a conversation rolling, bringing your terrible idea up is not the worst way to do it.

Of course, just because it was not the right tool for the job it doesn't mean that it was that bad. If anything, once I accepted that the heavy lifting should be made in a "proper" programming language (in this case, perl), using Bash to glue everything together worked surprisingly well.

With all of that in mind, I have now finally published the source code. I also plan on some light editing to make it friendlier, and an installation guide in case you really want to blog in a system where you have no permissions to install anything. The current template (just as my current blog) is based on Yahoo's Pure.css library library. In case you're not familiar with it, Pure.css is a set of CSS modules to make your project look good and responsive without a lot of effort, similarly to that other library whose name escapes me right now Bootstrap. I chose it specifically because I like to explore alternatives to the most popular projects, and Pure.css ended up being one of my favorites.

Music for programming

Like many programmers, I am a night owl. Also, as many other programmers, I have a day job that forces me to be there at 8. These two characteristics interact badly with each other.

For most programmers, this is the type of problem normally solved with coffee. But not being a coffee drinker in general (I think it's just okay) and with what I can only assume is a natural immunity to caffeine, my to-go alternative solution is music: a good pair of headphones and epic, upbeat music makes wonders for my concentration until lunch time, when all productivity dies.

2019 was a great year for me to both catch up with songs I didn't listen to in many years and to discover new ones. The following is a list of songs to which I return every week, divided into three sections: Full albums, Instrumental songs (no words), and Individual songs (with words).

Full albums

There are two full albums that I have often listened entirely during long coding sessions, and that I definitely recommend:

  • For no one's surprise, Daft Punk's soundtrack for TRON: Legacy makes the list. Too bad the rest of the movie was not as good.
  • I haven't seen The Exorcist yet, so I never considered this album "creepy", but if you have seen it then you might recognize the opening of Tubular Bells. I found that the song's rhytm perfectly syncs with my internal rhytm, and it is not unusual for me to realize that I need to take a break right as the album comes to an end.

Instrumental Songs

It has been common knowledge for some time now that movie music is ideal for focusing on a task - you don't want the music to pull you out of a movie, the same way I don't want my music to pull me out of my work. For this reason alone, the first three items in this list are pulled straight out of Hollywood blockbusters:

Moving onto TV, the next two songs are taken from the Japanese series "Kill la Kill": Naming Sense Gata Boshi Gokuseifuku, which I could swear I never heard in the series itself, and Nui Harime's theme.

Finally, and cheating a little bit, the theme from "The Good, the bad, and the ugly" as performed by the Danish National Symphony Orchestra is the one piece of music that got me to actually, physically buy music in many years.

Individual Songs

Individual songs are always tricky, because it takes a lot of listening to them before you learn to ignore the lyrics and let them blend in the background. That said, if you are looking for songs to listen over and over again, here's a bunch:

  • The least controversial song in this list are The greatest show on Earth and Ghost Love Score, both by Nightwish. They have long instrumental-only sections, and they are epic enough to give you an extra push while working.
  • Both Heldenzeit and Guten Tag by the German band "Wir sind Helden" are the perfect example of a great band that you discover long after they have disbanded. If you are a geek, the videoclip for Analogpunk (performed by the singer of "Wir sind Helden") is full of easter eggs.
  • The theme of "Revolutionary Girl Utena", Rinbu Revolution, is really good. There are not that many series where seeing the opening over and over is a plus, but Utena manages it.

Honorable mentions

I feel John Butler's "Ocean" deserves a spot in this list. It didn't make it into the official selection simply because I couldn't decide which version to include. I'm partial to the live version because it's the first one I heard, but the 2012 studio version is not bad at all.

Blog update!

It is tradition to start every new year with a blog post lamenting why I haven't posted more. Instead, I have decided to kickstart 2020 with a list of the changes I've made to the blog to ensure I write more, why I've made them, and what interesting tools I've found along the way.

A big problem in my blog has always been how difficult it is to actually get something published. As I've mentioned in the past, my blog is powered by a bunch of command-line tools and Bash. This works fine when I want to work from home, but makes it very difficult when I want to blog something spontaneously: writing a post involves SSHing into my server, and there's exactly one computer from which I can do that. Converting an entry to the final HTML is not hard, either, but is friction enough that I need to be really motivated to get into it.

Enter 2020. I am more and more concerned about the state of the modern web, where "the internet" has become a synonym for Facebook, Google, and not much more. At the same time, I realize that I'm part of the problem: I may not be putting content in Facebook, but I'm not really putting content anywhere. The little content I'm putting out is not particularly useful, either. Clearly, something had to change.

And thus, a plan for 2020 was born. In order to simplify blogging, I have now thrown away my custom blog engine and moved to Pelican. I've also embraced Markdown as document format, meaning I no longer have to worry about things looking ugly after I've written them down. All my previous content has been migrated from .html to .md using Pandoc.

On the content generation side, I've also decided to try something new. Rather than wait until inspiration strikes, I'll try to blog weekly about small problems and how I solved them. This will often involve talking about Bash and Python, so I'll have to pay special attention to other topics that could be interesting.

In the meantime, here are some changes you'll notice from the migration:

  • Some entries will look a bit different. My custom footnote CSS no longer works with Markdown, so I'll probably substitute them with regular, boring footnotes.
  • A couple entries with custom html will look weird for a while. It will take a bit to get them looking like before, and I'd rather avoid delaying the update until then.
  • The RSS feed will probably break a little bit. Seeing as the RSS will now be generated by Pelican, I imagine your RSS reader will panic a little bit.

Next in the pipeline: a showcase of my now-almost-defunct blog engine.