7c0h

Latest Post

Which apostrophe should I use?

As someone who regularly switches between keyboard layouts, I have a problem: I have at least three keys that can be used as an apostrophe, but I don't know which one is the correct one. Compare:

  • Backtick: Hamlet`s father
  • ASCII Apostrophe: Hamlet's father
  • Acute accent: Hamlet́s father
  • Single closing quote: Hamlet’s father

Do you know which one is the right one? If not, this small guide is for you. But you'll have to endure a lot of theory first.

The problem here is that using your keyboard requires mixing three different concepts: which key you pressed, which character it represents, and how is it visually represented. To explain that in clearer terms, let's take the backtick as an example.

The key itself can move around. The backtick key is located below the tilde (~) in a US keyboard, to the left of the backspace key in a German keyboard, and under the caret (^) in a Spanish keyboard.

Internally, this key is called "Grave accent" in ASCII but programmers know it it as "backquote" or "backtick". When you press it you send a code to your computer that, if you were using ASCII, would be reflected as the 0x60 hexadecimal value. And here we make another distinction: if your computer is configured to do so, this code can be interpreted as a dead key that only exists to modify the next character. If you want to type the è in the French word très (very) you use the combination <Grave accent key> + <e key>. If your computer is not configured in this way then you simply get a backtick.

But here we have a very subtle difference between a backtick and a Grave accent. The Grave accent is a modifier, changing the sound of the letter underneath. Therefore, it cannot exist by itself. If you see the character ` alone then it is not a Grave accent, it's a backtick. They both look the same, but they have different meanings.

Your program is another factor: some programs may replace the character you are using because it's very likely that you are using it wrong. If I type the double quote character (") in Microsoft Word it may be replaced by an Opening double quote (“), a German opening double quote („), an English closing double quote (”), or a German closing double quote (“, which is the same character as the Opening double quote in English).

And then we have the issue of fonts. Some fonts may represent two different characters in the same way, or straight up ignore it. If you have been staring at some of the quotes I mentioned above and see no difference, well, maybe that's why.

So back to our original question: what is each key good for?

  • The backtick quote (`) has no meaning in typography at all. The only reason a non-programmer would use it would be to type a Grave accent.
  • The apostrophe (') is the one I need for daily use. In truth, Unicode defines a different character as a "true" apostrophe (we'll get to that soon), but not all keyboards can generate it. Therefore, and just like the double quote example above, it is okay to use it.
  • The Acute accent (́) should only be used as a modifier. You can see that it insists on modifying whichever character is next to it, because it has no meaning by itself.
  • The Single closing quote (’) can be generated under certain combinations of keyboard and local configuration. If your keyboard supports it, this is the preferred key to use for apostrophes.

There are plenty other versions of the quote character that can be used. This list of commonly confused characters has been of great help, while this technical description of the apostrophe character also provides more info than you thought you needed. Sure, it may sound like a lot of work. But you'll thank me next time you need to write a phrase like “In a Déjà vu, René corrected ‘It’s x′, not x″’” knowing full well that you are right.

Older Posts

Internet should be an utility

Have you heard of Parler? In case you haven't, Parler is (was?) a social network for ~~racists~~ the alt-right that gained notoriety this week. Having allegedly been used to coordinate the storming of the U.S. Capitol, its app was removed from both the Google Play Store and the Apple Store and then pulled from Amazon AWS the next day. With no hosting and no app, Parler has been effectively killed by the tech giants.

The swift removal of Parler from the internet is the incident that, I hope, will bring together the right and the left under a common cause: that the internet should be considered a public utility and that Internet Service Providers (ISP) should be regulated as such.

A public utility is a service that everyone needs (think water and electricity) and where regulation is needed because the high cost of entry discourages competition (so-called natural monopolies). It has been argued that internet should be included in this list too - can you imagine your current daily life without internet? ISPs, on the other hand, are happy setting their own prices and policies, and have resisted for years efforts in this direction.

Parler was effectively removed from the internet by the tech giants under the argument that, as private companies, they have the right to refuse service to anyone they don't want to work with[1]. But ISPs are private companies too, and therefore free to do the same to you - if your live in a country with no ISP regulation, your provider has a right to stop giving you internet access and tank your business with little repercussions.

And here is where I hope both "the left" and "the right" will see that their interests overlap. The left should support ISP regulations (and net neutrality!) because they believe, as Germany and France put it, that free speech should be governed by law and not by tech giants. The right, on the other hand, should realize that they gave tech monopolies all the cards and that they are the only ones getting kicked out of their social media accounts. If the President of the United States himself can be banned from Twitter, Facebook, Snapchat and Youtube, then no one is safe.

Why ISPs?

You might have noticed that everyone I linked above talks about regulating Amazon, Google, and/or Apple. I, on the other hand, would suggest that we focus on ISPs instead for the following reasons.

First, because the internet parallels the history of the telephone almost perfectly: a communication technology that catches on and that, while not biologically required (unlike water and heating), plays a critical role for life in a society. And ISPs are not "like" the telephone companies, they are the telephone companies.

Second, because it makes sense that internet should be provided to everyone without discrimination: imagine a world in which your shower stops working because you said in public that you prefer bottled water, or where your telephone is disconnected because you bad-mouthed someone during a conversation.

And finally, because it's the last step down the technology chain at which you can still survive: there are alternatives to Amazon AWS, and if they won't have you then you could still plug your own server and keep going (ask the Pirate Bay). But if the only ISP in town denies you service, what are you going to do, move your family to the closest town? Ask people to send you letters?

So here's my proposal: make the internet a public utility and, in exchange, give ISPs immunity from what their customers do with it. Let's bring the internet into the 21st century.

Footnotes

[1] Leaving illegal discrimination aside, which doesn't seem to be the case here.

Unity and Internet-Oriented Software Development

I am developing a simple HTML5 game with Unity. The experience reminded me of my post about Rust and led me to coin a new term that I'd like to discuss today.

Ladies and gentlemen, in the spirit of Object-Oriented Programming I present to you Internet-Oriented Software Development (IOSD): a style of software development in which the official way to program is by trying something, giving up, asking strangers on the internet, and hoping for the best.

You may wonder: how is this new, seeing as that's what we are all already doing? The keyword here is "official". If you want to program with (say) Keras during a long trip with unreliable internet, you could do it with an offline version of the API reference alone. Sure, getting the offline docs will take a bit of work, but at least there's an official repository of knowledge that you can always go to. Of course you can search on the wider internet for help too, but you don't have to.

IOSD is different: when you release IOSD software you publish an okay guide to your software, and that's it. There is no need for you to keep it up to date nor for it to be useful, because your documentation is Internet-Oriented: if someone has a problem, they can ask the internet, IRC channels, their co-workers, anyone but you.

Rust came to mind because, as I complained before, that's how their intended development cycle apparently works: if you don't know how to do something, you are encouraged to either search the forums or ask the developers. In the later case, "we don't know, we won't do it, and we won't tell you who is doing it right" is a possible response. TensorFlow used to be like that too (ask me about the Seq2Seq tutorial!), but they reversed course and their current official suggestion is that you use Keras (no, seriously).

But Unity is an even better example. For starters, the official docs are essentially useless because they tell you what something does but neither why nor what to use it for. Can you guess in which situations is a Sprite renderer required, and what do you need it for? Because I can't. One might argue that Unity Learn is where you should look for answers, in which case one would be wrong. Taking the first course in the "Game development" section, for instance, gets you this tutorial which is only valid for an outdated version of Unity.

No, the real source of answers are YouTube tutorials. Sure, sometimes they refer to windows that aren't there anymore and/or changed their name, but you can always add a "2019.4" to your search and try again.

I am not entirely a beginner with Unity, as I worked with it for my PhD projects. Even then, the list of resources I needed to complete my current project so far includes 5 YouTube tutorials, 2 forum threads, and zero links to official documentation. Is this a problem? Is IOSD better than the thick manuals we had before? Am I the only one getting outdated answers for trivial problems? I have no idea. So I propose a compromise: I will point at the situation and give it a name, let someone else answer the hard questions, and we will share the credit.

Animations in Gephi

Gephi is a tool for graph visualization that's great for plotting big datasets. If you have large, complex graphs to plot, Gephi has plenty of layouts that you can apply to your data and turn a mess of a graph into something that hopefully makes sense.

But there's something Gephi can't do: animations. The layout algorithms at work look awesome, and yet there's no easy way to create a video out of them. Capturing the screen is an alternative, but the final resolution is not really high.

It has been suggested that you could export images of the individual networks at every step and then put them together. Of course, the people who suggested this in the past have also...

Therefore, this post.

Note: this post is not an introduction to Gephi, and I'll assume that you know the basics. If you don't know them, these slides look quite good.

Requirements

  1. Gephi (obviously)
  2. ImageMagick (to post-process images in bulk)
  3. FFmpeg (to create the final video)

I am also assuming that you are using Linux. This is not a requirement, but it does make my life easier.

Preparing your data

The first step is to have some data to visualize. For the purposes of this exercise, you can download this zip file containing two files named nodes.csv and edges.csv. They form a simple directed graph that I generated from a sample of directories in my computer.

Now, load them into Gephi:

  1. Start a new Project
  2. Load the nodes table: File → Import spreadsheet → nodes.csv. Follow the steps, but remember to select "Append to existing workspace" in the last one.
  3. Repeat the previous step, this time using the edges.csv file.

Next, choose a layout. In my experience it is better to try them all first, see which one gives the best result, and then generate the final graph from scratch. Generating the final graph will take a while, so it's better to do it only once you are sure about which parameters to use.

Exporting all frames

It is now time to run our first script. If you don't have the Scripting Plugin installed, you can do it now via Tools → Plugins. We will use it to write a very simple Python script that does the following:

  1. Run a single step of a layout
  2. Take note of the relative size of the graph (we'll come back to this)
  3. Export it to a PNG file
  4. Return to step 1.

In case you want to copy-paste it, this is the script I used. Don't forget to remove the comments first, because Gephi doesn't like them.

def make_frames(iters, layout, outdir):
    # As many iterations as frames we want to generate
    for i in range(iters):
        # Run a single step of our layout
        runLayout(layout, iters=1)
        # Calculate the bounding box of this specific graph
        min_x = 0
        max_x = 0
        min_y = 0
        max_y = 0
        for node in g.nodes:
            min_x = min(min_x, node.x)
            max_x = max(max_x, node.x)
            min_y = min(min_y, node.y)
            max_y = max(max_y, node.y)
        width = max_x - min_x
        height = max_y - min_y
        # Generate a file and include the graph's bounding box
        # information in its filename
        exportGraph("%s/%05d-%d-%d.png" % (outdir, i, width, height))

Once you have copied this script into the console, you can generate all animation frames with the command make_frames(100, FruchtermanReingold, "/tmp/"). This will run the FruchtermanReingold layout for 100 iterations, and will save the generated frames in the /tmp/ directory. Of course, you can choose other layouts (see the documentation for more info) and you can run the script for a larger number of steps. You can also customize the layout parameters in the regular Layout tab, and the script will pick them up.

The script will block Gephi entirely, so don't go for a really high number of steps from the beginning. Start with 50-100, and only then move on.
For a nicer effect, make a single run of the "Random" algorithm first. This will put all your nodes in a very small space, and the final effect will be like an explosion of nodes.

Generating the animation

A further issue to deal with is the changing size of the image canvas. If we set Gephi to generate a 1024x1024 output image but we only have two nodes close to each other, those two nodes will look huge. If we have thousands of disperse nodes, however, the individual nodes will be barely visible. Therefore, if you made a video with the images we generated in the previous section directly, you would almost certainly get a zoom effect where the nodes would get bigger and smaller as the graph gets denser or sparser respectively.

To avoid this, we need to scale all pictures proportionally. The Bash script below calculates the maximum theoretical size of our graph (based on those variables we added to the filenames before), scales all images down to the proper size (as defined by canvas_w and canvas_h), and places them in the center of a single-color canvas (see the rgb() call).

# Directory where all individual frames are
SOURCE_DIR=/tmp
# Directory where final frames will be stored
TARGET_DIR=/tmp/outdir

# Obtain the theoretical maximum width and height in the PNG frames
max_width=`ls ${SOURCE_DIR}/*png | cut -f 2 -d '-' | sort -n | tail -1`
max_height=`ls ${SOURCE_DIR}/*png | cut -f 3 -d '-' | cut -f 1 -d '.' | sort -n | tail -1`

# Give your desired canvas size
canvas_w=1024
canvas_h=1024

# Scaling factor for the frames, based on the largest theoretical dimension
if (( $max_width > $max_height ))
then
    factor=`bc -l <<< "$canvas_w/$max_width"`
else
    factor=`bc -l <<< "$canvas_h/$max_height"`
fi

# Generate the new frames
for file in ${SOURCE_DIR}/*png
do
    # Obtain the properties of the image
    frame=`echo $file | xargs -n 1 basename | cut -f 1 -d '-'`
    width=`echo $file | cut -f 2 -d '-'`
    height=`echo $file | cut -f 3 -d '-' | cut -f 1 -d '.'`
    # Calculate how the image should be scaled down
    new_width=`bc -l <<< "scale=0; ($width*$factor)/1"`
    new_height=`bc -l <<< "scale=0; ($height*$factor)/1"`
    # Put it all together
    convert $file -scale ${new_width}x${new_height} png:- | \
        convert - -gravity Center -background "rgb(255,255,255)" \
        -auto-orient -extent ${canvas_w}x${canvas_h} \
        ${TARGET_DIR}/${frame}.png
done

Once you have generated this second set of frames, you can generate your final video going to your TARGET_DIR directory and running the command

ffmpeg -framerate 30 -i %05d.png -c:v libx265 -r 24 final_video.mp4

If your video is too slow, a higher framerate value will do the trick (and vice versa). The final result, heavily compressed for the internet, can be seen below:

Final thoughts

I hope you'll find this guide useful. I'm not going to say that it's easy to follow all these steps, but at least you can set them up once and forget about it.

Some final points:

  • For an alternative method of graph generation involving nodes with timestamps, this script looks like the way to go.
  • I'm interested in unifying everything under a single script - I chose Python because it was easier than Java, but maybe developing a proper multi-platform plugin is the way to go. I can't promise I'll do it, but I'll think about it. In that same vein, perhaps the script should center in a specific node?
  • If you plan to publish your video on the internet by yourself, this post gives a nice overview of which standards to use. If you want to tweak the video quality, this SE question provides some magic incantations for FFmpeg.
  • Finally, and speaking of magic incantations, special thanks to this post for providing the right ImageMagick parameters.

Wiki idea: what happens when you press Enter on your browser

Here is an idea that I had and that I don't have time to work on. I read somewhere about the following job interview question:

What happens after you write a URL in your browser and press Enter?

If you think about this for a moment, you might realize that what this question really means is "tells us everything you know about computers". I have yet to find a topic that wouldn't be involved in giving a full response. Off the top of my head, and in roughly chronological order, you would have to explain...

  • ... how your keyboard sends signals, including the difference between pressing and releasing a key. Also, how your computer display works.
  • ... how to turn a series of electric impulses into a character.
  • ... how to parse a URL, including the difference between Unicode and ASCII.
  • ... how the internet works: DNS, TCP/IP, IPv4 vs IPv6, routing, etc
  • ... how the browser and server negotiate the type of content they want. It might also include an introduction to the GZIP compression algorithm.
  • ... a primer on HTTPS, including cryptography and handling certificates.
  • ... what is a web server and how it works. Same for load balancers, proxies, and pretty much all modern server infrastructure.
  • ... how your operating system renders anything on screen.
  • ... how your web browser renders content.
  • ... the standards involved in receiving content: HTML, CSS, JavaScript, etc.

I imagine that this would be a great idea for a Wiki: the main page would simply present the general question, and you could go deeper and deeper until you reach your motherboard's buses, your microprocessor's cache, the specifics of BGP, or pretty much anything that was ever used in an internet-connected computer.

I was never asked this question, which is a bit of a missed opportunity: I don't know exactly how many hours I could waste on this question, but I'm willing to bet it would be more than what any reasonable interviewer is willing to spend. More realistically, I imagine the point of the question is both to check whether you know about computers and, more important, whether you know when to stop talking about computers.

Page 1 / 12 »