It's not me, Spotify, it's you

Dear Spotify,

I think it's time to realize that you are not the service you once where. At first it was subtle, like that time when you changed the shade of green of your logo to the ugly one you are using now. Then there was that issue with offline mode, in which I lost a whole playlist because your synchronization with Windows Phone doesn't work. I guess I should have seen the signals back then.

But now... now you changed. More specifically, you change your Terms of Use, and I can only use you if I agree for you to collect my pictures and track my location, among others. And that's where I have to draw the line. Spotify, I'm breaking up with you.

Wait, let me rephrase that: I already broke up with you 10 minutes ago, when I canceled my paid subscription. This is just me being polite.

Let's be honest here: I was not paying for music. I can get free music pretty much everywhere - call it Youtube, Vimeo, MP3 forums or torrents, finding free music is not particularly difficult. I was paying those € 10 because I preferred that to paying with my data, like so many other services. But if you are going to build a profile of me anyway that is related not to what I like to listen (which is what you are supposed to care about) but about what I do in my daily life (which is none of your business), then what's the point? I was paying to get away of the claws of marketing, and that is now gone. And so am I.

I guess I'll just go back to the old way, building my own music collection and listening to it wherever and however I want. I may even get back to my old idea of a streaming server. I know we had our issues before, like when I kept looking for videogame music and you kept showing me crappy piano versions of them. Or when you wouldn't change the title of Fabiana Cantilo's misspelled album even after I pointed it out repeatedly. But this time it's different. This time I'm gone for good.

Bye, Spotify. I'll show up later on to collect the titles on my playlist, so I can download them later. You can keep my e-mail address. It was a throwaway anyway.

Guitar music and I

Here's a joke I heard once at a music academy:

How do you keep a pianist from playing? You take away their music sheet.

How do you keep a guitarist from playing? You give them a music sheet.

This joke rings oh-so-true because it highlights a key point for those of us who tried to learn guitar by ourselves: the typical amateur guitar player doesn't know how to read music, and (s)he doesn't care about it. Somehow they manage, but how they do that remains a mistery to me.

Typical guitar tabs (those you buy at a shop or download from the internet) contain therefore little more than the lyrics for a song and the points at which you are supposed to switch from one chord to another. This works pretty well for your left hand, but how about the right one? Should I just move it up and down? And at which speed? "Well", says the guitar book, "you should just do whatever feels natural". This is of course useless - what am I, the song whisperer? What if nothing feels natural? Do I just sit there in silence?

Let's take the following example, which I borrowed from Ultimate-guitar.com

    D                       G       A

D   G  A   [play twice]

You'll say
G      A           D
we got nothing in common
   G      A         D
no common ground to start from
    G       A      D     G  A
and we're falling apart

This is actually a fairly complete piece: it shows the lyrics and notes (lower half) along with an attempt at explaining how should the strumming (i.e., what to do with your right hand) be performed. But here's the thing: it's not clear at all which strokes should be "up" and "down", nor the duration and silences between them. You cannot derive rhytm from this information, which is pretty bad for a section titled "Rhytm for intro and verse". And here's an extra fact: the "D" section should actually be played exactly the same as the "G A" section, but good luck discovering that from this notation. This is a known bug of guitar tabs, and yet I have several books with songs that don't even include such a section, either because they don't care or because they realized it's useless.

This is one of those very few problems that is currently solved by, of all things, Youtube. It's not too hard to find a "How to play Breakfast at Tiffany's" video tutorial, where some dude will spend some time showing you slow motion strumming, so you can play the whole thing. But how come youtubers have fixed this problem so fast, while guitar books have remained the same for decades? Why isn't everybody complaining? My theory is that the typical amateur guitarist picks up a guitar, downloads one of these tabs, fails at getting anything out of it, and quits guitar forever saying "guitars are hard".

I don't really have a good solution, because any attempt at formalizing the strumming will undoubtedly require some knowledge about rhytm, and guitar players seem to hate that. Perhaps that's how we ended up in this mess in the first place. Or perhaps there's a super easy, totally intuitive method that I've always missed for one reason or another.

But then again: how can a method do any good if it's never taught?

The Semantic and Observational models

This article is the fourth of a series in which I explain what my research is about in (I hope) a simple and straightforward manner. For more details, feel free to check the Research section.

Let's continue with our idea of guiding people around like I mentioned in the previous article. It turns out that people usually make mistakes, either because the instruction we gave was confusing, or because they weren't paying attention. How can I prevent those mistakes?

For my first research project at the University of Potsdam, we designed a system that took two things into account: how clear an instruction was, and what did the player do after hearing it. Let's focus on those points.

For the first part, which we called the Semantic Model, a system tries to guess what will the user understand after hearing an instruction. If the instruction says "open the door", and there's only one door nearby, then you'll probably open that one. But what if I tell you "press the red button" and there are two red buttons? Which one will you press? In this case, the model tells us "this instruction is confusing, so I don't know what the user will do", and we can use that to make a better instruction.

For the second part, which we called the Observational Model, a second system tries to guess what are your intentions based on what you are doing now. For instance, if you are walking towards the door with your arm extended, then there's a good chance you are going to open that door. Similarly, if you were walking towards a button, but then you stopped, looked around and walked away, then I'm sure you wanted at first to press that button but changed your mind.

When we put both models together, they are pretty good at guessing what you are trying to do: when the first one says "I'm sure you'll press one of the red buttons" and the second one says "I'm sure you'll press either this blue button or that red one", we combine them both and get "We are sure you'll press that red button". Even though neither of them were absolutely sure about what you'd do, together they can deduct the right answer.

Each system takes into account different clues to make their guess. The semantic model pays attention mostly to what the instruction says: did I mention a color? Is there any direction, such as "in front of"? Did I mention just one thing or several? And which buttons were visible when you heard the instruction? The other model, on the other hand, takes into account what you are doing: how fast you are moving, in which direction, which buttons are getting closer, and which ones are you ignoring, among others.

Something that both models like to consider is which buttons were more likely to call your attention, either because you looked at them for a long time or because one of them is more interesting. But there's a catch: computer's don't have eyes! They don't know what you are really looking at, right? Finding a way of solving this problem is what my next article will be about.

The Tapiz instruction-giving system

This article is the third of a series in which I explain what my research is about in (I hope) a simple and straightforward manner. For more details, feel free to check the Research section.

For my first research paper during my PhD, the basic idea was pretty simple. Imagine that, after recording several hours of people being guided around a room, I realize the following: everytime a player stood in front of a door, and someone told them "go straight", they walked through the door. So now I ask: if you are standing in front of a door, and I want you to walk through it, would it be enough for me to say "go straight", like before? My research team and I wanted to give this question an answer, so this is what we did.

We looked at our recorded data. Whenever we saw a player moving somewhere, we took notes about where the player was, where is the player now, and what was the instruction that convinced the player to move from one place to the other. We then created a big dictionary, where each entry reads "to move the player from point A to point B, say this". Quite smart, right?

The most important part about this idea is that we don't need to teach our computer how to understand language - in fact, when our system reads "turn right" in our dictionary, it has no idea about what "turn" or "right" mean. All our system cares about is that saying "turn right", for some strange reason, causes people to look to the right. This makes our system a lot simpler than other systems that try to understand everything.

Now, let's complicate things a bit: let's say I tell you "walk through the door to your left". You turn left, walk through the door, take 7 steps, give a full turn to look at the room, and then you wait for me to say something else. Which of those things you did because I told you, and which ones you did because you felt like it?

Since we didn't really know the answer, we tried two ideas: in the first case, we decided that everything you did was a reaction to our instruction (including the final turn), while in the second one we only considered the first action (turning left), and nothing else. As you can see, neither approach is truly correct: one is too short, and the other one is too long. But in research we like trying simple ideas first, and we decided to give these two a try.

Our results showed that the second approach works better, because if you advance just one step I can guide you to the next, but if you do too many things at once there's a chance you'll get confused and lost. Also, since our system is repeating what other humans said before, players thought the instructions were not too artificial.

Not bad for my first project, right?

What is the GIVE Challenge?

This article is the second of a series in which I explain what my research is about in (I hope) a simple and straightforward manner. For more details, feel free to check the Research section.

The GIVE Challenge is a competition started in the University of Saarland, created to collect data about human behavior. Since most of my research is based on that data, it's a good moment to explain what is it about.

We all know GPS by now - whenever we go by car somewhere new, we just type the direction and the GPS guides us. But have you ever thought about how hard it is to give instructions, like your GPS does? For instance, if we are in a roundabout and I say "take the third street to your right", does that mean I have to count all streets, or should I ignore wrong ways? And how much time do you need to react to my directions? These are important question, because they reveal a bit more about how humans act and think.

If we want answers, we need to collect data (reaction times, distance to other cars, misunderstandings, etc), and that data is very difficult to get. For our example, you would have to drive while wearing special glasses, a military GPS, and keep track of all the cars and pedestrians around you. So you might wonder, couldn't we make something simpler, but still useful? My adviser and other researchers asked themselves this exact same question in 2007, and that is how the GIVE Challenge was born.

In GIVE, a person sits in front of a computer, and they play a game. The game is pretty easy - all the person has to do is walk around a virtual house and press some buttons in a certain order. Just like a GPS, they receive instructions telling them where to go and what to do.

In the first variant of the GIVE Challenge, the instructions are given by a person using a computer in a different room. We then record all the information about how the player reacts to the instructions: if the instruction says "turn right", how much does the player turn? Do they just turn, or do they walk too? And how long does it take them? By recording every single movement of the player inside this game, we can answer questions like that.

There's also a second variant: we can write a program that guides the person inside the game, and see how good (or how bad) its instructions are. While a common GPS only cares about streets, our programs have a harder time: humans are not limited to just following streets like cars do, so the instructions are more complex. GIVE is a good way of testing how smart our computers are, and that's why we've been using it for many years now.

We've so far recorded over 340 hours of human movements, divided in 2500 games. Believe it or not this is not too much data, but it's a good start. We have extracted several interesting results from this data, some of which I talk about in future articles.