This article is the first of a series in which I explain what my
research is about in (I hope) a simple and straightforward manner. For
more details, feel free to check the Research
section.
In research, we often want to teach computers how to do a new task, but
that is difficult because computers are not too smart, and teaching them
even a simple task takes a lot of work. So let's say I want my computer
to tell me whether an e-mail is important or not. If I could teach my
computer that, then it could show me important e-mails first and save me
the trouble of sorting through them daily.
One way of teaching tasks to computers is by doing the job myself, and
then make the computer repeat what I did. This is something scientists
have been doing for a long time, and today we have a set of steps that
every researcher should follow.
The first step is to collect as many e-mails as possible, both important
and not. In science, such a big set of e-mails is called a corpus.
Now, just like you wouldn't know what kind of e-mails I consider
important, neither does a computer. So the second step is to go through
all those e-mails I collected, and mark which ones are important. I'll
create two groups, one called "training" and another one called
"testing". The first group will contain 4 out of 5 emails, picked at
random, while the second group will have the remaining ones.
The third step, unsurprisingly called the training stage, requires the
computer to analyze all the e-mails I put in the training group and
decide what makes an e-mail important. We would expect our computer to
understand, for instance, that since every e-mail containing the word
"SALE" was marked as unimportant, then it might be a good idea to mark
all e-mails with commercial offers as unimportant. This is by far the
hardest step, and there are many ways in which I can influence how well
the computer will learn.
The fourth and final step is to give our computer a test, to see whether
it learned something useful or not. For this step, called the testing
stage, I'll go through each e-mail from the testing group, show the
computer the e-mail's text, and ask whether it's important or not. Then
I compare the computer's answers with mine, and I'll use that result to
decide how good (or how bad) my computer learned the task. If the
results are not good enough I can always go back, change how are the
e-mails analyzed, and try again. If the results are good, on the other
hand, I can trust my program to sort my e-mail from now on.
This is pretty much half my daily work. Collecting enough data (e-mails)
is either complicated, expensive, takes a lot of time, or all of that
together. And remember I said there are several ways in which a computer
might learn? We have to try some of those alternatives too.
Finally, training is usually very slow - in my last project, it took
almost a week.
I usually dedicate that time to play Solitaire.
I always associated the word tumbler both with Batman's car
in Nolan's movies and with that one website. It turns out that there's a
third definition: a tumbler is also a glass, and in particular it is the
name for those plastic cups for hot
drinks (i.e., coffee) that people carry with them sometimes. They are more
or less good for the environment, which is why I ended up with one.
The one I have allows me to change its decoration in a simple way - all
I have to do is unscrew the base, remove the current one and replace it
with something else. And here is where things become interesting: my
tumbler is not shaped like a cylinder, but more like half a cone. Wikipedia
calls this figure a frustum,
and when you flatten it you end up with a template like this:

So let's say I want to put my own drawing in this format. One could
naively make the design, cut it out in the shape of the template, and put
it in. If I did this, the entire design would look weird, because the
base is smaller than the top, and that messes perspective up in several
ways. Luckily, with math we can make it look nice.
Now, before we start, I'm going to give you the summary: if
you want to use your own image, make sure that it's size is 21.26cm x 17.43cm
(or proportional to that) and then use ImageMagick to run the following command:
convert input.jpg -virtual-pixel White --distort Arc 16.56 output.jpg
So that's out of the way. But where did those numbers come from? Well,
let's dust our rulers and take some measurements. I'm going to assume
that the template is a slice of a circle with radius (r), like Figure a)
shows. Figure b) shows the measures I could obtain accurately. Those are
the numbers I got but, of course, if you have a different model your
numbers might be different:

These are easy measures to take, but it doesn't tell me anything about
the two main measures I need: the perimeter of the template, and the angle
between the non-parallel sides. Why do I need this?
- Measuring the perimeter will allow me to reconstruct the aspect ratio
of the original picture. In case you don't know, "aspect ratio" is the relation
between the width and the height of an image. Using the wrong one would
mean that I'd have to either add black bars to the final design or crop
it's sides, and I don't want to do that.
- The angle between the sides will let me deduce at which point both
of them would intersect, which I need to calculate (r). Given that we are
assuming the template is part of a circle, that would tell me where the
center of the circle is.
So let's get down to it. By taking our measures, we can build a
trapezoid like the one shown in figure c), along with some names for
each side (le=long edge, se=short edge, s=side). Some simple math lets me
deduct the lengths for all the relevant sides in figure d), but no angles
yet. To get that, we'll need some trigonometry.

Let's cut a triangular slice of our figure. We know the length of two
sides (which we use to obtain the third via Pythagoras's theorem),
and we also know one of the angles is 90 degrees, or (\frac{\pi}{2}) radians.
Using my favorite identity, the Law of sines, we can deduct the angle (\alpha) (see Figure e), which is almost
the one we wanted: looking at Figure f), it is clear that (2\alpha) is the
angle between the non-parallel sides.

So now let's see if we can get the length of the largest curved side
of our tumbler. There are some weird ways of obtaining this value, but
I this this is the simpler one: we know already that (2 \pi r) gives
us the full circumference of a circle. But we don't want the full circumference,
just a small piece of it - more precisely, a piece with an angle
(\theta = 2\alpha).
Now that you got the basic idea, this is what I'm doing next:
- Make a triangle with angle (\alpha) similar to the previous one, but
one that goes all the way to the center of the circle. Note that the sides
are now larger, but the angle remains unchanged.
- For this new triangle, its length (s) equals the length (r) (which we
don't know yet) and the length of its shorter side is (\frac{le}{2}).
- Having two angles and one side, I'll use my favorite identity again to
calculate the length of the unknown side (s) (which equals (r)).
- The end result is that (r = 82.29cm).
Cool! Now that we have (r), we can obtain the circumference of our
circle. If we obtain the circumference only for an angle (\theta), we get
the length of the long curved side, which is 23.78cm, and the length of
the short side, which is the same calculation but subtracting (s) from (r).
That tells us the shorter curved side is 18.75cm, and now I have all the
measures I need to properly draw and deform my picture. Imagine for a
second that our original picture is made of rubber, and we deform it
until it looks like the shape of our tumbler's label. Then the top of our
picture would be stretched, while the bottom would be compressed. So if
we want to know how wide was our original picture, we want to check
right in the middle of the picture, which is the only part that
would not be deformed. That length is 21.26cm (i.e., the average between
both sides), and now that we know our original picture was 21.26cm x 17.43cm,
we can divide and get a frankly terrible ratio of 1.219, which is the
aspect ratio we need for our pictures to fit just right.
So that's that. We now know how to properly set up our original design,
and we have all the numbers to deform it properly, but how do we actually
deform it? Well, as I'm a GNU-Linux guy, I'm going to go ahead and suggest
you use ImageMagick. More specifically, the command I mentioned at the
beginning of the article - although now you know why your picture needs
to be a certain size and where the value of (\theta) comes from.
And just like that, we have our picture ready to be printed. Make sure
that your picture is printed exactly 23.7cm wide, pick some scissors, and
you now have a perspective corrected, formally verified new tumbler skin.
A random collection of things I saw on the internet:
Let me start with two pieces of unrelated information.
Random fact number one: I'm a fan of cable-free environments, and
specially when it comes to my house. I have a computer screen, speakers,
a PS3 and a laptop all connected together, with no cables lying around.
This presented a challenge, though, because the speakers only have one
3.5mm input (as in, the same one your headphones use) and I wanted to
plug two things at the same time. The PS3 has an RCA adapter, while a
male-male cable can be used for the computer (that I plug into the
headphones socket). So here I have three male connectors that I need to
connect together, ideally without any intervention afterwards (I don't
want to plug and unplug things all the time).
Random fact number two: there's a German company which I've seen in
the Mauerpark Flea Market
selling something they call Pokket Mixer.
This is a mixer reduced to its simplest shape, in which you plug two
devices through its headphones and get one single output. It costs €90
(okay, € 89,95), and looks fairly nice.
Now, you would thing that this two pieces of information complement each
other perfectly, right? Get a mixer, plug both sound inputs, and problem
solved. But alas, that's not the case - I cannot justify spending that
much for something that, let's face it, is more a result of lazyness
than anything else. So I looked for alternatives, and lucky me, I found
this simple stereo
mixer
that does the same thing, but with less features and is a lot cheaper. I
built one of those for around € 10, and now I'm happy.
Then, an idea crossed my mind: I could add a sound control to this
thing. It would still be under € 10 (let's say € 20 for the really fancy
options), and I could sell it for € 60. As I'd do it as a hobby and I
already have a job, I can sell it at a lot less than the other guys, and
still make some easy income on the way. By not having to pay any bills
nor worrying about the sustainability of my project, my definition of
success is fairly reachable.
But then again, there's only so much market for pocket mixers. After
all, is not the kind of thing you buy over and over. Being significantly
cheaper, my project is likely to be good enough for the casual listener,
so I could have a chance at capturing that market, but that niche is
bound to dry up eventually. And then again, when that happens I can just
sell my stock at a discount and go back to my regular job, but how about
the guys who actually are making a living out of it? Is it fair for me
to make their business harder just because I'd like to buy more comics
per week? I know the free market is all about competition and taking
every chance, but is it ethical for me to do that?
So, as you can see, I'm more worried about the consequences of my
success than the consequences of my project being a dud. It is clear to
me now that I'll never be rich, because in order to do so I'd have to
want money for the money itself, and I just can't do it. I don't want to
be the kind of person that leads a market, simply because that would
require me to crush the competition, and I can't bring myself to crush
someone's job just for the sake of money, not even hypotetically.
Not that I'm complaining. I just wanted to point it out.