Martin's blog

How helping Latin American researchers cost me my GMail account

Posted on 2021-03-30 under GMail

Have you heard of the LatinX in AI social event? It is a social event organized by LXAI intended to bring together Latin American researchers in NLP and AI. I joined as a participant in their EMNLP 2020 edition, and I am now volunteering to run the 2021 version to take place soon parallel to EACL 2021.

One of my tasks as part of the organizing committee is to send invitations to those who joined last year. And you can probably see where this is going: even though I have written permission from these 116 participants to contact them and even though I followed Google's best practices for sending emails, my 18-years-old GMail account was nonetheless blocked and it has remained so ever since.

I have now spent several days in Google support hell leaving no stone unturned and no link unclicked. If you have never tried to get support from Google, the following diagram illustrates all the maddening steps I have followed during the account recovery process with no luck so far:

Diagram showing several infinite loops when following support instructions

The "Number not accepted" box is particularly annoying: I have always refused to give Google my personal telephone number because there is no guarantee that they won't use it for tracking me like Facebook was caught doing, and you cannot enable 2-factor-authentication without providing one first (trust me, I tried). As a result, Google will not trust any number I give now - it is mildly funny to read that the telephone number of the Fortune Global 500 company where I work "has already been used too many times for verification" even though I had never used it before. Either whoever used my desk before me was a serious spammer, or Google is not being as honest as one would expect.

But you know what hurts the most? That all of this could have been avoided if I hadn't insisted on personalizing the emails. I hate emails addressed to "Dear sir or madam" and therefore went out of my way to write the script that would pull people's names and display it properly. If I had written a generic email instead and dumped 100+ addresses in the web interface I would probably still have my account. I know it isn't much, but that's all I could do to show people that I care about them receiving their invitation. No good deed goes unpunished.

Maybe in the future I will write about all my complaints. One particularly mean example are the emails I still get in my recovery account letting me know that someone has been trying to access my account but that I shouldn't worry because they didn't let them in. But for today, I want to leave you with two parting thoughts.

First: this story is not new, and if you have all your eggs in the Google basket it is only a matter of time before you lose something important with no recourse. Maybe they will remove your browser extension, ruin your startup, kill your game, terminate your Android app, delete your YouTube channel, or who knows what else. So be prepared. I can assure you that if I had not started using my own email domain years ago I would now be truly screwed with no way forward. If you are not willing to leave Google products for good, at the very least get a local copy of your data with Google Takeout and keep it safe.

And second: I still need that account to organize the event. So if you know someone who works for Google, please tell them to write me an email to get this sorted out. I would be slightly sad of losing the epic burn "Google closed my 18-years-old account forever for helping Latin American researchers", but I'll let it go if that means moving this event forward.

GPT-3 is blockchain

Posted on 2021-03-29 under nlp gpt3

I need to share with you an epiphany that occurred to me yesterday.

Have you heard of GPT-3? If not, I can tell you that it's a language model that has been showing up everywhere. Having been trained with a lot of data, it can generate text that people find impressive.

If you follow the hype, GPT-3 will revolutionize everything - people have been using it to generate plausible-looking creative fiction, pickup lines, SQL Queries that are sometimes wrong, trivia answers, tweets, and so on.

But you know what no one has generated yet, as far as I know? Something useful. Or even better: something that people always wanted but current technology cannot provide.

People are excited about GPT-3 because it promises to "just work" - you give it the right prompt and you get the right answer. This would solve all of those pesky problems associated with NLP such as "this search terms make no sense", "I hate knowledge bases", "That question has multiple answers", or "I don't want to manually write all possible answers for my system". But this is not what GPT-3 can do, because GPT-3 will not bend to your so-called facts and therefore will not do what you want. As Robert Dale puts it when talking about GPT-2: "driven as it is by information that is ultimately about language use, rather than directly about the real world, it roams untethered to the truth". In other words, people are excited about GPT-3 because they think it solves a different problem that the one it actually does.

If you want a chatbot to tell a patient that the solution to their depression is talking to a professional instead of GPT-3's suggestion that they should just go ahead and kill themselves, you need a way to constrain the system's output. This means that you still need to write the code that interprets the patient's problems, the code that chooses the right solution to that problem, and the code that says exactly what you want, no more and no less. And while turning structured data into human-readable sentences is a valid possible use for GPT-3, the amount of work required to constrain its output to an acceptable error level is comparable to the effort required to write smart templates that guarantee you'll generate exactly what you want.

And so, GPT-3 joins blockchain technology in being a solution searching for a problem. In fact, the parallels are kind of amazing: both technologies are hyped to the extreme, completely misunderstood by the general public, very expensive to run, and products based on them rarely make it out of the proof-of-concept stage.

I would like to leave you with two optimistic thoughts. First, I do think that it is only a matter of time before someone actually finds a good use for GPT-3. I predict it is going to be something marginal, with my best bet being something related to grammatical correctness. Abstractive summarization is also a good candidate, but my faith is lower because inserting unrelated facts is simultaneously what abstractive summarization tries to avoid and what GPT-3 does best.

And second, I want to let you know that there's a great business opportunity here. The blockchain craze reached the point where simply putting "blockchain" in your company name is enough to make your stock price rise by 289 percent. Therefore, if my prediction is correct then all you need to do is either name your own company "GPT-3" or invest in someone else doing it. Sure, their stock will probably tank once investors realize they invested for the wrong reasons, but by then you will have hopefully cashed out and moved on to something else.

Disclaimer: I am not an investment banker, this post does not constitute financial advice, I don't know why anyone would listen to me, and you shouldn't follow advice you find on random blog posts anyway.

Uber in Argentina

Posted on 2021-03-08 under uber

Uber arrived in Argentina working in a grey legal area, as usual. Word of mouth is that Uber refused to be classified as a transport company and insisted on being classified as a digital services company instead. These legal problems led to them being unable to accept Argentinean credit cards for payment. But Uber kept offering the service at a loss, allowing local drivers to accept cash and adding the debt to their driver profile. According to insiders, the drivers were expected to keep Uber's 25% cut aside and transfer it once in a while themselves. Although Uber eventually managed to get access to credit cards, they kept the cash option available.

The collateral damage of this policy is extensive.

Some drivers decided not to settle their debts with Uber, keeping 100% of the proceeds instead. If and when Uber closes the driver's account they get a new SIM card, send fake documentation, and start with a fresh account that lasts between a week and a month. These drivers accept only cash: they have no bank account data to provide because their data is fake, and they know that it's only a matter of time before their account gets banned anyway.

Because these drivers accept that their account is temporary, none of Uber's typical incentives work. When a passenger pays with credit card the money goes straight to Uber and the driver doesn't see a dime -- it all goes away to settle a debt they had no intention of paying. Therefore, drivers will often contact potential passengers asking how they intend to pay. If the passenger says "credit card", the driver either cancels the trip or straight up ignores the passenger forcing them to cancel. You can take the time and report the driver, but few people do it and all it does is to cause a mild inconvenience to the driver.

And while this is inconvenient for the passenger, it also opens the door to the really shady practices: once you have no way of verifying that the driver is who they claim to be, you are one step away from being robbed by a fake driver (in Spanish).

In short, Uber Argentina has become yet another dysfunctional taxi service. And rival local apps are catching up: not only do they have their paperwork up to date, but they have also incorporated apps into their daily routine. It would be no surprise if Uber were still operating in Argentina just for PR purposes. With a 43% drop in revenue for Latin America last year, and with Uber pinky swearing that they will achieve profitability any time now, the only reason I can see for Uber operating in Argentina is to keep the illusion of "one app for the entire world".

And sure, that's a fair point. But I have no reason to believe that these problems are exclusive to Argentina, and probably neither should you. I wrote this story because I found it interesting and I picked Argentina because that's what I know about, but if you are one of those tourists who blindly gets into an Uber believing that their drivers are more honest than taxi drivers you may be up for a rude awakening. Apps are not well known for solving deep, systemic social problems after all.

Sources

The information for this post came from these threads in Reddit's /r/argentina: Thread 1, Thread 2, Thread 3.

Why won't the music industry take my money?

Posted on 2021-03-01 under music rant

I have tried this week to buy the soundtrack for The Greatest Showman for a gift and let me tell you, it's really hard.

I started naively thinking that, since the album is available on Amazon as MP3, I could just click "Buy" and be done with it. But Amazon, as it turns out, doesn't want my money. Sure, they say they will sell me the album. But once I actually try they reject my credit and debit cards with a mysterious error that, after some digging, may be related to Amazon not having the rights to that album in Germany. I say "may" because Amazon doesn't give me any usable information - all they show is this error:

We were unable to process your purchase with your current payment information. Please enter a valid payment method and an address which are both local.

Seeing that my credit card is valid, my address is local, and the buy page doesn't mention any kind of restrictions, that's my first dead end.

My second stop is Warner Music, who owns the soundtrack. This is also a waste of time: they will gladly sell me physical copies in vinyl, but digital? No luck there.

Next: Apple, the first big company to offer DRM-free music downloads and self-professed champions of user experience. We were off to a rocky start: you can only buy music using iTunes, which is not available in Linux and forces me to boot my Windows 10 PC. One hour later, courtesy of Windows 10 deciding it's a good time for an update, I am faced with this screen:

iTunes screen showing gibberish

If you think this well-known and yet unresolved issue stopped me, you are mistaken - I have signed way too many contracts in languages I don't fully grasp to be afraid of what is clearly a credit card details form. Luckily, after giving my password like 6 times, converting m4a files to mp3, and almost two hours later, I am finally the proud temporary owner of this soundtrack.

So let's talk now about Spotify. I reluctantly started using it again because it's one of the few services with an offline mode for Android phones that doesn't require giving my phone number. Seeing as I still object to their collection of private data, I created a fake profile that I regularly renew with gift cards. But do you know what happens when your subscription is about to run out? The answer is "nothing": you get zero notifications, no e-mail, nothing.

What happens when my subscription runs out? First: all of my offline music is deleted, which is the one feature I'm paying for. Since I'm often in offline mode for work, that means no music for me for the rest of the day. And second: just like there is no notification about my balance running out, there is also no option in the app to give a new gift card code. I can easily give my credit card and subscribe forever, but gift cards require extra steps.

What these two infuriating stories have in common is that they are examples of the music industry working both badly and as intended. Amazon, Spotify, and Apple (up to a point) will gladly give me access to the music I'm trying to pay for, but only if I agree to set recurring payments to their walled gardens and access to my private data. Owning my music and keeping my privacy, however, is really hard.

Which brings me to my final point. There is a service with an extensive, high-quality music catalog that's easy to use, works on every platform, let's you keep your privacy, and will take your money but only if you really want to. It's called piracy. And even though it's been almost 10 years since Gabe Newell publicly pointed out how to effectively get rid of piracy for good, we are somehow still living in a world where buying a single music CD takes two hours, Windows, fluency in fictitious languages, and a computer science degree.

At least you can now order your vinyl records via e-mail. Take that, 1980s!

Which apostrophe should I use?

Posted on 2021-01-16 under language

As someone who regularly switches between keyboard layouts, I have a problem: I have at least three keys that can be used as an apostrophe, but I don't know which one is the correct one. Compare:

Backtick: Hamlet`s father
ASCII Apostrophe: Hamlet's father
Acute accent: Hamlet́s father
Single closing quote: Hamlet’s father

Do you know which one is the right one? If not, this small guide is for you. But you'll have to endure a lot of theory first.

The problem here is that using your keyboard requires mixing three different concepts: which key you pressed, which character it represents, and how is it visually represented. To explain that in clearer terms, let's take the backtick as an example.

The key itself can move around. The backtick key is located below the tilde (~) in a US keyboard, to the left of the backspace key in a German keyboard, and under the caret (^) in a Spanish keyboard.

Internally, this key is called "Grave accent" in ASCII but programmers know it it as "backquote" or "backtick". When you press it you send a code to your computer that, if you were using ASCII, would be reflected as the 0x60 hexadecimal value. And here we make another distinction: if your computer is configured to do so, this code can be interpreted as a dead key that only exists to modify the next character. If you want to type the è in the French word très (very) you use the combination <Grave accent key> + <e key>. If your computer is not configured in this way then you simply get a backtick.

But here we have a very subtle difference between a backtick and a Grave accent. The Grave accent is a modifier, changing the sound of the letter underneath. Therefore, it cannot exist by itself. If you see the character ` alone then it is not a Grave accent, it's a backtick. They both look the same, but they have different meanings.

Your program is another factor: some programs may replace the character you are using because it's very likely that you are using it wrong. If I type the double quote character (") in Microsoft Word it may be replaced by an Opening double quote (“), a German opening double quote („), an English closing double quote (”), or a German closing double quote (“, which is the same character as the Opening double quote in English).

And then we have the issue of fonts. Some fonts may represent two different characters in the same way, or straight up ignore it. If you have been staring at some of the quotes I mentioned above and see no difference, well, maybe that's why.

So back to our original question: what is each key good for?

The backtick quote (`) has no meaning in typography at all. The only reason a non-programmer would use it would be to type a Grave accent.
The apostrophe (') is the one I need for daily use. In truth, Unicode defines a different character as a "true" apostrophe (we'll get to that soon), but not all keyboards can generate it. Therefore, and just like the double quote example above, it is okay to use it.
The Acute accent (́) should only be used as a modifier. You can see that it insists on modifying whichever character is next to it, because it has no meaning by itself.
The Single closing quote (’) can be generated under certain combinations of keyboard and local configuration. If your keyboard supports it, this is the preferred key to use for apostrophes.

There are plenty other versions of the quote character that can be used. This list of commonly confused characters has been of great help, while this technical description of the apostrophe character also provides more info than you thought you needed. Sure, it may sound like a lot of work. But you'll thank me next time you need to write a phrase like “In a Déjà vu, René corrected ‘It’s x′, not x″’” knowing full well that you are right.

Martin's blog

document.write(random_tagline());

Further reading

Sources