If you only have a passing interest on what I do, then these articles are probably for you. Each one is a very simple introduction to my research topics, and none of them require special knowledge. The first two are the basis for the remaining articles, so you might want to start there.
- How to teach to computers: an introduction to training and testing computers.
- What is the GIVE Challenge?: explains how we collect our data, where it comes from and what is it good for.
- The Tapiz instruction-giving system: explains a system in which we give instructions without knowing what those instructions actually mean. It works better than expected. This was my first serious research project.
- The Semantic and Observational models: an overview of my current research, in which we try to predict people's reactions based on what they appear to be doing.
- Visual salience and eye-tracking (in progress): a description of our two main methods for guessing what a is user is looking at. I did not create either, but they are worth mentioning.
- Corrective feedback (in progress): a description of how to correct a user once we know they have made a mistake.
If you are interested in the small details, then you should definitely check the Published papers section, as they go very deep into the details. Or contact me directly. I don't bite.
Generating Contrastive Referring Expressions
- Authors: Martín Villalba, Christoph Teichmann and Alexander Koller
- Presented at: The 55th Annual Meeting of the Association for Computational Linguistics (ACL) (August 1, 2017)
The referring expressions (REs) produced by a natural language generation (NLG) system can be misunderstood by the hearer, even when they are semantically correct. In an interactive setting, the NLG system can try to recognize such misunderstandings and correct them. We present an algorithm for generating corrective REs that use contrastive focus ("no, the BLUE button") to emphasize the information the hearer most likely misunderstood. We show empirically that these contrastive REs are preferred over REs without contrast marking.
The Impact of Listener Gaze on Predicting Reference Resolution
- Authors: Nikolina Koleva, Martín Villalba, Maria Staudte and Alexander Koller
- Presented at: The 53rd Annual Meeting of the Association for Computational Linguistics (ACL) (July 27, 2015)
We investigate the impact of listener's gaze on predicting reference resolution in situated interactions. We extend an existing model that predicts to which entity in the environment listeners will resolve a referring expression (RE). Our model makes use of features that capture which objects were looked at and for how long, reflecting listeners' visual behavior. We improve a probabilistic model that considers a basic set of features for monitoring listeners' movements in a virtual environment. Particularly, in complex referential scenes, where more objects next to the target are possible referents, gaze turns out to be beneficial and helps deciphering listeners' intention. We evaluate performance at several prediction times before the listener performs an action, obtaining a highly significant accuracy gain.
Predicting the resolution of referring expressions from user behavior
- Authors: Nikos Engonopoulos, Martín Villalba, Ivan Titov and Alexander Koller
- Presented at: Converence on Empirical Methods in Natural Language Processing (EMNLP) (October 19, 2013)
We present a statistical model for predicting how the user of an interactive, situated NLP system resolved a referring expression. The model makes an initial prediction based on the meaning of the utterance, and revises it continuously based on the user's behavior. The combined model outperforms its components in predicting reference resolution and when to give feedback.
Interpreting Natural Language Instructions Using Language, Vision, and Behavior
- Authors: Luciana Benotti, Tessa Lau and Martín Villalba
- Published in: ACM Transactions on Interactive Intelligent Systems (TiiS) - Special Issue on Multiple Modalities in Interactive Systems and Robots. Volume 4 Issue 3 (October 2014)
We define the problem of automatic instruction interpretation as follows. Given a natural language instruction, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans interacting in a game-like environment. (...) Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on available English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus.
Corpus-based Interpretation of Instructions in Virtual Environments
- Authors: Luciana Benotti, Tessa Lau, Julián Cerruti and Martín Villalba
- Presented at: Proceedings of the 50th Anual Meeting of the Association for Computational Linguistics (April 23, 2012)
Previous approaches to instruction interpretation have required either extensive domain adaptation or manually annotated corpora. This paper presents a novel approach to instruction interpretation that leverages a large amount of unannotated, easy-to-collect data from humans interacting with a virtual world. We compare several algorithms for automatically segmenting and discretizing this data into (utterance, reaction) pairs and training a classifier to predict reactions given the next utterance. Our empirical analysis shows that the best algorithm achieves 70% accuracy on this task, with no manual annotation required.
Inference of Strategic Points in Virtual Worlds (Spanish)
- Authors: Luciana Benotti and Martín Villalba
- Presented at: Argentinean workshop on videogames (WAVI) (October 21, 2011)
Strategic points are a specific kind of waypoint inside a
virtual ol physical world which are vital to the successful
completion of a task. This points are usually placed by hand by
the world designer, leading usually to unnatural movements, and
requiring the designer to do the task over and over each time
the world changes.
In order to infer the location of strategic points, we've researched the movement of human players, looking for common behavior patterns; by analysing their movements and results in certain tasks, we've been able to infer both relevant strategic points and movement patterns.
Aspect-Oriented web requirements engineering with model transformations (Spanish)
- Authors: Juan Durán and Martín Villalba
- Presented at: Latin-American Conference on Informatics (CLEI) (October 20, 2010)
Analysts usually describe requirements using notations with
technical concepts that clients don't understand. Expression
requirements with a non-technical notation (NLC) allow the
clients to understand and validate the requirements analysis
While there is previous work in this area, most of it was directed towards functional requirements of a system, while the area of non-functional requirements was neglected. Because of this situation, we propose a graphical notation (readable by non-techical users and oriented towards information systems) to allow analysts to express how non-functional aspects of their systems affect, on a global scale, the functional aspects of a system.
As the whole process is based on models, we also present an implementation of a model transformation in ATL to transform two well known requirement models into aspect-oriented requirements models.
Besides the papers, I've also presented posters and/or given talks about my research. You can find a selection here.
Presentation for the Workshop on Computational Pragmatics from the 38th Annual Conference of the DGfS
The 38th Annual Conference of the German Linguistics Society (DGfs) took place in Konstanz in February 2016. I presented these slides about my group's work on the Computational Pragmatics workshop.
Presentations for the SFB632 group meetings
The SFB632 project was a research group that funded our research during my first three years in Potsdam. These are the slides and poster I presented during that time.
Young Researchers' poster
The Young Researchers Roundtable for Spoken Dialogue Systems (YRRSDS) is an event that takes place every year, usually in co-location with SIGDial. This is the poster I presented my first time there.
Tapiz is the name of the system I worked on while still in Argentina. Here you can find slides I presented both to IBM (who was paying for it at the time) and to the University of Potsdam (who is paying me now).