Natural Language Processing | Doctoral Program - Information Engineering and Computer Science

Natural Language Processing

Flexible Dialogue Models for Conversational Agents and ChatBot

Vevake

Publications | vevake.balaraman [at] unitn.it (Email)

More...

Conversational agents are designed to interact with users in multiple domains on several topics using natural language. Usually these applications work in a strictly limited domain with a clear and well defined dialogue structure, with little adaptation capabilities to the contextual and social situation. We focus on developing techniques to improve the portability of dialogue modeling both among languages and application domains. The objective is to allow for a greater dialogue flexibility and re-planning capabilities when the conversational agent is faced with unknown or unexpected situations.

Deep Learning for Machine Translation

Ruchit Rajeshkumar Agrawal

Publications | ruchit.agrawal [at] unitn.it (Email)

More...

Neural Machine Translation has emerged as the de facto standard for Machine Translation. It generates end-to-end translation, thereby removing the need for complex feature engineering and reducing human effort significantly. This project delves deeper into enhancing the efficiency of Neural Machine Translation by using multi-source training techniques.

Human Behaviour Understanding Using Mobile Phone and Social Media data

Gianni Barlacchi

Publications | gianni.barlacchi [at] unitn.it (Email) | Website

More...

Human behavior understanding is a challenging task, which aims at creating automatic models to capture how interactions among people affect the happiness, the health and the economic well-being of our society. The global growth of mobile phone usage has reinforced the need to study the psychological and social implications of this technology. This research aims at investigating novel machine learning models that can combine the joint use of CDRs and social media data (mainly textual information) for human behavior understanding.

Encoding Structural Information in Deep Neural Networks

Daniele Bonadiman

Publications | d.bonadiman [at] unitn.it (Email)

More...

Deep Neural Networks (NNs) have been shown very promising for solving many different natural language processing tasks. Previous state-of-the-art NNs process text in a sequential way. This is a critical limitation since the sentence semantics is structured in a compositional way, rather than sequentially. Therefore, it should be represented using tree structures. In this research, I propose to encode trees into neural networks in three steps: (i) defining an improved Neural Tree Encoder (NTE); (ii) pre-training a Neural Network composed of two NTEs sharing parameters with approximated Tree Kernel functions; and (iii) continuing training the model on the target tasks.

Open domain dialogue systems

Alessandra Cervone

Publications | alessandra.cervone [at] unitn.it (Email) | Website

More...

My research focuses on open-domain spoken dialog systems. During the first year of my PhD I was the team leader of Roving Mind, the university of Trento team which was selected to compete in the Alexa Prize, a conversational artificial intelligence challenge organised by Amazon.

Automatic Post-Editing for Machine Translation

Rajen Chatterjee

Publications | rajen.chatterjee [at] unitn.it (Email)

More...

Automatic post-editing aims to correct the errors in a machine translated text. This automatic error correction mechanism can speedup the work of translator, and eventually will increase the productivity of translation industry.

Going deeper in neural machine translation

Mattia Antonino Di Gangi

Publications | mattia.digangi [at] unitn.it (Email)

More...

The field of machine translation has seen spectacular improvements since the adoption of deep learning, which opened the way to the approach known as neural machine translation. The most surprising side is the improvements are mostly due to the data processing capabilities of deep learning, while the linguistics aspects give a relatively small contribution. The focus of this research is on this engineering side of machine translation. The data processing can be further improved in order to produce better systems with no additional data. This will lead to systems that are more useful for many practical tasks.

Online adaptive neural machine translation: from single- to multi-domain scenarios

Mohammad Amin Farajian

Publications | mohammad.farajian [at] unitn.it (Email)

More...

Current neural machine translation (NMT) systems are generally sensitive to the domain shifts and experience a drop in performance if exposed to new domains. They are usually trained on specific domains by carefully selecting the training sets and applying proper domain adaptation methods. However, in real-world applications maintaining several specific systems is practically infeasible due to the fact that usually: i) the target domain is not known in advance; ii) the application domains are diverse; iii) there is limited amount of in-domain training data. In my PhD, I explore effective solutions to develop multi-domain NMT systems that perform equally well in all the application domains.

Neural Machine Translation in a Mixed-Language Ecosystem

Surafel Melaku Lakew

Publications | surafelmelaku.lakew [at] unitn.it (Email) | Website

More...

Recently, NMT has been extended to multilingual settings - a single translation model able to translate between multiple languages. My research focuses on finding new ways for enabling and improving the translation of low-resource languages in a multilingual setting.

Semantic Parsing in Task Oriented Conversational Agents

Samuel Louvan

Publications | samuel.louvan [at] unitn.it (Email)

More...

Semantic parsing in goal-oriented dialogue systems generally aims to transform a natural language utterance to a semantic frame which includes three subtasks namely domain identification, intent classification, and slot filling. The conventional approach to perform semantic parsing is to build a separate model for each task. Instead of building multiple models s, we investigate how to train a joint model that can integrate these subtasks and also use less labeled data to cope with new domains.

Automatic Analysis of Agreement and Disagreement in the Political Domain

Stefano Menini

Publications | stefano.menini [at] unitn.it (Email) | Website

More...

To deal with the large amount of political documents available we need to integrate traditional humanistic approaches with computational ones. Political documents present a multitude of interconnected points of view and opinions. We focus on the automatic evaluation of ideological positions, detecting divergences and similarities between authors.

Social Annotation and User Profiling

Yaroslav Nechaev

Publications | yaroslav.nechaev [at] unitn.it (Email) | Website

More...

Social Media will be the cornerstone of any future knowledge-based system. Therefore, there is a need to be able to efficiently gather and process the Social Media data and learn how to use the user-generated content to solve a wide variety of problems. This is what I define as Social Annotation — the process of enriching the typical Computer Science problems, for example, Entity Linking, User Profiling, Event Detection and so on, with the knowledge from the Social Media. In my research, I introduce techniques that greatly simplify the processing of Social Media data. Specifically, I work on efficient user representations and novel social media-based evaluation approaches.

Supervised similarity in semantic tree kernels

Massimo Nicosia

Publications | massimo.nicosia [at] unitn.it (Email)

More...

Semantic tree kernels compute the similarity between the structural and semantic representation of two pieces of text. Matches between words can be established by the similarity of their word embeddings. Our research aims at producing word representations that will be more effective in the kernel computation, by including contextual information through explicit modeling of the context around words, and adopting supervision in the word encoding mechanism.

Deep Knowledge Extraction From Text

Giulio Petrucci

Publications | giulio.petrucci [at] unitn.it (Email) | Website

More...

Human knowledge is often available in large, unstructured textual sources. We tried to build an automatic system capable to scan large document collections, distill the valuable terminological knowledge, and translate it into a formal representation, so that it can be used by machines as well as by humans.

Large-scale biomedical literature summarization

Alan Ramponi

Publications | alan.ramponi [at] unitn.it (Email)

More...

The increasingly available body of scientific articles, clinical trials and patents in the biomedical domain has recently created a surge of interest in effective computational approaches which may aid scientists and companies to keep pace with advances in their fields. However, most of the current methods rely on trivial co-occurrence of terms or weak machine learning models trained on news data. This research aims at producing novel NLP tools to solve biomedical entity recognition, event extraction and multi-document facts summarization allowing a wide range of biomedical applications to be reliably accomplished (e.g., clinical meta-analyses, biomedical pathways discovery).

Event Detection and Classification for the Digital Humanities

Rachele Sprugnoli

Publications | rachele.sprugnoli [at] unitn.it (Email)

More...

Our work aims at shedding light on the complex concept of events adopting an interdisciplinary perspective. More specifically, theoretical and practical investigations are carried out on the specific topic of event detection and classification in historical texts by developing and releasing new annotation guidelines, new resources and new models for automatic annotation.

Neural Machine Translation

Amirhossein Tebbifakhr

Publications | a.tebbifakhr [at] unitn.it (Email)

More...

Amirhossein Tebbifakhr is currently working on the machine translation as part of his PhD research in the hlt-mt group at the FBK research institute and the University of Trento, Italy. He completed his BSc and MSc in Sharif University of Technology and University of Tehran, Iran, respectively. His research interests are in natural language processing, machine learning, and in particular using the deep learning approach to address the machine translation problem.

Exploring Sensorial Association of Words for Computational Linguistics Applications

Serra Sinem Tekiroglu

Publications | serrasinem.tekiroglu [at] unitn.it (Email)

More...

Language is the main communication device to represent the environment and share an understanding of the world that we perceive through of our sensory organs. Therefore, each language might contain a great amount of sensorial elements to express the perceptions both in literal and figurative (creative) usage. In order to tackle the semantics of figurative language we propose to use sensorial affinity of the words as a feature for metaphor identification. Additionally, we analyze the transition from perceptual to conceptual knowledge and conduct a creativity detection task on a multimodal dataset that contains both linguistic and visual dimensions of a given concept.

Beyond Factoid Question Answering

Text Snippet Interpretation in Social Media Communities

Antonio Uva

Publications | antonio.uva [at] unitn.it (Email)

More...

Social Media applications, e.g., forums, social networks, allow users to pose questions about a given topic to a community of experts and/or users. Community Question Answering (cQA) is a branch of QA aim at automatically answering user questions by (i) first looking at the questions most similar to the input question and (ii) then selecting the best answer for those questions. This way, the users ca easily retrieve answers to questions posted on Social Media from the community, In my research work, I explore methods for building automatic cQA systems. The resulting system can be used to automatically answer questions asked by customers about the products or services sold by a company.