PhD Research Seminar: Image-captioning methods, retrieval-based dialogue systems
First talk: Image-Captioning Methods: Architecture, Difficulties, and Future Trends
Speaker: Ismail Kayali, third-year PhD student, Faculty of Computer Science
Machine vision systems, which combine image recognition with natural language processing, can help people who are visually impaired get an accurate description of an image. They also have applications for people who need information about an image but can’t look at it, as, for example, when they are driving.
Image-captioning methods are of high importance for recognizing objects in an image. We are working on systems that can automatically describe a series of images in the same kind of way that a human would, by focusing not just on the items in the picture but also on what’s happening and how it might make a person feel. Captioning is about taking concrete objects and putting them together in a description like when one has to describe a person’s surroundings, read a text, answer questions, and even identify emotions on people’s faces.
Second talk: Retrieval-Based Dialogue Systems Based on Knowledge Model Implementation
Speaker: Elizaveta Goncharova, third-year PhD student, Faculty of Computer Science
In recent years, a large number of dialogue systems for product and service exploration has been proposed. They are widely used as assistants, to help users formulate their requirements about the products they want to purchase. Such systems should provide an interactive search that could not be achieved without the implementation of the knowledge model behind them. In our work, we are developing a concept-based knowledge model that encapsulates objects and their common descriptions. The leveraging of the concept-based knowledge model significantly reduces information overload and improves the user’s search experience. So, in the first part of the talk, we will overview the overall architecture of the proposed information retrieval dialogue system.
The second part of the talk will be dedicated to the module of the introduced model responsible for textual data processing. One of the reachest sources of textual information about the product encapsulated into the knowledge model is users’ reviews. To reduce the number of source texts we need to filter them and deliver the features only from the useful ones. The usefulness of the review is assessed w.r.t. its argumentative power. We analyze different techniques for argument mining and examine the influence of discourse features on the performance of the discriminative models. We also examine transformer-based models, such as BERT, which have shown state-of-the-art results on different NLP tasks, and propose a way for data augmentation that enables it to capture additional information about sub-sentential discourse units in the text.