multimodal translation

An empirical study on the effectiveness of images in Multimodal Neural Machine Translation

2017

Jean-Benoit Delbrouck Stéphane Dupont

In state-of-the-art Neural Machine Translation (NMT), an attention mechanism is used during decoding to enhance the translation. At every step, the decoder uses this mechanism to focus on different parts of the source sentence to gather the most useful information before outputting its target word. Recently, the effectiveness of the attention mechanism has also been explored for multimodal task...

متن کامل

CUNI System for the WMT17 Multimodal Translation Task

2017

Jindrich Helcl Jindrich Libovický

In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with b...

متن کامل

CUNI System for the WMT17 Multimodal Traslation Task

2017

Jindřich Helcl Jindřich Libovický

In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with b...

متن کامل

Articles: Robust Understanding in Multimodal Interfaces

2009

Srinivas Bangalore Michael Johnston

Multimodal grammars provide an effective mechanism for quickly creating integration and understanding capabilities for interactive systems supporting simultaneous use of multiple input modalities. However, like other approaches based on hand-crafted grammars, multimodal grammars can be brittle with respect to unexpected, erroneous, or disfluent input. In this article, we show how the finite-sta...

متن کامل

Multimodal Neural Machine Translation With Weakly Labeled Images

Journal: :IEEE Access 2019

متن کامل

Subtitle Translation from the Perspective of Multimodal Discourse

Journal: :International journal of education and humanities 2022

Film and television works are multimodal discourse composed of a variety symbol systems such as text, sound image, so the audience's various senses can be mobilized at same time when watching movies. Starting from perspective analysis, this paper applies Delu Zhang’s theoretical framework analysis to analyze subtitle translation Harry Potter Philosopher's Stone four aspects: culture, context, c...

متن کامل

Supervised Visual Attention for Multimodal Neural Machine Translation

Journal: :Journal of Natural Language Processing 2021

متن کامل

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

Journal: :Journal of Artificial Intelligence Research 2022

Recently, there has been a surge in research multimodal machine translation (MMT), where additional modalities such as images are used to improve quality of textual systems. A particular use for systems is the task simultaneous translation, visual context shown complement partial information provided by source sentence, especially early phases translation. In this paper, we propose first Transf...

متن کامل

Word-Region Alignment-Guided Multimodal Neural Machine Translation

Journal: :IEEE/ACM transactions on audio, speech, and language processing 2022

We propose word-region alignment-guided multimodal neural machine translation (MNMT), a novel model for MNMT that links the semantic correlation between textual and visual modalities using alignment (WRA). Existing studies on have mainly focused effect of integrating modalities. However, they do not leverage relevance two advance in by incorporating WRA as bridge. This proposal has been impleme...

متن کامل

The NESPOLE ! Multimodal Speech-to-Speech Translation System: User Based System Improvements

2003

Susannne Burger Erica Costantini Fabio Pianesi

This work discusses the results of two user studies aiming to evaluate the NESPOLE! speech-to-speech translation system, which provides for multilingual and multimodal communication in the tourism and in the medical domain, allowing users to interact through the Internet by sharing maps, web-pages and pen-based gestures. The purpose is to investigate the overall effectiveness of the combination...

متن کامل