FAUST - Feedback Analysis for User adaptive Statistical Translation

FP7-ICT-2009-4

FAUST News


April 2012 EAMT 2012 Best Paper Award !

  • Can Automatic Post-Editing Make MT More Meaningful?
    Kristen Parton, Nizar Habash, Kathleen McKeown, Gonzalo Iglesias, Adrià de Gispert


April 2012 Papers to appear at EAMT 2012

  • Can Automatic Post-Editing Make MT More Meaningful?
    Kristen Parton, Nizar Habash, Kathleen McKeown, Gonzalo Iglesias, Adrià de Gispert


April 2012 Papers to appear at LREC 2012

  • A Corpus of Adequacy Assessments for Real-World Machine Translation Output
    Daniele Pighin, Lluís Màrquez, Lluís Formiga
  • An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output
    Daniele Pighin , Lluís Màrquez , Jonathan May


April 2012 Papers to appear at EACL 2012

  • SYNTAX-BASED WORD ORDERING INCORPORATING A LARGE-SCALE LANGUAGE MODEL
    Yue Zhang, Graeme Blackwood and Stephen Clark
    • A fundamental problem in text generation is word ordering. Word ordering is a computationally difficult problem, which can be constrained to some extent for particular applications, for example by using synchronous grammars for statistical machine translation. There have been some recent attempts at the unconstrained problem of generating a sentence from a multi-set of input words. By using CCG and learning guided search, Zhang and Clark reported the highest scores on this task. One limitation of their system is the absence of an N-gram language model, which has been used by text generation systems to improve fluency. We take the Zhang and Clark system as the baseline, and incorporate an N-gram model by applying online large-margin training. Our system significantly improved on the baseline by 3.7 BLEU points.


March 2012 Marcus Tomalin seminar at University of Edinburgh

  • Marcus Tomalin gave a seminar titled `In Search of `Natural' Speech: Grammaticality, Acceptability, and Speech Technology' at the Edinburgh Linguistics Circle, March 2012
    • Abstract Although state-of-the-art large vocabulary Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) systems often achieve impressive Word Error Rates (WERs) and BLEU scores respectively, end-users frequently consider the word sequences output by such systems to be `unnatural'. The perceived `unnaturalness' usually results from the accumulation of many small linguistic errors (e.g., lack of subject-verb agreement, partially scrambled syntax, homophonic substitution). Consequently, in recent years there has been a renewed interest in improving the `naturalness' of ASR and SMT output, even in systems that produce good WER and BLEU scores.

      In this talk, the perceived `naturalness' of ASR and SMT transcriptions will be considered in the context of on-going debates about grammaticality and acceptability. An experimental framework for exploring these aspects of ASR/SMT transcriptions is described, and a methodology for improving the `naturalness' of such outputs is presented. The simplest ways of modifying an input word sequence are insertion, permutation, deletion, and substitution, and the approach adopted in this work makes use of a Combinatory Categorial Grammar (CCG) text generation system which enables input word sequences to be modified so as to improve their `naturalness'. It is shown that the output produced by the CCG-based system is considerably improved if the N-best generated hypotheses are rescored and reranked using Ngram-based techniques.


February 2012 Yue Zhang appointed to Assistant Professor

Dr. Yue Zhang will take up position as Assistant Professor at Singapore University of Technology and Design with effect from July 2012. Yue has been a Research Associate at the Cambridge Computer Laboratory working on parsing and natural language generation for FAUST.


The Asiya Open Toolkit for Automatic MT (Meta-)Evaluation

is available for download at http://www.lsi.upc.edu/~nlp/Asiya/


19 July 2011 FAUST is featured on the BBC World Service Radio Programme 'Click'

BBC Programme website, with audio

World Service English has total weekly reach of 43m listeners and its audience in the US has risen to 10m. This follows record audience figures in the UK of 1.79m for the first quarter of 2011 – a reach of 3.5% among all UK adults.

An extended version of the interview broadcast on Click on BBC World Service Radio, 19th July 2011 is available on the The Open University website.


23 June 2011 FAUST has a successful first-year review in Luxembourg


20 June 2011 Reverso Labs is live! Visit http://labs.reverso.net/ .

After several months of development, we are very pleased to announce the release of Reverso Labs.

This website allows you to test new translation engines coming from prestigious research labs, as well as to compare, rate and give your feedback on the translations on several European language pairs.

The Faust project aims at allowing user feedback on translation to be quickly processed so that users can benefit almost immediately from it. Reverso labs is online from now on. Test it and give us your feedback!

http://labs.reverso.net/


14 July 2010 New version of the Reverso website http://www.reverso.net

Users can now rate translations, edit the translations, and provide comments and corrections


2 July 2010 Softissimo Initial Analysis on Translation Feedback

Analysis of translation suggestions on Reverso translation engines: initial findings


2 June 2010 Translation Feedback Collection Systems in Operation at UPC

The Spanish-Catalan translation server at UPC is now collecting user feedback in response to translation requests. The collected responses will be used to develop models of user interaction with translation systems.

UPC Spanish-Catalan online translation service : http://www.n-ii.org/

Some examples of user feedback collected by the system:

Translation Request Automatic Translation Feedback Supplied by the User
Porta'ls el fonendoscopi, si us plau Porta les el fonendoscopio, por favor Llevales el fonendoscopio, por favor
eixir eixir salir
La bruixa amb sis gats i un hàlit de por espanta les ombres,el sol s'encofurna i es gebra d'esglai en veure-li el nas. La bruja con seis gatos y un hálito de miedo asusta a las sombras, el sol se encofurna y se escarcha de espanto en verle la nariz. La bruja con seis gatos y un hálito de miedo asusta a las sombras, el sol se encofurna y se hiela de espanto en ver su nariz.
Des del país dels lotòfags, Odisseu i els seus companys van navegar fins a un indret de la costa d'Itàlia on vivien els cíclops i que avui és la badia de Nàpols. Desde el país de los lotòfags, Odiseo y sus compañeros navegaron hasta un lugar de la costa de Italia donde vivían los cíclops y que hoy es la bahía de Nápoles. Desde el país de los lotófagos, Odiseo y sus compañeros navegaron hasta un lugar de la costa de Italia donde vivían los cíclopes y que hoy es la bahía de Nápoles.
Tampoc no tenien àgores, és a dir, places públiques on discutir, parlar i prendre decisions perquè tothom les respectés. Tampoco carecían àgores, es decir, plazas públicas donde discutir, hablar y tomar decisiones para que todo el mundo las respetara. Tampoco tenían ágoras, es decir, plazas públicas donde discutir, hablar y tomar decisiones para que todo el mundo las respetase.
fixa't en la imatge Fija te en la imagen Fijate en la imagen
L'autocar ens ha deixat a Castellar, des d'on hem iniciat la marxa caminant. El autocar nos ha dejado en Castellar, desde donde hemos iniciado su marcha andando. El autocar nos ha dejado en Castellar, desde donde hemos iniciado la marcha andando.
Sis persones s'esperen per pujar en un ascensor. Busca dues maneres diferents d'agrupar-les. Seis personas se esperan para subir en un ascensor. Busca dos maneras diferentes de agruparlas. Seis personas esperan para subir en un ascensor. Busca dos maneras diferentes de agruparlas.
De quins cossos en podries apilar dos d'iguals? De qué cuerpos en podrías apilar dos iguales? De qué cuerpos podrías apilar dos iguales?
Aquest vespre anem a sopar a Estoril. Esta noche vamos a cenar en Estoril. Esta noche vamos a cenar a Estoril.


22 March 2010 FAUST Presentation at Language Technology Days

Presentation (pdf)


11-12 February 2010 Project Kick-Off Meeting: Cambridge, UK


1 February 2010 Project Start Date


-- BillByrne - 14 June 2010