This release comprises entries drawn from the weblogs at Reverso.net. It contains 6,346 log entries with suggestions of better translation, namely around 10% of the total amount of feedback collected during the 3 years of the project in the FAUST language pairs. Please contact Theo Hoffenberg, theo -at- softissimo -dot- com , if you are interested in obtaining a larger portion of the collection.
This package contains raw testsets crawled from web material, cleaner versions and its respective translation references for 9 European language pairs.
ftp://mi.eng.cam.ac.uk/data/faust/FAUST-1.0.tgz
Translation Feedback
(1) Analysis and annotation of a corpus of open-domain, real-world automatic translations
The quality assessments provide relative ranking and absolute (satisfactory/non satisfactory) adequacy assessments for c.a. 12,000 translations generated from 2,000 English translation requests submitted to Softissimo's translation portal http://reverso.net. These two layers of annotation are complementary and useful in different ways, and they can be exploited to learn models of quality with different applications, i.e., to select among alternative translations or to discard unsatisfactory outputs. A professional translator corrected the most obvious typos in the input sentences and provided reference translations into Spanish for all of them. The corrected sentences have been automatically translated into Spanish with five different systems.
ftp://mi.eng.cam.ac.uk/data/faust//UPC-Oct2011-FAUST-quality-assessments.tgz
(2) FFF and FFF+ corpora with annotations on the correction of human feedback post-editions
Faust Feedback Filtering corpora (FFF and FFF+), consists of quadruples of