Application of statistical machine translation to public health information: a feasibility study

被引:41
作者
Kirchhoff, Katrin [1 ]
Turner, Anne M. [2 ,3 ]
Axelrod, Amittai [1 ]
Saavedra, Francisco [3 ]
机构
[1] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
[2] Univ Washington, NW Ctr Publ Hlth Practice, Seattle, WA 98195 USA
[3] Univ Washington, Dept Med Educ & Biomed Informat, Seattle, WA 98195 USA
关键词
DISPARITIES; INTERNET; CARE;
D O I
10.1136/amiajnl-2011-000176
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Accurate, understandable public health information is important for ensuring the health of the nation. The large portion of the US population with Limited English Proficiency is best served by translations of public-health information into other languages. However, a large number of health departments and primary care clinics face significant barriers to fulfilling federal mandates to provide multilingual materials to Limited English Proficiency individuals. This article presents a pilot study on the feasibility of using freely available statistical machine translation technology to translate health promotion materials. Design The authors gathered health-promotion materials in English from local and national public-health websites. Spanish versions were created by translating the documents using a freely available machine-translation website. Translations were rated for adequacy and fluency, analyzed for errors, manually corrected by a human posteditor, and compared with exclusively manual translations. Results Machine translation plus postediting took 15-53 min per document, compared to the reported days or even weeks for the standard translation process. A blind comparison of machine-assisted and human translations of six documents revealed overall equivalency between machine-translated and manually translated materials. The analysis of translation errors indicated that the most important errors were word-sense errors. Conclusion The results indicate that machine translation plus postediting may be an effective method of producing multilingual health materials with equivalent quality but lower cost compared to manual translations.
引用
收藏
页码:473 / 478
页数:6
相关论文
共 26 条
[1]  
ACS, 2009, ACS AM COMMUNITY SUR
[2]  
Allen J, 2004, LECT NOTES COMPUT SC, V3265, P1
[3]  
[Anonymous], 2003, FED REGISTER, V68, P47311
[4]  
[Anonymous], P C AM MACH TRANSL A
[5]  
[Anonymous], HLTH LIT PRESCR END
[6]  
[Anonymous], 2003, Proceedings of HLT-NAACL
[7]  
Aymerich J, 2005, SALV BAH BRAZ 9 WORL
[8]   Health information on the Internet -: Accessibility, quality, and readability in English and Spanish [J].
Berland, GK ;
Elliott, MN ;
Morales, LS ;
Algazy, JI ;
Kravitz, RL ;
Broder, MS ;
Kanouse, DE ;
Muñoz, JA ;
Puyol, JA ;
Lara, M ;
Watkins, KE ;
Yang, H ;
McGlynn, EA .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2001, 285 (20) :2612-2621
[9]  
Callison-Burch Chris., 2007, Proceedings of the second workshop on statistical machine translation, P136, DOI 10.3115/1626355.1626373
[10]  
Cancedda N, 2009, NEURAL INF PROCESS S, P1