Validating multilingual hybrid automatic term extraction for search engine optimisation: the use case of EBM-GUIDELINES
Abstract
Tools that automatically extract terms and their equivalents in other languages from parallel corpora can contribute to multilingual professional communication in more than one way. By means of a use case with data from a medical web site with point of care evidence summaries (Ebpracticenet), we illustrate how hybrid multilingual automatic term extraction from parallel corpora works and how it can be used in a practical application such as search engine optimisation. The original aim was to use the result of the extraction to improve the recall of a search engine by allowing automated multilingual searches. Two additional possible applications were found while considering the data: searching via related forms and searching via strongly semantically related words. The second stage of this research was to find the most suitable format for the required manual validation of the raw extraction results and to compare the validation process when performed by a domain expert versus a terminologist.
Keywords
automatic terminology extraction, ATR, terminology