[請益] 機器翻譯 ouilyn PTT批踢踢實業坊

[請益] 機器翻譯

作者: ouilyn (Si c'etait possible...) 2019-03-22 22:18:55

目前正在讀機器翻譯文章，其中一小段一直讀不懂，懇請版上前輩給予指教。
Error classification and annotation is carried out when the focus is on
the understanding the types of errors produced by an MT system and their
frequency. An example of an error typology for the evaluation of MT
is proposed by Vilar, Xu, d'Haro, and Ney. This form of evaluation was
particularly useful when the dominant MT paradigm was rule based; that is,
it was possible to "code" linguistic rules for the transfer of words, phrases,
and grammatical structures from one source language into a target language.
The use of error typology for the more recent data-driven or statistical
machine translation is more limited because, in this case, the nature and
volume of the data, as opposed to formal lingusitic rules, dictate the output
to a large extent.
error classification指將機器譯錯的部分分類，例如missing word, incorrect
word order... 等。這一套做法不適用於statistical machine translation
(SMT有學習能力，給予幾組翻譯譯文對照，機器進行分析，得到某種規則或公式，接下
遇到類似的翻譯，就有能力翻得出來。
我看不太懂的地方在the nature and volume of the data dictate the output to a
large extent。依照SMT操作模式，譯文產出與否，DATA量做為規則歸納來源，
所以很重要，但和nature (資料種類?)什麼關係？

作者: annisat 2019-04-02 16:01:00

nature感覺是指「性質」? 直覺想到的是它的messy程度？

繼續閱讀

Re: [請益] 音段算法HotDesert [請益] 音段算法orzcrz [情報] 第二屆中研語言學論壇(ILASALF-2)CCY0927 [請益] 想請各位鄉民解惑jjj9420 [請益] 師大語言學概論考題請益lily91100 [售書] cambridge encyclopedia of languagessiewyao [情報] 免費2019校園/職場英文會話口說能力講座yellowhsu [情報] ２４小時學會【日文閱讀翻譯】免基礎！Bixby [請益] thematic role 漢語偏好？ahcao [請益] 關於Lingerjumikai