首页 > 人工智能(Artificial Intelligence) > 深度学习(Deep Learning)

ChatGPT ⏐ 投喂语料,提升译文质量

利用GPT-4引擎,ChatGPT可以读取网页内容,根据上下文和用户意图生成更加流畅和自然的翻译。如何利用这一特性,让ChatGPT学习语料并最终生成高质量的翻译?本文探索了将双语语料上传至谷歌文档,让ChatGPT读取学习,从而从而提升译文质量的方法。使用这种方法,我们可以将双语语料输入到ChatGPT中,并将其应用于翻译。


1、投喂语料

在 Google Drive 上传语料,此处选择《中华人民共和国民法通则》双语版(22624个字/词)作为语料。上传后在文件右上方修改文件权限为【公用】-【所有人可编辑】,以供ChatGPT读取。

【本质】:利用ChatGPT只能读取网页的特性,将语料做成网页版,让它读取。


2、翻译测试文本选自《中华人民共和国民法通则》中第一百五十四条(注:语料中已将这条删去)

测试文本官方译本:
Article 154 Time periods referred to in the Civil Law shall be calculated by the Gregorian calendar in years, months, days and hours.When a time period is prescribed in hours, calculation of the period shall begin on the prescribed hour. When a time period is prescribed in days, months and years, the day on which the period begins shall not be counted as within the period; calculation shall begin on the next day.If the last day of a time period falls on a Sunday or an official holiday, the day after the holiday shall be taken as the last day.The last day shall end at 24:00 hours. If business hours are applicable, the last day shall end at closing time.

3、测试第一步:

在未读取语料、没有任何提示词的情况下,让ChatGPT翻译,结果如下(注:标黄处为和原文本不同的地方):


Article 154: The time period referred to in the Civil Code shall be calculated in years, months, days, and hours according to the Gregorian calendar.

If the time period is stipulated to be calculated in hours, it shall be calculated from the specified time. If the time period is stipulated to be calculated in days, months, or years, the first day shall not be counted, and the calculation shall begin from the next day.If the last day of the time period falls on a Sunday or other statutory holiday, the following day shall be deemed as the last day of the period.The deadline for the last day of the time period is 24:00. If there are business hours, the deadline is until the end of the business activities.

4、测试第二步:

加入提示词(法律英语),让其润色。提示词来源于AIPRM FOR CHATGPT。结果如下:(注:绿色为润色内容,黄色为与官方译文不同之处)
Article 154 Civil Law stipulates that time periods in civil law shall be calculated in years, months, days, and hours according to the Gregorian calendar.If a time period is specified to be calculated in hours, it shall commence from the specified time. If a time period is specified to be calculated in days, months, or years, the first day shall not be counted, and the calculation shall commence from the next day.If the last day of a time period falls on a Sunday or another statutory holiday, the following day shall be deemed as the last day of the period. The deadline for the last day of a time period is 24:00. If business hours are in effect, the deadline extends to the cessation of business activities.

5、测试第三步:
读取语料,输入提示词,进行翻译,结果如下:

(将文档链接输入对话框,分享给ChatGPT)

(让ChatGPT简单分析原文的语言风格,确保它正确读取文件,且理解原文)

(让ChatGPT翻译)

译文:Article 154: The periods referred to in the Civil Code shall be calculated in years, months, days, and hours in accordance with the Gregorian calendar.If a period is specified in hours, it shall be calculated from the designated time. If a period is specified in days, months, or years, the first day shall not be counted and the period shall be calculated from the following day.If the last day of a period falls on a Sunday or other legal holiday, the day after the holiday shall be deemed as the last day of the period.The deadline for the last day of a period shall be 24:00. If there are business hours, the deadline shall be the end of the business activity.

(输入 prompt,润色译文)

The periods mentioned in the Civil Code are calculated in years, months, days and hours according to the Gregorian calendar. The first day of the period is not counted for periods in days, months, or years, and the period begins from the following day. If the last day of a period falls on a Sunday or legal holiday, the last day is the day after the holiday. The deadline for the last day is the day after the holiday. The deadline for the last day of the period is 24:00, or the end of business activity if applicable.


6、使用BLEU机器翻译质量评测算法判断这四份译文优劣
使用官方译本作为标准。可以直接在ChatGPT里计算,也可以使用代码计算。此处选择直接让ChatGPT计算数值。

(第一份使用BLUE的测试译文)

(第三份测试译文的数据及其它测试的 prompt)

(python 指令,如选择使用 python 测试,可使用该指令)

数据如下:第一份:0.802第二份:0.892第三份:0.94第四份:0.77
发现第三份的相似度是最高的,说明让ChatGPT读取语料库的方法,确实能提高翻译质量。


翻译技术教育研究会ChatGPT专题小组致力于探索发掘ChatGPT为代表的AI工具在语言服务不同阶段的应用,进而提高从业者的工作效率。专题小组目前的主要方向是探索指令(prompt)及流程变更如何提升交互效果;了解及推介Github等平台上的AI应用,AI应用嵌入语言服务流程及其优化方式。小组通过定期会议及讨论确定选题方向并通过文章、视频等形式分享探索结果,欢迎通过文章留言为大家提供建议意见!


编辑:李林

关闭
感谢您的支持,我会继续努力!
扫码打赏,建议金额1-10元


提醒:打赏金额将直接进入对方账号,无法退款,请您谨慎操作。