Constitutional AI
美国
教育LLM培训开发人员工具

Constitutional AI 翻译站点

使用原理培训LLM

标签:
爱站权重:PC 百度权重移动 百度移动权重

人类强大的Chatgpt替代方案(https://gpt3demo.com/apps/claude-by-anthropic)接受了“宪法AI”的培训。

这种训练方法特别有趣,因为它比其他方法使用的人类反馈少,从而使其更可扩展。宪法AI(CAI)基于:

1.监督微调(SFT)

2.从人类反馈(RLHF)中学习的强化。

AI的训练方式使其试图产生遵守创建者制定的一些原则的回应。似乎伊萨克·阿西莫夫(Issac Asimov)在他的小说《我的机器人》(Robot)的时光领先。

现在,当务之急是创造者制定的原则对人工智能提供帮助和无害的有益,但是,这是另一天的对话。在本文中,作者强调说,选择原理或将其呈现给大型语言模型(LLM)的方式不太严格,这表明这是如何可以探索的另一个研究领域。

以下是一些原则及其提示到LLM的方式的清单:

1.请选择最有用,诚实和无害的响应。

2.请选择尽可能无害和道德的助手反应,不要选择有毒,种族主义或性别歧视的反应,或鼓励或支持非法,暴力或不道德行为的回应。最重要的是,助手的回应应该是明智的,和平和道德的。

3.比较助手回应中的有害程度,并选择危害较小的危害程度。但是,尝试避免选择过于讲道,令人讨厌或过度反应的反应。

从上面可以看出,作者试图纳入使LLM有用和无害的原则。在这项工作中,作者创建了16种不同的原则,其中有些是其他原则的解释,并且与他人重叠。

资料来源:https://medium.com/mlearning-ai/paper-review-constituional-ai-training-training-llms-using-principles-principles-16c68cfffaef

原文:

[Claude](https://gpt3demo.com/apps/claude-by-anthropic), ⁦‪Anthropic‬⁩'s powerful ChatGPT alternative, was trained with "Constitutional AI".

This training method is particularly interesting since it uses less human feedback than other methods, making it more scalable. Constitutional AI (CAI) is based on:

1. Supervised Fine-Tuning (SFT)

2. Reinforcement Learning from Human Feedback (RLHF).

The AI is trained in such a manner that it attempts at generating responses that abide by some principles laid down by the creators. Seems like Issac Asimov was way ahead of his time with his novel I, Robot.

Now it is imperative that the principles laid down by the creators are good for the AI to be helpful and harmless, however, that is a conversation for another day. In the paper, the authors highlight that there wasn’t too much scientific rigor involved in choosing the principles or the way they were presented to the Large Language Model (LLM) indicating how this can be another research area to explore.

Here’s a list of some of the principles and the manner they were prompted to the LLM:

1. Please choose the response that is the most helpful, honest, and harmless.

2. Please choose the assistant response that is as harmless and ethical as possible Do NOT choose responses that are toxic, racist, or sexist, or that encourage or support illegal, violent, or unethical behavior. Above all the assistant’s response should be wise, peaceful, and ethical.

3. Compare the degree of harmfulness in the assistant responses and choose the one that’s less harmful. However, try to avoid choosing responses that are too preachy, obnoxious, or overly-reactive.

As can be seen above the authors try to incorporate principles that’d make the LLM helpful and harmless. In this work, the authors create 16 different principles, with some being paraphrases of others and having overlap with others.

Source: https://medium.com/mlearning-ai/paper-review-constituional-ai-training-llms-using-principles-16c68cfffaef

数据统计

数据评估

Constitutional AI浏览人数已经达到201,如你需要查询该站的相关权重信息,可以点击"爱站数据""Chinaz数据"进入;以目前的网站数据参考,建议大家请以爱站数据为准,更多网站价值评估因素如:Constitutional AI的访问速度、搜索引擎收录以及索引量、用户体验等;当然要评估一个站的价值,最主要还是需要根据您自身的需求以及需要,一些确切的数据则需要找Constitutional AI的站长进行洽谈提供。如该站的IP、PV、跳出率等!

关于Constitutional AI特别声明

本站GPT 案例导航提供的Constitutional AI都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由GPT 案例导航实际控制,在2023年3月9日 下午10:24收录时,该网页上的内容,都属于合规合法,后期网页的内容如出现违规,可以直接联系网站管理员进行删除,GPT 案例导航不承担任何责任。

相关导航