Prompt Engineering
https://www.coursera.org/learn/chatgpt-prompt-engineering-for-developers-project/home/welcome
Prompt Engineering(提示工程)是设计和优化输入提示(prompts)以引导大语言模型生成期望输出的技术。这一领域随着GPT-3的发布而兴起,是连接用户意图与模型能力的关键桥梁。
经典论文:
- Brown, T. B., et al. (2020). “Language Models are Few-Shot Learners.” NeurIPS. [论文链接] - GPT-3论文,首次系统展示了prompt engineering的潜力
- Wei, J., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” NeurIPS. [论文链接]
1. Prompting Principles(提示原则)
Principle 1: Write clear and specific instructions(编写清晰具体的指令)
清晰性(Clarity)和具体性(Specificity)是prompt设计的核心原则。研究表明,模糊的指令会导致模型输出不确定性增加,而具体的指令能显著提升输出质量。
-
Tactic 1: Use delimiters to clearly indicate distinct parts of the input
分隔符(如 ```、"""、—、<>)能帮助模型识别输入的不同部分,避免混淆。这在处理包含多段文本或代码的任务时尤为重要。
-
Tactic 2: Ask for a structured output
要求结构化输出(如JSON、HTML、Markdown表格)便于后续程序处理,是构建AI应用的最佳实践。
-
Tactic 3: Ask the model to check whether conditions are satisfied
让模型先验证条件再执行任务,可以减少错误输出。
-
Tactic 4: “Few-shot” prompting - give examples
Few-shot prompting通过提供少量示例来引导模型理解任务格式和期望输出。
经典论文:
Principle 2: Give the model time to “think”(给模型思考时间)
这一原则源于Chain-of-Thought(思维链)研究,让模型通过逐步推理来解决复杂问题。
经典论文:
-
Wei, J., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” NeurIPS. [论文链接] - CoT开山之作
-
Kojima, T., et al. (2022). “Large Language Models are Zero-Shot Reasoners.” NeurIPS. [论文链接] - “Let’s think step by step"的起源
-
Wang, X., et al. (2022). “Self-Consistency Improves Chain of Thought Reasoning in Language Models.” ICLR. [论文链接]
-
Tactic 1: Specify the steps required to complete a task
明确列出任务步骤,引导模型按步骤执行。
-
Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion
让模型先独立推导答案再给出结论,避免直接猜测。
About Hallucinations(关于幻觉)
Hallucination(幻觉)是指模型生成看似合理但实际上错误或虚构的信息。
经典论文:
- Maynez, J., et al. (2020). “On Faithfulness and Factuality in Abstractive Summarization.” ACL. [论文链接]
- Zhang, Y., et al. (2023). “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models.” arXiv. [论文链接]
缓解策略:
- Ask model to find relative information first and answer questions based on the information
- 要求模型标注信息来源
- 使用RAG系统提供可靠上下文
2. Iterative Prompt Development(迭代式提示开发)
提示工程是一个迭代过程,需要不断测试和优化。
经典方法论:
- Zhou, Y., et al. (2022). “Large Language Models Are Human-Level Prompt Engineers.” ICLR. [论文链接] - APE (Automatic Prompt Engineer)
迭代流程:
- Try something(尝试初始prompt)
- Analyze where the result does not give what you want(分析失败原因)
- Clarify instructions, give more time to think(优化指令)
- Refine prompts with a batch of examples(用批量示例优化)
3. Summary(摘要任务)
文本摘要是最常见的NLP任务之一,LLM在此任务上表现优异。
经典论文:
- See, A., et al. (2017). “Get To The Point: Summarization with Pointer-Generator Networks.” ACL. [论文链接]
- Liu, Y., & Lapata, M. (2019). “Text Summarization with Pretrained Encoders.” EMNLP. [论文链接]
技术要点:
- Summarise with a word/sentence/character limit(限制长度)
- Summarise with a focus on certain topics such as shipping and delivery(聚焦特定主题)
- Try “extract” instead of “summarise”: for certain topic you want to see(提取而非摘要)
- Summarise multiple product reviews(多文档摘要)
4. Inferring(推理任务)
LLMs are pretty good at extracting specific things out of a piece of text.
LLM在文本推理任务上表现出色,能够从非结构化文本中提取结构化信息。
经典论文:
- Radford, A., et al. (2019). “Language Models are Unsupervised Multitask Learners.” - GPT-2论文,展示zero-shot能力 [论文链接]
应用场景:
-
Sentiment: positive or negative(情感分析)
经典论文:Socher, R., et al. (2013). “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank.” EMNLP.
-
Identity types of emotions: happy, grateful, anger etc(情绪分类)
经典论文:Mohammad, S. M., & Turney, P. D. (2013). “Crowdsourcing a Word-Emotion Association Lexicon.” Computational Intelligence.
-
Identify specific emotion: such as ‘is the writer expressing anger?’(特定情绪识别)
-
Extract product and company name from customer reviews(命名实体识别)
经典论文:Nadeau, D., & Sekine, S. (2007). “A survey of named entity recognition and classification.” Lingvisticae Investigationes.
-
Infer some topics, ask LLM give answer in JSON format(主题推理)
经典论文:Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). “Latent Dirichlet Allocation.” JMLR. - LDA主题模型开山之作
-
Make a news alert for certain topics: Zero-Shot learning(零样本学习)
经典论文:
5. Transforming(转换任务)
LLM擅长各种文本转换任务,包括语言翻译、风格转换、格式转换等。
经典论文:
- Vaswani, A., et al. (2017). “Attention Is All You Need.” NeurIPS. [论文链接] - Transformer开山之作,奠定现代NLP基础
转换类型:
-
Language(语言翻译)
经典论文:Wu, Y., et al. (2016). “Google’s Neural Machine Translation System.” arXiv. [论文链接]
-
Tone: informal to formal(语气转换)
-
Format Conversion: JSON to HTML(格式转换)
-
Spellcheck/Grammar check or given style such as APA style(语法检查)
经典论文:Yuan, Z., et al. (2021). “Synthesizing Coherent Story with Generative Pre-trained Transformer.” IJCAI.
6. Expanding(扩展任务)
文本扩展是将简短输入扩展为更详细输出的任务。
经典论文:
- Fan, A., et al. (2018). “Hierarchical Neural Story Generation.” ACL. [论文链接]
技术要点:
- Customise the automated reply to a customer email
- Remind the model to use details from the customer’s email
- Use Temperature to control the answer: more creative or stable
Temperature参数详解
Temperature是控制LLM输出随机性的关键参数,源自统计力学和模拟退火算法。
经典论文:
- Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). “A Learning Algorithm for Boltzmann Machines.” Cognitive Science. - 模拟退火理论基础
- Ficler, J., & Goldberg, Y. (2017). “Controlling Linguistic Style Aspects in Neural Language Generation.” EMNLP. [论文链接]
7. Chatbot(聊天机器人)
聊天机器人是LLM最直接的应用形式。
经典论文:
- Vinyals, O., & Le, Q. (2015). “A Neural Conversational Model.” ICML Deep Learning Workshop. [论文链接]
- Zhang, Y., et al. (2020). “DIALOGPT: Large-Scale Generative Pre-training for Conversational Response Generation.” ACL. [论文链接]
三种角色(Three Roles)
现代聊天机器人系统通常采用三种角色架构:
- System: 定义AI助手的整体行为和人设
- User: 用户输入
- Assistant: 模型回复
Chatbot will obey system roles firstly, so we can use this to finish specific tasks.
## example
messages = [
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},
{'role':'user', 'content':'tell me a joke'},
{'role':'assistant', 'content':'Why did the chicken cross the road'},
{'role':'user', 'content':'I don\'t know'} ]
Context(上下文)
Context: all messages including 3 roles are needed for the next response of chatbot.
上下文管理是聊天机器人系统的核心技术挑战,涉及如何有效利用历史对话信息。
经典论文:
- Bae, S., et al. (2022). “Keep Me Updated! Memory Management in Long-term Conversations.” EMNLP. [论文链接]