Prompt Engineering

https://www.coursera.org/learn/chatgpt-prompt-engineering-for-developers-project/home/welcome

Prompt Engineering（提示工程）是设计和优化输入提示（prompts）以引导大语言模型生成期望输出的技术。这一领域随着GPT-3的发布而兴起，是连接用户意图与模型能力的关键桥梁。

经典论文：

Brown, T. B., et al. (2020). “Language Models are Few-Shot Learners.” NeurIPS. [论文链接] - GPT-3论文，首次系统展示了prompt engineering的潜力
Wei, J., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” NeurIPS. [论文链接]

1. Prompting Principles（提示原则）

Principle 1: Write clear and specific instructions（编写清晰具体的指令）

清晰性（Clarity）和具体性（Specificity）是prompt设计的核心原则。研究表明，模糊的指令会导致模型输出不确定性增加，而具体的指令能显著提升输出质量。

Tactic 1: Use delimiters to clearly indicate distinct parts of the input

分隔符（如 ```、"""、—、<>）能帮助模型识别输入的不同部分，避免混淆。这在处理包含多段文本或代码的任务时尤为重要。
Tactic 2: Ask for a structured output

要求结构化输出（如JSON、HTML、Markdown表格）便于后续程序处理，是构建AI应用的最佳实践。
Tactic 3: Ask the model to check whether conditions are satisfied

让模型先验证条件再执行任务，可以减少错误输出。
Tactic 4: “Few-shot” prompting - give examples

Few-shot prompting通过提供少量示例来引导模型理解任务格式和期望输出。

经典论文：
- Brown, T. B., et al. (2020). “Language Models are Few-Shot Learners.” NeurIPS. [论文链接] - 首次提出few-shot learning概念
- Min, S., et al. (2022). “Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?” EMNLP. [论文链接]

Principle 2: Give the model time to “think”（给模型思考时间）

这一原则源于Chain-of-Thought（思维链）研究，让模型通过逐步推理来解决复杂问题。

经典论文：

Wei, J., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” NeurIPS. [论文链接] - CoT开山之作
Kojima, T., et al. (2022). “Large Language Models are Zero-Shot Reasoners.” NeurIPS. [论文链接] - “Let’s think step by step"的起源
Wang, X., et al. (2022). “Self-Consistency Improves Chain of Thought Reasoning in Language Models.” ICLR. [论文链接]
Tactic 1: Specify the steps required to complete a task

明确列出任务步骤，引导模型按步骤执行。
Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion

让模型先独立推导答案再给出结论，避免直接猜测。

About Hallucinations（关于幻觉）

Hallucination（幻觉）是指模型生成看似合理但实际上错误或虚构的信息。

经典论文：

Maynez, J., et al. (2020). “On Faithfulness and Factuality in Abstractive Summarization.” ACL. [论文链接]
Zhang, Y., et al. (2023). “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models.” arXiv. [论文链接]

缓解策略：

Ask model to find relative information first and answer questions based on the information
要求模型标注信息来源
使用RAG系统提供可靠上下文

2. Iterative Prompt Development（迭代式提示开发）

提示工程是一个迭代过程，需要不断测试和优化。

经典方法论：

Zhou, Y., et al. (2022). “Large Language Models Are Human-Level Prompt Engineers.” ICLR. [论文链接] - APE (Automatic Prompt Engineer)

迭代流程：

Try something（尝试初始prompt）
Analyze where the result does not give what you want（分析失败原因）
Clarify instructions, give more time to think（优化指令）
Refine prompts with a batch of examples（用批量示例优化）

3. Summary（摘要任务）

文本摘要是最常见的NLP任务之一，LLM在此任务上表现优异。

经典论文：

See, A., et al. (2017). “Get To The Point: Summarization with Pointer-Generator Networks.” ACL. [论文链接]
Liu, Y., & Lapata, M. (2019). “Text Summarization with Pretrained Encoders.” EMNLP. [论文链接]

技术要点：

Summarise with a word/sentence/character limit（限制长度）
Summarise with a focus on certain topics such as shipping and delivery（聚焦特定主题）
Try “extract” instead of “summarise”: for certain topic you want to see（提取而非摘要）
Summarise multiple product reviews（多文档摘要）

4. Inferring（推理任务）

LLMs are pretty good at extracting specific things out of a piece of text.

LLM在文本推理任务上表现出色，能够从非结构化文本中提取结构化信息。

经典论文：

Radford, A., et al. (2019). “Language Models are Unsupervised Multitask Learners.” - GPT-2论文，展示zero-shot能力 [论文链接]

应用场景：

Sentiment: positive or negative（情感分析）

经典论文：Socher, R., et al. (2013). “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank.” EMNLP.
Identity types of emotions: happy, grateful, anger etc（情绪分类）

经典论文：Mohammad, S. M., & Turney, P. D. (2013). “Crowdsourcing a Word-Emotion Association Lexicon.” Computational Intelligence.
Identify specific emotion: such as ‘is the writer expressing anger?’（特定情绪识别）
Extract product and company name from customer reviews（命名实体识别）

经典论文：Nadeau, D., & Sekine, S. (2007). “A survey of named entity recognition and classification.” Lingvisticae Investigationes.

💡
Format your response as a JSON object with “Item” and “Brand” as the keys. If the information isn’t present, use “unknown” as the value. Make your response as short as possible.
Infer some topics, ask LLM give answer in JSON format（主题推理）

经典论文：Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). “Latent Dirichlet Allocation.” JMLR. - LDA主题模型开山之作
Make a news alert for certain topics: Zero-Shot learning（零样本学习）

经典论文：
- Radford, A., et al. (2021). “Learning Transferable Visual Models From Natural Language Supervision.” ICML. - CLIP论文 [论文链接]
- Xian, Y., et al. (2017). “Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly.” TPAMI. [论文链接]

5. Transforming（转换任务）

LLM擅长各种文本转换任务，包括语言翻译、风格转换、格式转换等。

经典论文：

Vaswani, A., et al. (2017). “Attention Is All You Need.” NeurIPS. [论文链接] - Transformer开山之作，奠定现代NLP基础

转换类型：

Language（语言翻译）

经典论文：Wu, Y., et al. (2016). “Google’s Neural Machine Translation System.” arXiv. [论文链接]
Tone: informal to formal（语气转换）
Format Conversion: JSON to HTML（格式转换）
Spellcheck/Grammar check or given style such as APA style（语法检查）

经典论文：Yuan, Z., et al. (2021). “Synthesizing Coherent Story with Generative Pre-trained Transformer.” IJCAI.

6. Expanding（扩展任务）

文本扩展是将简短输入扩展为更详细输出的任务。

经典论文：

Fan, A., et al. (2018). “Hierarchical Neural Story Generation.” ACL. [论文链接]

技术要点：

Customise the automated reply to a customer email
Remind the model to use details from the customer’s email
Use Temperature to control the answer: more creative or stable

Temperature参数详解

Temperature是控制LLM输出随机性的关键参数，源自统计力学和模拟退火算法。

经典论文：

Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). “A Learning Algorithm for Boltzmann Machines.” Cognitive Science. - 模拟退火理论基础
Ficler, J., & Goldberg, Y. (2017). “Controlling Linguistic Style Aspects in Neural Language Generation.” EMNLP. [论文链接]

💡

Temperature 实际上是在 调整 softmax 概率分布。

Temperature	效果	应用场景
T < 1	概率差距变大（更确定）	stable, for RAG system, set T = 0, many chatbot use 0.7 as default
T = 1	原始分布	标准生成
T > 1	概率更平均（更随机）	creative

数学原理： $$P(x_i) = \frac{\exp(z_i/T)}{\sum_j \exp(z_j/T)}$$

7. Chatbot（聊天机器人）

聊天机器人是LLM最直接的应用形式。

经典论文：

Vinyals, O., & Le, Q. (2015). “A Neural Conversational Model.” ICML Deep Learning Workshop. [论文链接]
Zhang, Y., et al. (2020). “DIALOGPT: Large-Scale Generative Pre-training for Conversational Response Generation.” ACL. [论文链接]

三种角色（Three Roles）

现代聊天机器人系统通常采用三种角色架构：

System: 定义AI助手的整体行为和人设
User: 用户输入
Assistant: 模型回复

Chatbot will obey system roles firstly, so we can use this to finish specific tasks.

## example
messages =  [  
{'role':'system', 'content':'You are an assistant that speaks like Shakespeare.'},    
{'role':'user', 'content':'tell me a joke'},   
{'role':'assistant', 'content':'Why did the chicken cross the road'},   
{'role':'user', 'content':'I don\'t know'}  ]

Context（上下文）

Context: all messages including 3 roles are needed for the next response of chatbot.

上下文管理是聊天机器人系统的核心技术挑战，涉及如何有效利用历史对话信息。

经典论文：

Bae, S., et al. (2022). “Keep Me Updated! Memory Management in Long-term Conversations.” EMNLP. [论文链接]

Prompt Engineering#

1. Prompting Principles（提示原则）#

Principle 1: Write clear and specific instructions（编写清晰具体的指令）#

Principle 2: Give the model time to “think”（给模型思考时间）#

About Hallucinations（关于幻觉）#

2. Iterative Prompt Development（迭代式提示开发）#

3. Summary（摘要任务）#

4. Inferring（推理任务）#

5. Transforming（转换任务）#

6. Expanding（扩展任务）#

Temperature参数详解#

7. Chatbot（聊天机器人）#

三种角色（Three Roles）#

Context（上下文）#

Prompt Engineering

1. Prompting Principles（提示原则）

Principle 1: Write clear and specific instructions（编写清晰具体的指令）

Principle 2: Give the model time to “think”（给模型思考时间）

About Hallucinations（关于幻觉）

2. Iterative Prompt Development（迭代式提示开发）

3. Summary（摘要任务）

4. Inferring（推理任务）

5. Transforming（转换任务）

6. Expanding（扩展任务）

Temperature参数详解

7. Chatbot（聊天机器人）

三种角色（Three Roles）

Context（上下文）