国外设计素材网站免费_建设工程合同纠纷案例_seo基础入门教程_网店推广方案范文

代码

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator, ValidationInfo# Initialize the OpenAI client with Instructor
client = instructor.from_openai(OpenAI(api_key = "your api key",base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"))class Label(BaseModel):chunk_id: str = Field(description="The unique identifier of the text chunk")chain_of_thought: str = Field(description="The reasoning process used to evaluate the relevance")relevancy: int = Field(description="Relevancy score from 0 to 10, where 10 is most relevant",ge=0,le=10,)@field_validator("chunk_id")@classmethoddef validate_chunk_id(cls, v: str, info: ValidationInfo) -> str:context = info.contextchunks = context.get("chunks", [])if v not in [chunk["id"] for chunk in chunks]:raise ValueError(f"Chunk with id {v} not found, must be one of {[chunk['id'] for chunk in chunks]}")return vclass RerankedResults(BaseModel):labels: list[Label] = Field(description="List of labeled and ranked chunks")@field_validator("labels")@classmethoddef model_validate(cls, v: list[Label]) -> list[Label]:return sorted(v, key=lambda x: x.relevancy, reverse=True)def rerank_results(query: str, chunks: list[dict]) -> RerankedResults:return client.chat.completions.create(model="qwen-turbo",response_model=RerankedResults,messages=[{"role": "system","content": """You are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.For each chunk:1. Analyze its content in relation to the query.2. Provide a chain of thought explaining your reasoning.3. Assign a relevancy score from 0 to 10, where 10 is most relevant.Be objective and consistent in your evaluations.""",},{"role": "user","content": """<query>{{ query }}</query><chunks_to_rank>{% for chunk in chunks %}<chunk chunk_id="{{ chunk.id }}">{{ chunk.text }}</chunk>{% endfor %}</chunks_to_rank>Please provide a RerankedResults object with a Label for each chunk.""",},],context={"query": query, "chunks": chunks},)

代码解释

1. 导入和初始化

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator, ValidationInfoclient = instructor.from_openai(OpenAI(...))

使用 instructor 增强 OpenAI 功能
使用 Pydantic 进行数据验证和序列化

2. Label 类定义

class Label(BaseModel):chunk_id: str = Field(...)chain_of_thought: str = Field(...)relevancy: int = Field(..., ge=0, le=10)

定义了文本块的标签模型：

chunk_id: 文本块的唯一标识符
chain_of_thought: 相关性评估的推理过程
relevancy: 0-10的相关性得分

包含了一个验证器：

@field_validator("chunk_id")
def validate_chunk_id(cls, v: str, info: ValidationInfo) -> str:

确保 chunk_id 存在于输入的文本块列表中

3. RerankedResults 类

class RerankedResults(BaseModel):labels: list[Label]

存储所有标签的容器类
包含一个验证器，按相关性得分降序排序结果

4. 重排序函数

def rerank_results(query: str, chunks: list[dict]) -> RerankedResults:

核心功能：

接收查询和文本块列表
使用 AI 模型评估相关性
返回排序后的结果

系统提示设置：

定义 AI 为专家排序系统
提供评估标准和打分规则

用户提示模板：

使用 Jinja2 模板语法
动态插入查询和文本块
格式化为结构化的 XML 格式

这个系统的主要用途：

智能文本相关性排序
提供透明的推理过程
确保结果的一致性和可验证性

示例

def main():# Sample query and chunksquery = "What are the health benefits of regular exercise?"chunks = [{"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890","text": "Regular exercise can improve cardiovascular health and reduce the risk of heart disease.",},{"id": "b2c3d4e5-f6g7-8901-bcde-fg2345678901","text": "The price of gym memberships varies widely depending on location and facilities.",},{"id": "c3d4e5f6-g7h8-9012-cdef-gh3456789012","text": "Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.",},{"id": "d4e5f6g7-h8i9-0123-defg-hi4567890123","text": "Proper nutrition is essential for maintaining a healthy lifestyle.",},{"id": "e5f6g7h8-i9j0-1234-efgh-ij5678901234","text": "Strength training can increase muscle mass and improve bone density, especially important as we age.",},]# Rerank the resultsresults = rerank_results(query, chunks)# Print the reranked resultsprint("Reranked results:")for label in results.labels:print(f"Chunk {label.chunk_id} (Relevancy: {label.relevancy}):")print(f"Text: {next(chunk['text'] for chunk in chunks if chunk['id'] == label.chunk_id)}")print(f"Reasoning: {label.chain_of_thought}")print()main()

Reranked results:
Chunk a1b2c3d4-e5f6-7890-abcd-ef1234567890 (Relevancy: 10):
Text: Regular exercise can improve cardiovascular health and reduce the risk of heart disease.
Reasoning: This chunk directly discusses the health benefits of exercise, specifically improving cardiovascular health and reducing heart disease risk.Chunk c3d4e5f6-g7h8-9012-cdef-gh3456789012 (Relevancy: 8):
Text: Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.
Reasoning: This chunk talks about how exercise can boost mood and reduce symptoms of depression and anxiety, which are health benefits.Chunk e5f6g7h8-i9j0-1234-efgh-ij5678901234 (Relevancy: 7):
Text: Strength training can increase muscle mass and improve bone density, especially important as we age.
Reasoning: Strength training's effects on muscle mass and bone density are health benefits associated with exercise.Chunk d4e5f6g7-h8i9-0123-defg-hi4567890123 (Relevancy: 2):
Text: Proper nutrition is essential for maintaining a healthy lifestyle.
Reasoning: While nutrition is important, this chunk does not discuss the health benefits of exercise itself.Chunk b2c3d4e5-f6g7-8901-bcde-fg2345678901 (Relevancy: 0):
Text: The price of gym memberships varies widely depending on location and facilities.
Reasoning: This chunk is about gym membership prices, which is unrelated to the health benefits of exercise.

类似例子

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator, ValidationInfo# 初始化 OpenAI 客户端
client = instructor.from_openai(OpenAI(api_key = "your api key",base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"))class ReviewLabel(BaseModel):review_id: str = Field(description="评论的唯一标识符")chain_of_thought: str = Field(description="评估相关性的推理过程")relevancy: int = Field(description="相关性得分，0-10分，10分最相关",ge=0,le=10,)@field_validator("review_id")@classmethoddef validate_review_id(cls, v: str, info: ValidationInfo) -> str:context = info.contextreviews = context.get("reviews", [])if v not in [review["id"] for review in reviews]:raise ValueError(f"找不到ID为 {v} 的评论，必须是以下ID之一: {[review['id'] for review in reviews]}")return vclass RankedReviews(BaseModel):labels: list[ReviewLabel] = Field(description="已标记和排序的评论列表")@field_validator("labels")@classmethoddef model_validate(cls, v: list[ReviewLabel]) -> list[ReviewLabel]:return sorted(v, key=lambda x: x.relevancy, reverse=True)def rank_reviews(movie_title: str, reviews: list[dict]) -> RankedReviews:return client.chat.completions.create(model="qwen-turbo",response_model=RankedReviews,messages=[{"role": "system","content": """你是一个专业的电影评论分析专家。你的任务是评估每条评论与给定电影的相关性，并给出相关性得分。对每条评论：1. 分析评论内容与电影的相关程度2. 提供推理过程说明你的评分理由3. 给出0-10的相关性得分，10分表示最相关请保持客观和一致性。""",},{"role": "user","content": """<movie>{{ movie_title }}</movie><reviews_to_rank>{% for review in reviews %}<review review_id="{{ review.id }}">{{ review.text }}</review>{% endfor %}</reviews_to_rank>请提供一个包含每条评论标签的RankedReviews对象。""",},],context={"movie_title": movie_title, "reviews": reviews},)def main():# 示例数据movie_title = "泰坦尼克号"reviews = [{"id": "rev001","text": "这部电影完美展现了泰坦尼克号的悲剧，演员表演令人动容。",},{"id": "rev002","text": "最近电影票价格上涨了不少，看电影越来越贵了。",},{"id": "rev003","text": "Jack和Rose的爱情故事让人难忘，经典场景依然令人感动。",},{"id": "rev004","text": "这家电影院的爆米花很好吃，推荐尝试。",},{"id": "rev005","text": "电影的特效和场景还原都很精良，展现了那个年代的奢华。",},]# 对评论进行排序results = rank_reviews(movie_title, reviews)# 打印排序结果print("评论排序结果:")for label in results.labels:print(f"评论 {label.review_id} (相关性得分: {label.relevancy}):")print(f"内容: {next(review['text'] for review in reviews if review['id'] == label.review_id)}")print(f"推理过程: {label.chain_of_thought}")print()main()

评论排序结果:
评论 rev001 (相关性得分: 10):
内容: 这部电影完美展现了泰坦尼克号的悲剧，演员表演令人动容。
推理过程: 评论直接提到电影《泰坦尼克号》，并赞扬其悲剧展现和演员表演，明显与电影高度相关。评论 rev003 (相关性得分: 9):
内容: Jack和Rose的爱情故事让人难忘，经典场景依然令人感动。
推理过程: 评论聚焦于电影中的爱情故事和经典场景，与《泰坦尼克号》的主题紧密相关。评论 rev005 (相关性得分: 8):
内容: 电影的特效和场景还原都很精良，展现了那个年代的奢华。
推理过程: 评论称赞电影的特效和场景还原，这与《泰坦尼克号》的内容直接相关。评论 rev002 (相关性得分: 2):
内容: 最近电影票价格上涨了不少，看电影越来越贵了。
推理过程: 评论讨论的是电影票价上涨的问题，与具体电影《泰坦尼克号》无关，因此相关性较低。评论 rev004 (相关性得分: 1):
内容: 这家电影院的爆米花很好吃，推荐尝试。
推理过程: 评论谈论的是电影院的爆米花，与电影本身无直接关系，因此相关性很低。

例子中的jinjia模板语法

例子中用到Jinja 模板语法的核心概念：

变量

{{ 变量名 }}

用于在模板中插入变量值，例如：

"你好，{{ username }}"  # 如果 username = "小明"，输出: "你好，小明"

2. 控制结构

条件语句

{% if 条件 %}内容1
{% else %}内容2
{% endif %}

循环语句

{% for item in items %}{{ item }}
{% endfor %}

Jinja 模板的主要优势：

代码复用
逻辑与展示分离
动态内容生成
安全性（自动转义）
灵活的扩展性

这些特性使得 Jinja2 成为 Python 生态系统中最流行的模板引擎之一。

例子1:

from instructor.templating import handle_templating
from instructor.mode import Mode
# 输入参数示例
kwargs = {"messages": [{"role": "system","content": "你是一个专业的{{ domain }}助手"},{"role": "user","content": "请分析关于{{ topic }}的问题"}]
}mode = Mode.TOOLS  # 使用 OpenAI 格式context = {"domain": "医疗","topic": "心脏病预防"
}# 调用函数
result = handle_templating(kwargs, mode, context)# 输出结果
print(result)

{'messages': [{'role': 'system', 'content': '你是一个专业的医疗助手'}, {'role': 'user', 'content': '请分析关于心脏病预防的问题'}]}

例子2:

query = "What are the health benefits of regular exercise?"
chunks = [{"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890","text": "Regular exercise can improve cardiovascular health and reduce the risk of heart disease.",},{"id": "b2c3d4e5-f6g7-8901-bcde-fg2345678901","text": "The price of gym memberships varies widely depending on location and facilities.",},{"id": "c3d4e5f6-g7h8-9012-cdef-gh3456789012","text": "Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.",},{"id": "d4e5f6g7-h8i9-0123-defg-hi4567890123","text": "Proper nutrition is essential for maintaining a healthy lifestyle.",},{"id": "e5f6g7h8-i9j0-1234-efgh-ij5678901234","text": "Strength training can increase muscle mass and improve bone density, especially important as we age.",},
]kwargs = {"messages": [{"role": "system","content": """You are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.For each chunk:1. Analyze its content in relation to the query.2. Provide a chain of thought explaining your reasoning.3. Assign a relevancy score from 0 to 10, where 10 is most relevant.Be objective and consistent in your evaluations.""",},{"role": "user","content": """<query>{{ query }}</query><chunks_to_rank>{% for chunk in chunks %}<chunk chunk_id="{{ chunk.id }}">{{ chunk.text }}</chunk>{% endfor %}</chunks_to_rank>Please provide a RerankedResults object with a Label for each chunk.""",},]
}context={"query": query, "chunks": chunks}mode = Mode.TOOLS  # 使用 OpenAI 格式# 调用函数
handle_templating(kwargs, mode, context)

{'messages': [{'role': 'system','content': '\nYou are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.\n\nFor each chunk:\n1. Analyze its content in relation to the query.\n2. Provide a chain of thought explaining your reasoning.\n3. Assign a relevancy score from 0 to 10, where 10 is most relevant.\n\nBe objective and consistent in your evaluations.\n'},{'role': 'user','content': '\n<query>What are the health benefits of regular exercise?</query>\n\n<chunks_to_rank>\n\n<chunk chunk_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890">\n    Regular exercise can improve cardiovascular health and reduce the risk of heart disease.\n</chunk>\n\n<chunk chunk_id="b2c3d4e5-f6g7-8901-bcde-fg2345678901">\n    The price of gym memberships varies widely depending on location and facilities.\n</chunk>\n\n<chunk chunk_id="c3d4e5f6-g7h8-9012-cdef-gh3456789012">\n    Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.\n</chunk>\n\n<chunk chunk_id="d4e5f6g7-h8i9-0123-defg-hi4567890123">\n    Proper nutrition is essential for maintaining a healthy lifestyle.\n</chunk>\n\n<chunk chunk_id="e5f6g7h8-i9j0-1234-efgh-ij5678901234">\n    Strength training can increase muscle mass and improve bone density, especially important as we age.\n</chunk>\n\n</chunks_to_rank>\n\nPlease provide a RerankedResults object with a Label for each chunk.\n'}]}

参考链接：https://github.com/instructor-ai/instructor/tree/main