【LLM大模型】如何在LlamaIndex中使用RAG?-育师

如何在LlamaIndex中使用RAG

什么是 Llama-Index

LlamaIndex是一个数据框架，用于帮助基于LLM的应用程序摄取、构建结构和访问私有或特定领域的数据。

如何使用 Llama-Index ?

基本用法是一个五步流程，将我们从原始、非结构化数据导向基于该数据生成内容的LLM。

1. 加载文档
1. 解析文档到 LlamaIndex 的 Node 节点中
1. 构建索引
1. 解析索引
1. 解析响应

安装需要的依赖包

yaml !pip install llama-index -qU !pip install -q openai !pip install pypdf !pip install doc2txt !pip install -qU llama-cpp-python !pip install transformers !pip install accelerate

导入需要的依赖

python import os import openai from getpass import getpass # import logging import sys from pprint import pprint # logging.basicConfig(stream=sys.stdout, level=logging.INFO) logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout)) # from llama_index import(VectorStoreIndex, SimpleDirectoryReader, load_index_from_storage, StorageContext, ServiceContext, Document)  from llama_index.llms import OpenAI,HuggingFaceLLM from llama_index.prompts import PromptTemplate from llama_index.text_splitter import SentenceSplitter from llama_index.embeddings import OpenAIEmbedding,HuggingFaceEmbedding from llama_index.schema import MetadataMode from llama_index.postprocessor import MetadataReplacementPostProcessor

什么是`Document`?

Document是一个容器，用来保存来自各种来源的数据，比如PDF、API输出或从数据库中检索到的数据。

scss documents = SimpleDirectoryReader('./Data/').load_data() print(len(documents)) pprint(documents)

加载完成后，这个 pdf 被转换为长度为 12 的数组.

sql documents[0].get_content()  ## Response Face Recognition System Using Python This article was published as a part of the Data Science Blogathon. Introduction Face recognition is different from face detection. In face detection, we had only detected the location of human faces, and we recognized the identity of faces in the face recognition task. In this article, we are going to build a face recognition system using python with the help of face recognition library . There are many algorithms available in the market for face recognition. This broad computer vision challenge is detecting faces from videos and pictures. Many applications can be built on top of recognition systems. Many big companies are adopting recognition systems for their security and authentication purposes. Use Cases of Recognition Systems Face recognition systems are widely used in the modern era, and many new innovative systems are built on top of recognition systems. There are a few used cases : Finding Missing Person Identifying accounts on social media Recognizing Drivers in Cars School Attendance System Several methods and algorithms implement facial recognition systems depending on the performance and accuracy. Traditional Face Recognition Algorithm Traditional face recognition algorithms don’t meet modern-day’s facial recognition standards. They were designed to recognize faces using old conventional algorithms. OpenCV provides some traditional facial Recognition Algorithms. Eigenfaces Scale Invariant Feature Transform (SIFT) Fisher faces Local Binary Patterns Histograms (LBPH) COMPUTER VISION IMAGE ANALYSIS INTERMEDIATE PYTHON documents[0].metadata  ## 响应 {'file_path': 'Data/chinahistory.txt', 'file_name': 'chinahistory.txt', 'file_type': 'text/plain', 'file_size': 977274, 'creation_date': '2023-12-18', 'last_modified_date': '2023-12-05', 'last_accessed_date': '2023-12-18'}

设置 llm

ini from llama_index.llms import HuggingFaceLLM from llama_index.prompts import PromptTemplate llm = HuggingFaceLLM( model_name="HuggingFaceH4/zephyr-7b-beta", tokenizer_name="HuggingFaceH4/zephyr-7b-beta", #query_wrapper_prompt=PromptTemplate("<|system|>Please check if the following pieces of context has any mention of the keywords provided in the question.If not ten say that you do not know the answer.Please do not make up your own answer.</s>\n<|user|>\nQuestion:{query_str}</s>\n<|assistant|>\n"), # query_wrapper_prompt=PromptTemplate(template), context_window=4096, max_new_tokens=512, model_kwargs={'trust_remote_code':True}, generate_kwargs={"temperature": 0.0}, device_map="auto",)

配置 embedding Model

javascript from llama_index.embeddings import resolve_embed_model from llama_index.embeddings.huggingface import HuggingFaceEmbedding #embed_model = resolve_embed_model("local:BAAI/bge-large-en-v1.5") embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5")

在LlamaIndex中`Node`是什么

LlamaIndex中的Node对象表示源文档的“块”或部分。

这可能是一个文本块、一幅图像或其他类型的数据。类似于Documents，Nodes也包含与其他节点的元数据和关系信息。

在LlamaIndex中，Nodes被认为是一等公民。

这意味着可以直接定义Nodes及其所有属性。

或者也可以使用NodeParser类将源Document解析为Node。默认情况下，从文档派生的每个节点都会继承相同的元数据。例如，文档中的file_name字段会传播到每个节点。

视特定的使用情况和数据结构，选择将整个Document对象发送到索引还是在索引之前将Document转换为Node对象取决于自己。

将整个Document对象发送至索引：这种方法适用于将整个文档作为单个单位进行维护。当您的文档相对较短或不同部分之间的上下文重要时这可能会更好。
在索引之前将Document转换为Node对象：当的文档很长且希望在索引之前将其拆分成较小块（或节点）时，这种方法很实用。当想要检索文档特定部分而非整个文档时这可能会更好。

节点解析和索引化（基于句子窗口方法）

SentenceWindowNodeParser类旨在将文档解析为节点（句子），并为每个节点捕获周围句子的窗口。

这对于上下文感知的文本处理非常有用，通过理解句子周围的背景可以提供有价值的见解。

Node：表示文本的单元，这里指一句话。
Window：围绕特定句子的若干句组成的范围。例如，如果窗口大小为3，并且当前句是第5句，则该窗口会捕获第2至第8句。
Metadata：与节点相关联的额外信息，如周围句子的窗口。

工作机制

当我们使用from_defaults方法创建一个SentenceWindowNodeParser实例时，使用了custom_sentence_splitter（根据"\n","\n-", 或"\n"分隔文本）以及指定的参数(window_size=3,include_prev_next_rel=True,include_metadata=True)，我们将设置一个解析器来按照以下方式处理文档：

每个文档的文本将使用自定义分隔符分为句子。
对于每个句子，生成一个节点。
该节点将包含捕获其两侧三个句子的元数据。
此外，每个节点还会引用其前后的句子。
使用一个文档列表调用get_nodes_from_documents将返回一组这些节点，每个代表一个句子，丰富了指定的元数据和关系。

python #create senetence window node parser with default settings from llama_index.node_parser import SentenceWindowNodeParser,SimpleNodeParser sentence_node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text") #base_node_parser = SentenceSplitter(llm=llm) base_node_parser = SimpleNodeParser() # nodes = sentence_node_parser.get_nodes_from_documents(documents) base_nodes = base_node_parser.get_nodes_from_documents(documents) # print(f"SENTENCE NODES :\n {nodes[10]}") print(f"BASE NODES :\n {base_nodes[10]}") SENTENCE NODES : Node ID: 8418b939-dc08-42a6-8ee1-821e46f7a2a1 Text: Traditional Face Recognition Algorithm Traditional face recognition algorithms don’t meet modern-day’s facial recognition standards. BASE NODES : Node ID: 7a94495b-2f49-4cc4-8fd4-87f5fb0f645e Text: Now let’s test the model prediction using text in different languages. def predict(text): x = cv.transform([text]).toarray() # converting text to bag of words model (Vector) lang = model.predict(x) # predicting the language lang = le.inverse_transform(lang) # finding the language corresponding the the predicted value print("The langauge is in",l... dict(nodes[10]) # 由于没有执行索引操作，因此embedding为 None。  #### {'id_': '8418b939-dc08-42a6-8ee1-821e46f7a2a1', 'embedding': None, 'metadata': {'window': 'Many big companies are adopting recognition systems for their security and authentication\npurposes.\n Use Cases of Recognition Systems\nFace recognition systems are widely used in the modern era, and many new innovative systems are built on\ntop of recognition systems.\n There are a few used cases :\nFinding Missing Person\nIdentifying accounts on social media\nRecognizing Drivers in Cars\nSchool Attendance System\nSeveral methods and algorithms implement facial recognition systems depending on the performance and\naccuracy.\n Traditional Face Recognition Algorithm\nTraditional face recognition algorithms don’t meet modern-day’s facial recognition standards. They were\ndesigned to recognize faces using old conventional algorithms.\n OpenCV provides some traditional facial Recognition Algorithms.\n', 'original_text': 'Traditional Face Recognition Algorithm\nTraditional face recognition algorithms don’t meet modern-day’s facial recognition standards. ', 'page_label': '1', 'file_name': 'face-recognition-system-using-python.pdf', 'file_path': 'Data/face-recognition-system-using-python.pdf', 'file_type': 'application/pdf', 'file_size': 465666, 'creation_date': '2023-12-21', 'last_modified_date': '2023-12-21', 'last_accessed_date': '2023-12-21'}, 'excluded_embed_metadata_keys': ['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date', 'window', 'original_text'], 'excluded_llm_metadata_keys': ['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date', 'window', '.......' }

`LlamaIndex`中的`IndexNode`是什么？

IndexNode是在LlamaIndex中使用的节点对象。

它代表了存储在索引中的原始文档块。索引是一种数据结构，允许快速检索与用户查询相关的上下文，这对于“检索增强生成”（RAG）用例至关重要。

从根本上讲，“IndexNode”继承自“TextNode”的属性，意味着它主要代表文本内容。

IndexNode的区别特征在于其index_id属性。这个index_id充当一个唯一标识符或对另一个对象的引用，使得节点能够指向系统内的其他实体。

这种引用功能在文本内容之上增加了一层连接性和关联信息。

例如，在递归检索和节点引用的背景下，较小的块（表示为IndexNode对象）可以指向更大的父块。在查询时会检索较小的块，但会跟踪对更大块的引用。这样可以提供更多合成的背景信息。

LlamaIndex 中的 ServiceContext 是什么？

ServiceContext是在LlamaIndex管道/应用程序的索引和查询阶段中经常使用的资源包。

ini ctx_sentence = ServiceContext.from_defaults( llm=llm, embed_model=embed_model, node_parser=nodes) # 以上内容已经包含了SentenceWindowNodeParser # ctx_base = ServiceContext.from_defaults( llm=llm, embed_model=embed_model, node_parser=base_nodes)

`LlamaIndex`中的`VectorStoreIndex`是什么？

在 LlamaIndex 中，VectorStoreIndex是一种索引类型，它使用文本的向量表示以实现有效检索相关上下文。

它构建在VectorStore之上，后者是一种存储向量并允许快速最近邻搜索的数据结构。

VectorStoreIndex接收IndexNode对象，这些对象代表了原始文档的块。

它使用一个嵌入模型（在ServiceContext中指定）将这些节点的文本内容转换成向量表示。然后这些向量被存储在VectorStore中。

在查询时，VectorStoreIndex可以快速检索出针对特定查询最相关的节点。它通过使用相同的嵌入模型将查询转换为向量，然后在VectorStore中执行最近邻搜索来实现这一点。

ini sentence_index = VectorStoreIndex( nodes, service_context=ctx_sentence) base_index = VectorStoreIndex( base_nodes, service_context=ctx_base)

在`LlamaIndex`中，`RetrieverQueryEngine`是什么？

LlamaIndex中的RetrieverQueryEngine是一种查询引擎，它使用一个检索器从索引中获取相关的上下文，给定用户查询。

它主要用于和检索器一起工作，比如从VectorStoreIndex创建的VectorStoreRetriever。

RetrieverQueryEngine接受一个检索器和一个响应合成器作为输入。检索器负责从索引中获取相关的IndexNode对象，而响应合成器则根据检索到的节点和用户查询生成自然语言响应。

`LlamaIndex`中的`MetadataReplacementPostProcessor`是什么？

MetadataReplacementPostProcessor 用于将节点内容替换为节点元数据中的字段。如果元数据中不存在该字段，则节点文本保持不变。与SentenceWindowNodeParser结合使用时效果最佳。

ini from llama_index.indices.postprocessor import MetadataReplacementPostProcessor sentence_query_engine = sentence_index.as_query_engine( similarity_top_k=5, verbose=True, node_postprocessor=[ MetadataReplacementPostProcessor("window") ], )  # base_query_engine = base_index.as_query_engine( similarity_top_k=5, verbose=True, node_postprocessor=[ MetadataReplacementPostProcessor("window") ], )

运行查询以获取句子窗口解析器查询引擎

arduino query ="使用Python检测图像中的人脸的示例代码。" response = sentence_query_engine.query(query) from IPython.display import display,Markdown display(Markdown(f"<b>{response}</b>"))

生成的响应

这里是一个使用Python和OpenCV库来检测图像中人脸的示例代码：

ini import cv2 import numpy as np # Load the pre-trained face detection model face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')# Load the image img = cv2.imread('image.jpg')# Convert the image to grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)# Detect faces in the grayscale image faces = face_cascade.detectMultiScale(gray, 1.2, 5)# Draw a rectangle around each face for (x, y, w, h) in faces: cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)# Display the image with the detected faces cv2.imshow('Face Detection', img) cv2.waitKey(0) cv2.destroyAllWindows()

在这段代码中，我们首先使用OpenCV的**CascadeClassifier函数加载预训练的人脸检测模型。然后加载图像，将其转换为灰度，并将其传递给人脸检测模型的detectMultiScale函数以检测人脸。然后使用OpenCV的rectangle函数在每张人脸周围绘制矩形。最后，我们使用OpenCV的imshow**函数显示带有检测到的人脸的图像。

请确保将haarcascade_frontalface_default.xml替换为您预训练的人脸检测模型的路径。

为基节点分析器运行查询查询引擎

css response = base_query_engine.query(query) # display(Markdown(f"<b>{response}</b>"))

ini import cv2 import numpy as np  img = cv2.imread('image.jpg') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml') faces = face_cascade.detectMultiScale(gray, 1.2, 5)  for (x, y, w, h) in faces: cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)  cv2.imshow('img', img) cv2.waitKey(0) cv2.destroyAllWindows()

此代码使用Haar级联算法在图像中检测面部。haarcascade_frontalface_default.xml文件包含用于面部检测的训练分类器。detectMultiScale()函数用于以一定比例因子和最小邻域尺寸检测图像中的多个面部。然后，使用rectangle()函数在原始图像上将检测到的面部画成矩形。imshow()函数用于显示图像，而waitKey()函数则用于等待按键后关闭窗口。destroyAllWindows()函数可销毁程序执行期间创建的所有窗口。

保存和重新加载 VectorStore

javascript from google.colab import drive drive.mount('/content/drive')

保存至持久存储

ini sentence_index.storage_context.persist(persist_dir="location in Gdrive") base_index.storage_context.persist(persist_dir="location in Gdrive")

从存储中检索

ini # 重建存储 SC_retrieved_sentence = StorageContext.from_defaults(persist_dir="location in Gdrive") SC_retrieved_base = StorageContext.from_defaults(persist_dir="location in Gdrive")

加载索引

ini retrieved_sentence_index = load_index_from_storage( SC_retrieved_sentence, service_context=ctx_sentence) retrieved_base_index = load_index_from_storage( SC_retrieved_base, service_context=ctx_base)

重建查询引擎

ini from llama_index.postprocessor import MetadataReplacementPostProcessor # sentence_query_engine = retrieved_sentence_index.as_query_engine( similarity_top_k=5, verbose=True, node_postprocessor=[MetadataReplacementPostProcessor("window")], ) base_query_engine = retrieved_base_index.as_query_engine( similarity_top_k=5, verbose=True, )

提问问题并得到回应

css base_response = base_query_engine.query(query) # display(Markdown(f"<b>{base_response}</b>"))

想入门 AI 大模型却找不到清晰方向？备考大厂 AI 岗还在四处搜集零散资料？别再浪费时间啦！2025 年AI 大模型全套学习资料已整理完毕，从学习路线到面试真题，从工具教程到行业报告，一站式覆盖你的所有需求，现在全部免费分享！

👇👇扫码免费领取全部内容👇👇

一、学习必备：100+本大模型电子书+26 份行业报告 + 600+ 套技术PPT，帮你看透 AI 趋势

想了解大模型的行业动态、商业落地案例？大模型电子书？这份资料帮你站在 “行业高度” 学 AI：

1. 100+本大模型方向电子书

2. 26 份行业研究报告：覆盖多领域实践与趋势

报告包含阿里、DeepSeek 等权威机构发布的核心内容，涵盖：

职业趋势：《AI + 职业趋势报告》《中国 AI 人才粮仓模型解析》；
商业落地：《生成式 AI 商业落地白皮书》《AI Agent 应用落地技术白皮书》；
领域细分：《AGI 在金融领域的应用报告》《AI GC 实践案例集》；
行业监测：《2024 年中国大模型季度监测报告》《2025 年中国技术市场发展趋势》。

3. 600+套技术大会 PPT：听行业大咖讲实战

PPT 整理自 2024-2025 年热门技术大会，包含百度、腾讯、字节等企业的一线实践：

安全方向：《端侧大模型的安全建设》《大模型驱动安全升级（腾讯代码安全实践）》；
产品与创新：《大模型产品如何创新与创收》《AI 时代的新范式：构建 AI 产品》；
多模态与 Agent：《Step-Video 开源模型（视频生成进展）》《Agentic RAG 的现在与未来》；
工程落地：《从原型到生产：AgentOps 加速字节 AI 应用落地》《智能代码助手 CodeFuse 的架构设计》。

二、求职必看：大厂 AI 岗面试 “弹药库”，300 + 真题 + 107 道面经直接抱走

想冲字节、腾讯、阿里、蔚来等大厂 AI 岗？这份面试资料帮你提前 “押题”，拒绝临场慌！

1. 107 道大厂面经：覆盖 Prompt、RAG、大模型应用工程师等热门岗位

面经整理自 2021-2025 年真实面试场景，包含 TPlink、字节、腾讯、蔚来、虾皮、中兴、科大讯飞、京东等企业的高频考题，每道题都附带思路解析：

2. 102 道 AI 大模型真题：直击大模型核心考点

针对大模型专属考题，从概念到实践全面覆盖，帮你理清底层逻辑：

3. 97 道 LLMs 真题：聚焦大型语言模型高频问题

专门拆解 LLMs 的核心痛点与解决方案，比如让很多人头疼的 “复读机问题”：

三、路线必明： AI 大模型学习路线图，1 张图理清核心内容

刚接触 AI 大模型，不知道该从哪学起？这份「AI大模型学习路线图」直接帮你划重点，不用再盲目摸索！

路线图涵盖 5 大核心板块，从基础到进阶层层递进：一步步带你从入门到进阶，从理论到实战。

L1阶段:启航篇丨极速破界AI新时代

L1阶段：了解大模型的基础知识，以及大模型在各个行业的应用和分析，学习理解大模型的核心原理、关键技术以及大模型应用场景。

L2阶段：攻坚篇丨RAG开发实战工坊

L2阶段：AI大模型RAG应用开发工程，主要学习RAG检索增强生成：包括Naive RAG、Advanced-RAG以及RAG性能评估，还有GraphRAG在内的多个RAG热门项目的分析。

L3阶段：跃迁篇丨Agent智能体架构设计

L3阶段：大模型Agent应用架构进阶实现，主要学习LangChain、 LIamaIndex框架，也会学习到AutoGPT、 MetaGPT等多Agent系统，打造Agent智能体。

L4阶段：精进篇丨模型微调与私有化部署

L4阶段：大模型的微调和私有化部署，更加深入的探讨Transformer架构，学习大模型的微调技术，利用DeepSpeed、Lamam Factory等工具快速进行模型微调，并通过Ollama、vLLM等推理部署框架，实现模型的快速部署。

L5阶段：专题集丨特训篇【录播课】

四、资料领取：全套内容免费抱走，学 AI 不用再找第二份

不管你是 0 基础想入门 AI 大模型，还是有基础想冲刺大厂、了解行业趋势，这份资料都能满足你！
现在只需按照提示操作，就能免费领取：