mirror of
https://github.com/NVIDIA/nv-ingest.git
synced 2025-01-05 18:58:13 +03:00
224 lines
6.6 KiB
Plaintext
224 lines
6.6 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7efe0f92-fdbb-4471-b74c-5bdaafed8102",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Multimodal RAG with LangChain"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "91ece9e3-155a-44f4-81e5-2f9492c62a2f",
|
|
"metadata": {},
|
|
"source": [
|
|
"This notebook shows how to perform RAG on the table, chart, and text extraction results of NV-Ingest's pdf extraction tools using LangChain"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c6905d11-0ec3-43c8-961b-24cb52e36bfe",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Note:** In order to run this notebook, you'll need to have the NV-Ingest microservice running along with all of the other included microservices. To do this, make sure all of the services are uncommented in the file: [docker-compose.yaml](https://github.com/NVIDIA/nv-ingest/blob/main/docker-compose.yaml) and follow the [quickstart guide](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#quickstart) to start everything up. You'll also need to have the NV-Ingest python client installed as demonstrated [here](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#step-2-installing-python-dependencies)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "81014734-f765-48fc-8fc2-4c19f5f28eae",
|
|
"metadata": {},
|
|
"source": [
|
|
"To start, make sure LangChain and pymilvus are installed and up to date"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "bacbe052-4429-4c0a-8b1e-309ac55ad8fb",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pip install -qU langchain langchain_community langchain-nvidia-ai-endpoints langchain_milvus pymilvus"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d888ba26-04cf-4577-81a3-5bcd537fc2f6",
|
|
"metadata": {},
|
|
"source": [
|
|
"Then, we'll use NV-Ingest's Ingestor interface to extract the tables and charts from a test pdf, embed them, and upload them to our Milvus vector database (VDB)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "32017922-9b9c-48b9-86ab-6319377fcce8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from nv_ingest_client.client import Ingestor\n",
|
|
"\n",
|
|
"ingestor = (\n",
|
|
" Ingestor(message_client_hostname=\"localhost\")\n",
|
|
" .files(\"../data/multimodal_test.pdf\")\n",
|
|
" .extract(\n",
|
|
" extract_text=False,\n",
|
|
" extract_tables=True,\n",
|
|
" extract_images=False,\n",
|
|
" ).embed(\n",
|
|
" text=False,\n",
|
|
" tables=True,\n",
|
|
" ).vdb_upload()\n",
|
|
")\n",
|
|
"\n",
|
|
"results = ingestor.ingest()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "02131711-31bf-4536-81b7-8c464c7473e3",
|
|
"metadata": {},
|
|
"source": [
|
|
"Now, the text, table, and chart content is extracted and stored in the Milvus VDB along with the embeddings. Next we'll connect LangChain to Milvus and create a vector store so that we can query our extraction results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"id": "53957974-c688-4521-8c61-09f2649d5d53",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
|
|
"from langchain_milvus import Milvus\n",
|
|
"\n",
|
|
"embedding_function = NVIDIAEmbeddings(base_url=\"http://localhost:8012/v1\")\n",
|
|
"\n",
|
|
"vectorstore = Milvus(\n",
|
|
" embedding_function=embedding_function,\n",
|
|
" collection_name=\"nv_ingest_collection\",\n",
|
|
" primary_field = \"pk\",\n",
|
|
" vector_field = \"vector\",\n",
|
|
" text_field=\"text\",\n",
|
|
" connection_args={\"uri\": \"http://localhost:19530\"},\n",
|
|
")\n",
|
|
"retriever = vectorstore.as_retriever()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b87111b5-e5a8-45a0-9663-2ae6d9ea2ab6",
|
|
"metadata": {},
|
|
"source": [
|
|
"Then, we'll create an RAG chain using [llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct) that we can use to query our pdf in natural language"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "b4c9e109-395c-40e2-a1a5-e0c0ef217e24",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os \n",
|
|
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
|
|
"\n",
|
|
"# TODO: Add your NVIDIA API key\n",
|
|
"os.environ[\"NVIDIA_API_KEY\"] = \"[YOUR NVIDIA API KEY HERE]\"\n",
|
|
"\n",
|
|
"llm = ChatNVIDIA(model=\"meta/llama-3.1-405b-instruct\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "77fd17f8-eac0-4457-b6fb-6e5c8ce90c84",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.prompts import PromptTemplate\n",
|
|
"from langchain_core.runnables import RunnablePassthrough\n",
|
|
"from langchain_core.output_parsers import StrOutputParser\n",
|
|
"\n",
|
|
"template = (\n",
|
|
" \"You are an assistant for question-answering tasks. \"\n",
|
|
" \"Use the following pieces of retrieved context to answer \"\n",
|
|
" \"the question. If you don't know the answer, say that you \"\n",
|
|
" \"don't know. Keep the answer concise.\"\n",
|
|
" \"\\n\\n\"\n",
|
|
" \"{context}\"\n",
|
|
" \"Question: {question}\"\n",
|
|
")\n",
|
|
"\n",
|
|
"prompt = PromptTemplate.from_template(template)\n",
|
|
"\n",
|
|
"rag_chain = (\n",
|
|
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
|
|
" | prompt\n",
|
|
" | llm\n",
|
|
" | StrOutputParser()\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "cc2ee8fb-a154-46c9-9181-29a035fdcfbb",
|
|
"metadata": {},
|
|
"source": [
|
|
"And now we can ask our pdf questions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"id": "b547a19a-9ada-4a40-a246-6d7bc4d24482",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'The dog is chasing a squirrel in the front yard.'"
|
|
]
|
|
},
|
|
"execution_count": 16,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"rag_chain.invoke(\"What is the dog doing and where?\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "b5b3f079-65a6-4d32-a190-1df96925c5c7",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.15"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|