Files
nv-ingest/examples/langchain_multimodal_rag.ipynb

224 lines
6.6 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "7efe0f92-fdbb-4471-b74c-5bdaafed8102",
"metadata": {},
"source": [
"# Multimodal RAG with LangChain"
]
},
{
"cell_type": "markdown",
"id": "91ece9e3-155a-44f4-81e5-2f9492c62a2f",
"metadata": {},
"source": [
"This notebook shows how to perform RAG on the table, chart, and text extraction results of NV-Ingest's pdf extraction tools using LangChain"
]
},
{
"cell_type": "markdown",
"id": "c6905d11-0ec3-43c8-961b-24cb52e36bfe",
"metadata": {},
"source": [
"**Note:** In order to run this notebook, you'll need to have the NV-Ingest microservice running along with all of the other included microservices. To do this, make sure all of the services are uncommented in the file: [docker-compose.yaml](https://github.com/NVIDIA/nv-ingest/blob/main/docker-compose.yaml) and follow the [quickstart guide](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#quickstart) to start everything up. You'll also need to have the NV-Ingest python client installed as demonstrated [here](https://github.com/NVIDIA/nv-ingest?tab=readme-ov-file#step-2-installing-python-dependencies)."
]
},
{
"cell_type": "markdown",
"id": "81014734-f765-48fc-8fc2-4c19f5f28eae",
"metadata": {},
"source": [
"To start, make sure LangChain and pymilvus are installed and up to date"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bacbe052-4429-4c0a-8b1e-309ac55ad8fb",
"metadata": {},
"outputs": [],
"source": [
"pip install -qU langchain langchain_community langchain-nvidia-ai-endpoints langchain_milvus pymilvus"
]
},
{
"cell_type": "markdown",
"id": "d888ba26-04cf-4577-81a3-5bcd537fc2f6",
"metadata": {},
"source": [
"Then, we'll use NV-Ingest's Ingestor interface to extract the tables and charts from a test pdf, embed them, and upload them to our Milvus vector database (VDB)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "32017922-9b9c-48b9-86ab-6319377fcce8",
"metadata": {},
"outputs": [],
"source": [
"from nv_ingest_client.client import Ingestor\n",
"\n",
"ingestor = (\n",
" Ingestor(message_client_hostname=\"localhost\")\n",
" .files(\"../data/multimodal_test.pdf\")\n",
" .extract(\n",
" extract_text=False,\n",
" extract_tables=True,\n",
" extract_images=False,\n",
" ).embed(\n",
" text=False,\n",
" tables=True,\n",
" ).vdb_upload()\n",
")\n",
"\n",
"results = ingestor.ingest()"
]
},
{
"cell_type": "markdown",
"id": "02131711-31bf-4536-81b7-8c464c7473e3",
"metadata": {},
"source": [
"Now, the text, table, and chart content is extracted and stored in the Milvus VDB along with the embeddings. Next we'll connect LangChain to Milvus and create a vector store so that we can query our extraction results"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "53957974-c688-4521-8c61-09f2649d5d53",
"metadata": {},
"outputs": [],
"source": [
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
"from langchain_milvus import Milvus\n",
"\n",
"embedding_function = NVIDIAEmbeddings(base_url=\"http://localhost:8012/v1\")\n",
"\n",
"vectorstore = Milvus(\n",
" embedding_function=embedding_function,\n",
" collection_name=\"nv_ingest_collection\",\n",
" primary_field = \"pk\",\n",
" vector_field = \"vector\",\n",
" text_field=\"text\",\n",
" connection_args={\"uri\": \"http://localhost:19530\"},\n",
")\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "b87111b5-e5a8-45a0-9663-2ae6d9ea2ab6",
"metadata": {},
"source": [
"Then, we'll create an RAG chain using [llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct) that we can use to query our pdf in natural language"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b4c9e109-395c-40e2-a1a5-e0c0ef217e24",
"metadata": {},
"outputs": [],
"source": [
"import os \n",
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
"\n",
"# TODO: Add your NVIDIA API key\n",
"os.environ[\"NVIDIA_API_KEY\"] = \"[YOUR NVIDIA API KEY HERE]\"\n",
"\n",
"llm = ChatNVIDIA(model=\"meta/llama-3.1-405b-instruct\")"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "77fd17f8-eac0-4457-b6fb-6e5c8ce90c84",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"\n",
"template = (\n",
" \"You are an assistant for question-answering tasks. \"\n",
" \"Use the following pieces of retrieved context to answer \"\n",
" \"the question. If you don't know the answer, say that you \"\n",
" \"don't know. Keep the answer concise.\"\n",
" \"\\n\\n\"\n",
" \"{context}\"\n",
" \"Question: {question}\"\n",
")\n",
"\n",
"prompt = PromptTemplate.from_template(template)\n",
"\n",
"rag_chain = (\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "cc2ee8fb-a154-46c9-9181-29a035fdcfbb",
"metadata": {},
"source": [
"And now we can ask our pdf questions"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "b547a19a-9ada-4a40-a246-6d7bc4d24482",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The dog is chasing a squirrel in the front yard.'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rag_chain.invoke(\"What is the dog doing and where?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5b3f079-65a6-4d32-a190-1df96925c5c7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.15"
}
},
"nbformat": 4,
"nbformat_minor": 5
}