mirror of
https://github.com/anthropics/claude-cookbooks.git
synced 2025-10-06 01:00:28 +03:00
Reverted all instances of CLAUDE_API_KEY back to ANTHROPIC_API_KEY to maintain compatibility with existing infrastructure and GitHub secrets. This affects: - Environment variable examples (.env.example files) - Python scripts and notebooks - Documentation and README files - Evaluation scripts and test files Other naming changes (Claude API, Claude Console, Claude Docs, Claude Cookbook) remain intact. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
404 lines
9.2 KiB
Plaintext
404 lines
9.2 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "813NspGRKhc2"
|
||
},
|
||
"source": [
|
||
"# RAG Pipeline with LlamaIndex\n",
|
||
"\n",
|
||
"In this notebook we will look into building Basic RAG Pipeline with LlamaIndex. The pipeline has following steps.\n",
|
||
"\n",
|
||
"1. Setup LLM and Embedding Model.\n",
|
||
"2. Download Data.\n",
|
||
"3. Load Data.\n",
|
||
"4. Index Data.\n",
|
||
"5. Create Query Engine.\n",
|
||
"6. Querying."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "qYHCYDecKDRZ"
|
||
},
|
||
"source": [
|
||
"### Installation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "mLTzvn__ldjd"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"!pip install llama-index\n",
|
||
"!pip install llama-index-llms-anthropic\n",
|
||
"!pip install llama-index-embeddings-huggingface"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Iu-wU44BKF9c"
|
||
},
|
||
"source": [
|
||
"### Setup API Keys"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"id": "ku5rkxtIlpCs"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import os\n",
|
||
"os.environ['ANTHROPIC_API_KEY'] = 'YOUR Claude API KEY'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "jJTRVhkJKH4r"
|
||
},
|
||
"source": [
|
||
"### Setup LLM and Embedding model\n",
|
||
"\n",
|
||
"We will use anthropic latest released `Claude 3 Opus` models"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"id": "fQ0tqJL_mSe0"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_index.llms.anthropic import Anthropic\n",
|
||
"from llama_index.embeddings.huggingface import HuggingFaceEmbedding"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"id": "snc2jpj4nlXJ"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "74a676bbd1d04390b4449a42b98a4428",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"config.json: 0%| | 0.00/777 [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "779eefefc7dc49bb9244dc54822d2195",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"model.safetensors: 0%| | 0.00/438M [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "167e22b0490c4c17a6a2ac3c1861e75a",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"tokenizer_config.json: 0%| | 0.00/366 [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "7f0e9e699cfb47b2bbe92756e95131ac",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"vocab.txt: 0%| | 0.00/232k [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "2623fec06f4c4fcd8fbaa496a0652b5e",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"tokenizer.json: 0%| | 0.00/711k [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"data": {
|
||
"application/vnd.jupyter.widget-view+json": {
|
||
"model_id": "33d6ff6e06214af08c3474e2c9daf830",
|
||
"version_major": 2,
|
||
"version_minor": 0
|
||
},
|
||
"text/plain": [
|
||
"special_tokens_map.json: 0%| | 0.00/125 [00:00<?, ?B/s]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"llm = Anthropic(temperature=0.0, model='claude-3-opus-20240229')\n",
|
||
"embed_model = HuggingFaceEmbedding(model_name=\"BAAI/bge-base-en-v1.5\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"id": "YmN2FEQFnx6Y"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_index.core import Settings\n",
|
||
"Settings.llm = llm\n",
|
||
"Settings.embed_model = embed_model\n",
|
||
"Settings.chunk_size = 512"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "BqdSyD2_KK-m"
|
||
},
|
||
"source": [
|
||
"### Download Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "zC7CD222n72t",
|
||
"outputId": "42c54b8a-bec9-4c2e-9cd5-97387f7011eb"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"--2024-03-08 06:51:30-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\n",
|
||
"Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...\n",
|
||
"Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.\n",
|
||
"HTTP request sent, awaiting response... 200 OK\n",
|
||
"Length: 75042 (73K) [text/plain]\n",
|
||
"Saving to: ‘data/paul_graham/paul_graham_essay.txt’\n",
|
||
"\n",
|
||
"data/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.002s \n",
|
||
"\n",
|
||
"2024-03-08 06:51:30 (34.6 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"!mkdir -p 'data/paul_graham/'\n",
|
||
"!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"id": "Ea9GbN2poO3V"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_index.core import (\n",
|
||
" VectorStoreIndex,\n",
|
||
" SimpleDirectoryReader,\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "2dOGl_SKKUNH"
|
||
},
|
||
"source": [
|
||
"### Load Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"id": "Q4hWhnhxoUBj"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"documents = SimpleDirectoryReader(\"./data/paul_graham\").load_data()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "XM3-kLhdKRk1"
|
||
},
|
||
"source": [
|
||
"### Index Data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"id": "yvblCnWYKSrh"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"index = VectorStoreIndex.from_documents(\n",
|
||
" documents,\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "a17Tz644KZ_P"
|
||
},
|
||
"source": [
|
||
"### Create Query Engine"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"id": "LIT8kqYKoaRq"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"query_engine = index.as_query_engine(similarity_top_k=3)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "n6ODGRTxKd-u"
|
||
},
|
||
"source": [
|
||
"### Test Query"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"id": "igdrBclbKYrJ"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"response = query_engine.query(\"What did author do growing up?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "ap-6RUvSozgE",
|
||
"outputId": "37ba0327-afe7-4f1e-d862-0412085954c8"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Based on the information provided, the author worked on two main things outside of school before college: writing and programming.\n",
|
||
"\n",
|
||
"For writing, he wrote short stories as a beginning writer, though he felt they were awful, with hardly any plot and just characters with strong feelings.\n",
|
||
"\n",
|
||
"In terms of programming, in 9th grade he tried writing his first programs on an IBM 1401 computer that his school district used. He and his friend got permission to use it, programming in an early version of Fortran using punch cards. However, he had difficulty figuring out what to actually do with the computer at that stage given the limited inputs available.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(response)\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "ri47de5wpMTo"
|
||
},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"colab": {
|
||
"provenance": []
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.12"
|
||
},
|
||
"vscode": {
|
||
"interpreter": {
|
||
"hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
|
||
}
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|