Merge pull request #83 from VakeDomen/feature/hype

Feature/hype
improved markdown
2025-04-07 00:48:52 +03:00 · 2025-04-01 23:32:26 +03:00 · 2025-03-10 13:28:16 +00:00 · 2025-03-06 00:38:52 +02:00 · 2025-03-06 00:36:56 +02:00 · 2025-03-06 00:31:15 +02:00
18 changed files with 1658 additions and 82 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,4 +1,4 @@
-# Contributing to Advanced RAG Techniques
+# Contributing to RAG Techniques

 Welcome to the world's largest and most comprehensive repository of Retrieval-Augmented Generation (RAG) tutorials! 🌟 We're thrilled you're interested in contributing to this ever-growing knowledge base. Your expertise and creativity can help us maintain our position at the forefront of RAG technology.

--- a/README.md
+++ b/README.md
@@ -30,9 +30,11 @@ Welcome to one of the most comprehensive and dynamic collections of Retrieval-Au

 [![Subscribe to DiamantAI Newsletter](images/subscribe-button.svg)](https://diamantai.substack.com/?r=336pe4&utm_campaign=pub-share-checklist)

-*Join thousands of AI enthusiasts getting unique cutting-edge insights and free tutorials! **Plus, subscribers get exclusive early access and special discounts to our upcoming RAG Techniques course!** *
+*Join over 15,000 of AI enthusiasts getting unique cutting-edge insights and free tutorials!* ***Plus, subscribers get exclusive early access and special 33% discounts to my book and the upcoming RAG Techniques course!***
 </div>

+
+
 [![DiamantAI's newsletter](images/substack_image.png)](https://diamantai.substack.com/?r=336pe4&utm_campaign=pub-share-checklist)


@@ -151,7 +153,24 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 📚 Context and Content Enrichment

-8. **[Contextual Chunk Headers :label:](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb)**
+8. Hypothetical Prompt Embeddings (HyPE) ❓🚀  
+   - **[LangChain](all_rag_techniques/HyPE_Hypothetical_Prompt_Embedding.ipynb)**  
+   - **[Runnable Script](all_rag_techniques_runnable_scripts/HyPE_Hypothetical_Prompt_Embedding.py)**  
+
+   #### Overview 🔎  
+   HyPE (Hypothetical Prompt Embeddings) is an enhancement to traditional RAG retrieval that **precomputes hypothetical prompts at the indexing stage**, but inseting the chunk in their place. This transforms retrieval into a **question-question matching task**. This avoids the need for runtime synthetic answer generation, reducing inference-time computational overhead while **improving retrieval alignment**.  
+
+   #### Implementation 🛠️  
+   - 📖 **Precomputed Questions:** Instead of embedding document chunks, HyPE **generates multiple hypothetical queries per chunk** at indexing time.  
+   - 🔍 **Question-Question Matching:** User queries are matched against stored hypothetical questions, leading to **better retrieval alignment**.  
+   - ⚡ **No Runtime Overhead:** Unlike HyDE, HyPE does **not require LLM calls at query time**, making retrieval **faster and cheaper**.  
+   - 📈 **Higher Precision & Recall:** Improves retrieval **context precision by up to 42 percentage points** and **claim recall by up to 45 percentage points**.  
+
+   #### Additional Resources 📚  
+   - **[Preprint: Hypothetical Prompt Embeddings (HyPE)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335)** - Research paper detailing the method, evaluation, and benchmarks.  
+
+
+9. **[Contextual Chunk Headers :label:](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/contextual_chunk_headers.ipynb)**

    #### Overview 🔎
    Contextual chunk headers (CCH) is a method of creating document-level and section-level context, and prepending those chunk headers to the chunks prior to embedding them.
@@ -162,7 +181,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Additional Resources 📚
    **[dsRAG](https://github.com/D-Star-AI/dsRAG)**: open-source retrieval engine that implements this technique (and a few other advanced RAG techniques)

-9. **[Relevant Segment Extraction 🧩](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb)**
+10. **[Relevant Segment Extraction 🧩](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/relevant_segment_extraction.ipynb)**

    #### Overview 🔎
    Relevant segment extraction (RSE) is a method of dynamically constructing multi-chunk segments of text that are relevant to a given query.
@@ -170,7 +189,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Perform a retrieval post-processing step that analyzes the most relevant chunks and identifies longer multi-chunk segments to provide more complete context to the LLM.

-10. Context Enrichment Techniques 📝  
+11. Context Enrichment Techniques 📝  
   - **[LangChain](all_rag_techniques/context_enrichment_window_around_chunk.ipynb)**  
   - **[LlamaIndex](all_rag_techniques/context_enrichment_window_around_chunk_with_llamaindex.ipynb)**
   - **[Runnable Script](all_rag_techniques_runnable_scripts/context_enrichment_window_around_chunk.py)**
@@ -181,7 +200,7 @@ Explore the extensive list of cutting-edge RAG techniques:
   #### Implementation 🛠️
   Retrieve the most relevant sentence while also accessing the sentences before and after it in the original text.

-11. Semantic Chunking 🧠
+12. Semantic Chunking 🧠
   - **[LangChain](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/semantic_chunking.ipynb)**
   - **[Runnable Script](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques_runnable_scripts/semantic_chunking.py)**

@@ -194,7 +213,7 @@ Explore the extensive list of cutting-edge RAG techniques:
   #### Additional Resources 📚
   - **[Semantic Chunking: Improving AI Information Retrieval](https://open.substack.com/pub/diamantai/p/semantic-chunking-improving-ai-information?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the benefits and implementation of semantic chunking in RAG systems.

-12. Contextual Compression 🗜️  
+13. Contextual Compression 🗜️  
   - **[LangChain](all_rag_techniques/contextual_compression.ipynb)**  
   - **[Runnable Script](all_rag_techniques_runnable_scripts/contextual_compression.py)**

@@ -204,7 +223,7 @@ Explore the extensive list of cutting-edge RAG techniques:
   #### Implementation 🛠️
   Use an LLM to compress or summarize retrieved chunks, preserving key information relevant to the query.

-13. Document Augmentation through Question Generation for Enhanced Retrieval  
+14. Document Augmentation through Question Generation for Enhanced Retrieval  
   - **[LangChain](all_rag_techniques/document_augmentation.ipynb)**  
   - **[Runnable Script](all_rag_techniques_runnable_scripts/document_augmentation.py)**

@@ -216,7 +235,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 🚀 Advanced Retrieval Methods

-14. Fusion Retrieval 🔗  
+15. Fusion Retrieval 🔗  
    - **[LangChain](all_rag_techniques/fusion_retrieval.ipynb)**  
    - **[LlamaIndex](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/fusion_retrieval_with_llamaindex.ipynb)**
    - **[Runnable Script](all_rag_techniques_runnable_scripts/fusion_retrieval.py)**
@@ -227,7 +246,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Combine keyword-based search with vector-based search for more comprehensive and accurate retrieval.

-15. Intelligent Reranking 📈  
+16. Intelligent Reranking 📈  
    - **[LangChain](all_rag_techniques/reranking.ipynb)**  
    - **[LlamaIndex](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking_with_llamaindex.ipynb)**
    - **[Runnable Script](all_rag_techniques_runnable_scripts/reranking.py)**
@@ -243,7 +262,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Additional Resources 📚
    - **[Relevance Revolution: How Re-ranking Transforms RAG Systems](https://open.substack.com/pub/diamantai/p/relevance-revolution-how-re-ranking?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the power of re-ranking in enhancing RAG system performance.

-16. Multi-faceted Filtering 🔍
+17. Multi-faceted Filtering 🔍

    #### Overview 🔎
    Applying various filtering techniques to refine and improve the quality of retrieved results.
@@ -254,7 +273,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    - 📄 **Content Filtering:** Remove results that don't match specific content criteria or essential keywords.
    - 🌈 **Diversity Filtering:** Ensure result diversity by filtering out near-duplicate entries.

-17. Hierarchical Indices 🗂️  
+18. Hierarchical Indices 🗂️  
    - **[LangChain](all_rag_techniques/hierarchical_indices.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/hierarchical_indices.py)**

@@ -267,7 +286,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Additional Resources 📚
    - **[Hierarchical Indices: Enhancing RAG Systems](https://open.substack.com/pub/diamantai/p/hierarchical-indices-enhancing-rag?r=336pe4&utm_campaign=post&utm_medium=web)** - A comprehensive blog post exploring the power of hierarchical indices in enhancing RAG system performance.

-18. Ensemble Retrieval 🎭
+19. Ensemble Retrieval 🎭

    #### Overview 🔎
    Combining multiple retrieval models or techniques for more robust and accurate results.
@@ -275,7 +294,16 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Apply different embedding models or retrieval algorithms and use voting or weighting mechanisms to determine the final set of retrieved documents.

-19. Multi-modal Retrieval 📽️
+20. Dartboard Retrieval 🎯
+    - **[LangChain](https://github.com/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/dartboard.ipynb)** 
+    #### Overview 🔎
+    Optimizing over Relevant Information Gain in Retrieval
+
+    #### Implementation 🛠️
+    - Combine both relevance and diversity into a single scoring function and directly optimize for it.
+    - POC showing plain simple RAG underperforming when the database is dense, and the dartboard retrieval outperforming it.
+
+21. Multi-modal Retrieval 📽️

    #### Overview 🔎
    Extending RAG capabilities to handle diverse data types for richer responses.
@@ -287,7 +315,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 🔁 Iterative and Adaptive Techniques

-20. Retrieval with Feedback Loops 🔁  
+22. Retrieval with Feedback Loops 🔁  
    - **[LangChain](all_rag_techniques/retrieval_with_feedback_loop.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/retrieval_with_feedback_loop.py)**

@@ -297,7 +325,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Collect and utilize user feedback on the relevance and quality of retrieved documents and generated responses to fine-tune retrieval and ranking models.

-21. Adaptive Retrieval 🎯  
+23. Adaptive Retrieval 🎯  
    - **[LangChain](all_rag_techniques/adaptive_retrieval.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/adaptive_retrieval.py)**

@@ -307,7 +335,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Classify queries into different categories and use tailored retrieval strategies for each, considering user context and preferences.

-22. Iterative Retrieval 🔄
+24. Iterative Retrieval 🔄

    #### Overview 🔎
    Performing multiple rounds of retrieval to refine and enhance result quality.
@@ -317,7 +345,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 📊 Evaluation

-23. **[DeepEval Evaluation](evaluation/evaluation_deep_eval.ipynb)** 📘
+25. **[DeepEval Evaluation](evaluation/evaluation_deep_eval.ipynb)** 📘

    #### Overview 🔎
    Performing evaluations Retrieval-Augmented Generation systems, by covering several metrics and creating test cases.
@@ -326,7 +354,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    Use the `deepeval` library to conduct test cases on correctness, faithfulness and contextual relevancy of RAG systems.
    

-24. **[GroUSE Evaluation](evaluation/evaluation_grouse.ipynb)** 🐦
+26. **[GroUSE Evaluation](evaluation/evaluation_grouse.ipynb)** 🐦

    #### Overview 🔎
    Evaluate the final stage of Retrieval-Augmented Generation using metrics of the GroUSE framework and meta-evaluate your custom LLM judge on GroUSE unit tests.
@@ -337,7 +365,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 🔬 Explainability and Transparency

-25. Explainable Retrieval 🔍  
+27. Explainable Retrieval 🔍  
    - **[LangChain](all_rag_techniques/explainable_retrieval.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/explainable_retrieval.py)**

@@ -349,7 +377,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ### 🏗️ Advanced Architectures

-26. Knowledge Graph Integration (Graph RAG) 🕸️  
+28. Knowledge Graph Integration (Graph RAG) 🕸️  
    - **[LangChain](all_rag_techniques/graph_rag.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/graph_rag.py)**

@@ -359,7 +387,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Retrieve entities and their relationships from a knowledge graph relevant to the query, combining this structured data with unstructured text for more informative responses.
    
-27. GraphRag (Microsoft) 🎯
+29. GraphRag (Microsoft) 🎯
    - **[GraphRag](all_rag_techniques/Microsoft_GraphRag.ipynb)**  

    #### Overview 🔎
@@ -368,7 +396,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    • Analyze an input corpus by extracting entities, relationships from text units. generates summaries of each community and its constituents from the bottom-up.

-28. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval 🌳  
+30. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval 🌳  
    - **[LangChain](all_rag_techniques/raptor.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/raptor.py)**

@@ -378,7 +406,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    Use abstractive summarization to recursively process and summarize retrieved documents, organizing the information in a tree structure for hierarchical context.

-29. Self RAG 🔁  
+31. Self RAG 🔁  
    - **[LangChain](all_rag_techniques/self_rag.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/self_rag.py)**

@@ -388,7 +416,7 @@ Explore the extensive list of cutting-edge RAG techniques:
    #### Implementation 🛠️
    • Implement a multi-step process including retrieval decision, document retrieval, relevance evaluation, response generation, support assessment, and utility evaluation to produce accurate, relevant, and useful outputs.

-30. Corrective RAG 🔧  
+32. Corrective RAG 🔧  
    - **[LangChain](all_rag_techniques/crag.ipynb)**  
    - **[Runnable Script](all_rag_techniques_runnable_scripts/crag.py)**

@@ -400,7 +428,7 @@ Explore the extensive list of cutting-edge RAG techniques:

 ## 🌟 Special Advanced Technique 🌟

-31. **[Sophisticated Controllable Agent for Complex RAG Tasks 🤖](https://github.com/NirDiamant/Controllable-RAG-Agent)**
+33. **[Sophisticated Controllable Agent for Complex RAG Tasks 🤖](https://github.com/NirDiamant/Controllable-RAG-Agent)**

    #### Overview 🔎
    An advanced RAG solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve. This approach uses a sophisticated deterministic graph as the "brain" 🧠 of a highly controllable autonomous agent, capable of answering non-trivial questions from your own data.
--- a/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb
+++ b/all_rag_techniques/HyPE_Hypothetical_Prompt_Embeddings.ipynb
@@ -0,0 +1,558 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Hypothetical Prompt Embeddings (HyPE)\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "This code implements a Retrieval-Augmented Generation (RAG) system enhanced by Hypothetical Prompt Embeddings (HyPE). Unlike traditional RAG pipelines that struggle with query-document style mismatch, HyPE precomputes hypothetical questions during the indexing phase. This transforms retrieval into a question-question matching problem, eliminating the need for expensive runtime query expansion techniques.\n",
+    "\n",
+    "## Key Components of notebook\n",
+    "\n",
+    "1. PDF processing and text extraction\n",
+    "2. Text chunking to maintain coherent information units\n",
+    "3. **Hypothetical Prompt Embedding Generation** using an LLM to create multiple proxy questions per chunk\n",
+    "4. Vector store creation using [FAISS](https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/) and OpenAI embeddings\n",
+    "5. Retriever setup for querying the processed documents\n",
+    "6. Evaluation of the RAG system\n",
+    "\n",
+    "## Method Details\n",
+    "\n",
+    "### Document Preprocessing\n",
+    "\n",
+    "1. The PDF is loaded using `PyPDFLoader`.\n",
+    "2. The text is split into chunks using `RecursiveCharacterTextSplitter` with specified chunk size and overlap.\n",
+    "\n",
+    "### Hypothetical Question Generation\n",
+    "\n",
+    "Instead of embedding raw text chunks, HyPE **generates multiple hypothetical prompts** for each chunk. These **precomputed questions** simulate user queries, improving alignment with real-world searches. This removes the need for runtime synthetic answer generation needed in techniques like HyDE.\n",
+    "\n",
+    "### Vector Store Creation\n",
+    "\n",
+    "1. Each hypothetical question is embedded using OpenAI embeddings.\n",
+    "2. A FAISS vector store is built, associating **each question embedding with its original chunk**.\n",
+    "3. This approach **stores multiple representations per chunk**, increasing retrieval flexibility.\n",
+    "\n",
+    "### Retriever Setup\n",
+    "\n",
+    "1. The retriever is optimized for **question-question matching** rather than direct document retrieval.\n",
+    "2. The FAISS index enables **efficient nearest-neighbor** search over the hypothetical prompt embeddings.\n",
+    "3. Retrieved chunks provide a **richer and more precise context** for downstream LLM generation.\n",
+    "\n",
+    "## Key Features\n",
+    "\n",
+    "1. **Precomputed Hypothetical Prompts** – Improves query alignment without runtime overhead.\n",
+    "2. **Multi-Vector Representation**– Each chunk is indexed multiple times for broader semantic coverage.\n",
+    "3. **Efficient Retrieval** – FAISS ensures fast similarity search over the enhanced embeddings.\n",
+    "4. **Modular Design** – The pipeline is easy to adapt for different datasets and retrieval settings. Additionally it's compatible with most optimizations like reranking etc.\n",
+    "\n",
+    "## Evaluation\n",
+    "\n",
+    "HyPE's effectiveness is evaluated across multiple datasets, showing:\n",
+    "\n",
+    "- Up to 42 percentage points improvement in retrieval precision\n",
+    "- Up to 45 percentage points improvement in claim recall\n",
+    "    (See full evaluation results in [preprint](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335))\n",
+    "\n",
+    "## Benefits of this Approach\n",
+    "\n",
+    "1. **Eliminates Query-Time Overhead** – All hypothetical generation is done offline at indexing.\n",
+    "2. **Enhanced Retrieval Precision** – Better alignment between queries and stored content.\n",
+    "3. **Scalable & Efficient** – No addinal per-query computational cost; retrieval is as fast as standard RAG.\n",
+    "4. **Flexible & Extensible** – Can be combined with advanced RAG techniques like reranking.\n",
+    "\n",
+    "## Conclusion\n",
+    "\n",
+    "HyPE provides a scalable and efficient alternative to traditional RAG systems, overcoming query-document style mismatch while avoiding the computational cost of runtime query expansion. By moving hypothetical prompt generation to indexing, it significantly enhances retrieval precision and efficiency, making it a practical solution for real-world applications.\n",
+    "\n",
+    "For further details, refer to the full paper: [preprint](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335)\n",
+    "\n",
+    "\n",
+    "<div style=\"text-align: center;\">\n",
+    "\n",
+    "<img src=\"../images/hype.svg\" alt=\"HyPE\" style=\"width:70%; height:auto;\">\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Import libraries and environment variables"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 63,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "import faiss\n",
+    "from tqdm import tqdm\n",
+    "from dotenv import load_dotenv\n",
+    "from concurrent.futures import ThreadPoolExecutor, as_completed\n",
+    "from langchain_community.docstore.in_memory import InMemoryDocstore\n",
+    "\n",
+    "\n",
+    "# Load environment variables from a .env file\n",
+    "load_dotenv()\n",
+    "\n",
+    "# Set the OpenAI API key environment variable (comment out if not using OpenAI)\n",
+    "if not os.getenv('OPENAI_API_KEY'):\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = input(\"Please enter your OpenAI API key: \")\n",
+    "else:\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')\n",
+    "\n",
+    "sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path since we work with notebooks\n",
+    "from helper_functions import *\n",
+    "from evaluation.evalute_rag import *\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define constants\n",
+    "\n",
+    "- `PATH`: path to the data, to be embedded into the RAG pipeline\n",
+    "\n",
+    "This tutorial uses OpenAI endpoint ([avalible models](https://platform.openai.com/docs/pricing)). \n",
+    "- `LANGUAGE_MODEL_NAME`: The name of the language model to be used. \n",
+    "- `EMBEDDING_MODEL_NAME`: The name of the embedding model to be used.\n",
+    "\n",
+    "The tutroial uses a `RecursiveCharacterTextSplitter` chunking approach where the chunking length function used is python `len` function. The chunking varables to be tweaked here are:\n",
+    "- `CHUNK_SIZE`: The minimum length of one chunk\n",
+    "- `CHUNK_OVERLAP`: The overlap of two consecutive chunks."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "PATH = \"../data/Understanding_Climate_Change.pdf\"\n",
+    "LANGUAGE_MODEL_NAME = \"gpt-4o-mini\"\n",
+    "EMBEDDING_MODEL_NAME = \"text-embedding-3-small\"\n",
+    "CHUNK_SIZE = 1000\n",
+    "CHUNK_OVERLAP = 200"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define generation of Hypothetical Prompt Embeddings\n",
+    "\n",
+    "The code block below generates hypothetical questions for each text chunk and embeds them for retrieval.\n",
+    "\n",
+    "- An LLM extracts key questions from the input chunk.\n",
+    "- These questions are embedded using OpenAI's model.\n",
+    "- The function returns the original chunk and its prompt embeddings later used for retrieval.\n",
+    "\n",
+    "To ensure clean output, extra newlines are removed, and regex parsing can improve list formatting when needed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 65,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def generate_hypothetical_prompt_embeddings(chunk_text: str):\n",
+    "    \"\"\"\n",
+    "    Uses the LLM to generate multiple hypothetical questions for a single chunk.\n",
+    "    These questions will be used as 'proxies' for the chunk during retrieval.\n",
+    "\n",
+    "    Parameters:\n",
+    "    chunk_text (str): Text contents of the chunk\n",
+    "\n",
+    "    Returns:\n",
+    "    chunk_text (str): Text contents of the chunk. This is done to make the \n",
+    "        multithreading easier\n",
+    "    hypothetical prompt embeddings (List[float]): A list of embedding vectors\n",
+    "        generated from the questions\n",
+    "    \"\"\"\n",
+    "    llm = ChatOpenAI(temperature=0, model_name=LANGUAGE_MODEL_NAME)\n",
+    "    embedding_model = OpenAIEmbeddings(model=EMBEDDING_MODEL_NAME)\n",
+    "\n",
+    "    question_gen_prompt = PromptTemplate.from_template(\n",
+    "        \"Analyze the input text and generate essential questions that, when answered, \\\n",
+    "        capture the main points of the text. Each question should be one line, \\\n",
+    "        without numbering or prefixes.\\n\\n \\\n",
+    "        Text:\\n{chunk_text}\\n\\nQuestions:\\n\"\n",
+    "    )\n",
+    "    question_chain = question_gen_prompt | llm | StrOutputParser()\n",
+    "\n",
+    "    # parse questions from response\n",
+    "    # Notes: \n",
+    "    # - gpt4o likes to split questions by \\n\\n so we remove one \\n\n",
+    "    # - for production or if using smaller models from ollama, it's beneficial to use regex to parse \n",
+    "    # things like (un)ordeed lists\n",
+    "    # r\"^\\s*[\\-\\*\\•]|\\s*\\d+\\.\\s*|\\s*[a-zA-Z]\\)\\s*|\\s*\\(\\d+\\)\\s*|\\s*\\([a-zA-Z]\\)\\s*|\\s*\\([ivxlcdm]+\\)\\s*\"\n",
+    "    questions = question_chain.invoke({\"chunk_text\": chunk_text}).replace(\"\\n\\n\", \"\\n\").split(\"\\n\")\n",
+    "    \n",
+    "    return chunk_text, embedding_model.embed_documents(questions)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define creation and population of FAISS Vector Store\n",
+    "\n",
+    "The code block below builds a FAISS vector store by embedding text chunks in parallel.\n",
+    "\n",
+    "What happens?\n",
+    "- Parallel processing – Uses threading to generate embeddings faster.\n",
+    "- FAISS initialization – Sets up an L2 index for efficient similarity search.\n",
+    "- Chunk embedding – Each chunk is stored multiple times, once for each generated question embedding.\n",
+    "- In-memory storage – Uses InMemoryDocstore for fast lookup.\n",
+    "\n",
+    "This ensures efficient retrieval, improving query alignment with precomputed question embeddings."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 66,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def prepare_vector_store(chunks: List[str]):\n",
+    "    \"\"\"\n",
+    "    Creates and populates a FAISS vector store from a list of text chunks.\n",
+    "\n",
+    "    This function processes a list of text chunks in parallel, generating \n",
+    "    hypothetical prompt embeddings for each chunk.\n",
+    "    The embeddings are stored in a FAISS index for efficient similarity search.\n",
+    "\n",
+    "    Parameters:\n",
+    "    chunks (List[str]): A list of text chunks to be embedded and stored.\n",
+    "\n",
+    "    Returns:\n",
+    "    FAISS: A FAISS vector store containing the embedded text chunks.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    # Wait with initialization to see vector lengths\n",
+    "    vector_store = None  \n",
+    "\n",
+    "    with ThreadPoolExecutor() as pool:  \n",
+    "        # Use threading to speed up generation of prompt embeddings\n",
+    "        futures = [pool.submit(generate_hypothetical_prompt_embeddings, c) for c in chunks]\n",
+    "        \n",
+    "        # Process embeddings as they complete\n",
+    "        for f in tqdm(as_completed(futures), total=len(chunks)):  \n",
+    "            \n",
+    "            chunk, vectors = f.result()  # Retrieve the processed chunk and its embeddings\n",
+    "            \n",
+    "            # Initialize the FAISS vector store on the first chunk\n",
+    "            if vector_store == None:  \n",
+    "                vector_store = FAISS(\n",
+    "                    embedding_function=OpenAIEmbeddings(model=EMBEDDING_MODEL_NAME),  # Define embedding model\n",
+    "                    index=faiss.IndexFlatL2(len(vectors[0]))  # Define an L2 index for similarity search\n",
+    "                    docstore=InMemoryDocstore(),  # Use in-memory document storage\n",
+    "                    index_to_docstore_id={}  # Maintain index-to-document mapping\n",
+    "                )\n",
+    "            \n",
+    "            # Pair the chunk's content with each generated embedding vector.\n",
+    "            # Each chunk is inserted multiple times, once for each prompt vector\n",
+    "            chunks_with_embedding_vectors = [(chunk.page_content, vec) for vec in vectors]\n",
+    "            \n",
+    "            # Add embeddings to the store\n",
+    "            vector_store.add_embeddings(chunks_with_embedding_vectors)  \n",
+    "\n",
+    "    return vector_store  # Return the populated vector store\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Encode PDF into a FAISS Vector Store\n",
+    "\n",
+    "The code block below processes a PDF file and stores its content as embeddings for retrieval.\n",
+    "\n",
+    "What happens?\n",
+    "- PDF loading – Extracts text from the document.\n",
+    "- Chunking – Splits text into overlapping segments for better context retention.\n",
+    "- Preprocessing – Cleans text to improve embedding quality.\n",
+    "- Vector store creation – Generates embeddings and stores them in FAISS for retrieval."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 70,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def encode_pdf(path, chunk_size=1000, chunk_overlap=200):\n",
+    "    \"\"\"\n",
+    "    Encodes a PDF book into a vector store using OpenAI embeddings.\n",
+    "\n",
+    "    Args:\n",
+    "        path: The path to the PDF file.\n",
+    "        chunk_size: The desired size of each text chunk.\n",
+    "        chunk_overlap: The amount of overlap between consecutive chunks.\n",
+    "\n",
+    "    Returns:\n",
+    "        A FAISS vector store containing the encoded book content.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    # Load PDF documents\n",
+    "    loader = PyPDFLoader(path)\n",
+    "    documents = loader.load()\n",
+    "\n",
+    "    # Split documents into chunks\n",
+    "    text_splitter = RecursiveCharacterTextSplitter(\n",
+    "        chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=len\n",
+    "    )\n",
+    "    texts = text_splitter.split_documents(documents)\n",
+    "    cleaned_texts = replace_t_with_space(texts)\n",
+    "\n",
+    "    vectorstore = prepare_vector_store(cleaned_texts)\n",
+    "\n",
+    "    return vectorstore"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Create HyPE vector store\n",
+    "\n",
+    "Now we process the PDF and store its embeddings.\n",
+    "This step initializes the FAISS vector store with the encoded document."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 71,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 97/97 [00:22<00:00,  4.40it/s]\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Chunk size can be quite large with HyPE as we are not loosing percision with more\n",
+    "# information. For production, test how exhaustive your model is in generating sufficient \n",
+    "# amount of questions per chunk. This will mostly depend on your information density.\n",
+    "chunks_vector_store = encode_pdf(PATH, chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Create retriever\n",
+    "\n",
+    "Now we set up the retriever to fetch relevant chunks from the vector store.\n",
+    "\n",
+    "Retrieves the top `k=3` most relevant chunks based on query similarity."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 79,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chunks_query_retriever = chunks_vector_store.as_retriever(search_kwargs={\"k\": 3})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Test retriever\n",
+    "\n",
+    "Now we test retrieval using a sample query.\n",
+    "\n",
+    "- Queries the vector store to find the most relevant chunks.\n",
+    "- Deduplicates results to remove potentially repeated chunks.\n",
+    "- Displays the retrieved context for inspection.\n",
+    "\n",
+    "This step verifies that the retriever returns meaningful and diverse information for the given question."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 80,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Context 1:\n",
+      "Most of these climate changes are attributed to very small variations in Earth's orbit that \n",
+      "change the amount of solar energy our planet receives. During the Holocene epoch, which \n",
+      "began at the end of the last ice age, human societies f lourished, but the industrial era has seen \n",
+      "unprecedented changes.  \n",
+      "Modern Observations  \n",
+      "Modern scientific observations indicate a rapid increase in global temperatures, sea levels, \n",
+      "and extreme weather events. The Intergovernmental Panel on Climate Change (IPCC) has \n",
+      "documented these changes extensively. Ice core samples, tree rings, and ocean sediments \n",
+      "provide a historical record that scientists use to understand past climate conditions and \n",
+      "predict future trends. The evidence overwhelmingly shows that recent changes are primarily \n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases\n",
+      "\n",
+      "\n",
+      "Context 2:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n",
+      "Context 3:\n",
+      "Understanding Climate Change  \n",
+      "Chapter 1: Introduction to Climate Change  \n",
+      "Climate change refers to significant, long -term changes in the global climate. The term \n",
+      "\"global climate\" encompasses the planet's overall weather patterns, including temperature, \n",
+      "precipitation, and wind patterns, over an extended period. Over the past cent ury, human \n",
+      "activities, particularly the burning of fossil fuels and deforestation, have significantly \n",
+      "contributed to climate change.  \n",
+      "Historical Context  \n",
+      "The Earth's climate has changed throughout history. Over the past 650,000 years, there have \n",
+      "been seven cycles of glacial advance and retreat, with the abrupt end of the last ice age about \n",
+      "11,700 years ago marking the beginning of the modern climate era and  human civilization. \n",
+      "Most of these climate changes are attributed to very small variations in Earth's orbit that \n",
+      "change the amount of solar energy our planet receives. During the Holocene epoch, which\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "test_query = \"What is the main cause of climate change?\"\n",
+    "context = retrieve_context_per_question(test_query, chunks_query_retriever)\n",
+    "context = list(set(context))\n",
+    "show_context(context)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Evaluate results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 76,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'questions': ['1. **Multiple Choice: Causes of Climate Change**',\n",
+       "  '   - What is the primary cause of the current climate change trend?',\n",
+       "  '     A) Solar radiation variations',\n",
+       "  '     B) Natural cycles of the Earth',\n",
+       "  '     C) Human activities, such as burning fossil fuels',\n",
+       "  '     D) Volcanic eruptions',\n",
+       "  '',\n",
+       "  '2. **True or False: Impact on Biodiversity**',\n",
+       "  '   - True or False: Climate change does not have any significant impact on the migration patterns and extinction rates of various species.',\n",
+       "  '',\n",
+       "  '3. **Short Answer: Mitigation Strategies**',\n",
+       "  '   - What are two effective strategies that can be implemented at a community level to mitigate the effects of climate change?',\n",
+       "  '',\n",
+       "  '4. **Matching: Climate Change Effects**',\n",
+       "  '   - Match the following effects of climate change (numbered) with their likely consequences (lettered).',\n",
+       "  '     1. Rising sea levels',\n",
+       "  '     2. Increased frequency of extreme weather events',\n",
+       "  '     3. Melting polar ice caps',\n",
+       "  '     4. Ocean acidification',\n",
+       "  '     ',\n",
+       "  '     A) Displacement of coastal communities',\n",
+       "  '     B) Loss of marine biodiversity',\n",
+       "  '     C) Increased global temperatures',\n",
+       "  '     D) More frequent and severe hurricanes and floods',\n",
+       "  '',\n",
+       "  '5. **Essay: International Cooperation**',\n",
+       "  '   - Discuss the importance of international cooperation in combating climate change. Include examples of successful global agreements or initiatives and explain how they have contributed to addressing climate change.'],\n",
+       " 'results': ['```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 2,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 1,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 1,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 2,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 2,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 2,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 2,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 5,\\n  \"Completeness\": 4,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 2,\\n  \"Completeness\": 1,\\n  \"Conciseness\": 2\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 3,\\n  \"Conciseness\": 3\\n}\\n```',\n",
+       "  '```json\\n{\\n  \"Relevance\": 4,\\n  \"Completeness\": 2,\\n  \"Conciseness\": 3\\n}\\n```'],\n",
+       " 'average_scores': None}"
+      ]
+     },
+     "execution_count": 76,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "evaluate_rag(chunks_query_retriever)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/all_rag_techniques/choose_chunk_size.ipynb
+++ b/all_rag_techniques/choose_chunk_size.ipynb
@@ -19,7 +19,7 @@
    "nest_asyncio.apply()\n",
    "from dotenv import load_dotenv\n",
    "\n",
-    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, ServiceContext\n",
+    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "from llama_index.core.prompts import PromptTemplate\n",
    "\n",
    "from llama_index.core.evaluation import (\n",
@@ -28,6 +28,7 @@
    "    RelevancyEvaluator\n",
    ")\n",
    "from llama_index.llms.openai import OpenAI\n",
+    "from llama_index.core import Settings\n",
    "\n",
    "import openai\n",
    "import time\n",
@@ -90,11 +91,11 @@
    "# We will use GPT-4 for evaluating the responses\n",
    "gpt4 = OpenAI(temperature=0, model=\"gpt-4o\")\n",
    "\n",
-    "# Define service context for GPT-4 for evaluation\n",
-    "service_context_gpt4 = ServiceContext.from_defaults(llm=gpt4)\n",
+    "# Set appropriate settings for the LLM\n",
+    "Settings.llm = gpt4\n",
    "\n",
-    "# Define Faithfulness and Relevancy Evaluators which are based on GPT-4\n",
-    "faithfulness_gpt4 = FaithfulnessEvaluator(service_context=service_context_gpt4)\n",
+    "# Define Faithfulness Evaluators which are based on GPT-4\n",
+    "faithfulness_gpt4 = FaithfulnessEvaluator()\n",
    "\n",
    "faithfulness_new_prompt_template = PromptTemplate(\"\"\" Please tell if a given piece of information is directly supported by the context.\n",
    "    You need to answer with either YES or NO.\n",
@@ -123,7 +124,9 @@
    "    \"\"\")\n",
    "\n",
    "faithfulness_gpt4.update_prompts({\"your_prompt_key\": faithfulness_new_prompt_template}) # Update the prompts dictionary with the new prompt template\n",
-    "relevancy_gpt4 = RelevancyEvaluator(service_context=service_context_gpt4)"
+    "\n",
+    "# Define Relevancy Evaluators which are based on GPT-4\n",
+    "relevancy_gpt4 = RelevancyEvaluator()"
   ]
  },
  {
@@ -159,10 +162,12 @@
    "    # create vector index\n",
    "    llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
    "\n",
-    "    service_context = ServiceContext.from_defaults(llm=llm, chunk_size=chunk_size, chunk_overlap=chunk_size//5)  \n",
-    "    vector_index = VectorStoreIndex.from_documents(\n",
-    "        eval_documents, service_context=service_context\n",
-    "    )\n",
+    "    Settings.llm = llm\n",
+    "    Settings.chunk_size = chunk_size\n",
+    "    Settings.chunk_overlap = chunk_size // 5 \n",
+    "\n",
+    "    vector_index = VectorStoreIndex.from_documents(eval_documents)\n",
+    "    \n",
    "    # build query engine\n",
    "    query_engine = vector_index.as_query_engine(similarity_top_k=5)\n",
    "    num_questions = len(eval_questions)\n",
@@ -234,7 +239,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": ".venv",
+   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
@@ -248,7 +253,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.12.0"
+   "version": "3.11.0"
  }
 },
 "nbformat": 4,
--- a/all_rag_techniques/crag.ipynb
+++ b/all_rag_techniques/crag.ipynb
@@ -97,7 +97,7 @@
    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
    "\n",
    "\n",
-    "sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path sicnce we work with notebooks\n",
+    "sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path since we work with notebooks\n",
    "from helper_functions import *\n",
    "from evaluation.evalute_rag import *\n",
    "\n",
--- a/all_rag_techniques/dartboard.ipynb
+++ b/all_rag_techniques/dartboard.ipynb
@@ -0,0 +1,609 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Dartboard RAG: Retrieval-Augmented Generation with Balanced Relevance and Diversity\n",
+    "\n",
+    "## Overview\n",
+    "The **Dartboard RAG** process addresses a common challenge in large knowledge bases: ensuring the retrieved information is both relevant and non-redundant. By explicitly optimizing a combined relevance-diversity scoring function, it prevents multiple top-k documents from offering the same information. This approach is drawn from the elegant method in thepaper:\n",
+    "\n",
+    "> [*Better RAG using Relevant Information Gain*](https://arxiv.org/abs/2407.12101)\n",
+    "\n",
+    "The paper outlines three variations of the core idea—hybrid RAG (dense + sparse), a cross-encoder version, and a vanilla approach. The **vanilla approach** conveys the fundamental concept most directly, and this implementation extends it with optional weights to control the balance between relevance and diversity.\n",
+    "\n",
+    "## Motivation\n",
+    "\n",
+    "1. **Dense, Overlapping Knowledge Bases**  \n",
+    "   In large databases, documents may repeat similar content, causing redundancy in top-k retrieval.\n",
+    "\n",
+    "2. **Improved Information Coverage**  \n",
+    "   Combining relevance and diversity yields a richer set of documents, mitigating the “echo chamber” effect of overly similar content.\n",
+    "\n",
+    "\n",
+    "## Key Components\n",
+    "\n",
+    "1. **Relevance & Diversity Combination**  \n",
+    "   - Computes a score factoring in both how pertinent a document is to the query and how distinct it is from already chosen documents.\n",
+    "\n",
+    "2. **Weighted Balancing**  \n",
+    "   - Introduces RELEVANCE_WEIGHT and DIVERSITY_WEIGHT to allow dynamic control of scoring.  \n",
+    "   - Helps in avoiding overly diverse but less relevant results.\n",
+    "\n",
+    "3. **Production-Ready Code**  \n",
+    "   - Derived from the official implementation yet reorganized for clarity.  \n",
+    "   - Allows easier integration into existing RAG pipelines.\n",
+    "\n",
+    "## Method Details\n",
+    "\n",
+    "1. **Document Retrieval**  \n",
+    "   - Obtain an initial set of candidate documents based on similarity (e.g., cosine or BM25).  \n",
+    "   - Typically retrieves top-N candidates as a starting point.\n",
+    "\n",
+    "2. **Scoring & Selection**  \n",
+    "   - Each document’s overall score combines **relevance** and **diversity**:  \n",
+    "   - Select the highest-scoring document, then penalize documents that are overly similar to it.  \n",
+    "   - Repeat until top-k documents are identified.\n",
+    "\n",
+    "3. **Hybrid / Fusion & Cross-Encoder Support**  \n",
+    "   Essentially, all you need are distances between documents and the query, and distances between documents. You can easily extract these from hybrid / fusion retrieval or from cross-encoder retrieval. The only recommendation I have is to rely less on raking based scores.\n",
+    "   - For **hybrid / fusion retrieval**: Merge similarities (dense and sparse / BM25) into a single distance. This can be achieved by combining cosine similarity over the dense and the sparse vectors (e.g. averaging them). the move to distances is straightforward (1 - mean cosine similarity). \n",
+    "   - For **cross-encoders**: You can directly use the cross-encoder similarity scores (1- similarity), potentially adjusting with scaling factors.\n",
+    "\n",
+    "4. **Balancing & Adjustment**  \n",
+    "   - Tune DIVERSITY_WEIGHT and RELEVANCE_WEIGHT based on your needs and the density of your dataset.  \n",
+    "\n",
+    "\n",
+    "\n",
+    "By integrating both **relevance** and **diversity** into retrieval, the Dartboard RAG approach ensures that top-k documents collectively offer richer, more comprehensive information—leading to higher-quality responses in Retrieval-Augmented Generation systems.\n",
+    "\n",
+    "The paper also has an official code implemention, and this code is based on it, but I think this one here is more readable, manageable and production ready."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Import libraries and environment variables"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Please enter your OpenAI API key: \n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "import sys\n",
+    "from dotenv import load_dotenv\n",
+    "from scipy.special import logsumexp\n",
+    "from typing import Tuple, List, Any\n",
+    "import numpy as np\n",
+    "\n",
+    "# Load environment variables from a .env file\n",
+    "load_dotenv()\n",
+    "# Set the OpenAI API key environment variable (comment out if not using OpenAI)\n",
+    "if not os.getenv('OPENAI_API_KEY'):\n",
+    "    print(\"Please enter your OpenAI API key: \")\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = input(\"Please enter your OpenAI API key: \")\n",
+    "else:\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = os.getenv('OPENAI_API_KEY')\n",
+    "\n",
+    "sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path since we work with notebooks\n",
+    "from helper_functions import *\n",
+    "from evaluation.evalute_rag import *\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Read Docs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "path = \"../data/Understanding_Climate_Change.pdf\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Encode document"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# this part is same like simple_rag.ipynb, only simulating a dense dataset\n",
+    "def encode_pdf(path, chunk_size=1000, chunk_overlap=200):\n",
+    "    \"\"\"\n",
+    "    Encodes a PDF book into a vector store using OpenAI embeddings.\n",
+    "\n",
+    "    Args:\n",
+    "        path: The path to the PDF file.\n",
+    "        chunk_size: The desired size of each text chunk.\n",
+    "        chunk_overlap: The amount of overlap between consecutive chunks.\n",
+    "\n",
+    "    Returns:\n",
+    "        A FAISS vector store containing the encoded book content.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    # Load PDF documents\n",
+    "    loader = PyPDFLoader(path)\n",
+    "    documents = loader.load()\n",
+    "    documents=documents*5 # load every document 5 times to emulate a dense dataset\n",
+    "\n",
+    "    # Split documents into chunks\n",
+    "    text_splitter = RecursiveCharacterTextSplitter(\n",
+    "        chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=len\n",
+    "    )\n",
+    "    texts = text_splitter.split_documents(documents)\n",
+    "    cleaned_texts = replace_t_with_space(texts)\n",
+    "\n",
+    "    # Create embeddings (Tested with OpenAI and Amazon Bedrock)\n",
+    "    embeddings = get_langchain_embedding_provider(EmbeddingProvider.OPENAI)\n",
+    "    #embeddings = get_langchain_embedding_provider(EmbeddingProvider.AMAZON_BEDROCK)\n",
+    "\n",
+    "    # Create vector store\n",
+    "    vectorstore = FAISS.from_documents(cleaned_texts, embeddings)\n",
+    "\n",
+    "    return vectorstore"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Create Vector store\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chunks_vector_store = encode_pdf(path, chunk_size=1000, chunk_overlap=200)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Some helper functions for using the vector store for retrieval.\n",
+    "this part is same like simple_rag.ipynb, only its using the actual FAISS index (not the wrapper)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "def idx_to_text(idx:int):\n",
+    "    \"\"\"\n",
+    "    Convert a Vector store index to the corresponding text.\n",
+    "    \"\"\"\n",
+    "    docstore_id = chunks_vector_store.index_to_docstore_id[idx]\n",
+    "    document = chunks_vector_store.docstore.search(docstore_id)\n",
+    "    return document.page_content\n",
+    "\n",
+    "\n",
+    "def get_context(query:str,k:int=5) -> Tuple[np.ndarray, np.ndarray, List[str]]:\n",
+    "    \"\"\"\n",
+    "    Retrieve top k context items for a query using top k retrieval.\n",
+    "    \"\"\"\n",
+    "    # regular top k retrieval\n",
+    "    q_vec=chunks_vector_store.embedding_function.embed_documents([query])\n",
+    "    _,indices=chunks_vector_store.index.search(np.array(q_vec),k=k)\n",
+    "\n",
+    "    texts = [idx_to_text(i) for i in indices[0]]\n",
+    "    return texts\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "test_query = \"What is the main cause of climate change?\"\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Regular top k retrieval\n",
+    "- This demonstration shows that when database is dense (here we simulate density by loading each document 5 times), the results are not good, we don't get the most relevant results. Note that the top 3 results are all repetitions of the same document."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Context 1:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n",
+      "Context 2:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n",
+      "Context 3:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "texts=get_context(test_query,k=3)\n",
+    "show_context(texts)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Now for the real part :) "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "### More utils for distances normalization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def lognorm(dist:np.ndarray, sigma:float):\n",
+    "    \"\"\"\n",
+    "    Calculate the log-normal probability for a given distance and sigma.\n",
+    "    \"\"\"\n",
+    "    if sigma < 1e-9: \n",
+    "        return -np.inf * dist\n",
+    "    return -np.log(sigma) - 0.5 * np.log(2 * np.pi) - dist**2 / (2 * sigma**2)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Greedy Dartboard Search\n",
+    "\n",
+    "This is the core algorithm: A search algorithm that selects a diverse set of relevant documents from a collection by balancing two factors: relevance to the query and diversity among selected documents.\n",
+    "\n",
+    "Given distances between a query and documents, plus distances between all documents, the algorithm:\n",
+    "\n",
+    "1. Selects the most relevant document first\n",
+    "2. Iteratively selects additional documents by combining:\n",
+    "   - Relevance to the original query\n",
+    "   - Diversity from previously selected documents\n",
+    "\n",
+    "The balance between relevance and diversity is controlled by weights:\n",
+    "- `DIVERSITY_WEIGHT`: Importance of difference from existing selections\n",
+    "- `RELEVANCE_WEIGHT`: Importance of relevance to query\n",
+    "- `SIGMA`: Smoothing parameter for probability conversion\n",
+    "\n",
+    "The algorithm returns both the selected documents and their selection scores, making it useful for applications like search results where you want relevant but varied results.\n",
+    "\n",
+    "For example, when searching news articles, it would first return the most relevant article, then find articles that are both on-topic and provide new information, avoiding redundant selections."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Configuration parameters\n",
+    "DIVERSITY_WEIGHT = 1.0  # Weight for diversity in document selection\n",
+    "RELEVANCE_WEIGHT = 1.0  # Weight for relevance to query\n",
+    "SIGMA = 0.1  # Smoothing parameter for probability distribution\n",
+    "\n",
+    "def greedy_dartsearch(\n",
+    "    query_distances: np.ndarray,\n",
+    "    document_distances: np.ndarray,\n",
+    "    documents: List[str],\n",
+    "    num_results: int\n",
+    ") -> Tuple[List[str], List[float]]:\n",
+    "    \"\"\"\n",
+    "    Perform greedy dartboard search to select top k documents balancing relevance and diversity.\n",
+    "    \n",
+    "    Args:\n",
+    "        query_distances: Distance between query and each document\n",
+    "        document_distances: Pairwise distances between documents\n",
+    "        documents: List of document texts\n",
+    "        num_results: Number of documents to return\n",
+    "    \n",
+    "    Returns:\n",
+    "        Tuple containing:\n",
+    "        - List of selected document texts\n",
+    "        - List of selection scores for each document\n",
+    "    \"\"\"\n",
+    "    # Avoid division by zero in probability calculations\n",
+    "    sigma = max(SIGMA, 1e-5)\n",
+    "    \n",
+    "    # Convert distances to probability distributions\n",
+    "    query_probabilities = lognorm(query_distances, sigma)\n",
+    "    document_probabilities = lognorm(document_distances, sigma)\n",
+    "    \n",
+    "    # Initialize with most relevant document\n",
+    "    \n",
+    "    most_relevant_idx = np.argmax(query_probabilities)\n",
+    "    selected_indices = np.array([most_relevant_idx])\n",
+    "    selection_scores = [1.0] # dummy score for the first document\n",
+    "    # Get initial distances from the first selected document\n",
+    "    max_distances = document_probabilities[most_relevant_idx]\n",
+    "    \n",
+    "    # Select remaining documents\n",
+    "    while len(selected_indices) < num_results:\n",
+    "        # Update maximum distances considering new document\n",
+    "        updated_distances = np.maximum(max_distances, document_probabilities)\n",
+    "        \n",
+    "        # Calculate combined diversity and relevance scores\n",
+    "        combined_scores = (\n",
+    "            updated_distances * DIVERSITY_WEIGHT +\n",
+    "            query_probabilities * RELEVANCE_WEIGHT\n",
+    "        )\n",
+    "        \n",
+    "        # Normalize scores and mask already selected documents\n",
+    "        normalized_scores = logsumexp(combined_scores, axis=1)\n",
+    "        normalized_scores[selected_indices] = -np.inf\n",
+    "        \n",
+    "        # Select best remaining document\n",
+    "        best_idx = np.argmax(normalized_scores)\n",
+    "        best_score = np.max(normalized_scores)\n",
+    "        \n",
+    "        # Update tracking variables\n",
+    "        max_distances = updated_distances[best_idx]\n",
+    "        selected_indices = np.append(selected_indices, best_idx)\n",
+    "        selection_scores.append(best_score)\n",
+    "    \n",
+    "    # Return selected documents and their scores\n",
+    "    selected_documents = [documents[i] for i in selected_indices]\n",
+    "    return selected_documents, selection_scores"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Dartboard Context Retrieval\n",
+    "\n",
+    "### Main function for using the dartboard retrieval. This serves instead of get_context (which is simple RAG). It:\n",
+    "\n",
+    "1. Takes a text query, vectorizes it, gets the top k documents (and their vectors) via simple RAG\n",
+    "2. Uses these vectors to calculate the similarities to query and between candidate matches\n",
+    "3. Runs the dartboard algorithm to refine the candidate matches to a final list of k documents\n",
+    "4. Returns the final list of documents and their scores"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "def get_context_with_dartboard(\n",
+    "    query: str,\n",
+    "    num_results: int = 5,\n",
+    "    oversampling_factor: int = 3\n",
+    ") -> Tuple[List[str], List[float]]:\n",
+    "    \"\"\"\n",
+    "    Retrieve most relevant and diverse context items for a query using the dartboard algorithm.\n",
+    "    \n",
+    "    Args:\n",
+    "        query: The search query string\n",
+    "        num_results: Number of context items to return (default: 5)\n",
+    "        oversampling_factor: Factor to oversample initial results for better diversity (default: 3)\n",
+    "    \n",
+    "    Returns:\n",
+    "        Tuple containing:\n",
+    "        - List of selected context texts\n",
+    "        - List of selection scores\n",
+    "        \n",
+    "    Note:\n",
+    "        The function uses cosine similarity converted to distance. Initial retrieval \n",
+    "        fetches oversampling_factor * num_results items to ensure sufficient diversity \n",
+    "        in the final selection.\n",
+    "    \"\"\"\n",
+    "    # Embed query and retrieve initial candidates\n",
+    "    query_embedding = chunks_vector_store.embedding_function.embed_documents([query])\n",
+    "    _, candidate_indices = chunks_vector_store.index.search(\n",
+    "        np.array(query_embedding),\n",
+    "        k=num_results * oversampling_factor\n",
+    "    )\n",
+    "    \n",
+    "    # Get document vectors and texts for candidates\n",
+    "    candidate_vectors = np.array(\n",
+    "        chunks_vector_store.index.reconstruct_batch(candidate_indices[0])\n",
+    "    )\n",
+    "    candidate_texts = [idx_to_text(idx) for idx in candidate_indices[0]]\n",
+    "    \n",
+    "    # Calculate distance matrices\n",
+    "    # Using 1 - cosine_similarity as distance metric\n",
+    "    document_distances = 1 - np.dot(candidate_vectors, candidate_vectors.T)\n",
+    "    query_distances = 1 - np.dot(query_embedding, candidate_vectors.T)\n",
+    "    \n",
+    "    # Apply dartboard selection algorithm\n",
+    "    selected_texts, selection_scores = greedy_dartsearch(\n",
+    "        query_distances,\n",
+    "        document_distances,\n",
+    "        candidate_texts,\n",
+    "        num_results\n",
+    "    )\n",
+    "    \n",
+    "    return selected_texts, selection_scores"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### dartboard retrieval - results on same query, k, and dataset\n",
+    "- As you can see now the top 3 results are not mere repetitions. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Context 1:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n",
+      "Context 2:\n",
+      "Most of these climate changes are attributed to very small variations in Earth's orbit that \n",
+      "change the amount of solar energy our planet receives. During the Holocene epoch, which \n",
+      "began at the end of the last ice age, human societies f lourished, but the industrial era has seen \n",
+      "unprecedented changes.  \n",
+      "Modern Observations  \n",
+      "Modern scientific observations indicate a rapid increase in global temperatures, sea levels, \n",
+      "and extreme weather events. The Intergovernmental Panel on Climate Change (IPCC) has \n",
+      "documented these changes extensively. Ice core samples, tree rings, and ocean sediments \n",
+      "provide a historical record that scientists use to understand past climate conditions and \n",
+      "predict future trends. The evidence overwhelmingly shows that recent changes are primarily \n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases\n",
+      "\n",
+      "\n",
+      "Context 3:\n",
+      "driven by human activities, particularly the emission of greenhou se gases.  \n",
+      "Chapter 2: Causes of Climate Change  \n",
+      "Greenhouse Gases  \n",
+      "The primary cause of recent climate change is the increase in greenhouse gases in the \n",
+      "atmosphere. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous \n",
+      "oxide (N2O), trap heat from the sun, creating a \"greenhouse effect.\" This effect is  essential \n",
+      "for life on Earth, as it keeps the planet warm enough to support life. However, human \n",
+      "activities have intensified this natural process, leading to a warmer climate.  \n",
+      "Fossil Fuels  \n",
+      "Burning fossil fuels for energy releases large amounts of CO2. This includes coal, oil, and \n",
+      "natural gas used for electricity, heating, and transportation. The industrial revolution marked \n",
+      "the beginning of a significant increase in fossil fuel consumption, which continues to rise \n",
+      "today.  \n",
+      "Coal\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "texts,scores=get_context_with_dartboard(test_query,k=3)\n",
+    "show_context(texts)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/all_rag_techniques/reliable_rag.ipynb
+++ b/all_rag_techniques/reliable_rag.ipynb
@@ -239,7 +239,7 @@
   "outputs": [],
   "source": [
    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "from pydantic import BaseModel, Field\n",
    "from langchain_groq import ChatGroq\n",
    "\n",
    "# Data model\n",
--- a/all_rag_techniques/semantic_chunking.ipynb
+++ b/all_rag_techniques/semantic_chunking.ipynb
@@ -8,7 +8,7 @@
    "\n",
    "## Overview\n",
    "\n",
-    "This code implements a semantic chunking approach for processing and retrieving information from PDF documents, [first proposed by Greg Kamradt](https://youtu.be/8OJC21T2SL4?t=1933) and subsequently [implemented in LangChain](https://docs.llamaindex.ai/en/stable/examples/node_parsers/semantic_chunking/). Unlike traditional methods that split text based on fixed character or word counts, semantic chunking aims to create more meaningful and context-aware text segments.\n",
+    "This code implements a semantic chunking approach for processing and retrieving information from PDF documents, [first proposed by Greg Kamradt](https://youtu.be/8OJC21T2SL4?t=1933) and subsequently [implemented in LangChain](https://python.langchain.com/docs/how_to/semantic-chunker/). Unlike traditional methods that split text based on fixed character or word counts, semantic chunking aims to create more meaningful and context-aware text segments.\n",
    "\n",
    "## Motivation\n",
    "\n",
--- a/all_rag_techniques/simple_rag.ipynb
+++ b/all_rag_techniques/simple_rag.ipynb
@@ -289,7 +289,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.12.3"
+   "version": "3.12.0"
  }
 },
 "nbformat": 4,
--- a/all_rag_techniques_runnable_scripts/HyPE_Hypothetical_Prompt_Embeddings.py
+++ b/all_rag_techniques_runnable_scripts/HyPE_Hypothetical_Prompt_Embeddings.py
@@ -0,0 +1,203 @@
+import os
+import sys
+import argparse
+import time
+import faiss
+from dotenv import load_dotenv
+from tqdm import tqdm
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from langchain_community.docstore.in_memory import InMemoryDocstore
+
+# Add the parent directory to the path since we work with notebooks
+sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))
+
+from helper_functions import *
+from evaluation.evalute_rag import *
+
+# Load environment variables from a .env file (e.g., OpenAI API key)
+load_dotenv()
+os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
+
+class HyPE:
+    """
+    A class to handle the HyPE RAG process, which enhances document chunking by 
+    generating hypothetical questions as proxies for retrieval.
+    """
+
+    def __init__(self, path, chunk_size=1000, chunk_overlap=200, n_retrieved=3):
+        """
+        Initializes the HyPE-based RAG retriever by encoding the PDF document with 
+        hypothetical prompt embeddings.
+
+        Args:
+            path (str): Path to the PDF file to encode.
+            chunk_size (int): Size of each text chunk (default: 1000).
+            chunk_overlap (int): Overlap between consecutive chunks (default: 200).
+            n_retrieved (int): Number of chunks to retrieve for each query (default: 3).
+        """
+        print("\n--- Initializing HyPE RAG Retriever ---")
+
+        # Encode the PDF document into a FAISS vector store using hypothetical prompt embeddings
+        start_time = time.time()
+        self.vector_store = self.encode_pdf(path, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
+        self.time_records = {'Chunking': time.time() - start_time}
+        print(f"Chunking Time: {self.time_records['Chunking']:.2f} seconds")
+
+        # Create a retriever from the vector store
+        self.chunks_query_retriever = self.vector_store.as_retriever(search_kwargs={"k": n_retrieved})
+
+    def generate_hypothetical_prompt_embeddings(self, chunk_text):
+        """
+        Uses an LLM to generate multiple hypothetical questions for a single chunk.
+        These questions act as 'proxies' for the chunk during retrieval.
+
+        Parameters:
+        chunk_text (str): Text contents of the chunk.
+
+        Returns:
+        tuple: (Original chunk text, List of embedding vectors generated from the questions)
+        """
+        llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini")
+        embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")
+
+        question_gen_prompt = PromptTemplate.from_template(
+            "Analyze the input text and generate essential questions that, when answered, \
+            capture the main points of the text. Each question should be one line, \
+            without numbering or prefixes.\n\n \
+            Text:\n{chunk_text}\n\nQuestions:\n"
+        )
+        question_chain = question_gen_prompt | llm | StrOutputParser()
+
+        # Parse questions from response
+        questions = question_chain.invoke({"chunk_text": chunk_text}).replace("\n\n", "\n").split("\n")
+
+        return chunk_text, embedding_model.embed_documents(questions)
+
+    def prepare_vector_store(self, chunks):
+        """
+        Creates and populates a FAISS vector store using hypothetical prompt embeddings.
+
+        Parameters:
+        chunks (List[str]): A list of text chunks to be embedded and stored.
+
+        Returns:
+        FAISS: A FAISS vector store containing the embedded text chunks.
+        """
+        vector_store = None  # Wait to initialize to determine vector size
+
+        with ThreadPoolExecutor() as pool:
+            # Parallelized embedding generation
+            futures = [pool.submit(self.generate_hypothetical_prompt_embeddings, c) for c in chunks]
+
+            for f in tqdm(as_completed(futures), total=len(chunks)):  
+                chunk, vectors = f.result()  # Retrieve processed chunk and embeddings
+
+                # Initialize FAISS store once vector size is known
+                if vector_store is None:
+                    vector_store = FAISS(
+                        embedding_function=OpenAIEmbeddings(model="text-embedding-3-small"),
+                        index=faiss.IndexFlatL2(len(vectors[0])),
+                        docstore=InMemoryDocstore(),
+                        index_to_docstore_id={}
+                    )
+
+                # Store multiple vector representations per chunk
+                chunks_with_embedding_vectors = [(chunk.page_content, vec) for vec in vectors]
+                vector_store.add_embeddings(chunks_with_embedding_vectors)
+
+        return vector_store
+
+    def encode_pdf(self, path, chunk_size=1000, chunk_overlap=200):
+        """
+        Encodes a PDF document into a vector store using hypothetical prompt embeddings.
+
+        Args:
+            path: The path to the PDF file.
+            chunk_size: The size of each text chunk.
+            chunk_overlap: The overlap between consecutive chunks.
+
+        Returns:
+            A FAISS vector store containing the encoded book content.
+        """
+        # Load PDF documents
+        loader = PyPDFLoader(path)
+        documents = loader.load()
+
+        # Split documents into chunks
+        text_splitter = RecursiveCharacterTextSplitter(
+            chunk_size=chunk_size, chunk_overlap=chunk_overlap, length_function=len
+        )
+        texts = text_splitter.split_documents(documents)
+        cleaned_texts = replace_t_with_space(texts)
+
+        return self.prepare_vector_store(cleaned_texts)
+
+    def run(self, query):
+        """
+        Retrieves and displays the context for the given query.
+
+        Args:
+            query (str): The query to retrieve context for.
+
+        Returns:
+            None
+        """
+        # Measure retrieval time
+        start_time = time.time()
+        context = retrieve_context_per_question(query, self.chunks_query_retriever)
+        self.time_records['Retrieval'] = time.time() - start_time
+        print(f"Retrieval Time: {self.time_records['Retrieval']:.2f} seconds")
+
+        # Deduplicate context and display results
+        context = list(set(context))
+        show_context(context)
+
+
+def validate_args(args):
+    if args.chunk_size <= 0:
+        raise ValueError("chunk_size must be a positive integer.")
+    if args.chunk_overlap < 0:
+        raise ValueError("chunk_overlap must be a non-negative integer.")
+    if args.n_retrieved <= 0:
+        raise ValueError("n_retrieved must be a positive integer.")
+    return args
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description="Encode a PDF document and test a HyPE-based RAG system.")
+    parser.add_argument("--path", type=str, default="../data/Understanding_Climate_Change.pdf",
+                        help="Path to the PDF file to encode.")
+    parser.add_argument("--chunk_size", type=int, default=1000,
+                        help="Size of each text chunk (default: 1000).")
+    parser.add_argument("--chunk_overlap", type=int, default=200,
+                        help="Overlap between consecutive chunks (default: 200).")
+    parser.add_argument("--n_retrieved", type=int, default=3,
+                        help="Number of chunks to retrieve for each query (default: 3).")
+    parser.add_argument("--query", type=str, default="What is the main cause of climate change?",
+                        help="Query to test the retriever (default: 'What is the main cause of climate change?').")
+    parser.add_argument("--evaluate", action="store_true",
+                        help="Whether to evaluate the retriever's performance (default: False).")
+
+    return validate_args(parser.parse_args())
+
+
+def main(args):
+    # Initialize the HyPE-based RAG Retriever
+    hyperag = HyPE(
+        path=args.path,
+        chunk_size=args.chunk_size,
+        chunk_overlap=args.chunk_overlap,
+        n_retrieved=args.n_retrieved
+    )
+
+    # Retrieve context based on the query
+    hyperag.run(args.query)
+
+    # Evaluate the retriever's performance on the query (if requested)
+    if args.evaluate:
+        evaluate_rag(hyperag.chunks_query_retriever)
+
+
+if __name__ == '__main__':
+    # Call the main function with parsed arguments
+    main(parse_args())
--- a/all_rag_techniques_runnable_scripts/adaptive_retrieval.py
+++ b/all_rag_techniques_runnable_scripts/adaptive_retrieval.py
@@ -1,11 +1,10 @@
 import os
 import sys
 from dotenv import load_dotenv
-from langchain.prompts import PromptTemplate
-from langchain.vectorstores import FAISS
-from langchain.embeddings import OpenAIEmbeddings
-from langchain.text_splitter import CharacterTextSplitter
-from langchain.prompts import PromptTemplate
+from langchain_core.prompts import PromptTemplate
+from langchain_community.vectorstores import FAISS
+from langchain_openai import OpenAIEmbeddings
+from langchain_text_splitters import CharacterTextSplitter

 from langchain_core.retrievers import BaseRetriever
 from typing import List, Dict, Any
--- a/all_rag_techniques_runnable_scripts/choose_chunk_size.py
+++ b/all_rag_techniques_runnable_scripts/choose_chunk_size.py
@@ -7,6 +7,7 @@ from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
 from llama_index.core.prompts import PromptTemplate
 from llama_index.core.evaluation import DatasetGenerator, FaithfulnessEvaluator, RelevancyEvaluator
 from llama_index.llms.openai import OpenAI
+from llama_index.core.node_parser import SentenceSplitter

 # Apply asyncio fix for Jupyter notebooks
 nest_asyncio.apply()
@@ -44,7 +45,8 @@ def evaluate_response_time_and_accuracy(chunk_size, eval_questions, eval_documen
    Settings.llm = llm
    
    # Create vector index
-    vector_index = VectorStoreIndex.from_documents(eval_documents)
+    splitter = SentenceSplitter(chunk_size=chunk_size)
+    vector_index = VectorStoreIndex.from_documents(eval_documents, transformations=[splitter])

    # Build query engine
    query_engine = vector_index.as_query_engine(similarity_top_k=5)
--- a/data/nike_2023_annual_report.txt
+++ b/data/nike_2023_annual_report.txt
@@ -1,5 +1,6 @@
 FORM 10-K FORM 10-KUNITED STATES
 SECURITIES AND EXCHANGE COMMISSION
+Washington, D.C.
 Washington, D.C. 20549
 FORM 10-K 
 (Mark One)
--- a/evaluation/evalute_rag.py
+++ b/evaluation/evalute_rag.py
@@ -14,12 +14,14 @@ Custom modules:
 """

 import json
-from typing import List, Tuple
+from typing import List, Tuple, Dict, Any

 from deepeval import evaluate
 from deepeval.metrics import GEval, FaithfulnessMetric, ContextualRelevancyMetric
 from deepeval.test_case import LLMTestCase, LLMTestCaseParams
 from langchain_openai import ChatOpenAI
+from langchain_core.prompts import PromptTemplate
+from langchain_core.output_parsers import StrOutputParser

 # 09/15/24 kimmeyh Added path where helper functions is located to the path
 # Add the parent directory to the path since we work with notebooks
@@ -90,41 +92,75 @@ relevance_metric = ContextualRelevancyMetric(
    include_reason=True
 )

-def evaluate_rag(chunks_query_retriever, num_questions: int = 5) -> None:
+def evaluate_rag(retriever, num_questions: int = 5) -> Dict[str, Any]:
    """
-    Evaluate the RAG system using predefined metrics.
-
+    Evaluates a RAG system using predefined test questions and metrics.
+    
    Args:
-        chunks_query_retriever: Function to retrieve context chunks for a given query.
-        num_questions (int): Number of questions to evaluate (default: 5).
+        retriever: The retriever component to evaluate
+        num_questions: Number of test questions to generate
+    
+    Returns:
+        Dict containing evaluation metrics
    """
-    llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=2000)
-    question_answer_from_context_chain = create_question_answer_from_context_chain(llm)
-
-    # Load questions and answers from JSON file
-    q_a_file_name = "../data/q_a.json"
-    with open(q_a_file_name, "r", encoding="utf-8") as json_file:
-        q_a = json.load(json_file)
-
-    questions = [qa["question"] for qa in q_a][:num_questions]
-    ground_truth_answers = [qa["answer"] for qa in q_a][:num_questions]
-    generated_answers = []
-    retrieved_documents = []
-
-    # Generate answers and retrieve documents for each question
-    for question in questions:
-        context = retrieve_context_per_question(question, chunks_query_retriever)
-        retrieved_documents.append(context)
-        context_string = " ".join(context)
-        result = answer_question_from_context(question, context_string, question_answer_from_context_chain)
-        generated_answers.append(result["answer"])
-
-    # Create test cases and evaluate
-    test_cases = create_deep_eval_test_cases(questions, ground_truth_answers, generated_answers, retrieved_documents)
-    evaluate(
-        test_cases=test_cases,
-        metrics=[correctness_metric, faithfulness_metric, relevance_metric]
+    
+    # Initialize LLM
+    llm = ChatOpenAI(temperature=0, model_name="gpt-4-turbo-preview")
+    
+    # Create evaluation prompt
+    eval_prompt = PromptTemplate.from_template("""
+    Evaluate the following retrieval results for the question.
+    
+    Question: {question}
+    Retrieved Context: {context}
+    
+    Rate on a scale of 1-5 (5 being best) for:
+    1. Relevance: How relevant is the retrieved information to the question?
+    2. Completeness: Does the context contain all necessary information?
+    3. Conciseness: Is the retrieved context focused and free of irrelevant information?
+    
+    Provide ratings in JSON format:
+    """)
+    
+    # Create evaluation chain
+    eval_chain = (
+        eval_prompt 
+        | llm 
+        | StrOutputParser()
    )
+    
+    # Generate test questions
+    question_gen_prompt = PromptTemplate.from_template(
+        "Generate {num_questions} diverse test questions about climate change:"
+    )
+    question_chain = question_gen_prompt | llm | StrOutputParser()
+    
+    questions = question_chain.invoke({"num_questions": num_questions}).split("\n")
+    
+    # Evaluate each question
+    results = []
+    for question in questions:
+        # Get retrieval results
+        context = retriever.get_relevant_documents(question)
+        context_text = "\n".join([doc.page_content for doc in context])
+        
+        # Evaluate results
+        eval_result = eval_chain.invoke({
+            "question": question,
+            "context": context_text
+        })
+        results.append(eval_result)
+    
+    return {
+        "questions": questions,
+        "results": results,
+        "average_scores": calculate_average_scores(results)
+    }
+
+def calculate_average_scores(results: List[Dict]) -> Dict[str, float]:
+    """Calculate average scores across all evaluation results."""
+    # Implementation depends on the exact format of your results
+    pass

 if __name__ == "__main__":
    # Add any necessary setup or configuration here
--- a/helper_functions.py
+++ b/helper_functions.py
@@ -17,7 +17,7 @@ from enum import Enum

 def replace_t_with_space(list_of_documents):
    """
-    Replaces all tab characters ('\t') with spaces in the page content of each document.
+    Replaces all tab characters ('\t') with spaces in the page content of each document

    Args:
        list_of_documents: A list of document objects, each with a 'page_content' attribute.
--- a/images/hype.svg
+++ b/images/hype.svg
--- a/requirements.txt
+++ b/requirements.txt
@@ -208,3 +208,55 @@ nbformat==5.10.4
 xxhash==3.5.0
 yarl==1.10.0
 zipp==3.20.1
+
+# Core LangChain packages
+langchain>=0.1.0
+langchain-core>=0.1.17
+langchain-community>=0.0.13
+langchain-openai>=0.0.5
+langchain-anthropic>=0.0.9
+langchain-groq>=0.0.1
+langchain-cohere>=0.0.1
+
+# Vector stores and embeddings
+faiss-cpu>=1.7.4
+chromadb>=0.4.22
+
+# Document processing
+PyMuPDF>=1.23.8  # for fitz
+python-docx>=1.0.1
+pypdf>=3.17.4
+rank-bm25>=0.2.2
+
+# Machine Learning and Data Science
+numpy>=1.24.3
+pandas>=2.0.3
+scikit-learn>=1.3.0
+
+# API Clients
+openai>=1.12.0
+anthropic>=0.8.1
+cohere>=4.48
+groq>=0.4.2
+
+# Testing and Evaluation
+pytest>=7.4.0
+deepeval>=0.20.12
+grouse>=0.3.0
+
+# Development Tools
+python-dotenv>=1.0.0
+jupyter>=1.0.0
+notebook>=7.0.6
+ipykernel>=6.29.2
+
+# Type Checking
+pydantic>=2.6.1
+typing-extensions>=4.9.0
+
+# Async Support
+aiohttp>=3.9.1
+asyncio>=3.4.3
+
+# Utilities
+tqdm>=4.66.1
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -1,10 +1,18 @@
 import pytest
 import os
 import sys
+from langchain_openai import ChatOpenAI, OpenAIEmbeddings
+from langchain_community.vectorstores import FAISS
+from langchain_core.prompts import PromptTemplate
+from langchain_text_splitters import CharacterTextSplitter
+from dotenv import load_dotenv

 # Add the main folder to sys.path
 sys.path.append(os.path.abspath(os.path.dirname(__file__) + "/../"))

+# Load environment variables
+load_dotenv()
+
 def pytest_addoption(parser):
    parser.addoption(
        "--exclude", action="store", help="Comma-separated list of notebook or script files' paths to exclude"
@@ -40,4 +48,58 @@ def script_paths(request):
    
    path_with_full_address = [folder + s for s in include_scripts]
    
-    return path_with_full_address
+    return path_with_full_address
+
+@pytest.fixture(scope="session")
+def llm():
+    """Fixture for ChatOpenAI model."""
+    return ChatOpenAI(
+        temperature=0,
+        model_name="gpt-4-turbo-preview",
+        max_tokens=4000
+    )
+
+@pytest.fixture(scope="session")
+def embeddings():
+    """Fixture for OpenAI embeddings."""
+    return OpenAIEmbeddings()
+
+@pytest.fixture(scope="session")
+def text_splitter():
+    """Fixture for text splitter."""
+    return CharacterTextSplitter(
+        chunk_size=1000,
+        chunk_overlap=200
+    )
+
+@pytest.fixture(scope="session")
+def sample_texts():
+    """Fixture for sample test data."""
+    return [
+        "The Earth is the third planet from the Sun.",
+        "Climate change is a significant global challenge.",
+        "Renewable energy sources include solar and wind power."
+    ]
+
+@pytest.fixture(scope="session")
+def vector_store(embeddings, sample_texts, text_splitter):
+    """Fixture for vector store."""
+    docs = text_splitter.create_documents(sample_texts)
+    return FAISS.from_documents(docs, embeddings)
+
+@pytest.fixture(scope="session")
+def retriever(vector_store):
+    """Fixture for retriever."""
+    return vector_store.as_retriever(search_kwargs={"k": 2})
+
+@pytest.fixture(scope="session")
+def basic_prompt():
+    """Fixture for basic prompt template."""
+    return PromptTemplate.from_template("""
+    Answer the following question based on the context provided:
+    
+    Context: {context}
+    Question: {question}
+    
+    Answer:
+    """)
Author	SHA1	Message	Date
NirDiamant	1cfb0d44cb	Merge pull request #83 from VakeDomen/feature/hype Feature/hype	2025-04-01 23:32:26 +03:00
VakeDomen	57e9dcc87a	improved markdown	2025-03-10 13:28:16 +00:00
nird	73b91bfa13	updated readme	2025-03-06 00:38:52 +02:00
nird	7d603611bd	update subs count	2025-03-06 00:36:56 +02:00
NirDiamant	e76f08482a	Merge pull request #87 from anantgupta129/fix/chunk_size Fix: Ensure Chunk Size Parameter is Properly Utilized	2025-03-06 00:31:15 +02:00
Anant Gupta	942467c05d	fix chunk size utilization	2025-03-05 16:12:49 +05:30
VakeDomen	6096797c7e	merge main	2025-03-01 18:28:18 +00:00
NirDiamant	42cabf3a9b	Merge pull request #85 from Redempt1onzzZZ/main [Typo] Update crag.ipynb	2025-02-24 09:21:57 +02:00
1ndigo	990ecff889	[Typo] Update crag.ipynb	2025-02-24 13:49:53 +08:00
NirDiamant	3c46cc9b0a	Merge pull request #84 from roybka/dartboard_minor_correction fix scores initialization	2025-02-19 22:29:45 +02:00
rotbka	c4eb7c15e6	fix scores initialization	2025-02-19 22:27:10 +02:00
VakeDomen	165876797c	hype image	2025-02-19 21:05:40 +01:00
VakeDomen	91a8a89302	readme	2025-02-19 20:57:57 +01:00
VakeDomen	dfb0e9125b	HyPE	2025-02-19 20:40:18 +01:00
NirDiamant	d100326db5	Merge pull request #82 from Un1que11/patch-1 Update semantic_chunking.ipynb	2025-02-19 21:37:27 +02:00
nird	6e1698f962	made the dartboard more understandable	2025-02-19 21:36:18 +02:00
NirDiamant	9b19b48637	Merge pull request #81 from roybka/dartboard_algo dartboard algo implementation + README	2025-02-19 21:26:53 +02:00
Mikhail Orlov	b0b1b2f72e	Update semantic_chunking.ipynb There was a link to LlamaIndex in the article for Langchain	2025-02-18 13:15:19 +01:00
VakeDomen	06d2f16b4b	cp	2025-02-18 10:06:29 +00:00
rotbka	a51359b9c1	better explanation	2025-02-17 17:44:10 +02:00
rotbka	db8b6a7b6c	more comments and markdown	2025-02-15 21:05:36 +02:00
rotbka	c1d4bb450f	better arrange functions, split them to cells	2025-02-13 09:46:13 +02:00
rotbka	673ffb5b0a	improve readme - dartboard section	2025-02-12 19:49:30 +02:00
rotbka	c8791970e9	tidy up	2025-02-12 19:44:18 +02:00
rotbka	0993d27edf	dartboard algo implementation + README	2025-02-10 14:12:17 +02:00
NirDiamant	7249e55824	Merge pull request #79 from speedwagon1299/FixServiceContext Modified from Service Context to LLaMa Settings	2025-02-03 00:50:05 +02:00
speedwagon1299	076560320c	Modified from Service Context to LLaMa Settings	2025-02-03 00:06:49 +05:30
nird	a1155a5581	updated contributing	2025-02-02 12:28:23 +02:00
nird	76a529eccf	updated code	2025-02-02 12:24:43 +02:00
nird	f50f0e4373	updated imports	2025-02-02 12:15:20 +02:00
NirDiamant	8a9d842ede	Merge pull request #76 from speedwagon1299/ReliableRagFix Fixed pydantic import in reliable_rag.ipynb	2025-02-02 12:03:53 +02:00
speedwagon1299	0ff4ed2270	Fixed pydantic import in reliable rag	2025-02-01 18:16:53 +05:30
nird	2d3344b4f8	updated readme	2025-01-29 23:17:33 +02:00
nird	209dde5430	updated readme	2025-01-29 23:13:41 +02:00