Update README.md - docs to docs site (#60)

* Update README.md - docs to docs site * Update README.md Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2024-09-08 19:13:11 +03:00 · 2024-08-27 12:56:17 -07:00
parent 7aa5cea7da
commit 383a2c22af
1 changed files with 7 additions and 203 deletions
--- a/README.md
+++ b/README.md
@@ -69,11 +69,15 @@ Optional:
 > [!TIP]
 > The simplest way to install Neo4j is via [Neo4j Desktop](https://neo4j.com/download/). It provides a user-friendly interface to manage Neo4j instances and databases.
-`pip install graphiti-core`
+```bash
 pip install graphiti-core
 ```
 or
-`poetry add graphiti-core`
+```bash
 poetry add graphiti-core
 ```
@@ -145,207 +149,7 @@ graphiti.close()
 ## Documentation
-### Adding Episodes
+Visit the Zep knowledge base for graphiti [Guides and API documentation](https://help.getzep.com/graphiti/graphiti).
 Episodes represent a single data ingestion event. An `episode` is itself a node, and any nodes identified while ingesting the
 episode are related to the episode via `MENTIONS` edges.
 Episodes enable querying for information at a point in time and understanding the provenance of nodes and their edge relationships.
 Supported episode types:
 - `text`: Unstructured text data
 - `message`: Conversational messages of the format `speaker: message...`
 - `json`: Structured data, processed distinctly from the other types
 The graph below was generated using the code in the [Quick Start](#quick-start). Each "podcast" is an individual episode.
 ![Simple Graph Visualization](images/simple_graph.svg)
 #### Adding a `text` or `message` Episode
 Using the `EpisodeType.text` type:
 ```python
 await graphiti.add_episode(
    name="tech_innovation_article",
    episode_body=(
        "MIT researchers have unveiled 'ClimateNet', an AI system capable of predicting "
        "climate patterns with unprecedented accuracy. Early tests show it can forecast "
        "major weather events up to three weeks in advance, potentially revolutionizing "
        "disaster preparedness and agricultural planning."
    ),
    source=EpisodeType.text,
    # A description of the source (e.g., "podcast", "news article")
    source_description="Technology magazine article",
    # The timestamp for when this episode occurred or was created
    reference_time=datetime(2023, 11, 15, 9, 30),
    # Additional metadata about the episode (optional)
    metadata={
        "author": "Zara Patel",
        "publication": "Tech Horizons Monthly",
        "word_count": 39
    }
 )
 ```
 Using the `EpisodeType.message` type supports passing in multi-turn conversations in the `episode_body`.
 The text should be structured in `{role/name}: {message}` pairs.
 ```python
 await graphiti.add_episode(
    name="Customer_Support_Interaction_1",
    episode_body=(
        "Customer: Hi, I'm having trouble with my Allbirds shoes. "
        "The sole is coming off after only 2 months of use.\n"
        "Support: I'm sorry to hear that. Can you please provide your order number?"
    ),
    source=EpisodeType.message,
    source_description="Customer support chat",
    reference_time=datetime(2024, 3, 15, 14, 45),
    metadata={
        "channel": "Live Chat",
        "agent_id": "SP001",
        "customer_id": "C12345"
    }
 )
 ```
 #### Adding an Epsiode using structured data in JSON format
 JSON documents can be arbitrarily nested. However, it's advisable to keep documents compact, as they must fit within your LLM's context window.
 > [!TIP]
 > For large data imports, consider using the `add_episode_bulk` API to efficiently add multiple episodes at once.
 ```python
 product_data = {
    "id": "PROD001",
    "name": "Men's SuperLight Wool Runners",
    "color": "Dark Grey",
    "sole_color": "Medium Grey",
    "material": "Wool",
    "technology": "SuperLight Foam",
    "price": 125.00,
    "in_stock": True,
    "last_updated": "2024-03-15T10:30:00Z"
 }
 # Add the episode to the graph
 await graphiti.add_episode(
    name="Product Update - PROD001",
    episode_body=product_data,  # Pass the Python dictionary directly
    source=EpisodeType.json,
    source_description="Allbirds product catalog update",
    reference_time=datetime.now(),
    metadata={
        "update_type": "product_info",
        "catalog_version": "v2.3"
    }
 )
 ```
 #### Loading Episodes in Bulk
 Graphiti offers `add_episode_bulk` for efficient batch ingestion of episodes, significantly outperforming `add_episode` for large datasets. This method is highly recommended for bulk loading.
 > [!WARNING]
 > Use `add_episode_bulk` only for populating empty graphs or when edge invalidation is not required. The bulk ingestion pipeline does not perform edge invalidation operations.
 ```python
 product_data = [
    {
        "id": "PROD001",
        "name": "Men's SuperLight Wool Runners",
        "color": "Dark Grey",
        "sole_color": "Medium Grey",
        "material": "Wool",
        "technology": "SuperLight Foam",
        "price": 125.00,
        "in_stock": true,
        "last_updated": "2024-03-15T10:30:00Z"
    },
    ...
    {
        "id": "PROD0100",
        "name": "Kids Wool Runner-up Mizzles",
        "color": "Natural Grey",
        "sole_color": "Orange",
        "material": "Wool",
        "technology": "Water-repellent",
        "price": 80.00,
        "in_stock": true,
        "last_updated": "2024-03-17T14:45:00Z"
    }
 ]
 # Prepare the episodes for bulk loading
 bulk_episodes = [
    RawEpisode(
        name=f"Product Update - {product['id']}",
        content=json.dumps(product),
        source=EpisodeType.json,
        source_description="Allbirds product catalog update",
        reference_time=datetime.now()
    )
    for product in product_data
 ]
 await graphiti.add_episode_bulk(bulk_episodes)
 ```
 ### Searching graphiti's graph
 The examples below demonstrate two search approaches in the graphiti library:
 1. **Hybrid Search:**
   ```python
   await graphiti.search(query)
   ```
   Combines semantic similarity and BM25 retrieval, reranked using Reciprocal Rank Fusion.
   Example: Does a broad retrieval of facts related to Allbirds Wool Runners and Jane's purchase.
 2. **Node Distance Reranking:**
   ```python
   await client.search(query, focal_node_uuid)
   ```
   Extends Hybrid Search above by prioritizing results based on proximity to a specified node in the graph.
   Example: Focuses on Jane-specific information, highlighting her wool allergy.
 Node Distance Reranking is particularly useful for entity-specific queries, providing more contextually relevant results. It weights facts by their closeness to the focal node, emphasizing information directly related to the entity of interest.
 This dual approach allows for both broad exploration and targeted, entity-specific information retrieval from the knowledge graph.
 ```python
 query = "Can Jane wear Allbirds Wool Runners?"
 jane_node_uuid = "123e4567-e89b-12d3-a456-426614174000"
 def print_facts(edges):
    print("\n".join([edge.fact for edge in edges]))
 # Hybrid Search
 results = await graphiti.search(query)
 print_facts(results)
 > The Allbirds Wool Runners are sold by Allbirds.
 > Men's SuperLight Wool Runners - Dark Grey (Medium Grey Sole) has a runner silhouette.
 > Jane purchased SuperLight Wool Runners.
 # Hybrid Search with Node Distance Reranking
 await client.search(query, jane_node_uuid)
 print_facts(results)
 > Jane purchased SuperLight Wool Runners.
 > Jane is allergic to wool.
 > The Allbirds Wool Runners are sold by Allbirds.
 ```
 ## Status and Roadmap