claude-cookbooks/third_party/Deepgram/prerecorded_audio.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Transcribe an audio file with Deepgram & use Anthropic to prepare interview questions!\n",
    "\n",
    "**Make a copy of this notebook into your own drive, and follow the instructions below!** 🥳🥳🥳\n",
    "\n",
    "----------------------------\n",
    "\n",
    "# Get started:\n",
    "Running the following three cells will allow you to transcribe any audio you wish. The comments below point out the variables you can manipulate to modify your output as you wish.\n",
    "\n",
    "Before running this notebook, you'll need to have a couple audio URLs to transcribe. You can use any audio files you wish.\n",
    "\n",
    "And by the way, if you haven't yet signed up for Deepgram, check out this link here: https://dpgr.am/prerecorded-notebook-signup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 1: Dependencies\n",
    "\n",
    "Run this cell to download all necessary dependencies.\n",
    "\n",
    "Note: You can run a cell by clicking the play button on the left or by clicking on the cell and pressing `shift`+`ENTER` at the same time. (Or `shift` + `return` on Mac)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install requests ffmpeg-python\n",
    "! pip install deepgram-sdk --upgrade\n",
    "! pip install requests\n",
    "! pip install anthropic"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 2: Audio URL files\n",
    "\n",
    "Find some audio files hosted on a server so you can use this notebook. OR An example file is provided by Deepgram is code below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Have you completed Step 2 above? 👀\n",
    "# Do you see your audio file in the folder on the left? 📂"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 3: Transcription\n",
    "\n",
    "Fill in the following variables:\n",
    "\n",
    "\n",
    "* `DG_KEY` = Your personal Deepgram API key\n",
    "* `AUDIO_FILE_URL` = a URL for an audio file you wish to transcribe.\n",
    "\n",
    "\n",
    "Now run the cell! (`Shift` + `Enter`)\n",
    "\n",
    "-----------\n",
    "\n",
    "\n",
    "\n",
    "And by the way, if you're already a Deepgram user, and you're getting an error in this cell the most common fixes are:\n",
    "\n",
    "1. You may need to update your installation of the deepgram-sdk.\n",
    "2. You may need to check how many credits you have left in your Deepgram account."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from deepgram import DeepgramClient, PrerecordedOptions, FileSource\n",
    "import requests\n",
    "\n",
    "# Deepgram API key\n",
    "DG_KEY = \"🔑🔑🔑 Your API Key here! 🔑🔑🔑\"\n",
    "\n",
    "# URL of the audio file\n",
    "AUDIO_FILE_URL = \"https://static.deepgram.com/examples/nasa-spacewalk-interview.wav\"\n",
    "\n",
    "# Path to save the transcript JSON file\n",
    "TRANSCRIPT_FILE = \"transcript.json\"\n",
    "\n",
    "def main():\n",
    "    try:\n",
    "        # STEP 1: Create a Deepgram client using the API key\n",
    "        deepgram = DeepgramClient(DG_KEY)\n",
    "\n",
    "        # Download the audio file from the URL\n",
    "        response = requests.get(AUDIO_FILE_URL)\n",
    "        if response.status_code == 200:\n",
    "            buffer_data = response.content\n",
    "        else:\n",
    "            print(\"Failed to download audio file\")\n",
    "            return\n",
    "\n",
    "        payload: FileSource = {\n",
    "            \"buffer\": buffer_data,\n",
    "        }\n",
    "\n",
    "        # STEP 2: Configure Deepgram options for audio analysis\n",
    "        options = PrerecordedOptions(\n",
    "            model=\"nova-2\",\n",
    "            smart_format=True,\n",
    "        )\n",
    "\n",
    "        # STEP 3: Call the transcribe_file method with the text payload and options\n",
    "        response = deepgram.listen.prerecorded.v(\"1\").transcribe_file(payload, options)\n",
    "\n",
    "        # STEP 4: Write the response JSON to a file\n",
    "        with open(TRANSCRIPT_FILE, \"w\") as transcript_file:\n",
    "            transcript_file.write(response.to_json(indent=4))\n",
    "\n",
    "        print(\"Transcript JSON file generated successfully.\")\n",
    "\n",
    "    except Exception as e:\n",
    "        print(f\"Exception: {e}\")\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    main()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the cell above succeeds, you should see JSON output file(s) in the content directory. Note: There may be a small delay between when the cell finishes running and when the JSON file actually appears. This is normal. Just wait a few moments for the file(s) to appear."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 4: Check out your transcription\n",
    "\n",
    "The function below parses the output JSON and prints out the transcription of one of the files you just transcribed! (Make sure\n",
    "the file you're trying to examine is indeed already loaded into the content directory.)\n",
    "\n",
    "**Set the `OUTPUT` variable to the name of the file you wish to see the transcription of.**\n",
    "\n",
    "Then run this cell (`Shift`+`Enter`) to see a sentence-by-sentence transcription of your audio!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "# Set this variable to the path of the output file you wish to read\n",
    "OUTPUT = 'transcript.json'\n",
    "\n",
    "\n",
    "# The JSON is loaded with information, but if you just want to read the\n",
    "# transcript, run the code below!\n",
    "def print_transcript(transcription_file):\n",
    "  with open(transcription_file, \"r\") as file:\n",
    "        data = json.load(file)\n",
    "        result = data['results']['channels'][0]['alternatives'][0]['transcript']\n",
    "        result = result.split('.')\n",
    "        for sentence in result:\n",
    "          print(sentence + '.')\n",
    "\n",
    "print_transcript(OUTPUT)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "If the cell above succeeds you should see a plain text version of your audio transcription. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 5: Prepare Interview Questions using Anthropic\n",
    "\n",
    "Now we can send off our transcript to Anthropic for analysis to help us prepare some interview questions. Run the cell below (`Shift`+`Enter`) to get a suggested set of interview questions provided by Anthropic based on your audio transcript above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import anthropic\n",
    "import json\n",
    "\n",
    "transcription_file = \"transcript.json\"\n",
    "\n",
    "# Function to get the transcript from the JSON file\n",
    "def get_transcript(transcription_file):\n",
    "    with open(transcription_file, \"r\") as file:\n",
    "        data = json.load(file)\n",
    "        result = data['results']['channels'][0]['alternatives'][0]['transcript']\n",
    "        return result\n",
    "\n",
    "# Load the transcript from the JSON file\n",
    "message_text = get_transcript(transcription_file)\n",
    "\n",
    "# Initialize the Claude API client\n",
    "client = anthropic.Anthropic(\n",
    "    # Defaults to os.environ.get(\"CLAUDE_API_KEY\")\n",
    "    # Claude API key\n",
    "    api_key=\"🔑🔑🔑 Your API Key here! 🔑🔑🔑\"\n",
    ")\n",
    "\n",
    "# Prepare the text for the API request\n",
    "formatted_messages = [\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": message_text\n",
    "    }\n",
    "]\n",
    "\n",
    "# Generate thoughtful, open-ended interview questions\n",
    "response = client.messages.create(\n",
    "    model=\"claude-3-opus-20240229\",\n",
    "    max_tokens=1000,\n",
    "    temperature=0.5,\n",
    "    system=\"Your task is to generate a series of thoughtful, open-ended questions for an interview based on the given context. The questions should be designed to elicit insightful and detailed responses from the interviewee, allowing them to showcase their knowledge, experience, and critical thinking skills. Avoid yes/no questions or those with obvious answers. Instead, focus on questions that encourage reflection, self-assessment, and the sharing of specific examples or anecdotes.\",\n",
    "    messages=formatted_messages\n",
    ")\n",
    "\n",
    "# Print the generated questions\n",
    "\n",
    "# Join the text of each TextBlock into a single string\n",
    "content = ''.join(block.text for block in response.content)\n",
    "\n",
    "# Split the content by '\\n\\n'\n",
    "parts = content.split('\\n\\n')\n",
    "\n",
    "# Print each part with an additional line break\n",
    "for part in parts:\n",
    "    print(part)\n",
    "    print('\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If this cell succeeded you should see a list of interview questions based on your original audio file. Now you can transcribe audio with Deepgram and use Anthropic to get a set of interview questions. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "sdk",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}