mirror of https://github.com/videosdk-community/ai-telephony-demo.git synced 2025-08-02 04:19:31 +03:00

Files

sosumit001 50dcb7c9b4 telephony-agent:support[twilio]

2025-06-18 12:04:43 +05:30

10 KiB

Raw Permalink Blame History

AI Telephony Agent

Make INBOUND and OUTBOUND calls with AI agents using VideoSDK. Supports multiple SIP providers and AI agents with a clean, extensible architecture for VoIP telephony solutions.

Installation

Prerequisites

Python 3.11+
VideoSDK account
Twilio account (SIP trunking provider)
Google API key (for Gemini AI)

Setup

Clone the repository

git clone https://github.com/yourusername/ai-agent-telephony.git
cd ai-agent-telephony

Install dependencies

pip install -r requirements.txt

Configure environment variables Create a .env file:

# VideoSDK Configuration
VIDEOSDK_AUTH_TOKEN=your_videosdk_token
VIDEOSDK_SIP_USERNAME=your_sip_username
VIDEOSDK_SIP_PASSWORD=your_sip_password

# AI Configuration
GOOGLE_API_KEY=your_google_api_key

# Twilio SIP Trunking Configuration
TWILIO_SID=your_twilio_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token
TWILIO_NUMBER=your_twilio_number

Run the server

python server.py

The server will start on http://localhost:8000

API Endpoints

Handle Inbound Calls (SIP User Agent Server)

POST /inbound-call

Handles incoming calls from your SIP provider. Expects Twilio webhook parameters, either host this server or use ngrok:

POST <server-url>/inbound-call

CallSid: Unique call identifier
From: Caller's phone number (CLI - Calling Line Identification)
To: Recipient's phone number (DID - Direct Inward Dialing)

Initiate Outbound Calls (SIP User Agent Client)

POST /outbound-call
Content-Type: application/json

{
  "to_number": "+1234567890",
  "initial_greeting": "Hello from AI Agent!"
}

Configure SIP Provider

POST /configure-provider?provider_name=twilio

Switch SIP providers at runtime (currently supports: twilio).

Adding New SIP Providers

The modular architecture makes it easy to add new SIP providers and SIP trunking services. Here's how to add a new provider:

1. Create Provider Implementation

Create providers/your_provider.py:

from typing import Dict, Any
from .base import SIPProvider
from config import Config

class YourProvider(SIPProvider):
    def __init__(self):
        self.client = self.create_client()

    def create_client(self) -> Any:
        return YourProviderClient(Config.YOUR_API_KEY)

    def generate_twiml(self, sip_endpoint: str, **kwargs) -> str:
        return f"<Response><Dial><Sip>{sip_endpoint}</Sip></Dial></Response>"

    def initiate_outbound_call(self, to_number: str, twiml: str) -> Dict[str, Any]:
        call = self.client.calls.create(
            to=to_number,
            from_=Config.YOUR_NUMBER,
            twiml=twiml
        )
        return {
            "call_sid": call.id,
            "status": call.status,
            "provider": "your_provider"
        }

    def get_provider_name(self) -> str:
        return "your_provider"

2. Update Provider Factory

Add to providers/__init__.py:

from .your_provider import YourProvider

def get_provider(provider_name: str = "twilio") -> SIPProvider:
    providers = {
        "twilio": TwilioProvider,
        "your_provider": YourProvider,
    }
    # ... rest of function

3. Add Configuration

Update config.py:

class Config:
    YOUR_API_KEY = os.getenv("YOUR_API_KEY")
    YOUR_NUMBER = os.getenv("YOUR_NUMBER")

    @classmethod
    def validate(cls) -> None:
        required_vars = {
            # ... existing vars
            "YOUR_API_KEY": cls.YOUR_API_KEY,
            "YOUR_NUMBER": cls.YOUR_NUMBER,
        }
        # ... rest of validation

Adding New AI Agents

Similarly, you can add new AI agents for intelligent call handling:

1. Create AI Agent Implementation

Create ai/your_ai_agent.py:

from typing import Dict, Any
from videosdk.agents import AgentSession, RealTimePipeline
from .base_agent import AIAgent
from voice_agent import VoiceAgent
from config import Config

class YourAIAgent(AIAgent):
    def create_pipeline(self) -> RealTimePipeline:
        model = YourAIModel(
            api_key=Config.YOUR_AI_API_KEY,
            model="your-model-name"
        )
        return RealTimePipeline(model=model)

    def create_session(self, room_id: str, context: Dict[str, Any]) -> AgentSession:
        pipeline = self.create_pipeline()
        agent_context = {
            "name": "Your AI Agent",
            "meetingId": room_id,
            "videosdk_auth": Config.VIDEOSDK_AUTH_TOKEN,
            **context
        }

        session = AgentSession(
            agent=VoiceAgent(context=agent_context),
            pipeline=pipeline,
            context=agent_context
        )
        return session

    def get_agent_name(self) -> str:
        return "your_ai_agent"

2. Update AI Agent Factory

Add to ai/__init__.py:

from .your_ai_agent import YourAIAgent

def get_ai_agent(agent_name: str = "gemini") -> AIAgent:
    agents = {
        "gemini": GeminiAgent,
        "your_ai_agent": YourAIAgent,
    }
    # ... rest of function

Testing

Health Check

curl "http://localhost:8000/health"

Outbound Call Test (SIP UAC)

curl -X POST "http://localhost:8000/outbound-call" \
  -H "Content-Type: application/json" \
  -d '{"to_number": "+1234567890", "initial_greeting": "Hello from AI Agent!"}'

Switch SIP Provider

curl -X POST "http://localhost:8000/configure-provider?provider_name=twilio"

🔧 Configuration

Environment Variables

Variable	Description	Required
`VIDEOSDK_AUTH_TOKEN`	VideoSDK authentication token	✅
`VIDEOSDK_SIP_USERNAME`	VideoSDK SIP username	✅
`VIDEOSDK_SIP_PASSWORD`	VideoSDK SIP password	✅
`GOOGLE_API_KEY`	Google API key for Gemini	✅
`TWILIO_SID`	Twilio account SID	✅
`TWILIO_AUTH_TOKEN`	Twilio auth token	✅
`TWILIO_NUMBER`	Twilio phone number	✅

Provider-Specific Variables

For additional SIP providers, add their specific environment variables to config.py.

Features

SIP/VoIP Integration: Pluggable SIP providers (Twilio, and more) with session initiation protocol support
AI-Powered Voice Agents: Pluggable AI agents (Gemini, and more) for intelligent call handling
Real-time Voice Communication: AI agents with real-time transport protocol (RTP) capabilities
Modular Architecture: Clean separation of concerns for scalable telephony solutions
Runtime Configuration: Switch SIP providers and AI agents without restart
VideoSDK Integration: Seamless room creation and session management
Call Control: Advanced call routing, forwarding, and transfer capabilities
Codec Support: Multiple audio codecs for optimal voice quality

Use Cases

Customer Service (SIP-based)

AI agents handle customer inquiries via VoIP
24/7 availability with SIP trunking
Consistent service quality across PSTN and IP networks

Appointment Scheduling

Automated appointment booking via SIP calls
Reminder calls using SIP user agent client
Rescheduling assistance with DTMF support

Surveys and Feedback

Automated survey calls over SIP
Customer feedback collection via VoIP
Data collection with real-time transport protocol

Emergency Notifications

Automated emergency alerts via SIP trunking
Mass notification systems using PSTN integration
Status updates through IP multimedia subsystem (IMS)

Architecture Benefits

Separation of Concerns: Each component has a single responsibility
Extensibility: Easy to add new SIP providers and AI agents
Testability: Components can be tested in isolation
Maintainability: Clear structure makes code easier to understand
Reusability: Components can be reused across different projects
Configuration Management: Centralized configuration with validation
SIP Compliance: Full session initiation protocol support
VoIP Integration: Seamless integration with voice over internet protocol

Roadmap

Add support for multiple AI agents per session
Implement SIP-specific features (SBC, registrar, proxy server)
Add monitoring and metrics for SIP sessions
Create provider-specific webhook handlers
Add support for different voice codecs per AI agent
Implement call recording and transcription
Add sentiment analysis for call quality
Create web dashboard for call management
Support for H.323 protocol integration
Advanced call control features (forwarding, transfer, queue)

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Guidelines

Follow the existing code patterns
Add proper error handling
Include logging
Update documentation
Add tests if possible

License

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❤️ for the developer community

10 KiB Raw Permalink Blame History