Merge pull request #73 from casistack/feat--add-Docker-support-and-environment-configuration-updated

Feat  add docker support and environment configuration updated
This commit is contained in:
warmshao
2025-01-10 19:14:23 +08:00
committed by GitHub
9 changed files with 496 additions and 88 deletions

View File

@@ -17,5 +17,17 @@ ANONYMIZED_TELEMETRY=true
# LogLevel: Set to debug to enable verbose logging, set to result to get results only. Available: result | debug | info
BROWSER_USE_LOGGING_LEVEL=info
# Chrome settings
CHROME_PATH=
CHROME_USER_DATA=
CHROME_USER_DATA=
CHROME_DEBUGGING_PORT=9222
CHROME_DEBUGGING_HOST=localhost
CHROME_PERSISTENT_SESSION=false # Set to true to keep browser open between AI tasks
# Display settings
RESOLUTION=1920x1080x24 # Format: WIDTHxHEIGHTxDEPTH
RESOLUTION_WIDTH=1920 # Width in pixels
RESOLUTION_HEIGHT=1080 # Height in pixels
# VNC settings
VNC_PASSWORD=youvncpassword

82
Dockerfile Normal file
View File

@@ -0,0 +1,82 @@
FROM python:3.11-slim
# Install system dependencies
RUN apt-get update && apt-get install -y \
wget \
gnupg \
curl \
unzip \
xvfb \
libgconf-2-4 \
libxss1 \
libnss3 \
libnspr4 \
libasound2 \
libatk1.0-0 \
libatk-bridge2.0-0 \
libcups2 \
libdbus-1-3 \
libdrm2 \
libgbm1 \
libgtk-3-0 \
libxcomposite1 \
libxdamage1 \
libxfixes3 \
libxrandr2 \
xdg-utils \
fonts-liberation \
dbus \
xauth \
xvfb \
x11vnc \
tigervnc-tools \
supervisor \
net-tools \
procps \
git \
python3-numpy \
&& rm -rf /var/lib/apt/lists/*
# Install noVNC
RUN git clone https://github.com/novnc/noVNC.git /opt/novnc \
&& git clone https://github.com/novnc/websockify /opt/novnc/utils/websockify \
&& ln -s /opt/novnc/vnc.html /opt/novnc/index.html
# Install Chrome
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \
&& apt-get update \
&& apt-get install -y google-chrome-stable \
&& rm -rf /var/lib/apt/lists/*
# Set up working directory
WORKDIR /app
# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Install Playwright and browsers with system dependencies
ENV PLAYWRIGHT_BROWSERS_PATH=/ms-playwright
RUN playwright install --with-deps chromium
RUN playwright install-deps
# Copy the application code
COPY . .
# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV BROWSER_USE_LOGGING_LEVEL=info
ENV CHROME_PATH=/usr/bin/google-chrome
ENV ANONYMIZED_TELEMETRY=false
ENV DISPLAY=:99
ENV RESOLUTION=1920x1080x24
ENV VNC_PASSWORD=vncpassword
# Set up supervisor configuration
RUN mkdir -p /var/log/supervisor
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
EXPOSE 7788 6080 5900
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

View File

@@ -17,9 +17,45 @@ We would like to officially thank [WarmShao](https://github.com/warmshao) for hi
**Custom Browser Support:** You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording.
<video src="https://github.com/user-attachments/assets/56bc7080-f2e3-4367-af22-6bf2245ff6cb" controls="controls" >Your browser does not support playing this video!</video>
**Persistent Browser Sessions:** You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions.
## Installation Guide
<video src="https://github.com/user-attachments/assets/56bc7080-f2e3-4367-af22-6bf2245ff6cb" controls="controls">Your browser does not support playing this video!</video>
## Installation Options
### Option 1: Docker Installation (Recommended)
1. **Prerequisites:**
- Docker and Docker Compose installed on your system
- Git to clone the repository
2. **Setup:**
```bash
# Clone the repository
git clone <repository-url>
cd browser-use-webui
# Copy and configure environment variables
cp .env.example .env
# Edit .env with your preferred text editor and add your API keys
```
3. **Run with Docker:**
```bash
# Build and start the container with default settings (browser closes after AI tasks)
docker compose up --build
# Or run with persistent browser (browser stays open between AI tasks)
CHROME_PERSISTENT_SESSION=true docker compose up --build
```
4. **Access the Application:**
- WebUI: `http://localhost:7788`
- VNC Viewer (to see browser interactions): `http://localhost:6080/vnc.html`
Default VNC password is "vncpassword". You can change it by setting the `VNC_PASSWORD` environment variable in your `.env` file.
### Option 2: Local Installation
Read the [quickstart guide](https://docs.browser-use.com/quickstart#prepare-the-environment) or follow the steps below to get started.
@@ -51,6 +87,59 @@ playwright install
## Usage
### Docker Setup
1. **Environment Variables:**
- All configuration is done through the `.env` file
- Available environment variables:
```
# LLM API Keys
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here
# Browser Settings
CHROME_PERSISTENT_SESSION=true # Set to true to keep browser open between AI tasks
RESOLUTION=1920x1080x24 # Custom resolution format: WIDTHxHEIGHTxDEPTH
RESOLUTION_WIDTH=1920 # Custom width in pixels
RESOLUTION_HEIGHT=1080 # Custom height in pixels
# VNC Settings
VNC_PASSWORD=your_vnc_password # Optional, defaults to "vncpassword"
```
2. **Browser Persistence Modes:**
- **Default Mode (CHROME_PERSISTENT_SESSION=false):**
- Browser opens and closes with each AI task
- Clean state for each interaction
- Lower resource usage
- **Persistent Mode (CHROME_PERSISTENT_SESSION=true):**
- Browser stays open between AI tasks
- Maintains history and state
- Allows viewing previous AI interactions
- Set in `.env` file or via environment variable when starting container
3. **Viewing Browser Interactions:**
- Access the noVNC viewer at `http://localhost:6080/vnc.html`
- Enter the VNC password (default: "vncpassword" or what you set in VNC_PASSWORD)
- You can now see all browser interactions in real-time
4. **Container Management:**
```bash
# Start with persistent browser
CHROME_PERSISTENT_SESSION=true docker compose up -d
# Start with default mode (browser closes after tasks)
docker compose up -d
# View logs
docker compose logs -f
# Stop the container
docker compose down
```
### Local Setup
1. **Run the WebUI:**
```bash
python webui.py --ip 127.0.0.1 --port 7788
@@ -129,4 +218,4 @@ CHROME_USER_DATA="~/Library/Application Support/Google/Chrome/Profile 1"
## Changelog
- [x] **2025/01/06:** Thanks to @richard-devbot, a New and Well-Designed WebUI is released. [Video tutorial demo](https://github.com/warmshao/browser-use-webui/issues/1#issuecomment-2573393113).
- [x] **2025/01/06:** Thanks to @richard-devbot, a New and Well-Designed WebUI is released. [Video tutorial demo](https://github.com/warmshao/browser-use-webui/issues/1#issuecomment-2573393113).

51
docker-compose.yml Normal file
View File

@@ -0,0 +1,51 @@
services:
browser-use-webui:
build:
context: .
dockerfile: Dockerfile
ports:
- "7788:7788" # Gradio default port
- "6080:6080" # noVNC web interface
- "5900:5900" # VNC port
- "9222:9222" # Chrome remote debugging port
environment:
- OPENAI_ENDPOINT=${OPENAI_ENDPOINT:-https://api.openai.com/v1}
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- GOOGLE_API_KEY=${GOOGLE_API_KEY:-}
- AZURE_OPENAI_ENDPOINT=${AZURE_OPENAI_ENDPOINT:-}
- AZURE_OPENAI_API_KEY=${AZURE_OPENAI_API_KEY:-}
- DEEPSEEK_ENDPOINT=${DEEPSEEK_ENDPOINT:-https://api.deepseek.com}
- DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY:-}
- BROWSER_USE_LOGGING_LEVEL=${BROWSER_USE_LOGGING_LEVEL:-info}
- ANONYMIZED_TELEMETRY=false
- CHROME_PATH=/usr/bin/google-chrome
- CHROME_USER_DATA=/app/data/chrome_data
- CHROME_PERSISTENT_SESSION=${CHROME_PERSISTENT_SESSION:-false}
- DISPLAY=:99
- PLAYWRIGHT_BROWSERS_PATH=/ms-playwright
- RESOLUTION=${RESOLUTION:-1920x1080x24}
- RESOLUTION_WIDTH=${RESOLUTION_WIDTH:-1920}
- RESOLUTION_HEIGHT=${RESOLUTION_HEIGHT:-1080}
- VNC_PASSWORD=${VNC_PASSWORD:-vncpassword}
- PERSISTENT_BROWSER_PORT=9222
- PERSISTENT_BROWSER_HOST=localhost
- CHROME_DEBUGGING_PORT=9222
- CHROME_DEBUGGING_HOST=localhost
volumes:
- ./data:/app/data
- ./data/chrome_data:/app/data/chrome_data
- /tmp/.X11-unix:/tmp/.X11-unix
restart: unless-stopped
shm_size: '2gb'
cap_add:
- SYS_ADMIN
security_opt:
- seccomp=unconfined
tmpfs:
- /tmp
healthcheck:
test: ["CMD", "nc", "-z", "localhost", "5900"]
interval: 10s
timeout: 5s
retries: 3

30
src/browser/config.py Normal file
View File

@@ -0,0 +1,30 @@
# -*- coding: utf-8 -*-
# @Time : 2025/1/6
# @Author : wenshao
# @ProjectName: browser-use-webui
# @FileName: config.py
import os
from dataclasses import dataclass
from typing import Optional
@dataclass
class BrowserPersistenceConfig:
"""Configuration for browser persistence"""
persistent_session: bool = False
user_data_dir: Optional[str] = None
debugging_port: Optional[int] = None
debugging_host: Optional[str] = None
@classmethod
def from_env(cls) -> "BrowserPersistenceConfig":
"""Create config from environment variables"""
return cls(
persistent_session=os.getenv("CHROME_PERSISTENT_SESSION", "").lower()
== "true",
user_data_dir=os.getenv("CHROME_USER_DATA"),
debugging_port=int(os.getenv("CHROME_DEBUGGING_PORT", "9222")),
debugging_host=os.getenv("CHROME_DEBUGGING_HOST", "localhost"),
)

View File

@@ -6,15 +6,45 @@
from browser_use.browser.browser import Browser
from browser_use.browser.context import BrowserContext, BrowserContextConfig
from playwright.async_api import BrowserContext as PlaywrightBrowserContext
import logging
from .config import BrowserPersistenceConfig
from .custom_context import CustomBrowserContext
logger = logging.getLogger(__name__)
class CustomBrowser(Browser):
_global_context = None
async def new_context(
self,
config: BrowserContextConfig = BrowserContextConfig(),
context: CustomBrowserContext = None,
) -> BrowserContext:
"""Create a browser context"""
context: PlaywrightBrowserContext = None,
) -> CustomBrowserContext:
"""Create a browser context with persistence support"""
persistence_config = BrowserPersistenceConfig.from_env()
if persistence_config.persistent_session:
if CustomBrowser._global_context is not None:
logger.info("Reusing existing persistent browser context")
return CustomBrowser._global_context
context_instance = CustomBrowserContext(config=config, browser=self, context=context)
CustomBrowser._global_context = context_instance
logger.info("Created new persistent browser context")
return context_instance
logger.info("Creating non-persistent browser context")
return CustomBrowserContext(config=config, browser=self, context=context)
async def close(self):
"""Override close to respect persistence setting"""
persistence_config = BrowserPersistenceConfig.from_env()
if not persistence_config.persistent_session:
if CustomBrowser._global_context is not None:
await CustomBrowser._global_context.close()
CustomBrowser._global_context = None
await super().close()
else:
logger.info("Skipping browser close due to persistent session")

View File

@@ -12,7 +12,9 @@ import os
from browser_use.browser.browser import Browser
from browser_use.browser.context import BrowserContext, BrowserContextConfig
from playwright.async_api import Browser as PlaywrightBrowser
from playwright.async_api import BrowserContext as PlaywrightBrowserContext
from .config import BrowserPersistenceConfig
logger = logging.getLogger(__name__)
@@ -21,18 +23,21 @@ class CustomBrowserContext(BrowserContext):
self,
browser: "Browser",
config: BrowserContextConfig = BrowserContextConfig(),
context: BrowserContext = None,
context: PlaywrightBrowserContext = None,
):
super(CustomBrowserContext, self).__init__(browser=browser, config=config)
self.context = context
self._persistence_config = BrowserPersistenceConfig.from_env()
async def _create_context(self, browser: PlaywrightBrowser):
async def _create_context(self, browser: PlaywrightBrowser) -> PlaywrightBrowserContext:
"""Creates a new browser context with anti-detection measures and loads cookies if available."""
# If we have a context, return it directly
if self.context:
return self.context
if self.browser.config.chrome_instance_path and len(browser.contexts) > 0:
# Connect to existing Chrome instance instead of creating new one
# Check if we should use existing context for persistence
if self._persistence_config.persistent_session and len(browser.contexts) > 0:
logger.info("Using existing persistent context")
context = browser.contexts[0]
else:
# Original code for creating new context
@@ -47,7 +52,7 @@ class CustomBrowserContext(BrowserContext):
bypass_csp=self.config.disable_security,
ignore_https_errors=self.config.disable_security,
record_video_dir=self.config.save_recording_path,
record_video_size=self.config.browser_window_size, # set record video size, same as windows size
record_video_size=self.config.browser_window_size,
)
if self.config.trace_path:
@@ -94,3 +99,8 @@ class CustomBrowserContext(BrowserContext):
)
return context
async def close(self):
"""Override close to respect persistence setting"""
if not self._persistence_config.persistent_session:
await super().close()

83
supervisord.conf Normal file
View File

@@ -0,0 +1,83 @@
[supervisord]
nodaemon=true
logfile=/dev/stdout
logfile_maxbytes=0
loglevel=debug
[program:xvfb]
command=Xvfb :99 -screen 0 %(ENV_RESOLUTION)s -ac +extension GLX +render -noreset
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=100
startsecs=3
[program:vnc_setup]
command=bash -c "mkdir -p ~/.vnc && echo '%(ENV_VNC_PASSWORD)s' | vncpasswd -f > ~/.vnc/passwd && chmod 600 ~/.vnc/passwd && ls -la ~/.vnc/passwd"
autorestart=false
startsecs=0
priority=150
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
[program:x11vnc]
command=bash -c "sleep 3 && DISPLAY=:99 x11vnc -display :99 -forever -shared -rfbauth /root/.vnc/passwd -rfbport 5900 -bg -o /var/log/x11vnc.log"
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=200
startretries=5
startsecs=5
depends_on=vnc_setup
[program:x11vnc_log]
command=tail -f /var/log/x11vnc.log
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=250
[program:novnc]
command=bash -c "sleep 5 && cd /opt/novnc && ./utils/novnc_proxy --vnc localhost:5900 --listen 0.0.0.0:6080 --web /opt/novnc"
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=300
startretries=5
startsecs=3
depends_on=x11vnc
[program:persistent_browser]
command=bash -c 'if [ "%(ENV_CHROME_PERSISTENT_SESSION)s" = "true" ]; then mkdir -p /app/data/chrome_data && sleep 8 && google-chrome --user-data-dir=/app/data/chrome_data --window-position=0,0 --window-size=%(ENV_RESOLUTION_WIDTH)s,%(ENV_RESOLUTION_HEIGHT)s --start-maximized --no-sandbox --disable-dev-shm-usage --disable-gpu --disable-software-rasterizer --disable-setuid-sandbox --no-first-run --no-default-browser-check --no-experiments --ignore-certificate-errors --remote-debugging-port=9222 --remote-debugging-address=0.0.0.0 "data:text/html,<html><body style=\"background: #f0f0f0; margin: 0; display: flex; justify-content: center; align-items: center; height: 100vh; font-family: Arial;\"><h1>Browser Ready for AI Interaction</h1></body></html>"; else echo "Persistent browser disabled"; fi'
autorestart=%(ENV_CHROME_PERSISTENT_SESSION)s
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=350
startretries=3
startsecs=3
depends_on=novnc
[program:webui]
command=python webui.py --ip 0.0.0.0 --port 7788
directory=/app
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
priority=400
startretries=3
startsecs=3
depends_on=persistent_browser

173
webui.py
View File

@@ -35,6 +35,16 @@ from src.browser.custom_context import BrowserContextConfig
from src.controller.custom_controller import CustomController
from src.utils import utils
from src.utils.utils import update_model_dropdown
from src.browser.config import BrowserPersistenceConfig
from src.browser.custom_browser import CustomBrowser
from src.browser.custom_context import CustomBrowserContext
from browser_use.browser.browser import BrowserConfig
from browser_use.browser.context import BrowserContextConfig, BrowserContextWindowSize
# Global variables for persistence
_global_browser = None
_global_browser_context = None
_global_playwright = None
async def run_browser_agent(
agent_type,
@@ -196,96 +206,107 @@ async def run_custom_agent(
max_actions_per_step,
tool_call_in_content
):
global _global_browser, _global_browser_context, _global_playwright
controller = CustomController()
playwright = None
browser_context_ = None
persistence_config = BrowserPersistenceConfig.from_env()
try:
# Initialize global browser if needed
if _global_browser is None:
_global_browser = CustomBrowser(
config=BrowserConfig(
headless=headless,
disable_security=disable_security,
extra_chromium_args=[f"--window-size={window_w},{window_h}"],
)
)
# Handle browser context based on configuration
if use_own_browser:
playwright = await async_playwright().start()
chrome_exe = os.getenv("CHROME_PATH", "")
chrome_use_data = os.getenv("CHROME_USER_DATA", "")
if _global_browser_context is None:
_global_playwright = await async_playwright().start()
chrome_exe = os.getenv("CHROME_PATH", "")
chrome_use_data = os.getenv("CHROME_USER_DATA", "")
if chrome_exe == "":
chrome_exe = None
elif not os.path.exists(chrome_exe):
raise ValueError(f"Chrome executable not found at {chrome_exe}")
if chrome_use_data == "":
chrome_use_data = None
browser_context_ = await playwright.chromium.launch_persistent_context(
user_data_dir=chrome_use_data,
executable_path=chrome_exe,
no_viewport=False,
headless=headless, # 保持浏览器窗口可见
user_agent=(
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"
),
java_script_enabled=True,
bypass_csp=disable_security,
ignore_https_errors=disable_security,
record_video_dir=save_recording_path if save_recording_path else None,
record_video_size={"width": window_w, "height": window_h},
)
else:
browser_context_ = None
browser = CustomBrowser(
config=BrowserConfig(
headless=headless,
disable_security=disable_security,
extra_chromium_args=[f"--window-size={window_w},{window_h}"],
)
)
async with await browser.new_context(
config=BrowserContextConfig(
trace_path=save_trace_path if save_trace_path else None,
save_recording_path=save_recording_path
if save_recording_path
else None,
browser_context = await _global_playwright.chromium.launch_persistent_context(
user_data_dir=chrome_use_data,
executable_path=chrome_exe,
no_viewport=False,
browser_window_size=BrowserContextWindowSize(
width=window_w, height=window_h
headless=headless,
user_agent=(
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"
),
),
context=browser_context_,
) as browser_context:
agent = CustomAgent(
task=task,
add_infos=add_infos,
use_vision=use_vision,
llm=llm,
browser_context=browser_context,
controller=controller,
system_prompt_class=CustomSystemPrompt,
max_actions_per_step=max_actions_per_step,
tool_call_in_content=tool_call_in_content
)
history = await agent.run(max_steps=max_steps)
java_script_enabled=True,
bypass_csp=disable_security,
ignore_https_errors=disable_security,
record_video_dir=save_recording_path if save_recording_path else None,
record_video_size={"width": window_w, "height": window_h},
)
_global_browser_context = await _global_browser.new_context(
config=BrowserContextConfig(
trace_path=save_trace_path if save_trace_path else None,
save_recording_path=save_recording_path if save_recording_path else None,
no_viewport=False,
browser_window_size=BrowserContextWindowSize(
width=window_w, height=window_h
),
),
context=browser_context,
)
else:
if _global_browser_context is None:
_global_browser_context = await _global_browser.new_context(
config=BrowserContextConfig(
trace_path=save_trace_path if save_trace_path else None,
save_recording_path=save_recording_path if save_recording_path else None,
no_viewport=False,
browser_window_size=BrowserContextWindowSize(
width=window_w, height=window_h
),
),
)
final_result = history.final_result()
errors = history.errors()
model_actions = history.model_actions()
model_thoughts = history.model_thoughts()
# Create and run agent
agent = CustomAgent(
task=task,
add_infos=add_infos,
use_vision=use_vision,
llm=llm,
browser_context=_global_browser_context,
controller=controller,
system_prompt_class=CustomSystemPrompt,
max_actions_per_step=max_actions_per_step,
tool_call_in_content=tool_call_in_content
)
history = await agent.run(max_steps=max_steps)
final_result = history.final_result()
errors = history.errors()
model_actions = history.model_actions()
model_thoughts = history.model_thoughts()
except Exception as e:
import traceback
traceback.print_exc()
final_result = ""
errors = str(e) + "\n" + traceback.format_exc()
model_actions = ""
model_thoughts = ""
finally:
# 显式关闭持久化上下文
if browser_context_:
await browser_context_.close()
# Handle cleanup based on persistence configuration
if not persistence_config.persistent_session:
if _global_browser_context:
await _global_browser_context.close()
_global_browser_context = None
if _global_playwright:
await _global_playwright.stop()
_global_playwright = None
if _global_browser:
await _global_browser.close()
_global_browser = None
# 关闭 Playwright 对象
if playwright:
await playwright.stop()
await browser.close()
return final_result, errors, model_actions, model_thoughts
# Define the theme map globally