How to Set Up Browser-Use Web UI: Complete Beginner's Guide 2025
Published on
Jun 12, 2025
Category:
Automation
307 views
Want to run AI agents in your browser to automate web tasks but don't know where to start? This comprehensive guide walks you through every single step to set up Browser-Use Web UI and unleash the power of AI-driven browser automation on your computer.
Why Use Browser-Use Web UI? π€
Before diving into the setup, let's understand why Browser-Use Web UI is a game-changer for web automation:
- AI-Powered Automation: Let AI agents handle repetitive web tasks automatically
- Multi-LLM Support: Works with OpenAI, Anthropic, Google, DeepSeek, Ollama, and more
- Real Browser Integration: Use your actual browser with existing logins and sessions
- Visual Monitoring: Watch AI agents work in real-time through VNC viewer
- Persistent Sessions: Keep browser state between tasks for complex workflows
- User-Friendly Interface: Built on Gradio for easy interaction
What You'll Need π
For Local Installation:
- A Windows, macOS, or Linux computer
- Python 3.11 or higher installed
- Git for cloning the repository
- API keys for your preferred LLM service
For Docker Installation:
- Docker and Docker Compose installed
- About 15-20 minutes of setup time
- API keys for your preferred LLM service
Part 1: Prerequisites Setup π
Step 1: Install Python 3.11+ (Local Installation Only)
For Windows:
- Visit: https://www.python.org/downloads/
- Download Python 3.11 or higher (latest recommended)
- Run the installer with these important settings:
- β
Check "Add Python to PATH"
- β
Check "Install pip"
- Choose "Install for all users" if available
- Verify installation: Open Command Prompt, type
python --version
For macOS:
- Install via Homebrew (recommended):
brew install python@3.11
- Or download from: https://www.python.org/downloads/
- Verify installation: Open Terminal, type
python3 --version
For Linux (Ubuntu/Debian):
sudo apt update
sudo apt install python3.11 python3.11-pip python3.11-venv
Step 2: Install Git
For Windows:
- Download from: https://git-scm.com/download/win
- Run installer with default settings
- Verify: Open Command Prompt, type
git --version
For macOS:
- Install via Homebrew:
brew install git
- Or use Xcode Command Line Tools:
xcode-select --install
For Linux:
sudo apt install git # Ubuntu/Debian
sudo yum install git # CentOS/RHEL
Step 3: Install Docker (Docker Installation Only)
For Windows:
- Download Docker Desktop: https://www.docker.com/products/docker-desktop/
- Install and start Docker Desktop
- Verify: Open Command Prompt, type
docker --version
For macOS:
- Download Docker Desktop for Mac
- Install and start Docker Desktop
- Verify: Open Terminal, type
docker --version
For Linux:
# Ubuntu/Debian
sudo apt update
sudo apt install docker.io docker-compose
sudo systemctl start docker
sudo systemctl enable docker
Part 2: Local Installation Method π»
Step 4: Clone the Repository
Open your terminal/command prompt and run:
git clone https://github.com/browser-use/web-ui.git
cd web-ui
Step 5: Set Up Python Virtual Environment
Using UV (Recommended Method):
First, install UV package manager:
pip install uv
Create virtual environment:
uv venv --python 3.11
Activate the virtual environment:
Windows (Command Prompt):
.venv\Scripts\activate
Windows (PowerShell):
.\.venv\Scripts\Activate.ps1
macOS/Linux:
source .venv/bin/activate
Alternative: Using Standard Python venv:
python -m venv .venv
# Activate (Windows)
.venv\Scripts\activate
# Activate (macOS/Linux)
source .venv/bin/activate
Step 6: Install Python Dependencies
With your virtual environment activated:
# If using UV (recommended)
uv pip install -r requirements.txt
# If using standard pip
pip install -r requirements.txt
This process takes 3-5 minutes depending on your internet connection.
Step 7: Install Playwright Browsers
Browser-Use requires Playwright browsers for automation:
Install specific browser (recommended for faster setup):
playwright install --with-deps chromium
Install all browsers (if you need multiple browser support):
playwright install
Step 8: Configure Environment Variables βοΈ
Create environment file:
Windows (Command Prompt):
copy .env.example .env
macOS/Linux/Windows (PowerShell):
cp .env.example .env
Edit the .env file:
Open .env
in your preferred text editor and configure these essential settings:
# LLM API Keys (Add your keys here)
OPENAI_API_KEY=your_openai_key_here
ANTHROPIC_API_KEY=your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here
# Browser Settings
CHROME_PERSISTENT_SESSION=false # Set to true to keep browser open between tasks
RESOLUTION=1920x1080x24 # Custom resolution format: WIDTHxHEIGHTxDEPTH
RESOLUTION_WIDTH=1920 # Custom width in pixels
RESOLUTION_HEIGHT=1080 # Custom height in pixels
# VNC Settings (for watching browser interactions)
VNC_PASSWORD=your_vnc_password # Optional, defaults to "vncpassword"
Important API Key Setup:
For OpenAI:
- Visit: https://platform.openai.com/api-keys
- Create new API key
- Copy and paste into
OPENAI_API_KEY=
For Anthropic (Claude):
- Visit: https://console.anthropic.com/
- Generate API key
- Copy and paste into
ANTHROPIC_API_KEY=
For Google (Gemini):
- Visit: https://aistudio.google.com/app/apikey
- Create API key
- Copy and paste into
GOOGLE_API_KEY=
Step 9: Launch Browser-Use Web UI π―
Start the application:
python webui.py --ip 127.0.0.1 --port 7788
WebUI Command Options:
--ip
: IP address to bind to (default: 127.0.0.1)--port
: Port number (default: 7788)--theme
: UI theme (Ocean, Soft, Monochrome, Glass, Origin, Citrus)--dark-mode
: Enable dark mode interface
Example with custom settings:
python webui.py --ip 0.0.0.0 --port 8080 --theme Glass --dark-mode
Step 10: Access Your Browser-Use Interface β
Web Interface:
- Open your browser and navigate to:
http://127.0.0.1:7788
- You should see the Browser-Use Web UI interface
Success Indicators:
- β
Gradio interface loads successfully
- β
LLM models appear in dropdown menu
- β
Browser settings are configurable
- β
No error messages in terminal
Part 3: Docker Installation Method π³
Step 11: Clone Repository for Docker
git clone https://github.com/browser-use/web-ui.git
cd web-ui
Step 12: Configure Docker Environment
Create environment file:
# Windows (Command Prompt)
copy .env.example .env
# macOS/Linux/Windows (PowerShell)
cp .env.example .env
Edit .env file with your API keys:
# LLM API Keys
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here
# Browser Settings
CHROME_PERSISTENT_SESSION=true # Set to true to keep browser open
RESOLUTION=1920x1080x24
RESOLUTION_WIDTH=1920
RESOLUTION_HEIGHT=1080
# VNC Settings
VNC_PASSWORD=yourvncpassword
Step 13: Launch with Docker Compose π’
Start with default settings (browser closes after tasks):
docker compose up --build
Start with persistent browser (recommended for complex workflows):
CHROME_PERSISTENT_SESSION=true docker compose up --build
Run in background (detached mode):
docker compose up -d --build
Step 14: Access Docker-Based Interface
Web Interface:
- Navigate to:
http://localhost:7788
VNC Viewer (watch browser interactions):
- Navigate to:
http://localhost:6080/vnc.html
- Default VNC password: "youvncpassword" (or what you set in
.env
)
Docker Container Management:
# View logs
docker compose logs -f
# Stop container
docker compose down
# Restart container
docker compose restart
Part 4: Advanced Configuration & Usage π§
Step 15: Using Your Own Browser (Optional but Powerful)
This feature lets you use your existing browser profile with saved logins and bookmarks:
Configure browser paths in .env:
Windows:
CHROME_PATH="C:\Program Files\Google\Chrome\Application\chrome.exe"
CHROME_USER_DATA="C:\Users\YourUsername\AppData\Local\Google\Chrome\User Data"
macOS:
CHROME_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
CHROME_USER_DATA="/Users/YourUsername/Library/Application Support/Google/Chrome"
Important Setup Steps:
- Close ALL Chrome windows completely
- Open Browser-Use Web UI in Firefox or Edge (not Chrome)
- Check "Use Own Browser" option in Browser Settings
- Start your automation tasks
Step 16: Browser Persistence Modes π
Default Mode (CHROME_PERSISTENT_SESSION=false):
- β
Browser opens fresh for each task
- β
Clean state, no interference
- β
Lower resource usage
- β No task history preservation
Persistent Mode (CHROME_PERSISTENT_SESSION=true):
- β
Browser stays open between tasks
- β
Maintains login sessions
- β
View complete interaction history
- β
Better for complex multi-step workflows
- β Higher resource usage
Set in your .env
file:
CHROME_PERSISTENT_SESSION=true
Step 17: Monitoring Browser Interactions π
VNC Viewer Access:
- URL:
http://localhost:6080/vnc.html
- Password: Your VNC_PASSWORD from
.env
- Direct VNC: Connect to
localhost:5900
with VNC client
What You Can See:
- Real-time browser interactions
- AI agent decision-making process
- Form filling and navigation
- Error handling and recovery
Step 18: Theme Customization π¨
Browser-Use Web UI offers multiple themes:
Available Themes:
- Ocean (default): Blue, calming ocean-inspired theme
- Soft: Gentle, muted colors for relaxed viewing
- Monochrome: Grayscale theme for minimal distraction
- Glass: Sleek, semi-transparent modern design
- Origin: Classic, retro-inspired nostalgic theme
- Citrus: Vibrant, bright citrus-inspired palette
Apply theme:
python webui.py --theme Citrus --dark-mode
Part 5: Testing Your Setup π§ͺ
Step 19: First Test Run
Basic functionality test:
- Access Web UI:
http://127.0.0.1:7788
- Select your LLM model from dropdown
- Enter a simple task: "Navigate to google.com and search for 'AI automation'"
- Click "Start Task"
- Watch the browser automation begin
Success Indicators:
- β
Browser window opens automatically
- β
AI agent navigates to Google
- β
Search is performed correctly
- β
Results are displayed
- β
Task completes successfully
Step 20: Advanced Test Scenarios
Form Filling Test:
- Task: "Go to a contact form and fill it with sample data"
- Verify: AI can identify form fields and populate them
Multi-Step Workflow Test:
- Task: "Search for a product on an e-commerce site, add to cart, and proceed to checkout (but don't complete purchase)"
- Verify: AI maintains context across multiple pages
Authentication Test (with persistent browser):
- Task: "Log into a social media platform and create a post"
- Verify: AI can handle login forms and authenticated actions
Part 6: Troubleshooting Common Issues π οΈ
Problem: "Module not found" errors
Solution:
- Ensure virtual environment is activated
- Reinstall requirements:
pip install -r requirements.txt
- Check Python version:
python --version
(should be 3.11+)
Problem: Browser fails to start
Solution:
- Reinstall Playwright:
playwright install --with-deps chromium
- Check system permissions for browser installation
- Verify no other instances of Chrome are running
Problem: API key authentication fails
Solution:
- Verify API keys are correctly formatted in
.env
- Check API key validity on respective platforms
- Ensure no extra spaces in environment variables
Problem: Docker container won't start
Solution:
- Check Docker daemon is running:
docker ps
- Verify port 7788 isn't already in use:
netstat -an | grep 7788
- Review container logs:
docker compose logs
Problem: VNC viewer shows black screen
Solution:
- Wait 30-60 seconds for VNC to initialize
- Check VNC password matches
.env
configuration - Restart container:
docker compose restart
Problem: Browser automation gets stuck
Solution:
- Use more specific instructions in your task description
- Enable persistent session for complex workflows
- Monitor via VNC to understand where AI gets confused
- Break complex tasks into smaller, specific steps
Part 7: Best Practices for AI Browser Automation π‘
Step 21: Writing Effective Task Instructions
Good Task Examples:
- β
"Navigate to amazon.com, search for 'wireless headphones', filter by 4+ star ratings, and add the first result to cart"
- β
"Go to LinkedIn.com, search for 'software engineers in San Francisco', and save the first 5 profiles"
- β
"Visit news.google.com, find articles about artificial intelligence from the past week, and summarize the top 3 headlines"
Poor Task Examples:
- β "Buy something online" (too vague)
- β "Do social media stuff" (unclear objective)
- β "Fill out forms" (no context or specifics)
Step 22: Optimizing Performance
For Better Speed:
- Use specific selectors in instructions
- Enable persistent sessions for related tasks
- Close unnecessary browser tabs between tasks
- Monitor resource usage during long operations
For Better Accuracy:
- Provide clear, step-by-step instructions
- Specify exact text or button names when possible
- Include error handling instructions ("if X fails, try Y")
- Test with simple tasks before complex workflows
Step 23: Security Considerations π
API Key Security:
- Never commit
.env
files to version control - Use environment-specific API keys
- Regularly rotate API keys
- Monitor API usage for unusual activity
Browser Security:
- Use dedicated browser profiles for automation
- Avoid storing sensitive credentials in persistent sessions
- Clear browser data after sensitive operations
- Monitor automated actions in real-time via VNC
Part 8: Real-World Use Cases π
Content Management Automation
- Website Translation: Automate translation of content management systems
- Content Publishing: Schedule and publish posts across platforms
- Data Entry: Fill forms and update databases automatically
E-commerce Operations
- Price Monitoring: Track competitor pricing automatically
- Inventory Updates: Sync product information across platforms
- Order Processing: Automate routine order management tasks
- Post Scheduling: Automate social media posting workflows
- Engagement Monitoring: Track mentions and respond appropriately
- Content Curation: Gather and organize social media content
Research and Data Collection
- Market Research: Gather competitive intelligence automatically
- Lead Generation: Collect contact information from websites
- Content Aggregation: Compile information from multiple sources
Conclusion: You're Ready to Automate! π
Congratulations! You've successfully set up Browser-Use Web UI and learned how to harness the power of AI-driven browser automation. You now have:
β
Complete development environment (local or Docker) β
LLM integration with multiple providers β
Browser automation capabilities β
Real-time monitoring through VNC β
Persistent session management β
Advanced configuration options
What's Next?
Immediate Steps:
- Start with simple automation tasks to understand capabilities
- Experiment with different LLM models to find the best fit
- Practice writing clear, specific task instructions
- Explore the VNC viewer for debugging complex workflows
Advanced Exploration:
- Integrate with your existing workflows and systems
- Develop custom automation scripts for specific use cases
- Combine with other AI tools for comprehensive automation
- Share successful automation patterns with the community
Remember These Key Points:
- Always monitor AI actions, especially on production websites
- Start with non-critical tasks while learning the system
- Use persistent sessions for complex, multi-step workflows
- Keep your API keys secure and monitor usage regularly
Pro Tips for Success π―
- Start Small: Begin with basic navigation and form-filling tasks
- Be Specific: Detailed instructions yield better results than vague requests
- Monitor Resources: Watch CPU and memory usage during intensive tasks
- Document Workflows: Keep notes on successful automation patterns
- Stay Updated: Follow the GitHub repository for new features and improvements
Need More Help? π€
Official Resources:
- GitHub Repository: https://github.com/browser-use/web-ui
- Documentation: Check the repo's wiki and README files
- Issue Tracker: Report bugs and request features on GitHub
Community Support:
- Join discussions in GitHub Issues section
- Share successful automation examples
- Contribute to the project's development
Troubleshooting Resources:
- Review browser-use core documentation
- Check Playwright documentation for browser-specific issues
- Consult LLM provider documentation for API-related problems
Happy automating, and enjoy exploring the possibilities of AI-powered browser automation! π
Tags: #BrowserAutomation #AIAgents #WebUI #BrowserUse #Playwright #LLM #Automation #Python #Docker #AI
Tags:
#browser automation
#AI web automation
#browser-use setup
#AI agents
#web scraping automation
#playwright automation
#LLM browser control
#automated web tasks
#browser bot
#AI web navigation