Try Demo 🤗HuggingFace | 🚀ModelScope
Join us on 🎮Discord | 💬WeChat
If you like WebQA Agent, please give us a ⭐ on GitHub!
🤖 WebQA Agent is an autonomous web browser agent that audits performance, functionality & UX for engineers and vibe-coding creators. ✨
- 🤖 AI-Powered Testing: Performs autonomous website testing with intelligent planning and reflection—explores pages, plans actions, and executes end-to-end flows without manual scripting. Features 2-stage architecture (lightweight filtering + comprehensive planning) and dynamic test generation for newly appeared UI elements.
- 📊 Multi-Dimensional Observation: Covers functionality, performance, user experience, and basic security; evaluates load speed, design details, and links to surface issues. Uses multi-modal analysis (screenshots + DOM structure + text content) and DOM diff detection to discover new test opportunities.
- 🎯 Actionable Recommendations: Runs in real browsers with smart element prioritization and automatic viewport management. Provides concrete suggestions for improvement with adaptive recovery mechanisms for robust test execution.
- 📈 Visual Reports: Generates detailed HTML test reports with clear, multi-dimensional views for analysis and tracking.
- 🤖 Conversational UI: Autonomously plans goals and interacts across a dynamic chat interface
- 🎨 Creative Page: Explores page structure, identifies elements
Try Demo: 🤗Hugging Face · 🚀ModelScope
🏎️ Recommended uv (Python>=3.11):
# 1) Create a project and install the package
uv init my-webqa && cd my-webqa
uv add webqa-agent
uv sync
# 2) Install browser (required)
uv run playwright install chromium
# 3) Create a config file (auto-generated template)
uv run webqa-agent init # creates config.yaml
# 4) Edit config.yaml
# - target.url: your site
# - llm_config.api_key: your OpenAI key (or set OPENAI_API_KEY)
# For detailed configuration information, please refer to the "Usage > Test Configuration"
# 5) Run
uv run webqa-agent runBefore starting, ensure Docker is installed. If not, please refer to the official installation guide: Docker Installation Guide.
Recommended versions: Docker >= 24.0, Docker Compose >= 2.32.
mkdir -p config \
&& curl -fsSL https://raw.githubusercontent.com/MigoXLab/webqa-agent/main/config/config.yaml.example -o config/config.yaml
# Edit config.yaml
# Set target.url, llm_config.api_key and other parameters
curl -fsSL https://raw.githubusercontent.com/MigoXLab/webqa-agent/main/start.sh | bashgit clone https://github.com/MigoXLab/webqa-agent.git
cd webqa-agent
uv sync
uv run playwright install chromium
cp ./config/config.yaml.example ./config/config.yaml
# Edit config.yaml
# Set target.url, llm_config.api_key and other parameters
uv run webqa-agent run -c ./config/config.yamlPerformance (Lighthouse): npm install lighthouse chrome-launcher (Node.js ≥18)
Security (Nuclei):
brew install nuclei # macOS
nuclei -ut # update templates
# Linux/Win: download from https://github.com/projectdiscovery/nuclei/releasestarget:
url: https://example.com # Website URL to test
description: Website QA testing
# max_concurrent_tests: 2 # Optional, default 2
test_config:
function_test: # Functional testing
enabled: True
type: ai # 'default' or 'ai'
business_objectives: Test search functionality, generate 3 test cases
dynamic_step_generation:
enabled: True # Enable dynamic step generation
max_dynamic_steps: 10
min_elements_threshold: 1
ux_test: # User experience testing
enabled: True
performance_test: # Performance analysis (requires Lighthouse)
enabled: False
security_test: # Security scanning (requires Nuclei)
enabled: False
llm_config:
model: gpt-4.1-2025-04-14 # Vision model configuration, currently supports OpenAI SDK compatible format only
filter_model: gpt-4o-mini # Lightweight model for element filtering
api_key: your_api_key # Or use OPENAI_API_KEY env var
base_url: https://api.openai.com/v1 # Or use OPENAI_BASE_URL env var
temperature: 0.1
browser_config:
viewport: {"width": 1280, "height": 720}
headless: False # Auto True in Docker
language: en-US
cookies: []
save_screenshots: False
report:
language: en-US # zh-CN or en-US
log:
level: info # debug, info, warning, error- Functional Testing (AI mode): Two-stage planning. Stage 1 (
filter_model) prioritizes elements for efficient analysis; Stage 2 (primary model) generates comprehensive test cases. The agent may reflect and re-plan based on page state and coverage, so executed case count can differ from the initial request. Whendynamic_step_generationis enabled, new UI elements (e.g., dropdowns, modals) detected via DOM diff will trigger additional generated steps. - Functional Testing (default mode): Focuses on whether UI interactions (clicks, navigations) complete successfully.
- User Experience Testing: Multi-modal analysis (screenshots + DOM structure + text) to assess visual quality, detect typos/grammar issues, and validate layout rendering. Model outputs include best-practice suggestions for optimization.
# Create config.yaml in current directory
webqa-agent init
# Create at custom path
webqa-agent init -o myconfig.yaml
# Overwrite existing file
webqa-agent init --force# Auto-discover config (./config.yaml or ./config/config.yaml)
webqa-agent run
# Specify config file
webqa-agent run -c /path/to/config.yamlWebQA Agent provides a visual interface powered by Gradio:
# Install Gradio
uv add "gradio>=5.44.0"
# Launch Web UI (English by default)
webqa-agent ui
# Access at http://localhost:7860
# Launch with Chinese interface
webqa-agent ui -l zh-CN
# Optional: custom host/port and no auto-open browser
webqa-agent ui --host 0.0.0.0 --port 9000| Model | Recommendation |
|---|---|
| gpt-4.1-2025-04-14 | High accuracy and reliability |
| gpt-4.1-mini-2025-04-14 | Economical and practical |
| qwen3-vl-235b-a22b-instruct | Open-source model, preferred for on-premise |
| doubao-seed-1-6-vision-250815 | Good web understanding, supports visual recognition |
Test reports are generated in the reports/ directory. Open the HTML file to view detailed results.
- Continuous optimization of AI functional testing: Improve coverage and accuracy
- Functional traversal and page validation: Verify business logic correctness
- Interaction and visualization: Real-time reasoning process display
- Capability expansion: Multi-model integration and more evaluation dimensions
- natbot: Drive a browser with GPT-3
- Midscene.js: AI Operator for Web, Android, Automation & Testing
- browser-use: AI Agent for Browser control
This project is licensed under the Apache 2.0 License.