# OmniTool
Control a Windows 11 VM with OmniParser + your vision model of choice.
## Highlights:
1. **OmniParser V2** is 60% faster than V1 and now understands a wide variety of OS, app and inside app icons!
2. **OmniBox** uses 50% less disk space than other Windows VMs for agent testing, whilst providing the same computer use API
3. **OmniTool** supports out of the box the following vision models - OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL) or Anthropic Computer Use
## Overview
There are three components:
| omniparserserver | FastAPI server running OmniParser V2. | |
| omnibox | A Windows 11 VM running in a Docker container. | |
| gradio | UI to provide commands and watch reasoning + execution on OmniBox |