From cb92e432b34b04a093b3078de5b8bae253d0f31f Mon Sep 17 00:00:00 2001 From: Thomas Dhome-Casanova Date: Wed, 5 Feb 2025 11:11:08 -0800 Subject: [PATCH] Tweaking of information order --- omnitool/readme.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/omnitool/readme.md b/omnitool/readme.md index 3a501aa..2a8b529 100644 --- a/omnitool/readme.md +++ b/omnitool/readme.md @@ -2,7 +2,13 @@ # OmniTool -Control a Windows 11 VM with OmniParser+X (OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL)) or Anthropic Computer Use. +Control a Windows 11 VM with OmniParser + your vision model of choice. + +## Highlights: + +1. **OmniParser V2** is 60% faster than V1 and now understands a wide variety of OS, app and inside app icons! +2. **OmniBox** uses 50% less disk space than other Windows VMs for agent testing, whilst providing the same computer use API +3. **OmniTool** supports out of the box the following vision models - OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL) or Anthropic Computer Use ## Overview @@ -26,11 +32,8 @@ There are three components: -## Highlights: -1. **OmniParser V2** is 60% faster than V1 and now understands a wide variety of OS, app and inside app icons! -2. **OmniBox** uses 50% less disk space than other Windows VMs for agent testing, whilst providing the same computer use API - ## Notes: + 1. Though **OmniParser V2** can run on a CPU, we have separated this out if you want to run it fast on a GPU machine 2. The **OmniBox** Windows 11 VM docker is dependent on KVM so can only run quickly on Windows and Linux. This can run on a CPU machine (doesn't need GPU). 3. The Gradio UI can also run on a CPU machine. We suggest running **omnibox** and **gradio** on the same CPU machine and **omniparserserver** on a GPU server.