Naming conventions
This commit is contained in:
BIN
imgs/header_bar.png
Normal file
BIN
imgs/header_bar.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 210 KiB |
@@ -26,11 +26,11 @@ CONFIG_DIR = Path("~/.anthropic").expanduser()
|
||||
API_KEY_FILE = CONFIG_DIR / "api_key"
|
||||
|
||||
INTRO_TEXT = '''
|
||||
🚀🤖✨ It's Play Time!
|
||||
Welcome to OmniTool - the OmniParser+X Computer Use Demo! X = [OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL) or Anthropic Computer Use (Sonnet)].
|
||||
|
||||
Welcome to the OmniParser+X Computer Use Demo! X = [GPT family (4o/o1/o3-mini), Claude, deepseek R1/V3, Qwen-2.5VL]. Let OmniParser turn your general purpose vision-langauge model to an AI agent.
|
||||
OmniParser lets you turn any vision-langauge model into an AI agent.
|
||||
|
||||
Type a message and press submit to start OmniParser+X. Press stop to pause, and press the trash icon in the chat to clear the message history.
|
||||
Type a message and press submit to start OmniTool. Press stop to pause, and press the trash icon in the chat to clear the message history.
|
||||
'''
|
||||
|
||||
def parse_arguments():
|
||||
@@ -271,7 +271,7 @@ with gr.Blocks(theme=gr.themes.Default()) as demo:
|
||||
|
||||
setup_state(state.value)
|
||||
|
||||
gr.Markdown("# OmniParser + ✖️ Demo")
|
||||
gr.Markdown("# OmniTool")
|
||||
|
||||
if not os.getenv("HIDE_WARNING", False):
|
||||
gr.Markdown(INTRO_TEXT)
|
||||
|
Before Width: | Height: | Size: 3.1 KiB After Width: | Height: | Size: 3.1 KiB |
@@ -1,36 +1,38 @@
|
||||
# OmniParser+X Computer Use Demo
|
||||
|
||||
Control a Windows 11 VM with OmniParser+X (X = [GPT family (4o/o1/o3-mini), Claude, deepseek R1/V3, Qwen-2.5VL]).
|
||||
<p align="center">
|
||||
<img src="../imgs/som_overlaid_omni.png" alt="OmniParser+X Computer Use Demo screenshot">
|
||||
<img src="../imgs/header_bar.png" alt="OmniParser+X Computer Use Demo screenshot">
|
||||
</p>
|
||||
|
||||
# OmniTool
|
||||
|
||||
Control a Windows 11 VM with OmniParser+X (OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL)) or Anthropic Computer Use.
|
||||
|
||||
## Overview
|
||||
|
||||
There are three components:
|
||||
|
||||
1. **windowshost**: A Windows 11 VM running in a Docker container
|
||||
2. **omniparserserver**: FastAPI server running OmniParser
|
||||
1. **omnibox**: A Windows 11 VM running in a Docker container
|
||||
2. **omniparserserver**: FastAPI server running OmniParser V2
|
||||
3. **gradio**: UI where you can provide commands and watch OmniParser+X reasoning and executing on the Windows 11 VM
|
||||
|
||||
Notes:
|
||||
|
||||
1. The Windows 11 VM docker is dependent on KVM so can only run quickly on Windows and Linux. This can run on a CPU machine (doesn't need GPU).
|
||||
2. Though OmniParser can run on a CPU, we have separated this out if you want to run it fast on a GPU machine
|
||||
3. The Gradio UI can also run on a CPU machine.
|
||||
3. The Gradio UI can also run on a CPU machine. We suggest running **omnibox** and **gradio** on the same CPU machine and **omniparserserver** on a GPU server.
|
||||
|
||||
## Setup
|
||||
|
||||
1. **windowshost**:
|
||||
1. **omnibox**:
|
||||
|
||||
a. Install Docker Desktop
|
||||
|
||||
b. Visit [Microsoft Evaluation Center](https://info.microsoft.com/ww-landing-windows-11-enterprise.html), accept the Terms of Service, and download a **Windows 11 Enterprise Evaluation (90-day trial, English, United States)** ISO file [~6GB]. Rename the file to `custom.iso` and copy it to the directory `OmniParser/computer_use_demo/windowshost/vm/win11iso`
|
||||
b. Visit [Microsoft Evaluation Center](https://info.microsoft.com/ww-landing-windows-11-enterprise.html), accept the Terms of Service, and download a **Windows 11 Enterprise Evaluation (90-day trial, English, United States)** ISO file [~6GB]. Rename the file to `custom.iso` and copy it to the directory `OmniParser/omnitool/omnibox/vm/win11iso`
|
||||
|
||||
c. Navigate to vm management script directory with`cd OmniParser/computer_use_demo/windowshost/scripts`
|
||||
c. Navigate to vm management script directory with`cd OmniParser/omnitool/omnibox/scripts`
|
||||
|
||||
d. Build the docker container [400MB] and install the ISO to a storage folder [20GB] with `./manage_vm.sh create`
|
||||
|
||||
e. After creating the first time it will store a save of the VM state in `vm/win11storage`. You can then manage the VM with `./manage_vm.sh start` and `./manage_vm.sh stop`. To delete the VM, use `./manage_vm.sh delete` and delete the `OmniParser/computer_use_demo/windowshost/vm/win11storage` directory.
|
||||
e. After creating the first time it will store a save of the VM state in `vm/win11storage`. You can then manage the VM with `./manage_vm.sh start` and `./manage_vm.sh stop`. To delete the VM, use `./manage_vm.sh delete` and delete the `OmniParser/omnitool/omnibox/vm/win11storage` directory.
|
||||
|
||||
2. **omniparserserver**:
|
||||
|
||||
@@ -51,13 +53,13 @@ Notes:
|
||||
h. Ensure you have the weights downloaded in weights folder. If not download them with:
|
||||
`for folder in icon_caption_florence icon_detect icon_detect_v1_5; do huggingface-cli download microsoft/OmniParser --local-dir weights/ --repo-type model --include "$folder/*"; done`
|
||||
|
||||
h. Navigate to the server directory with `cd OmniParser/computer_use_demo/omniparserserver`
|
||||
h. Navigate to the server directory with `cd OmniParser/omnitool/omniparserserver`
|
||||
|
||||
i. Start the server with `python -m omniparserserver`
|
||||
|
||||
3. **gradio**:
|
||||
|
||||
a. Navigate to the gradio directory with `cd OmniParser/computer_use_demo/gradio`
|
||||
a. Navigate to the gradio directory with `cd OmniParser/omnitool/gradio`
|
||||
|
||||
b. Ensure you have activated the conda python environment with `conda activate omni`
|
||||
|
||||
Reference in New Issue
Block a user