OmniBox setup additional details + debugging info
This commit is contained in:
committed by
GitHub
parent
e40a461492
commit
3c4e77ec37
@@ -75,15 +75,18 @@ There are three components:
|
||||
|
||||
2. **omnibox**:
|
||||
|
||||
a. Install Docker Desktop
|
||||
a. Ensure you have 30GB of space remaining (5GB for ISO, 400MB for Docker container, 20GB for storage folder)
|
||||
|
||||
b. Visit [Microsoft Evaluation Center](https://info.microsoft.com/ww-landing-windows-11-enterprise.html), accept the Terms of Service, and download a **Windows 11 Enterprise Evaluation (90-day trial, English, United States)** ISO file [~6GB]. Rename the file to `custom.iso` and copy it to the directory `OmniParser/omnitool/omnibox/vm/win11iso`
|
||||
b. Install Docker Desktop
|
||||
|
||||
c. Navigate to vm management script directory with`cd OmniParser/omnitool/omnibox/scripts`
|
||||
c. Visit [Microsoft Evaluation Center](https://info.microsoft.com/ww-landing-windows-11-enterprise.html), accept the Terms of Service, and download a **Windows 11 Enterprise Evaluation (90-day trial, English, United States)** ISO file [~6GB]. Rename the file to `custom.iso` and copy it to the directory `OmniParser/omnitool/omnibox/vm/win11iso`
|
||||
|
||||
d. Build the docker container [400MB] and install the ISO to a storage folder [20GB] with `./manage_vm.sh create`
|
||||
d. Navigate to vm management script directory with`cd OmniParser/omnitool/omnibox/scripts`
|
||||
|
||||
e. After creating the first time it will store a save of the VM state in `vm/win11storage`. You can then manage the VM with `./manage_vm.sh start` and `./manage_vm.sh stop`. To delete the VM, use `./manage_vm.sh delete` and delete the `OmniParser/omnitool/omnibox/vm/win11storage` directory.
|
||||
e. Build the docker container [400MB] and install the ISO to a storage folder [20GB] with `./manage_vm.sh create`. The process is shown in the screenshots below and will take 20-90 mins depending on download speeds (commonly around 60 mins). When complete the terminal will show `VM + server is up and running!`. You can see the apps being installed in the VM by looking at the desktop via the NoVNC viewer (http://localhost:8006/vnc.html?view_only=1&autoconnect=1&resize=scale). The terminal window shown in the NoVNC viewer will not be open on the desktop after the setup is done. If you can see it, wait and don't click around!
|
||||

|
||||
|
||||
f. After creating the first time it will store a save of the VM state in `vm/win11storage`. You can then manage the VM with `./manage_vm.sh start` and `./manage_vm.sh stop`. To delete the VM, use `./manage_vm.sh delete` and delete the `OmniParser/omnitool/omnibox/vm/win11storage` directory.
|
||||
|
||||
3. **gradio**:
|
||||
|
||||
@@ -95,6 +98,20 @@ There are three components:
|
||||
|
||||
d. Open the URL in the terminal output, set your API Key and start playing with the AI agent!
|
||||
|
||||
## Common setup errors
|
||||
|
||||
## Validation errors: Windows Host is not responding
|
||||
If you get this error in Gradio after clicking the submit button, this indicates that the server running in the VM that accepts commands from Gradio and then moves the mouse/ keyboard isn't available. You can verify this by running `curl http://localhost:5000/probe`. Ensure your `omnibox` is fully finished setting up (should no longer have a terminal window). Refer to the omnibox section for timing on that. If you have set up your omnibox, it may be a matter of waiting a little.
|
||||
|
||||
If waiting 10 mins doesn't help. Try stopping (`./manage_vm.sh stop`) and starting (`./manage_vm.sh start`) your omnibox VM with the script commands.
|
||||
|
||||
Then, if that doesn't work, delete your VM (`./manage_vm.sh delete`) leaving the storage folder and then run create again. It will be fast as it will use the existing storage folder.
|
||||
|
||||
Finally, if that still doesn't work and you want to fully reset your VM to factory settings (create a new VM):
|
||||
1. run `./manage_vm.sh delete`
|
||||
2. delete the `vm/win11storage` folder
|
||||
3. run `./manage_vm.sh create`
|
||||
|
||||
## Risks and Mitigations
|
||||
To align with the Microsoft AI principles and Responsible AI practices, we conduct risk mitigation by training the icon caption model with Responsible AI data, which helps the model avoid inferring sensitive attributes (e.g.race, religion etc.) of the individuals which happen to be in icon images as much as possible. At the same time, we encourage user to apply OmniParser only for screenshot that does not contain harmful/violent content. For the OmniTool, we conduct threat model analysis using Microsoft Threat Modeling Tool. We advise human to stay in the loop in order to minimize risk.
|
||||
|
||||
@@ -102,3 +119,4 @@ To align with the Microsoft AI principles and Responsible AI practices, we condu
|
||||
## Acknowledgment
|
||||
Kudos to the amazing resources that are invaluable in the development of our code: [Claude Computer Use](https://github.com/anthropics/anthropic-quickstarts/blob/main/computer-use-demo/README.md), [OS World](https://github.com/xlang-ai/OSWorld), [Windows Agent Arena](https://github.com/microsoft/WindowsAgentArena), and [computer_use_ootb](https://github.com/showlab/computer_use_ootb).
|
||||
We are grateful for helpful suggestions and feedbacks provided by Francesco Bonacci, Jianwei Yang, Dillon DuPont, Yue Wu, Anh Nguyen.
|
||||
Many thanks to @keyserjaya for screenshots on omnibox install.
|
||||
|
||||
Reference in New Issue
Block a user