diff --git a/README.md b/README.md index c839130..4a3faea 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ **OmniParser** is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface. ## News +- [2024/10] OmniParser is the #1 trending model on huggingface model hub (starting 10/29/2024). - [2024/10] Feel free to checkout our demo on [huggingface space](https://huggingface.co/spaces/microsoft/OmniParser)! (stay tuned for OmniParser + Claude Computer Use) - [2024/10] Both Interactive Region Detection Model and Icon functional description model are released! [Hugginface models](https://huggingface.co/microsoft/OmniParser) - [2024/09] OmniParser achieves the best performance on [Windows Agent Arena](https://microsoft.github.io/WindowsAgentArena/)! @@ -40,6 +41,8 @@ To run gradio demo, simply run: python gradio_demo.py ``` +## Model Weights License +For the model checkpoints on huggingface model hub, please note that icon_detect model is under AGPL license since it is a license inherited from the original yolo model. And icon_caption_blip2 & icon_caption_florence is under MIT license. Please refer to the LICENSE file in the folder of each model: https://huggingface.co/microsoft/OmniParser. ## 📚 Citation Our technical report can be found [here](https://arxiv.org/abs/2408.00203). diff --git a/gradio_demo.py b/gradio_demo.py index d2df8e0..f420b7a 100644 --- a/gradio_demo.py +++ b/gradio_demo.py @@ -14,6 +14,8 @@ from PIL import Image yolo_model = get_yolo_model(model_path='weights/icon_detect/best.pt') caption_model_processor = get_caption_model_processor(model_name="florence2", model_name_or_path="weights/icon_caption_florence") +# caption_model_processor = get_caption_model_processor(model_name="blip2", model_name_or_path="weights/icon_caption_blip2") + platform = 'pc' if platform == 'pc': draw_bbox_config = { diff --git a/imgs/saved_image_demo.png b/imgs/saved_image_demo.png index 9e158d1..feaff69 100644 Binary files a/imgs/saved_image_demo.png and b/imgs/saved_image_demo.png differ