supprt local data logging
This commit is contained in:
@@ -3,6 +3,7 @@
|
||||
<p align="center">
|
||||
<img src="imgs/logo.png" alt="Logo">
|
||||
</p>
|
||||
<!-- <a href="https://trendshift.io/repositories/12975" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12975" alt="microsoft%2FOmniParser | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> -->
|
||||
|
||||
[](https://arxiv.org/abs/2408.00203)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
@@ -12,6 +13,7 @@
|
||||
**OmniParser** is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.
|
||||
|
||||
## News
|
||||
- [2025/3] We support local logging of trajecotry so that you can use OmniParser+OmniTool to build training data pipeline for your favorate agent in your domain. [Documentation WIP]
|
||||
- [2025/3] We are gradually adding multi agents orchstration and improving user interface in OmniTool for better experience.
|
||||
- [2025/2] We release OmniParser V2 [checkpoints](https://huggingface.co/microsoft/OmniParser-v2.0). [Watch Video](https://1drv.ms/v/c/650b027c18d5a573/EWXbVESKWo9Buu6OYCwg06wBeoM97C6EOTG6RjvWLEN1Qg?e=alnHGC)
|
||||
- [2025/2] We introduce OmniTool: Control a Windows 11 VM with OmniParser + your vision model of choice. OmniTool supports out of the box the following large language models - OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL) or Anthropic Computer Use. [Watch Video](https://1drv.ms/v/c/650b027c18d5a573/EehZ7RzY69ZHn-MeQHrnnR4BCj3by-cLLpUVlxMjF4O65Q?e=8LxMgX)
|
||||
|
||||
Reference in New Issue
Block a user