agentsea
/

paligemma-3b-ft-widgetcap-waveui-448

@@ -1,53 +1,58 @@
 ---
-library_name: transformers
-datasets:
-- agentsea/wave-ui-25k
-language:
-- en
 ---
-# Paligemma WaveUI
-Transformers [PaliGemma 3B 448-res weights](https://huggingface.co/google/paligemma-3b-pt-448), fine-tuned on the [WaveUI-25k](https://huggingface.co/datasets/agentsea/wave-ui-25k) dataset for object-detection.
-## Model Details
-### Model Description
-This fine-tune was done atop of the [Paligemma 448 Widgetcap](https://huggingface.co/google/paligemma-3b-ft-widgetcap-448) model, using the [WaveUI-25k](https://huggingface.co/datasets/agentsea/wave-ui-25k) dataset, which contains 25k examples of labeled UI elements.
-The fine-tune was done for the object detection task. Specifically, this model aims to perform well at UI element detection, as part of a wider effort to enable our open-source toolkit for building agents at [AgentSea](https://www.agentsea.ai/). However, this release is mainly intended as a proof of concept and more details on this larger effort will be shared soon.
-- **Developed by:** https://agentsea.ai/
-- **Language(s) (NLP):** en
-- **Finetuned from model:** https://huggingface.co/google/paligemma-3b-ft-widgetcap-448
-### Demo
-You can find a **demo** for this model [here](https://huggingface.co/spaces/agentsea/paligemma-waveui).
-## Notes
-- This model was trained only on a subset of the entire WaveUI dataset. We will release a version using the full dataset soon.
-- The only task used in the fine-tune was the object detection task, so it might not perform well in other types of tasks.
-## Usage
-To start using this model, run the following:
-```python
-from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
-model = PaliGemmaForConditionalGeneration.from_pretrained("agentsea/paligemma-3b-ft-widgetcap-waveui-448").eval()
-processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-448")
-```
-## Data
-We used the [WaveUI-25k](https://huggingface.co/datasets/agentsea/wave-ui-25k) dataset for this fine-tune. Before using it, we preprocessed the data to use the Paligemma bounding-box format, and we filtered-out non-English examples.
-## Evaluation
-We will release a full evaluation report along with the full WebUI dataset. Stay tuned! :)

 ---
+base_model: google/paligemma-3b-ft-widgetcap-448
+library_name: peft
+license: gemma
+tags:
+- generated_from_trainer
+model-index:
+- name: paligemma-3b-ft-widgetcap-waveui-448
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/kentauros/paligemma-waveui/runs/hfa841vp)
+# paligemma-3b-ft-widgetcap-waveui-448
+This model is a fine-tuned version of [google/paligemma-3b-ft-widgetcap-448](https://huggingface.co/google/paligemma-3b-ft-widgetcap-448) on an unknown dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 2
+- num_epochs: 3
+### Training results
+### Framework versions
+- PEFT 0.11.1
+- Transformers 4.43.2
+- Pytorch 2.4.0+cu121
+- Datasets 2.20.0
+- Tokenizers 0.19.1