tanhuajie2001
commited on
Commit
•
2703754
1
Parent(s):
9e60865
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,91 @@ base_model:
|
|
12 |
|
13 |
[[Paper]](https://arxiv.org/abs/2407.17331) [[GitHub]](https://github.com/deepglint/unicom)
|
14 |
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
|
18 |
|
|
|
12 |
|
13 |
[[Paper]](https://arxiv.org/abs/2407.17331) [[GitHub]](https://github.com/deepglint/unicom)
|
14 |
|
15 |
+
|
16 |
+
## Usage
|
17 |
+
|
18 |
+
### A. Installation
|
19 |
+
|
20 |
+
```bash
|
21 |
+
git clone https://github.com/deepglint/unicom
|
22 |
+
cd unicom
|
23 |
+
|
24 |
+
# Upgrade pip and install necessary dependencies
|
25 |
+
pip install --upgrade pip
|
26 |
+
pip install -e ".[train]"
|
27 |
+
```
|
28 |
+
|
29 |
+
### B. Inference
|
30 |
+
|
31 |
+
```bash
|
32 |
+
CUDA_VISIBLE_DEVICES=0 python infer.py --model_dir /path/to/your/model
|
33 |
+
|
34 |
+
# example:
|
35 |
+
# >> Enter 'exit' to end the conversation, 'reset' to clear the chat history.
|
36 |
+
# >> Enter image file paths (comma-separated): ./asserts/logo.png
|
37 |
+
# >> User: <image>What kind of animal is it in this picture?
|
38 |
+
# >> Assistant: The image features a stylized representation of a cat, characterized by its vibrant and abstract depiction.
|
39 |
+
# >> User: What color is this cat?
|
40 |
+
# >> Assistant: The cat in the image is primarily white with blue, orange and pink accents, creating a visually appealing and unique appearance.
|
41 |
+
```
|
42 |
+
|
43 |
+
### C. Evaluation for Embodied Ability
|
44 |
+
|
45 |
+
#### Step 1
|
46 |
+
|
47 |
+
Download raw data following [OpenEQA](https://github.com/facebookresearch/open-eqa/tree/main/data) and [RoboVQA](https://console.cloud.google.com/storage/browser/gdm-robovqa)(val part)
|
48 |
+
|
49 |
+
#### Step 2
|
50 |
+
|
51 |
+
Converting raw data into the format required for model evaluation.
|
52 |
+
```bash
|
53 |
+
# convert OpenEQA benchmark. Note: replace the paths with your own.
|
54 |
+
python llava/benchmark/make_openeqa_bmk.py
|
55 |
+
|
56 |
+
# convert RoboVQA benchmark. Note: replace the paths with your own.
|
57 |
+
python llava/benchmark/make_robovqa_bmk.py
|
58 |
+
```
|
59 |
+
|
60 |
+
#### Step 3
|
61 |
+
|
62 |
+
Make sure that your top-level directory structure should look like this:
|
63 |
+
```
|
64 |
+
|--/path/to/your/benchmarks
|
65 |
+
| |--OpenEQA
|
66 |
+
| | |--openeqa_scannet.parquet
|
67 |
+
| | |--openeqa_hm3d.parquet
|
68 |
+
| |--RoboVQA
|
69 |
+
| |--robovqa.parquet
|
70 |
+
|--/path/to/your/images
|
71 |
+
|--openeqa_val
|
72 |
+
| |--scannet-v0
|
73 |
+
| | |--002-scannet-scene0709_00
|
74 |
+
| | |--xxx-scannet-scenexxxx_xx
|
75 |
+
| |--hm3d-v0
|
76 |
+
| |--000-hm3d-BFRyYbPCCPE
|
77 |
+
| |--xxx-hm3d-xxxxxxxxxxx
|
78 |
+
|--robovqa_val
|
79 |
+
|--robovqa_221911
|
80 |
+
|--robovqa_xxxxxx
|
81 |
+
```
|
82 |
+
|
83 |
+
#### Step 4
|
84 |
+
|
85 |
+
Run script for evaluation
|
86 |
+
```bash
|
87 |
+
# Note: replace 'YOUR_API_KEY', 'YOUR_ENDPOINT', 'bmk_root', 'image_folder' with your own.
|
88 |
+
bash scripts/eval/eval_robo.sh /path/to/your/model
|
89 |
+
```
|
90 |
+
|
91 |
+
### D. Evaluation for General Ability
|
92 |
+
|
93 |
+
Install the evaluation tool and execute the evaluation script:
|
94 |
+
```bash
|
95 |
+
pip install lmms-eval==0.2.0
|
96 |
+
bash eval.sh
|
97 |
+
```
|
98 |
+
|
99 |
+
## Embodied Ability Evaluation: Performance in RoboVQA and OpenEQA
|
100 |
|
101 |
|
102 |
|