zuxin-llm commited on
Commit
11838e7
1 Parent(s): 0243b19

Upload 5 files

Browse files
README.md CHANGED
@@ -26,9 +26,10 @@ tags:
26
  <img width="500px" alt="xLAM" src="https://huggingface.co/datasets/jianguozhang/logos/resolve/main/xlam-no-background.png">
27
  </p>
28
  <p align="center">
29
- <a href="">[Homepage]</a> |
30
- <a href="">[Paper]</a> |
31
- <a href="https://github.com/SalesforceAIResearch/xLAM">[Github]</a>
 
32
  <a href="https://huggingface.co/spaces/Tonic/Salesforce-Xlam-7b-r">[Community Demo]</a>
33
  </p>
34
  <hr>
@@ -419,12 +420,55 @@ Output:
419
  {"thought": "", "tool_calls": [{"name": "get_earthquake_info", "arguments": {"location": "California"}}]}
420
  ````
421
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
422
 
423
  ## License
424
  The model is distributed under the CC-BY-NC-4.0 license.
425
 
426
- <!-- ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
427
 
428
- If you find this repo helpful, please cite our paper:
429
  ```bibtex
430
- ``` -->
 
 
 
 
 
 
 
26
  <img width="500px" alt="xLAM" src="https://huggingface.co/datasets/jianguozhang/logos/resolve/main/xlam-no-background.png">
27
  </p>
28
  <p align="center">
29
+ <a href="https://www.salesforceairesearch.com/projects/xlam-large-action-models">[Homepage]</a> |
30
+ <a href="https://arxiv.org/abs/2409.03215">[Paper]</a> |
31
+ <a href="https://github.com/SalesforceAIResearch/xLAM">[Github]</a> |
32
+ <a href="https://blog.salesforceairesearch.com/large-action-model-ai-agent/">[Blog]</a> |
33
  <a href="https://huggingface.co/spaces/Tonic/Salesforce-Xlam-7b-r">[Community Demo]</a>
34
  </p>
35
  <hr>
 
420
  {"thought": "", "tool_calls": [{"name": "get_earthquake_info", "arguments": {"location": "California"}}]}
421
  ````
422
 
423
+ ## Benchmark Results
424
+ Note: **Bold** and <u>Underline</u> results denote the best result and the second best result for Success Rate, respectively.
425
+
426
+ ### Berkeley Function-Calling Leaderboard (BFCL)
427
+ ![xlam-bfcl](media/xlam-bfcl.png)
428
+ *Table 1: Performance comparison on BFCL-v2 leaderboard (cutoff date 09/03/2024). The rank is based on the overall accuracy, which is a weighted average of different evaluation categories. "FC" stands for function-calling mode in contrast to using a customized "prompt" to extract the function calls.*
429
+
430
+ ### Webshop and ToolQuery
431
+ ![xlam-webshop_toolquery](media/xlam-webshop_toolquery.png)
432
+ *Table 2: Testing results on Webshop and ToolQuery. Bold and Underline results denote the best result and the second best result for Success Rate, respectively.*
433
+
434
+ ### Unified ToolQuery
435
+ ![xlam-unified_toolquery](media/xlam-unified_toolquery.png)
436
+ *Table 3: Testing results on ToolQuery-Unified. Bold and Underline results denote the best result and the second best result for Success Rate, respectively. Values in brackets indicate corresponding performance on ToolQuery*
437
+
438
+ ### ToolBench
439
+ ![xlam-toolbench](media/xlam-toolbench.png)
440
+ *Table 4: Pass Rate on ToolBench on three distinct scenarios. Bold and Underline results denote the best result and the second best result for each setting, respectively. The results for xLAM-8x22b-r are unavailable due to the ToolBench server being down between 07/28/2024 and our evaluation cutoff date 09/03/2024.*
441
 
442
  ## License
443
  The model is distributed under the CC-BY-NC-4.0 license.
444
 
445
+ ## Citation
446
+
447
+ If you find this repo helpful, please consider to cite our papers:
448
+
449
+ ```bibtex
450
+ @article{zhang2024xlam,
451
+ title={xLAM: A Family of Large Action Models to Empower AI Agent Systems},
452
+ author={Zhang, Jianguo and Lan, Tian and Zhu, Ming and Liu, Zuxin and Hoang, Thai and Kokane, Shirley and Yao, Weiran and Tan, Juntao and Prabhakar, Akshara and Chen, Haolin and others},
453
+ journal={arXiv preprint arXiv:2409.03215},
454
+ year={2024}
455
+ }
456
+ ```
457
+
458
+ ```bibtex
459
+ @article{liu2024apigen,
460
+ title={Apigen: Automated pipeline for generating verifiable and diverse function-calling datasets},
461
+ author={Liu, Zuxin and Hoang, Thai and Zhang, Jianguo and Zhu, Ming and Lan, Tian and Kokane, Shirley and Tan, Juntao and Yao, Weiran and Liu, Zhiwei and Feng, Yihao and others},
462
+ journal={arXiv preprint arXiv:2406.18518},
463
+ year={2024}
464
+ }
465
+ ```
466
 
 
467
  ```bibtex
468
+ @article{zhang2024agentohana,
469
+ title={AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning},
470
+ author={Zhang, Jianguo and Lan, Tian and Murthy, Rithesh and Liu, Zhiwei and Yao, Weiran and Tan, Juntao and Hoang, Thai and Yang, Liangwei and Feng, Yihao and Liu, Zuxin and others},
471
+ journal={arXiv preprint arXiv:2402.15506},
472
+ year={2024}
473
+ }
474
+ ```
media/xlam-bfcl.png ADDED
media/xlam-toolbench.png ADDED
media/xlam-unified_toolquery.png ADDED
media/xlam-webshop_toolquery.png ADDED