Spaces:
Running
Running
# this is .py for store constants | |
MODEL_INFO = ['Models', 'Ver.','Abilities'] | |
TASK_INFO = [ 'Resolution', 'FPS', 'Open Source', 'Length', 'Speed', 'Motion', 'Camera', 'Final Sum Score', 'Motion Quality', 'Text-Video Alignment', 'Visual Quality', 'Temporal Consistency'] | |
TASK_INFO_v2 = ['Final Sum Score', 'Motion Quality', 'Text-Video Alignment', 'Visual Quality', 'Temporal Consistency', 'Resolution', 'FPS', 'Open Source', 'Length', 'Speed', 'Motion', 'Camera'] | |
AVG_INFO = ['Final Sum Score', 'Motion Quality', 'Text-Video Alignment', 'Visual Quality', 'Temporal Consistency'] | |
DATA_TITILE_TYPE = ["markdown", "number", "number", "number", "number", "number", "number", "number", "number", "number", "number", "number", "number", "number", "number"] | |
CSV_DIR = "./file/result.csv" | |
# COLUMN_NAMES = MODEL_INFO + TASK_INFO | |
COLUMN_NAMES = MODEL_INFO + TASK_INFO_v2 | |
DATA_NUM = [3158, 1831, 4649, 978, 2447, 657, 97, 331, 85, 1740, 2077, 1192] | |
LEADERBORAD_INTRODUCTION = """# EvalCrafter Leaderboard 🏆 | |
Welcome to the cutting-edge leaderboard for text-to-video generation, where we meticulously evaluate state-of-the-art generative models using our comprehensive framework, ensuring high-quality results that align with user opinions. Join us in this exciting journey towards excellence! 🛫 | |
More methods will be evalcrafted soon, stay tunned ❤️ Join our evaluation by sending an email 📧 ([email protected])! You may also read the [EvalCrafter paper](https://arxiv.org/abs/2310.11440) for more detailed information 🤗 | |
""" | |
TABLE_INTRODUCTION = """In the table below, we summarize each dimension performance of all the models. """ | |
LEADERBORAD_INFO = """ | |
The vision and language generative models have been overgrown in recent years. For video generation, | |
various open-sourced models and public-available services are released for generating high-visual quality videos. | |
However, these methods often use a few academic metrics, \eg, FVD or IS, to evaluate the performance. We argue that | |
it is hard to judge the large conditional generative models from the simple metrics since these models are often trained | |
on very large datasets with multi-aspect abilities. Thus, we propose a new framework and pipeline to exhaustively evaluate | |
the performance of the generated videos. To achieve this, we first conduct a new prompt list for text-to-video generation | |
by analyzing the real-world prompt list with the help of the large language model. Then, we evaluate the state-of-the-art video | |
generative models on our carefully designed benchmarks, in terms of visual qualities, content qualities, motion qualities, and | |
text-caption alignment with around 18 objective metrics. To obtain the final leaderboard of the models, we also fit a series of | |
coefficients to align the objective metrics to the users' opinions. Based on the proposed opinion alignment method, our final score | |
shows a higher correlation than simply averaging the metrics, showing the effectiveness of the proposed evaluation method. | |
""" | |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results" | |
CITATION_BUTTON_TEXT = r"""@inproceedings{Liu2023EvalCrafterBA, | |
title={EvalCrafter: Benchmarking and Evaluating Large Video Generation Models}, | |
author={Yaofang Liu and Xiaodong Cun and Xuebo Liu and Xintao Wang and Yong Zhang and Haoxin Chen and Yang Liu and Tieyong Zeng and Raymond Chan and Ying Shan}, | |
year={2023}, | |
url={https://api.semanticscholar.org/CorpusID:264172222} | |
}""" | |