Upload gauntlet-longt5-base.md
Browse files- gauntlet-longt5-base.md +292 -0
gauntlet-longt5-base.md
ADDED
@@ -0,0 +1,292 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# gauntlet results
|
2 |
+
|
3 |
+
These are are this model's output results on my "summarization gauntlet". You can find more info about that [here on my dropbox for it](https://www.dropbox.com/sh/axu1xlscrrexy55/AADAm01-4Zs3POyHQrgbDAsda?dl=0) or at [this dataset](https://huggingface.co/datasets/pszemraj/summcomparer-gauntlet-v0p1).
|
4 |
+
|
5 |
+
- if you aren't familiar with it, one thing to note is some of the docs **purposefully** are "messy"/have spelling errors etc.
|
6 |
+
|
7 |
+
parameters
|
8 |
+
|
9 |
+
```json
|
10 |
+
{
|
11 |
+
"model_name_or_path": "pszemraj/long-t5-tglobal-base-synthsumm_direct",
|
12 |
+
"use_cuda": true,
|
13 |
+
"token_batch_length": 16384,
|
14 |
+
"batch_stride": 16,
|
15 |
+
"max_length_ratio": 0.25,
|
16 |
+
"load_in_8bit": false,
|
17 |
+
"compile_model": true,
|
18 |
+
"optimum_onnx": false,
|
19 |
+
"device": "cuda",
|
20 |
+
"inference_params": {
|
21 |
+
"min_length": 8,
|
22 |
+
"max_length": 4096,
|
23 |
+
"no_repeat_ngram_size": 3,
|
24 |
+
"encoder_no_repeat_ngram_size": 4,
|
25 |
+
"repetition_penalty": 2.5,
|
26 |
+
"num_beams": 10,
|
27 |
+
"num_beam_groups": 1,
|
28 |
+
"length_penalty": 1.0,
|
29 |
+
"early_stopping": true,
|
30 |
+
"do_sample": false
|
31 |
+
},
|
32 |
+
"textsum_version": "0.2.0"
|
33 |
+
}
|
34 |
+
````
|
35 |
+
|
36 |
+
- Created: `2023-11-28T18:50:46.039204`
|
37 |
+
|
38 |
+
## ASR-whisper-rpunctuated_Noam Chomsky, Fundam_1669853561_0_part1_summary
|
39 |
+
|
40 |
+
The speaker discusses the foundational issues in studying language and emphasizes the importance of focusing on computational operations that meet the conditions for genuine explanation. They also discuss the development of neural nets and the impact of resource restriction on computation efficiency.
|
41 |
+
|
42 |
+
---
|
43 |
+
|
44 |
+
Section Scores for ASR-whisper-rpunctuated_Noam Chomsky, Fundam_1669853561_0_part1_summary:
|
45 |
+
|
46 |
+
- -0.7767
|
47 |
+
|
48 |
+
---
|
49 |
+
|
50 |
+
## ASR-whisper-rpunctuated_Noam Chomsky, Fundam_1669853631_0_part2_summary
|
51 |
+
|
52 |
+
The speaker discusses the concept of merge, discussing its limitations and potential solutions. They emphasize the need for a comprehensive explanation of merge in terms of computational procedures and general conditions. They also discuss the challenges of dealing with unbounded or unstructured coordinates, as well as the implications of merging noun phrases into verb phrases.
|
53 |
+
The speaker discusses the use of Pair merge structures to unify adjunct island and coordination island problems, as well as paramerge and head movement. They also discuss the limitations of traditional adjunct operations and the need for a more principled approach in explaining these problems.
|
54 |
+
|
55 |
+
---
|
56 |
+
|
57 |
+
Section Scores for ASR-whisper-rpunctuated_Noam Chomsky, Fundam_1669853631_0_part2_summary:
|
58 |
+
|
59 |
+
- -1.0401
|
60 |
+
|
61 |
+
- -0.72
|
62 |
+
|
63 |
+
---
|
64 |
+
|
65 |
+
## ASRnlp_law_lecture_week_1_v_2_c_transcription_1_summary
|
66 |
+
|
67 |
+
The speaker is teaching a natural Language processing course at the University of Maryland, covering topics such as text documents, machine analysis, and legal applications. They emphasize the importance of understanding the social forces behind these documents and provide resources for self-motivated learning.
|
68 |
+
|
69 |
+
---
|
70 |
+
|
71 |
+
Section Scores for ASRnlp_law_lecture_week_1_v_2_c_transcription_1_summary:
|
72 |
+
|
73 |
+
- -0.8276
|
74 |
+
|
75 |
+
---
|
76 |
+
|
77 |
+
## ASRnlp_law_lecture_week_2_v_2_c_transcription_2_summary
|
78 |
+
|
79 |
+
The speaker discusses the start of room for new students to join in the computer science class, including questions about copy paste, homework emissions, and a final assignment. They also emphasize the importance of preprocessing documents and provide examples of projects that have been successful.
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
Section Scores for ASRnlp_law_lecture_week_2_v_2_c_transcription_2_summary:
|
84 |
+
|
85 |
+
- -0.7777
|
86 |
+
|
87 |
+
---
|
88 |
+
|
89 |
+
## ASRnlp_law_lecture_week_3_part_1_v_2_c_transcription_3_summary
|
90 |
+
|
91 |
+
The speaker discusses the use of phrase representations in documents, emphasizing the importance of using positive mutual information to identify distinctive phrases. They also discuss unsupervised learning methods, topic models, and clustering for dimension reduction.
|
92 |
+
|
93 |
+
---
|
94 |
+
|
95 |
+
Section Scores for ASRnlp_law_lecture_week_3_part_1_v_2_c_transcription_3_summary:
|
96 |
+
|
97 |
+
- -0.6562
|
98 |
+
|
99 |
+
---
|
100 |
+
|
101 |
+
## Emie_dissertation_cleansed_summary
|
102 |
+
|
103 |
+
The dissertation examines the movement of American and British film noir "Act of Violence" and "The Man Between." It explores the tension between individual and material reality in these films, emphasizing the importance of renegotiating identity through movement. It also discusses the role of the camera in capturing urban space and its impact on the characters' struggles to reconcile their identities.
|
104 |
+
The film "The Man Between" explores the relationship between characters in Berlin, focusing on their disengagement from material reality and its impact on their character development. It emphasizes the importance of renegotiating identity through movement and speed, emphasizing the need to embrace the painful materiality of urban space.
|
105 |
+
|
106 |
+
---
|
107 |
+
|
108 |
+
Section Scores for Emie_dissertation_cleansed_summary:
|
109 |
+
|
110 |
+
- -0.7842
|
111 |
+
|
112 |
+
- -0.8138
|
113 |
+
|
114 |
+
---
|
115 |
+
|
116 |
+
## OCR_ML4HLecture02image__summary
|
117 |
+
|
118 |
+
The lecture provides a comprehensive overview of machine learning for medical image analysis, covering topics such as image classification, segmentation, superpixels, Markov random fields, and convolutional networks. It also discusses the use of datasets large than previous studies to improve clinical decision support, and proposes a method to integrate it into the clinical workflow.
|
119 |
+
|
120 |
+
---
|
121 |
+
|
122 |
+
Section Scores for OCR_ML4HLecture02image__summary:
|
123 |
+
|
124 |
+
- -0.6902
|
125 |
+
|
126 |
+
---
|
127 |
+
|
128 |
+
## OCR_ML4HLecture04RepresentationLearning.pptx__summary
|
129 |
+
|
130 |
+
The lecture discusses machine learning for health care, covering topics such as computational patient representations, unsupervised time series representation learning, transformers ICU benchmarks, generative models, SOTA machine learning approaches, contrastive learning, and neighborhood contrastive loss. It concludes with a summary and take home messages.
|
131 |
+
|
132 |
+
---
|
133 |
+
|
134 |
+
Section Scores for OCR_ML4HLecture04RepresentationLearning.pptx__summary:
|
135 |
+
|
136 |
+
- -0.4887
|
137 |
+
|
138 |
+
---
|
139 |
+
|
140 |
+
## OCR_ML4HLecture05-NLP.pptx__summary
|
141 |
+
|
142 |
+
The lecture explores the use of natural language processing in health care, covering topics such as text features, bag of words, term frequency, latent representation, and speech tagging. It emphasizes the importance of preprocessing, normalization, and stop-word removal, as well as the usefulness of clinical texts for precision medicine. The presentation also discusses the concept of distributed representations of words and phrases and their compositionality, emphasizing the need for efficient estimation of word representations in vector space.
|
143 |
+
|
144 |
+
---
|
145 |
+
|
146 |
+
Section Scores for OCR_ML4HLecture05-NLP.pptx__summary:
|
147 |
+
|
148 |
+
- -0.6959
|
149 |
+
|
150 |
+
---
|
151 |
+
|
152 |
+
## OCR_PAPER_Hong et al. - 2022 - CogVideo Large-scale Pretraining for Text-to-Video Generation via Transformers-annotated__summary
|
153 |
+
|
154 |
+
The paper explores the use of large-scale PRETRAKED transformers for text and image generation, focusing on their ability to understand complex motion semantics. It introduces a multi-frame rate hierarchically training strategy to align text and videos clips, and proposes an efficient method Dual-Channel Attention to inherit knowledge from Pretrained Text-Image Models for Video Generation. The paper concludes with a summary of the results and acknowledges funding for the project.
|
155 |
+
|
156 |
+
---
|
157 |
+
|
158 |
+
Section Scores for OCR_PAPER_Hong et al. - 2022 - CogVideo Large-scale Pretraining for Text-to-Video Generation via Transformers-annotated__summary:
|
159 |
+
|
160 |
+
- -0.9629
|
161 |
+
|
162 |
+
---
|
163 |
+
|
164 |
+
## OCR_PAPER_Kandpal, Nieto, Jin - 2022 - Music Enhancement via Image Translation and Vocoding-annotated__summary
|
165 |
+
|
166 |
+
The paper explores the development of a solution for music enhancement using image translation and vocoder, focusing on low-quality audio recordings. It employs conditional Image Synthesis and Vocoding to improve the quality of these recordings, and compares the subjective listener scores with popular audio quality metrics. The study also evaluates the effectiveness of objective metrics in evaluating algorithms and suggests that this approach may be more performant than current objective metrics.
|
167 |
+
|
168 |
+
---
|
169 |
+
|
170 |
+
Section Scores for OCR_PAPER_Kandpal, Nieto, Jin - 2022 - Music Enhancement via Image Translation and Vocoding-annotated__summary:
|
171 |
+
|
172 |
+
- -0.9274
|
173 |
+
|
174 |
+
---
|
175 |
+
|
176 |
+
## OCR_PAPER_dall-e-2-annotated__summary
|
177 |
+
|
178 |
+
The paper explores the use of Contrastive Models like CLIP for image generation, using diffusion models and prior models to improve image diversity. It compares these models with other systems and shows that they are computationally efficient and produce high-quality images.
|
179 |
+
|
180 |
+
---
|
181 |
+
|
182 |
+
Section Scores for OCR_PAPER_dall-e-2-annotated__summary:
|
183 |
+
|
184 |
+
- -0.7825
|
185 |
+
|
186 |
+
---
|
187 |
+
|
188 |
+
## The Most Dangerous Game--Richard Connell_summary
|
189 |
+
|
190 |
+
The text is a collection of conversations and events from Richard Connell's 1893-1949 expedition to the ship-trap island in the Caribbean Sea. It explores the concept of fear, its impact on hunter behavior, and the importance of being aware of danger.
|
191 |
+
|
192 |
+
---
|
193 |
+
|
194 |
+
Section Scores for The Most Dangerous Game--Richard Connell_summary:
|
195 |
+
|
196 |
+
- -0.9024
|
197 |
+
|
198 |
+
---
|
199 |
+
|
200 |
+
## gpt_peter_testing_group_exemplars_summary
|
201 |
+
|
202 |
+
The text is a collection of conversations and interactions between various characters, covering topics such as mental health, technology, and personal struggles.
|
203 |
+
|
204 |
+
---
|
205 |
+
|
206 |
+
Section Scores for gpt_peter_testing_group_exemplars_summary:
|
207 |
+
|
208 |
+
- -0.7619
|
209 |
+
|
210 |
+
---
|
211 |
+
|
212 |
+
## navy seals copy pasta_summary
|
213 |
+
|
214 |
+
The speaker is a former navy seals sniper who has been involved in secret raids against Al-Qaeda. They are being targeted by a storm that will wipe them out with precision.
|
215 |
+
|
216 |
+
---
|
217 |
+
|
218 |
+
Section Scores for navy seals copy pasta_summary:
|
219 |
+
|
220 |
+
- -0.8165
|
221 |
+
|
222 |
+
---
|
223 |
+
|
224 |
+
## script_findingnemo_summary
|
225 |
+
|
226 |
+
The text is a transcript of the film "Finding Nemo" by Walt Disney Pictures. It includes dialogue from various characters, including Marlin and Coral, as well as references to sea turtles and sharks.
|
227 |
+
"Finding Nemo" is a Disney film about two sea turtles who search the ocean for their missing son. They encounter sharks, jellyfish, and sea cucumbers, leading to a quest to find his son. The story ends with Dory's return home.
|
228 |
+
|
229 |
+
---
|
230 |
+
|
231 |
+
Section Scores for script_findingnemo_summary:
|
232 |
+
|
233 |
+
- -0.7096
|
234 |
+
|
235 |
+
- -0.8317
|
236 |
+
|
237 |
+
---
|
238 |
+
|
239 |
+
## script_frozendisney_summary
|
240 |
+
|
241 |
+
"Frozen" is a screenplay by Jennifer Lee about a young Sami girl named Elsa who creates magical snowflakes for her father, Prince Hans. The story follows Anna, a princess with powers, as she prepares for her wedding to Prince Hans at the castle.
|
242 |
+
The text is a collection of scenes from the Disney film "The Little Mermaid." It explores the relationships and conflicts between characters, including Anna's attempt to bring summer back to her sister, Kristoff in a snowstorm, and Prince Hans' decision to kill Princess Anna. It also touches on themes of love, betrayal, and fear.
|
243 |
+
The text is a collection of scenes from the Disney film "The Little Mermaid," including Kristoff's confrontation with Sven, Anna's struggle to save her sister, and the aftermath of a winter storm.
|
244 |
+
|
245 |
+
---
|
246 |
+
|
247 |
+
Section Scores for script_frozendisney_summary:
|
248 |
+
|
249 |
+
- -0.8344
|
250 |
+
|
251 |
+
- -0.9199
|
252 |
+
|
253 |
+
- -0.6909
|
254 |
+
|
255 |
+
---
|
256 |
+
|
257 |
+
## script_strangersonatrain_summary
|
258 |
+
|
259 |
+
"Strangers on a Train" is a 1950 film about Bruno Anthony and Guy Haines, who plan to murder his wife Miriam after her divorce. The film explores the complex relationships between the characters, including sexual tension, family dynamics, and personal struggles.
|
260 |
+
The text is a screenplay about Anne Burton and Guy Haines, who are involved in a murder mystery. They encounter a mysterious figure named Bruno, who threatens to kill them. The characters confront each other with suspicion and fear, leading to a dramatic confrontation.
|
261 |
+
The text is a dramatic story about Anne and Guy Haines, who play tennis at an amusement park after a murder. They are confronted by the police and find their lighter on the island, leading to a series of confrontations and emotional moments.
|
262 |
+
|
263 |
+
---
|
264 |
+
|
265 |
+
Section Scores for script_strangersonatrain_summary:
|
266 |
+
|
267 |
+
- -0.7167
|
268 |
+
|
269 |
+
- -0.9529
|
270 |
+
|
271 |
+
- -0.8935
|
272 |
+
|
273 |
+
---
|
274 |
+
|
275 |
+
## script_sunsetblvd._summary
|
276 |
+
|
277 |
+
"Sunset Boulevard" is a screenplay by Billy Wilder about writer Joe Gillis who loses his car in Los Angeles. The story follows Gillis as he attempts to write a script for the movie "Salome." The film ends with a dramatic scene from an old silent picture.
|
278 |
+
The text is a collection of dialogue from the screenplay "Untitled Love Story" by Joe Gillis about a young actress named Norma who falls in love with Artie Green. It explores the complex relationships and conflicts between the characters, covering topics such as personal struggles, family dynamics, and professional challenges.
|
279 |
+
The text is a dramatic scene from the movie "Norma" about a young woman named Norma dealing with personal and professional challenges. She receives a phone call from a man named Joe, who offers her a job in a Hollywood movie theatre. However, she refuses to accept it, leading to a series of confrontations and murders.
|
280 |
+
|
281 |
+
---
|
282 |
+
|
283 |
+
Section Scores for script_sunsetblvd._summary:
|
284 |
+
|
285 |
+
- -0.851
|
286 |
+
|
287 |
+
- -0.8202
|
288 |
+
|
289 |
+
- -0.8761
|
290 |
+
|
291 |
+
---
|
292 |
+
|