--- language: en datasets: - ccdv/pubmed-summarization license: apache-2.0 --- ## Introduction [Google's LongT5: Efficient Text-To-Text Transformer for Long Sequences](https://arxiv.org/pdf/2112.07916.pdf) introduced as an extension of a successful [T5 model](https://arxiv.org/pdf/1910.10683.pdf). This is an unofficial *longt5-large-16384-pubmed-3k_steps* checkpoint. I.e., this is a large configuration of the LongT5 model with a `transient-global` attention fine-tuned on [pubmed summarization dataset](https://huggingface.co/datasets/ccdv/pubmed-summarization) for 3,000 training steps. It may be worth continuing in the fine-tuning as we did not train the model until the convergence. ## Results and Fine-tuning Details The fine-tuned model achieves the following results on the evaluation set using `beam_search=3` and without any specific calibration of generation parameters are presented below, altogether with the results from the original paper (the original scores are higher, very likely due to a higher number of training steps). | Metric | Score | Score (original paper) | --- | --- | --- | | Rouge-1 | 47.44 | 49.98 | | Rouge-2 | 22.68 | 24.69 | | Rouge-L | 29.83 | x | | Rouge-Lsum | 43.13 | 46.46 | The training parameters follow the ones specified in the paper. We accumulated batch size to 128 examples and used `Adafactor` optimizer with a constant learning rate `0.001`. The full training hyper-parameters and logs can be found via the following [W&B run](https://wandb.ai/stancld/LongT5/runs/1lwncl8a?workspace=user-stancld). The model was trained using the [HuggingFace's trainer](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer_seq2seq.py). The only specific adjustment, I made for the training, was dropping very short input articles (less than 16 words (a bit of mistake, should be less than 16 tokens)) as this sequences do not contribute to gradient creation in the *transient-global* attention, which resulted in training crashes when DDP used. ## Usage ```python LONG_ARTICLE = """"anxiety affects quality of life in those living with parkinson 's disease ( pd ) more so than overall cognitive status , motor deficits , apathy , and depression [ 13 ] . although anxiety and depression are often related and coexist in pd patients , recent research suggests that anxiety rather than depression is the most prominent and prevalent mood disorder in pd [ 5 , 6 ] . yet , our current understanding of anxiety and its impact on cognition in pd , as well as its neural basis and best treatment practices , remains meager and lags far behind that of depression . overall , neuropsychiatric symptoms in pd have been shown to be negatively associated with cognitive performance . for example , higher depression scores have been correlated with lower scores on the mini - mental state exam ( mmse ) [ 8 , 9 ] as well as tests of memory and executive functions ( e.g. , attention ) [ 1014 ] . likewise , apathy and anhedonia in pd patients have been associated with executive dysfunction [ 10 , 1523 ] . however , few studies have specifically investigated the relationship between anxiety and cognition in pd . one study showed a strong negative relationship between anxiety ( both state and trait ) and overall cognitive performance ( measured by the total of the repeatable battery for the assessment of neuropsychological status index ) within a sample of 27 pd patients . furthermore , trait anxiety was negatively associated with each of the cognitive domains assessed by the rbans ( i.e. , immediate memory , visuospatial construction , language , attention , and delayed memory ) . two further studies have examined whether anxiety differentially affects cognition in patients with left - sided dominant pd ( lpd ) versus right - sided dominant pd ( rpd ) ; however , their findings were inconsistent . the first study found that working memory performance was worse in lpd patients with anxiety compared to rpd patients with anxiety , whereas the second study reported that , in lpd , apathy but not anxiety was associated with performance on nonverbally mediated executive functions and visuospatial tasks ( e.g. , tmt - b , wms - iii spatial span ) , while in rpd , anxiety but not apathy significantly correlated with performance on verbally mediated tasks ( e.g. , clock reading test and boston naming test ) . furthermore , anxiety was significantly correlated with neuropsychological measures of attention and executive and visuospatial functions . taken together , it is evident that there are limited and inconsistent findings describing the relationship between anxiety and cognition in pd and more specifically how anxiety might influence particular domains of cognition such as attention and memory and executive functioning . it is also striking that , to date , no study has examined the influence of anxiety on cognition in pd by directly comparing groups of pd patients with and without anxiety while excluding depression . given that research on healthy young adults suggests that anxiety reduces processing capacity and impairs processing efficiency , especially in the central executive and attentional systems of working memory [ 26 , 27 ] , we hypothesized that pd patients with anxiety would show impairments in attentional set - shifting and working memory compared to pd patients without anxiety . furthermore , since previous work , albeit limited , has focused on the influence of symptom laterality on anxiety and cognition , we also explored this relationship . seventeen pd patients with anxiety and thirty - three pd patients without anxiety were included in this study ( see table 1 ) . the cross - sectional data from these participants was taken from a patient database that has been compiled over the past 8 years ( since 2008 ) at the parkinson 's disease research clinic at the brain and mind centre , university of sydney . inclusion criteria involved a diagnosis of idiopathic pd according to the united kingdom parkinson 's disease society brain bank criteria and were confirmed by a neurologist ( sjgl ) . patients also had to have an adequate proficiency in english and have completed a full neuropsychological assessment . ten patients in this study ( 5 pd with anxiety ; 5 pd without anxiety ) were taking psychotropic drugs ( i.e. , benzodiazepine or selective serotonin reuptake inhibitor ) . patients were also excluded if they had other neurological disorders , psychiatric disorders other than affective disorders ( such as anxiety ) , or if they reported a score greater than six on the depression subscale of the hospital anxiety and depression scale ( hads ) . thus , all participants who scored within a depressed ( hads - d > 6 ) range were excluded from this study , in attempt to examine a refined sample of pd patients with and without anxiety in order to determine the independent effect of anxiety on cognition . this research was approved by the human research ethics committee of the university of sydney , and written informed consent was obtained from all participants . self - reported hads was used to assess anxiety in pd and has been previously shown to be a useful measure of clinical anxiety in pd . a cut - off score of > 8 on the anxiety subscale of the hads ( hads - a ) was used to identify pd cases with anxiety ( pda+ ) , while a cut - off score of < 6 on the hads - a was used to identify pd cases without anxiety ( pda ) . this criterion was more stringent than usual ( > 7 cut - off score ) , in effort to create distinct patient groups . the neurological evaluation rated participants according to hoehn and yahr ( h&y ) stages and assessed their motor symptoms using part iii of the revised mds task force unified parkinson 's disease rating scale ( updrs ) . in a similar way this was determined by calculating a total left and right score from rigidity items 3035 , voluntary movement items 3643 , and tremor items 5057 from the mds - updrs part iii ( see table 1 ) . processing speed was assessed using the trail making test , part a ( tmt - a , z - score ) . attentional set - shifting was measured using the trail making test , part b ( tmt - b , z - score ) . working memory was assessed using the digit span forward and backward subtest of the wechsler memory scale - iii ( raw scores ) . language was assessed with semantic and phonemic verbal fluency via the controlled oral word associated test ( cowat animals and letters , z - score ) . the ability to retain learned verbal memory was assessed using the logical memory subtest from the wechsler memory scale - iii ( lm - i z - score , lm - ii z - score , % lm retention z - score ) . the mini - mental state examination ( mmse ) demographic , clinical , and neuropsychological variables were compared between the two groups with the independent t - test or mann whitney u test , depending on whether the variable met parametric assumptions . chi - square tests were used to examine gender and symptom laterality differences between groups . all analyses employed an alpha level of p < 0.05 and were two - tailed . spearman correlations were performed separately in each group to examine associations between anxiety and/or depression ratings and cognitive functions . as expected , the pda+ group reported significant greater levels of anxiety on the hads - a ( u = 0 , p < 0.001 ) and higher total score on the hads ( u = 1 , p < 0.001 ) compared to the pda group ( table 1 ) . groups were matched in age ( t(48 ) = 1.31 , p = 0.20 ) , disease duration ( u = 259 , p = 0.66 ) , updrs - iii score ( u = 250.5 , p = 0.65 ) , h&y ( u = 245 , p = 0.43 ) , ledd ( u = 159.5 , p = 0.80 ) , and depression ( hads - d ) ( u = 190.5 , p = 0.06 ) . additionally , all groups were matched in the distribution of gender ( = 0.098 , p = 0.75 ) and side - affected ( = 0.765 , p = 0.38 ) . there were no group differences for tmt - a performance ( u = 256 , p = 0.62 ) ( table 2 ) ; however , the pda+ group had worse performance on the trail making test part b ( t(46 ) = 2.03 , p = 0.048 ) compared to the pda group ( figure 1 ) . the pda+ group also demonstrated significantly worse performance on the digit span forward subtest ( t(48 ) = 2.22 , p = 0.031 ) and backward subtest ( u = 190.5 , p = 0.016 ) compared to the pda group ( figures 2(a ) and 2(b ) ) . neither semantic verbal fluency ( t(47 ) = 0.70 , p = 0.49 ) nor phonemic verbal fluency ( t(47 ) = 0.39 , p = 0.70 ) differed between groups . logical memory i immediate recall test ( u = 176 , p = 0.059 ) showed a trend that the pda+ group had worse new verbal learning and immediate recall abilities than the pda group . however , logical memory ii test performance ( u = 219 , p = 0.204 ) and logical memory % retention ( u = 242.5 , p = 0.434 ) did not differ between groups . there were also no differences between groups in global cognition ( mmse ) ( u = 222.5 , p = 0.23 ) . participants were split into lpd and rpd , and then further group differences were examined between pda+ and pda. importantly , the groups remained matched in age , disease duration , updrs - iii , dde , h&y stage , and depression but remained significantly different on self - reported anxiety . lpda+ demonstrated worse performance on the digit span forward test ( t(19 ) = 2.29 , p = 0.033 ) compared to lpda , whereas rpda+ demonstrated worse performance on the digit span backward test ( u = 36.5 , p = 0.006 ) , lm - i immediate recall ( u = 37.5 , p = 0.008 ) , and lm - ii ( u = 45.0 , p = 0.021 ) but not lm % retention ( u = 75.5 , p = 0.39 ) compared to rpda. this study is the first to directly compare cognition between pd patients with and without anxiety . the findings confirmed our hypothesis that anxiety negatively influences attentional set - shifting and working memory in pd . more specifically , we found that pd patients with anxiety were more impaired on the trail making test part b which assessed attentional set - shifting , on both digit span tests which assessed working memory and attention , and to a lesser extent on the logical memory test which assessed memory and new verbal learning compared to pd patients without anxiety . taken together , these findings suggest that anxiety in pd may reduce processing capacity and impair processing efficiency , especially in the central executive and attentional systems of working memory in a similar way as seen in young healthy adults [ 26 , 27 ] . although the neurobiology of anxiety in pd remains unknown , many researchers have postulated that anxiety disorders are related to neurochemical changes that occur during the early , premotor stages of pd - related degeneration [ 37 , 38 ] such as nigrostriatal dopamine depletion , as well as cell loss within serotonergic and noradrenergic brainstem nuclei ( i.e. , raphe nuclei and locus coeruleus , resp . , which provide massive inputs to corticolimbic regions ) . over time , chronic dysregulation of adrenocortical and catecholamine functions can lead to hippocampal damage as well as dysfunctional prefrontal neural circuitries [ 39 , 40 ] , which play a key role in memory and attention . recent functional neuroimaging work has suggested that enhanced hippocampal activation during executive functioning and working memory tasks may represent compensatory processes for impaired frontostriatal functions in pd patients compared to controls . therefore , chronic stress from anxiety , for example , may disrupt compensatory processes in pd patients and explain the cognitive impairments specifically in working memory and attention seen in pd patients with anxiety . it has also been suggested that hyperactivation within the putamen may reflect a compensatory striatal mechanism to maintain normal working memory performance in pd patients ; however , losing this compensatory activation has been shown to contribute to poor working memory performance . anxiety in mild pd has been linked to reduced putamen dopamine uptake which becomes more extensive as the disease progresses . this further supports the notion that anxiety may disrupt compensatory striatal mechanisms as well , providing another possible explanation for the cognitive impairments observed in pd patients with anxiety in this study . noradrenergic and serotonergic systems should also be considered when trying to explain the mechanisms by which anxiety may influence cognition in pd . although these neurotransmitter systems are relatively understudied in pd cognition , treating the noradrenergic and serotonergic systems has shown beneficial effects on cognition in pd . selective serotonin reuptake inhibitor , citalopram , was shown to improve response inhibition deficits in pd , while noradrenaline reuptake blocker , atomoxetine , has been recently reported to have promising effects on cognition in pd [ 45 , 46 ] . overall , very few neuroimaging studies have been conducted in pd in order to understand the neural correlates of pd anxiety and its underlying neural pathology . future research should focus on relating anatomical changes and neurochemical changes to neural activation in order to gain a clearer understanding on how these pathologies affect anxiety in pd . to further understand how anxiety and cognitive dysfunction are related , future research should focus on using advanced structural and function imaging techniques to explain both cognitive and neural breakdowns that are associated with anxiety in pd patients . research has indicated that those with amnestic mild cognitive impairment who have more neuropsychiatric symptoms have a greater risk of developing dementia compared to those with fewer neuropsychiatric symptoms . future studies should also examine whether treating neuropsychiatric symptoms might impact the progression of cognitive decline and improve cognitive impairments in pd patients . previous studies have used pd symptom laterality as a window to infer asymmetrical dysfunction of neural circuits . for example , lpd patients have greater inferred right hemisphere pathology , whereas rpd patients have greater inferred left hemisphere pathology . thus , cognitive domains predominantly subserved by the left hemisphere ( e.g. , verbally mediated tasks of executive function and verbal memory ) might be hypothesized to be more affected in rpd than lpd ; however , this remains controversial . it has also been suggested that since anxiety is a common feature of left hemisphere involvement [ 48 , 49 ] , cognitive domains subserved by the left hemisphere may also be more strongly related to anxiety . results from this study showed selective verbal memory deficits in rpd patients with anxiety compared to rpd without anxiety , whereas lpd patients with anxiety had greater attentional / working memory deficits compared to lpd without anxiety . although these results align with previous research , interpretations of these findings should be made with caution due to the small sample size in the lpd comparison specifically . recent work has suggested that the hads questionnaire may underestimate the burden of anxiety related symptomology and therefore be a less sensitive measure of anxiety in pd [ 30 , 50 ] . in addition , our small sample size also limited the statistical power for detecting significant findings . based on these limitations , our findings are likely conservative and underrepresent the true impact anxiety has on cognition in pd . additionally , the current study employed a very brief neuropsychological assessment including one or two tests for each cognitive domain . future studies are encouraged to collect a more complex and comprehensive battery from a larger sample of pd participants in order to better understand the role anxiety plays on cognition in pd . another limitation of this study was the absence of diagnostic interviews to characterize participants ' psychiatric symptoms and specify the type of anxiety disorders included in this study . future studies should perform diagnostic interviews with participants ( e.g. , using dsm - v criteria ) rather than relying on self - reported measures to group participants , in order to better understand whether the type of anxiety disorder ( e.g. , social anxiety , phobias , panic disorders , and generalized anxiety ) influences cognitive performance differently in pd . one advantage the hads questionnaire provided over other anxiety scales was that it assessed both anxiety and depression simultaneously and allowed us to control for coexisting depression . although there was a trend that the pda+ group self - reported higher levels of depression than the pda group , all participants included in the study scored < 6 on the depression subscale of the hads . controlling for depression while assessing anxiety has been identified as a key shortcoming in the majority of recent work . considering many previous studies have investigated the influence of depression on cognition in pd without accounting for the presence of anxiety and the inconsistent findings reported to date , we recommend that future research should try to disentangle the influence of anxiety versus depression on cognitive impairments in pd . considering the growing number of clinical trials for treating depression , there are few if any for the treatment of anxiety in pd . anxiety is a key contributor to decreased quality of life in pd and greatly requires better treatment options . moreover , anxiety has been suggested to play a key role in freezing of gait ( fog ) , which is also related to attentional set - shifting [ 52 , 53 ] . future research should examine the link between anxiety , set - shifting , and fog , in order to determine whether treating anxiety might be a potential therapy for improving fog .""" import torch from transformers import LongT5ForConditionalGeneration, AutoTokenizer tokenizer = LongT5ForConditionalGeneration.from_pretrained("Stancld/longt5-tglobal-large-16384-pubmed-3k_steps") input_ids = tokenizer(LONG_ARTICLE, return_tensors="pt").input_ids.to("cuda") model = LEDForConditionalGeneration.from_pretrained("Stancld/longt5-tglobal-large-16384-pubmed-3k_steps", return_dict_in_generate=True).to("cuda") sequences = model.generate(input_ids, global_attention_mask=global_attention_mask).sequences summary = tokenizer.batch_decode(sequences) ```