--- language: - en tags: - t5 datasets: - kmfoda/booksum metrics: - rouge widget: - text: "A large drop of sun lingered on the horizon and then dripped over and was gone, and the sky was brilliant over the spot where it had gone, and a torn cloud, like a bloody rag, hung over the spot of its going. And dusk crept over the sky from the eastern horizon, and darkness crept over the land from the east." example_title: "grapes of wrath" - text: "The year was 2081, and everybody was finally equal. They weren’t only equal before God and the law. They were equal every which way. Nobody was smarter than anybody else. Nobody was better looking than anybody else. Nobody was stronger or quicker than anybody else. All this equality was due to the 211th, 212th, and 213th Amendments to the Constitution, and to the unceasing vigilance of agents of the United States Handicapper General." example_title: "Harrison Bergeron" - text: "The ledge, where I placed my candle, had a few mildewed books piled up in one corner; and it was covered with writing scratched on the paint. This writing, however, was nothing but a name repeated in all kinds of characters, large and small—Catherine Earnshaw, here and there varied to Catherine Heathcliff, and then again to Catherine Linton. In vapid listlessness I leant my head against the window, and continued spelling over Catherine Earnshaw—Heathcliff—Linton, till my eyes closed; but they had not rested five minutes when a glare of white letters started from the dark, as vivid as spectres—the air swarmed with Catherines; and rousing myself to dispel the obtrusive name, I discovered my candle wick reclining on one of the antique volumes, and perfuming the place with an odour of roasted calf-skin." example_title: "Wuthering Heights" inference: parameters: no_repeat_ngram_size: 2 max_length: 32 early_stopping: True --- # literary analysis with t5-base - t5 sort-of learning to do literary analysis. It was trained on the booksum dataset with `chapter` (original text) as input and `summary_analysis` as the output text, where `summary_analysis` is the sparknotes/cliff notes/etc analysis - It was trained for 8 epochs - Testing may need to be completed in Colab as it seems to be CPU-intensive. A link to an example notebook is [here](https://colab.research.google.com/gist/pszemraj/fe495bc0225ef0c00c9f8445b64672a6/example-t5_1_1-base-writing-analysis.ipynb) # Example ``` !pip install -U -q transformers !pip install -U -q sentencepiece from transformers import pipeline analyzer = pipeline("text2text-generation", "pszemraj/t5_1_1-base-writing-analysis") ``` - enter text and compute. ``` text = "text to be analyzed goes here" result = analyzer( text, max_length=int(len(text) * 1.2), no_repeat_ngram_size= 2, repetition_penalty= 2.4, num_beams=4, early_stopping= True, ) ``` # sample results - inputs: > The ledge, where I placed my candle, had a few mildewed books piled up in one corner; and it was covered with writing scratched on the paint. This writing, however, was nothing but a name repeated in all kinds of characters, large and small—Catherine Earnshaw, here and there varied to Catherine Heathcliff, and then again to Catherine Linton. In vapid listlessness I leant my head against the window, and continued spelling over Catherine Earnshaw—Heathcliff—Linton, till my eyes closed; but they had not rested five minutes when a glare of white letters started from the dark, as vivid as spectres—the air swarmed with Catherines; and rousing myself to dispel the obtrusive name, I discovered my candle wick reclining on one of the antique volumes, and perfuming the place with an odour of roasted calf-skin. - output: > In this chapter, Catherine Heathcliff and Linton are the only characters in the novel who have been able to escape from their lives. The candle wick is placed on one of the antique books that were used as a lamp for the first time. It is also important to note that she has no idea what her name means. She does not know how to pronounce it, but she knows that there is something wrong with his name. He cannot understand why he should be called Catherine Earnshaw-Heathcliff; however, I do not want to make him feel comfortable. This is an example of remarkstrayeshadowed by the reader's own mind. As we learn more about the story, we realize that Catherine earnshaw is unable to find out of any kind of anything else. At the end of these chapters, at the beginning of Chapter 1, we see that they are all too much different from each other.