ahassoun's picture
Upload 3018 files
ee6e328
|
raw
history blame
17 kB

๋ฒˆ์—ญ[[translation]]

[[open-in-colab]]

๋ฒˆ์—ญ์€ ํ•œ ์–ธ์–ด๋กœ ๋œ ์‹œํ€€์Šค๋ฅผ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๋ฒˆ์—ญ์ด๋‚˜ ์š”์•ฝ์€ ์ž…๋ ฅ์„ ๋ฐ›์•„ ์ผ๋ จ์˜ ์ถœ๋ ฅ์„ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ ํ”„๋ ˆ์ž„์›Œํฌ์ธ ์‹œํ€€์Šค-ํˆฌ-์‹œํ€€์Šค ๋ฌธ์ œ๋กœ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋Œ€ํ‘œ์ ์ธ ํƒœ์Šคํฌ์ž…๋‹ˆ๋‹ค. ๋ฒˆ์—ญ ์‹œ์Šคํ…œ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋œ ํ…์ŠคํŠธ ๊ฐ„์˜ ๋ฒˆ์—ญ์— ์‚ฌ์šฉ๋˜์ง€๋งŒ, ์Œ์„ฑ ๊ฐ„์˜ ํ†ต์—ญ์ด๋‚˜ ํ…์ŠคํŠธ-์Œ์„ฑ ๋˜๋Š” ์Œ์„ฑ-ํ…์ŠคํŠธ์™€ ๊ฐ™์€ ์กฐํ•ฉ์—๋„ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๊ฐ€์ด๋“œ์—์„œ ํ•™์Šตํ•  ๋‚ด์šฉ์€:

  1. ์˜์–ด ํ…์ŠคํŠธ๋ฅผ ํ”„๋ž‘์Šค์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ ์œ„ํ•ด T5 ๋ชจ๋ธ์„ OPUS Books ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ํ•˜์œ„ ์ง‘ํ•ฉ์œผ๋กœ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ
  2. ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ์„ ์ถ”๋ก ์— ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
์ด ํƒœ์Šคํฌ ๊ฐ€์ด๋“œ๋Š” ์•„๋ž˜ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์—๋„ ์‘์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

BART, BigBird-Pegasus, Blenderbot, BlenderbotSmall, Encoder decoder, FairSeq Machine-Translation, GPTSAN-japanese, LED, LongT5, M2M100, Marian, mBART, MT5, MVP, NLLB, NLLB-MOE, Pegasus, PEGASUS-X, PLBart, ProphetNet, SwitchTransformers, T5, XLM-ProphetNet

์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”:

pip install transformers datasets evaluate sacrebleu

๋ชจ๋ธ์„ ์—…๋กœ๋“œํ•˜๊ณ  ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋„๋ก Hugging Face ๊ณ„์ •์— ๋กœ๊ทธ์ธํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ์ฐฝ์ด ํ‘œ์‹œ๋˜๋ฉด ํ† ํฐ์„ ์ž…๋ ฅํ•˜์—ฌ ๋กœ๊ทธ์ธํ•˜์„ธ์š”.

>>> from huggingface_hub import notebook_login

>>> notebook_login()

OPUS Books ๋ฐ์ดํ„ฐ์„ธํŠธ ๊ฐ€์ ธ์˜ค๊ธฐ[[load-opus-books-dataset]]

๋จผ์ € ๐Ÿค— Datasets ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ OPUS Books ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ํ•˜์œ„ ์ง‘ํ•ฉ์„ ๊ฐ€์ ธ์˜ค์„ธ์š”.

>>> from datasets import load_dataset

>>> books = load_dataset("opus_books", "en-fr")

๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ [~datasets.Dataset.train_test_split] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ ๋ฐ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„ํ• ํ•˜์„ธ์š”.

>>> books = books["train"].train_test_split(test_size=0.2)

ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์—์„œ ์˜ˆ์‹œ๋ฅผ ์‚ดํŽด๋ณผ๊นŒ์š”?

>>> books["train"][0]
{'id': '90560',
 'translation': {'en': 'But this lofty plateau measured only a few fathoms, and soon we reentered Our Element.',
  'fr': 'Mais ce plateau รฉlevรฉ ne mesurait que quelques toises, et bientรดt nous fรปmes rentrรฉs dans notre รฉlรฉment.'}}

๋ฐ˜ํ™˜๋œ ๋”•์…”๋„ˆ๋ฆฌ์˜ translation ํ‚ค๊ฐ€ ํ…์ŠคํŠธ์˜ ์˜์–ด, ํ”„๋ž‘์Šค์–ด ๋ฒ„์ „์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ „์ฒ˜๋ฆฌ[[preprocess]]

๋‹ค์Œ ๋‹จ๊ณ„๋กœ ์˜์–ด-ํ”„๋ž‘์Šค์–ด ์Œ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด T5 ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฐ€์ ธ์˜ค์„ธ์š”.

>>> from transformers import AutoTokenizer

>>> checkpoint = "t5-small"
>>> tokenizer = AutoTokenizer.from_pretrained(checkpoint)

๋งŒ๋“ค ์ „์ฒ˜๋ฆฌ ํ•จ์ˆ˜๋Š” ์•„๋ž˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

  1. T5๊ฐ€ ๋ฒˆ์—ญ ํƒœ์Šคํฌ์ž„์„ ์ธ์ง€ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ž…๋ ฅ ์•ž์— ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜์„ธ์š”. ์—ฌ๋Ÿฌ NLP ํƒœ์Šคํฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ ์ค‘ ์ผ๋ถ€๋Š” ์ด๋ ‡๊ฒŒ ํƒœ์Šคํฌ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฏธ๋ฆฌ ์ค˜์•ผํ•ฉ๋‹ˆ๋‹ค.
  2. ์›์–ด(์˜์–ด)๊ณผ ๋ฒˆ์—ญ์–ด(ํ”„๋ž‘์Šค์–ด)๋ฅผ ๋ณ„๋„๋กœ ํ† ํฐํ™”ํ•˜์„ธ์š”. ์˜์–ด ์–ดํœ˜๋กœ ์‚ฌ์ „ ํ•™์Šต๋œ ํ† ํฌ๋‚˜์ด์ €๋กœ ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•  ์ˆ˜๋Š” ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
  3. max_length ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์„ค์ •ํ•œ ์ตœ๋Œ€ ๊ธธ์ด๋ณด๋‹ค ๊ธธ์ง€ ์•Š๋„๋ก ์‹œํ€€์Šค๋ฅผ truncateํ•˜์„ธ์š”.
>>> source_lang = "en"
>>> target_lang = "fr"
>>> prefix = "translate English to French: "


>>> def preprocess_function(examples):
...     inputs = [prefix + example[source_lang] for example in examples["translation"]]
...     targets = [example[target_lang] for example in examples["translation"]]
...     model_inputs = tokenizer(inputs, text_target=targets, max_length=128, truncation=True)
...     return model_inputs

์ „์ฒด ๋ฐ์ดํ„ฐ์„ธํŠธ์— ์ „์ฒ˜๋ฆฌ ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜๋ ค๋ฉด ๐Ÿค— Datasets์˜ [~datasets.Dataset.map] ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”. map ํ•จ์ˆ˜์˜ ์†๋„๋ฅผ ๋†’์ด๋ ค๋ฉด batched=True๋ฅผ ์„ค์ •ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์—ฌ๋Ÿฌ ์š”์†Œ๋ฅผ ํ•œ ๋ฒˆ์— ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

>>> tokenized_books = books.map(preprocess_function, batched=True)

์ด์ œ [DataCollatorForSeq2Seq]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ œ ๋ฐฐ์น˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์„ธํŠธ์˜ ์ตœ๋Œ€ ๊ธธ์ด๋กœ ์ „๋ถ€๋ฅผ paddingํ•˜๋Š” ๋Œ€์‹ , ๋ฐ์ดํ„ฐ ์ •๋ ฌ ์ค‘ ๊ฐ ๋ฐฐ์น˜์˜ ์ตœ๋Œ€ ๊ธธ์ด๋กœ ๋ฌธ์žฅ์„ ๋™์ ์œผ๋กœ paddingํ•˜๋Š” ๊ฒƒ์ด ๋” ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค.

```py >>> from transformers import DataCollatorForSeq2Seq

data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint)

</pt>
<tf>

```py
>>> from transformers import DataCollatorForSeq2Seq

>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=checkpoint, return_tensors="tf")

ํ‰๊ฐ€[[evalulate]]

ํ›ˆ๋ จ ์ค‘์— ๋ฉ”ํŠธ๋ฆญ์„ ํฌํ•จํ•˜๋ฉด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ๐Ÿค— Evaluate ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•(evaluation method)์„ ๋น ๋ฅด๊ฒŒ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ์˜ค์„ธ์š”. (๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ์˜ค๊ณ  ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ๐Ÿค— Evaluate ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”):

>>> import evaluate

>>> metric = evaluate.load("sacrebleu")

๊ทธ๋Ÿฐ ๋‹ค์Œ [~evaluate.EvaluationModule.compute]์— ์˜ˆ์ธก๊ฐ’๊ณผ ๋ ˆ์ด๋ธ”์„ ์ „๋‹ฌํ•˜์—ฌ SacreBLEU ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜์„ธ์š”:

>>> import numpy as np


>>> def postprocess_text(preds, labels):
...     preds = [pred.strip() for pred in preds]
...     labels = [[label.strip()] for label in labels]

...     return preds, labels


>>> def compute_metrics(eval_preds):
...     preds, labels = eval_preds
...     if isinstance(preds, tuple):
...         preds = preds[0]
...     decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)

...     labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
...     decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

...     decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)

...     result = metric.compute(predictions=decoded_preds, references=decoded_labels)
...     result = {"bleu": result["score"]}

...     prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
...     result["gen_len"] = np.mean(prediction_lens)
...     result = {k: round(v, 4) for k, v in result.items()}
...     return result

์ด์ œ compute_metrics ํ•จ์ˆ˜๋Š” ์ค€๋น„๋˜์—ˆ๊ณ , ํ›ˆ๋ จ ๊ณผ์ •์„ ์„ค์ •ํ•  ๋•Œ ๋‹ค์‹œ ์‚ดํŽด๋ณผ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

ํ›ˆ๋ จ[[train]]

[Trainer]๋กœ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ์ต์ˆ™ํ•˜์ง€ ์•Š๋‹ค๋ฉด ์—ฌ๊ธฐ์—์„œ ๊ธฐ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์„ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค!

๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ์ค€๋น„๊ฐ€ ๋˜์—ˆ๊ตฐ์š”! [AutoModelForSeq2SeqLM]์œผ๋กœ T5๋ฅผ ๋กœ๋“œํ•˜์„ธ์š”:

>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer

>>> model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)

์ด์ œ ์„ธ ๋‹จ๊ณ„๋งŒ ๊ฑฐ์น˜๋ฉด ๋์ž…๋‹ˆ๋‹ค:

  1. [Seq2SeqTrainingArguments]์—์„œ ํ›ˆ๋ จ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ •์˜ํ•˜์„ธ์š”. ์œ ์ผํ•œ ํ•„์ˆ˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๋ชจ๋ธ์„ ์ €์žฅํ•  ์œ„์น˜์ธ output_dir์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ Hub์— ํ‘ธ์‹œํ•˜๊ธฐ ์œ„ํ•ด push_to_hub=True๋กœ ์„ค์ •ํ•˜์„ธ์š”. (๋ชจ๋ธ์„ ์—…๋กœ๋“œํ•˜๋ ค๋ฉด Hugging Face์— ๋กœ๊ทธ์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.) [Trainer]๋Š” ์—ํญ์ด ๋๋‚ ๋•Œ๋งˆ๋‹ค SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ํ‰๊ฐ€ํ•˜๊ณ  ํ›ˆ๋ จ ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  2. [Seq2SeqTrainer]์— ํ›ˆ๋ จ ์ธ์ˆ˜๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”. ๋ชจ๋ธ, ๋ฐ์ดํ„ฐ ์„ธํŠธ, ํ† ํฌ๋‚˜์ด์ €, data collator ๋ฐ compute_metrics ํ•จ์ˆ˜๋„ ๋ฉ๋‹ฌ์•„ ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  3. [~Trainer.train]์„ ํ˜ธ์ถœํ•˜์—ฌ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์„ธ์š”.
>>> training_args = Seq2SeqTrainingArguments(
...     output_dir="my_awesome_opus_books_model",
...     evaluation_strategy="epoch",
...     learning_rate=2e-5,
...     per_device_train_batch_size=16,
...     per_device_eval_batch_size=16,
...     weight_decay=0.01,
...     save_total_limit=3,
...     num_train_epochs=2,
...     predict_with_generate=True,
...     fp16=True,
...     push_to_hub=True,
... )

>>> trainer = Seq2SeqTrainer(
...     model=model,
...     args=training_args,
...     train_dataset=tokenized_books["train"],
...     eval_dataset=tokenized_books["test"],
...     tokenizer=tokenizer,
...     data_collator=data_collator,
...     compute_metrics=compute_metrics,
... )

>>> trainer.train()

ํ•™์Šต์ด ์™„๋ฃŒ๋˜๋ฉด [~transformers.Trainer.push_to_hub] ๋ฉ”์„œ๋“œ๋กœ ๋ชจ๋ธ์„ Hub์— ๊ณต์œ ํ•˜์„ธ์š”. ์ด๋Ÿฌ๋ฉด ๋ˆ„๊ตฌ๋‚˜ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค:

>>> trainer.push_to_hub()

Keras๋กœ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์ต์ˆ™ํ•˜์ง€ ์•Š๋‹ค๋ฉด, ์—ฌ๊ธฐ์—์„œ ๊ธฐ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์„ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค!

TensorFlow์—์„œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋ ค๋ฉด ์šฐ์„  optimizer ํ•จ์ˆ˜, ํ•™์Šต๋ฅ  ์Šค์ผ€์ค„ ๋“ฑ์˜ ํ›ˆ๋ จ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ค์ •ํ•˜์„ธ์š”:
>>> from transformers import AdamWeightDecay

>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)

์ด์ œ [TFAutoModelForSeq2SeqLM]๋กœ T5๋ฅผ ๊ฐ€์ ธ์˜ค์„ธ์š”:

>>> from transformers import TFAutoModelForSeq2SeqLM

>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(checkpoint)

[~transformers.TFPreTrainedModel.prepare_tf_dataset]๋กœ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ tf.data.Dataset ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜์„ธ์š”:

>>> tf_train_set = model.prepare_tf_dataset(
...     tokenized_books["train"],
...     shuffle=True,
...     batch_size=16,
...     collate_fn=data_collator,
... )

>>> tf_test_set = model.prepare_tf_dataset(
...     tokenized_books["test"],
...     shuffle=False,
...     batch_size=16,
...     collate_fn=data_collator,
... )

ํ›ˆ๋ จํ•˜๊ธฐ ์œ„ํ•ด compile ๋ฉ”์„œ๋“œ๋กœ ๋ชจ๋ธ์„ ๊ตฌ์„ฑํ•˜์„ธ์š”:

>>> import tensorflow as tf

>>> model.compile(optimizer=optimizer)

ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ์˜ˆ์ธก๊ฐ’์œผ๋กœ๋ถ€ํ„ฐ SacreBLEU ๋ฉ”ํŠธ๋ฆญ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ๋ชจ๋ธ์„ Hub์— ์—…๋กœ๋“œํ•˜๋Š” ๋ฐฉ๋ฒ• ๋‘ ๊ฐ€์ง€๋ฅผ ๋ฏธ๋ฆฌ ์„ค์ •ํ•ด๋‘ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค Keras callbacks๋กœ ๊ตฌํ˜„ํ•˜์„ธ์š”.

[~transformers.KerasMetricCallback]์— compute_metrics ํ•จ์ˆ˜๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”.

>>> from transformers.keras_callbacks import KerasMetricCallback

>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)

๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์—…๋กœ๋“œํ•  ์œ„์น˜๋ฅผ [~transformers.PushToHubCallback]์—์„œ ์ง€์ •ํ•˜์„ธ์š”:

>>> from transformers.keras_callbacks import PushToHubCallback

>>> push_to_hub_callback = PushToHubCallback(
...     output_dir="my_awesome_opus_books_model",
...     tokenizer=tokenizer,
... )

์ด์ œ ์ฝœ๋ฐฑ๋“ค์„ ํ•œ๋ฐ๋กœ ๋ฌถ์–ด์ฃผ์„ธ์š”:

>>> callbacks = [metric_callback, push_to_hub_callback]

๋“œ๋””์–ด ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ๋ชจ๋“  ์ค€๋น„๋ฅผ ๋งˆ์ณค๊ตฐ์š”! ์ด์ œ ํ›ˆ๋ จ ๋ฐ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— fit ๋ฉ”์„œ๋“œ๋ฅผ ์—ํญ ์ˆ˜์™€ ๋งŒ๋“ค์–ด๋‘” ์ฝœ๋ฐฑ๊ณผ ํ•จ๊ป˜ ํ˜ธ์ถœํ•˜์—ฌ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜์„ธ์š”:

>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)

ํ•™์Šต์ด ์™„๋ฃŒ๋˜๋ฉด ๋ชจ๋ธ์ด ์ž๋™์œผ๋กœ Hub์— ์—…๋กœ๋“œ๋˜๊ณ , ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค!

๋ฒˆ์—ญ์„ ์œ„ํ•ด ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋ณด๋‹ค ์ž์„ธํ•œ ์˜ˆ์ œ๋Š” ํ•ด๋‹น PyTorch ๋…ธํŠธ๋ถ ๋˜๋Š” TensorFlow ๋…ธํŠธ๋ถ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ถ”๋ก [[inference]]

์ข‹์•„์š”, ์ด์ œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ–ˆ์œผ๋‹ˆ ์ถ”๋ก ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ณ  ์‹ถ์€ ํ…์ŠคํŠธ๋ฅผ ์จ๋ณด์„ธ์š”. T5์˜ ๊ฒฝ์šฐ ์›ํ•˜๋Š” ํƒœ์Šคํฌ๋ฅผ ์ž…๋ ฅ์˜ ์ ‘๋‘์‚ฌ๋กœ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์˜์–ด์—์„œ ํ”„๋ž‘์Šค์–ด๋กœ ๋ฒˆ์—ญํ•˜๋Š” ๊ฒฝ์šฐ, ์•„๋ž˜์™€ ๊ฐ™์€ ์ ‘๋‘์‚ฌ๊ฐ€ ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค:

>>> text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."

ํŒŒ์ธํŠœ๋‹๋œ ๋ชจ๋ธ๋กœ ์ถ”๋ก ํ•˜๊ธฐ์— ์ œ์ผ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์€ [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ชจ๋ธ๋กœ ๋ฒˆ์—ญ pipeline์„ ๋งŒ๋“  ๋’ค, ํ…์ŠคํŠธ๋ฅผ ์ „๋‹ฌํ•˜์„ธ์š”:

>>> from transformers import pipeline

>>> translator = pipeline("translation", model="my_awesome_opus_books_model")
>>> translator(text)
[{'translation_text': 'Legumes partagent des ressources avec des bactรฉries azotantes.'}]

์›ํ•œ๋‹ค๋ฉด pipeline์˜ ๊ฒฐ๊ณผ๋ฅผ ์ง์ ‘ ๋ณต์ œํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค:

ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•˜๊ณ  `input_ids`๋ฅผ PyTorch ํ…์„œ๋กœ ๋ฐ˜ํ™˜ํ•˜์„ธ์š”:
>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="pt").input_ids

[~transformers.generation_utils.GenerationMixin.generate] ๋ฉ”์„œ๋“œ๋กœ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜์„ธ์š”. ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ƒ์„ฑ ์ „๋žต ๋ฐ ์ƒ์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ Text Generation API๋ฅผ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

>>> from transformers import AutoModelForSeq2SeqLM

>>> model = AutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)

์ƒ์„ฑ๋œ ํ† ํฐ ID๋“ค์„ ๋‹ค์‹œ ํ…์ŠคํŠธ๋กœ ๋””์ฝ”๋”ฉํ•˜์„ธ์š”:

>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lignรฉes partagent des ressources avec des bactรฉries enfixant l'azote.'
ํ…์ŠคํŠธ๋ฅผ ํ† ํฐํ™”ํ•˜๊ณ  `input_ids`๋ฅผ TensorFlow ํ…์„œ๋กœ ๋ฐ˜ํ™˜ํ•˜์„ธ์š”:
>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
>>> inputs = tokenizer(text, return_tensors="tf").input_ids

[~transformers.generation_tf_utils.TFGenerationMixin.generate] ๋ฉ”์„œ๋“œ๋กœ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜์„ธ์š”. ๋‹ค์–‘ํ•œ ํ…์ŠคํŠธ ์ƒ์„ฑ ์ „๋žต ๋ฐ ์ƒ์„ฑ์„ ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ Text Generation API๋ฅผ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

>>> from transformers import TFAutoModelForSeq2SeqLM

>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)

์ƒ์„ฑ๋œ ํ† ํฐ ID๋“ค์„ ๋‹ค์‹œ ํ…์ŠคํŠธ๋กœ ๋””์ฝ”๋”ฉํ•˜์„ธ์š”:

>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
'Les lugumes partagent les ressources avec des bactรฉries fixatrices d'azote.'