Introduction
The summary of the text is an essential part of natural language processing (NLP) that tries to shorten huge amounts of text and make more legible summaries by maintaining crucial information. Given the expansion of Internet material, good summary techniques are essential for various applications, such as academic research, content generation and news summaries. This article will explain how to build a text summary using the T5-BASE transformer model in CNN/Dailymail data set. In addition, it includes pre-process the data, upload the model, adjust it and evaluate it.
Learning Objectives
- Understand the key text summary concepts and their applications in PNL.
- Know the characteristics and architecture of the T5 model.
- Discover how the text summary tasks are performed with this data set.
- Discover how to prepare text data for T5 model.
- Understand how to refine a T5 base model already trained in a data set.
- Examine ways to evaluate model performance and produce summaries on non -viewed data, our test data.
What approach are we adopting?
Let’s look at our text summary approach using T5-BASE in CNN/Dailymail data set.
T5 model and tokenizer
It The T5 model and the tokenizer are critical components for text summary. Tokenizer converts the text into token sequences, which are numerical representations that the model can process. The T5 model then uses these token sequences to generate summaries. In this project, we use the T5 base variant of the model, which balances performance and computational efficiency.
Also read: What are the large language models (LLMS)?
Data set
The CNN/Dailymail data set is a reference point widely used for summary tasks. Contains corresponding news and summaries articles (highlights), making it ideal for forming and evaluating summary models. The data set is divided into training, validation and testing sets, guaranteeing a robust evaluation of the model.
Preprocessing
Preprocessing involves tokenize articles and summaries to prepare them for entry into T5 model. This passage includes a trunning text to fit into template restrictions and padding sequences to ensure uniform entry lengths. Preprocesess_function handles these tasks, creating corresponding models and tags entries.
Training and evaluation
T5 model adjustment involves training it in the preprocessing data set. We set up training arguments to control various aspects of the training process, such as learning rate, lot size and the number of times. The Class of Transformers Library Coaches simplifies this process, perfectly handling the formation and evaluation of models.
Inference
After sharpening, the model is evaluated in the test set to evaluate your performance. Then we generate summaries for non -viewed data using the tight model. The generate_summary function encodes input articles, generates summaries and decodes the lexible text output.
What is the T-5 model?
The T5 architecture comprises a pile of layers of transformative encoder encoder, each capacity -processing input text to capture contextual information and provide significant representations. These interconnected layers allow a flow of efficient information and hierarchical representation learning. T5 offers tip performance at various PNL reference points, retaining a simple and scalable architecture.
Base Architecture T-5
Let’s now see the architecture of the T-5 base.

Comparison with other T-5 models
Compare it now with other T-5 models.

Code for text summary using T5-BASE
Here is the code that will help us implement the text summary using T5-BASE in the CNN/Dailymail data set.
Installation and configuration
First, we installed the necessary libraries and imported the necessary modules:
Before you start, be sure to install the following:
!pip install transformers datasets
!pip install accelerate -U
!pip install transformers[torch]
from transformers import T5ForConditionalGeneration, T5Tokenizer, Trainer, TrainingArguments
from datasets import load_dataset
Loading the data set
We upload the CNN/Dailymail data set:
dataset = load_dataset("cnn_dailymail", "3.0.0")
Model and Tokenizer
We load the pre-trained T5 model and tokenizer:
model_name = "t5-base"
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
Preprocessing of the data
Preprocesess_Function prepares the model data:
def preprocess_function(examples):
inputs = [doc for doc in examples['article']]
model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length")
with tokenizer.as_target_tokenizer():
labels = tokenizer(examples['highlights'], max_length=128, truncation=True, padding="max_length")
model_inputs["labels"] = labels["input_ids"]
return model_inputs
encoded_dataset = dataset.map(preprocess_function, batched=True)
By dividing the data set
We divide the data set into training and test sets:
train_dataset = encoded_dataset["train"].shuffle(seed=42).select(range(2000))
test_dataset = encoded_dataset["validation"].shuffle(seed=42).select(range(1000))
By training the model
We set up training arguments and adjust the model:
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=3e-4,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
num_train_epochs=3,
weight_decay=0.01,
save_total_limit=3,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
)
trainer.train()
Evaluation of the model
We evaluate the tight model:
trainer.evaluate()
Generating summaries
Finally, we generate summaries for the test set:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
def generate_summary(example):
input_ids = tokenizer.encode(example["article"], return_tensors="pt", max_length=512, truncation=True).to(device)
output = model.generate(input_ids)
summary = tokenizer.decode(output[0], skip_special_tokens=True)
return {"summary": summary}
summaries = test_dataset.map(generate_summary, batched=False)
Showing examples
We have shown some examples to compare reference and generated summaries (using the non -viewed test data set):
for i in range(3):
print("Article:", test_dataset[i]["article"])
print("\nReference Summary:", test_dataset[i]["highlights"])
print("\nGenerated Summary:", summaries[i]["summary"])
print("\n")

The current text summary output captures the essence of the original text. However, we can try some things to make the summary more depth and consistency. To improve performance, different fine adjustment and hyperparameter adjustment procedures can be investigated. This includes a fine adjustment with a larger and diverse data set, and alteration of learning rates, batch sizes and the number of training seasons to improve the convergence and generalization of the model.
In addition, experimenting with other models of transformers and architecture adjustments, such as adding layers or care heads, could help improve the summary process. The text summary system can create more complete and informative summaries by improving the model and experimenting with different hyperparameters.
Assessment of the summary
We will use Rouge for our evaluation; We will first secure the installation by running “PIP Install Rouge”.
from rouge import Rouge
def calculate_rouge(reference_list, generated_list):
rouge = Rouge()
scores = rouge.get_scores(generated_list, reference_list)
rouge_1 = sum(score['rouge-1']['f'] for score in scores) / len(scores)
rouge_2 = sum(score['rouge-2']['f'] for score in scores) / len(scores)
rouge_l = sum(score['rouge-l']['f'] for score in scores) / len(scores)
return rouge_1, rouge_2, rouge_l
# Initialize lists to store reference and generated summaries
reference_summaries = [example["highlights"] for example in test_dataset]
generated_summaries = [example["summary"] for example in summaries]
# Calculate ROUGE scores
rouge_1, rouge_2, rouge_l = calculate_rouge(reference_summaries, generated_summaries)
print("Average ROUGE-1:", rouge_1)
print("Average ROUGE-2:", rouge_2)
print("Average ROUGE-L:", rouge_l)

These average Rouge scores indicate the quality of the summaries generated compared to the reference summaries as a data set. Rouge uses both precision and memory to compare models generated by models. Here’s what each score means:
Rouge-1 Calibres As the summaries generated and referred to with respect to individuals or individual words. Rouge-1 average score of Rouge-1 of the summarized summaries shows that, on average, 23.47% of the UNIGRAMS coincide with the reference summaries.
Rouge-2 calibers as closely the summaries generated and of reference are overlapped in large or pairs of neighboring words. According to an average score of Rouge-2 of 0.0959, about 9.59% of the large ones in the summaries generated coincide with those of the reference summaries.
Rouge-l counts the number of shared words the longest between the reference and generated summaries. According to an average rouge-l score of 0.2238, approximately 22.38% of the longest common substance of terms in the summaries generated coincides with the fact that in reference summaries.
Conclusion
Therefore, the text summary with the T5 base model in the CNN/Dailymail database shows the efficiency of the transformer -based architectures to compress great texts in short summaries. We can produce high quality summary results by taking an organized strategy, starting with data loading and preprocessing and ending with adjustment and evaluation models. This method demonstrates the adaptability of the T5 model and the importance of a rigorous preprocessing and a meticulous training of models.
Frequent questions
A. The T5 model is unusual because it deals with all PNL work as text -to -text problems. Translation, summary and answer answer are seen as text creation tasks. This makes it a very adaptable model that can be adjusted for multiple tasks using the same architecture, unlike the other transformer models that may require specific task structures or adjustments.
A. The class of coaches in the Transformers Library facilitates the training and evaluation of the model by providing a high level interface for specifying training parameters, handling data joints and running the training loop. Automates procedures such as gradient accumulation and verification points such as calculating and evaluating metrics, facilitating tuning and evaluating transformer models without an extensive boiler plate code.
A. Rouge (Common assessment scores for memory -oriented assessment) are common assessment metrics for summary models. They show n-gram overlap, word sequences, and word pairs between the article and the summary generated. Some common assessment measures include Rouge-1, Rouge-2 and Rouge-l. These metrics help quantitatively evaluate the quality and relevance of the summaries generated by the model. Human evaluation can also help determine the quality of summaries and model performance.