In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Hugging Face > Text Summarization with Hugging Face BART

Text Summarization with Hugging Face BART

Author: Venkata Sudhakar

Text summarization condenses long documents into shorter, coherent summaries. This is especially useful in e-commerce settings where product specifications, user manuals, and customer feedback can be lengthy. Hugging Face provides a summarization pipeline backed by models like facebook/bart-large-cnn which is a BART model fine-tuned on the CNN/Daily Mail dataset for abstractive summarization. At ShopMax India, the product team uses summarization to condense lengthy vendor specifications into concise bullet points for the website.

The summarization pipeline accepts raw text and optional parameters like max_length and min_length to control the summary size. The model reads the full input and generates a shorter version in its own words, unlike extractive summarization which picks sentences directly from the source text.

The below example shows how to summarize a product specification document using the Hugging Face summarization pipeline.

from transformers import pipeline

# Load summarization pipeline with BART model
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# ShopMax India - summarize a long product specification
spec_text = """
The ShopMax UltraBook Pro is a premium business laptop designed for
professionals in Mumbai, Bangalore, and Delhi. It features the latest
13th generation Intel Core i9 processor with 32GB of DDR5 RAM and a
2TB NVMe SSD. The display is a 15.6-inch 4K OLED panel with 120Hz
refresh rate, DCI-P3 100% color coverage, and 500 nits of brightness.
The laptop includes Thunderbolt 4 ports, Wi-Fi 6E, Bluetooth 5.3,
and an integrated fingerprint reader. Battery life is rated at 14 hours
with the 86Wh battery and 100W USB-C fast charging. The magnesium
alloy chassis weighs just 1.6kg and is MIL-SPEC 810H certified for
durability. Price starts at Rs 1,45,000 for the base configuration.
"""

result = summarizer(
    spec_text,
    max_length=80,
    min_length=30,
    do_sample=False
)

print("Original length:", len(spec_text.split()), "words")
print("Summary:", result[0]["summary_text"])

It gives the following output,

Original length: 119 words
Summary: The ShopMax UltraBook Pro is a premium business laptop
for professionals. It features a 13th gen Intel Core i9, 32GB RAM,
2TB SSD, and a 15.6-inch 4K OLED display. It weighs 1.6kg and
starts at Rs 1,45,000.

The do_sample=False setting uses beam search decoding which produces more factual and consistent summaries compared to sampling. The max_length and min_length parameters are token counts, not word counts, so the actual word count of the summary may differ slightly. For batch processing at ShopMax India, you can pass a list of texts to the pipeline and it will return a list of summaries, significantly reducing the time needed to process large product catalogs.

Send your comments, suggestions or queries regarding this site to [email protected].