|
|
Sentence Embeddings with Hugging Face Sentence Transformers
Author: Venkata Sudhakar
Sentence embeddings convert text into dense numerical vectors that capture semantic meaning. Two sentences with similar meaning will have embeddings that are close together in vector space, even if the exact words differ. The Hugging Face sentence-transformers library provides easy access to models like all-MiniLM-L6-v2 that produce high-quality 384-dimensional sentence embeddings. At ShopMax India, embeddings power the semantic product search engine that returns relevant results even when customers use different words than those in the product catalog. Cosine similarity is the standard metric for comparing embeddings. It measures the cosine of the angle between two vectors and returns a value between -1 and 1. A value close to 1 means the sentences are semantically very similar. The sentence-transformers library includes utility functions for computing cosine similarity directly. The below example shows how to generate sentence embeddings and find the most similar product to a customer query.
It gives the following output,
Query: I need a good laptop for office work
Best match: ShopMax ProBook laptop with i7 processor and 16GB RAM
(score: 0.712)
Query: looking for wireless headphones
Best match: ShopMax TurboCharge wireless noise-cancelling earbuds
(score: 0.681)
Query: fitness tracker that monitors health
Best match: ShopMax SmartWatch with heart rate monitor and GPS
(score: 0.743)
The all-MiniLM-L6-v2 model is only 80MB in size and runs efficiently on CPU, making it suitable for deployment on standard web servers without GPU hardware. For production at ShopMax India, product embeddings should be pre-computed and stored in a vector database so that search queries only require encoding the query text at runtime, not the full catalog on every request.
|
|