In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Ollama > Generating Embeddings with Ollama for Semantic Search

Generating Embeddings with Ollama for Semantic Search

Author: Venkata Sudhakar

Ollama supports generating text embeddings locally using models like nomic-embed-text, which is a compact model optimized specifically for producing high-quality embeddings. Embeddings are numerical vector representations of text that capture semantic meaning, enabling similarity comparisons between pieces of text. At ShopMax India, locally generated embeddings are used to power offline product search in the warehouse management system where internet connectivity is unreliable.

The ollama.embeddings() function takes a model name and a text prompt and returns a vector of floating point numbers. You can then compute cosine similarity between vectors to find the most semantically related items. This entire pipeline runs on local hardware with no data leaving the machine, which is critical for applications involving confidential inventory and pricing data.

The below example shows how to generate embeddings with Ollama and perform semantic search over a product list.

Then use the embeddings in Python as shown below.

import ollama
import math

# Compute cosine similarity between two vectors
def cosine_similarity(v1, v2):
    dot = sum(a * b for a, b in zip(v1, v2))
    mag1 = math.sqrt(sum(a * a for a in v1))
    mag2 = math.sqrt(sum(b * b for b in v2))
    return dot / (mag1 * mag2) if mag1 and mag2 else 0.0

# ShopMax India - product catalog
products = [
    "ShopMax ProBook laptop 16GB RAM Intel i7",
    "ShopMax TurboCharge wireless earbuds 30hr battery",
    "ShopMax SmartWatch GPS heart rate monitor",
    "ShopMax PowerBank 20000mAh USB-C fast charge",
    "ShopMax MechKeys RGB mechanical keyboard"
]

# Generate embeddings for all products
product_embeddings = []
for p in products:
    result = ollama.embeddings(model="nomic-embed-text", prompt=p)
    product_embeddings.append(result["embedding"])

# Customer query
query = "I want noise cancelling headphones for travel"
query_emb = ollama.embeddings(model="nomic-embed-text", prompt=query)["embedding"]

# Find best match
scores = [cosine_similarity(query_emb, pe) for pe in product_embeddings]
best_idx = scores.index(max(scores))

print(f"Query: {query}")
print(f"Best match: {products[best_idx]}")
print(f"Score: {max(scores):.4f}")

It gives the following output,

Query: I want noise cancelling headphones for travel
Best match: ShopMax TurboCharge wireless earbuds 30hr battery
Score: 0.8134

For better performance at scale, pre-compute and cache all product embeddings rather than generating them on every query. The nomic-embed-text model produces 768-dimensional vectors that offer a good balance of quality and speed on CPU. ShopMax India can store these embeddings in a local SQLite database and refresh them nightly as the product catalog changes, enabling fast offline semantic search without any external API dependency.

Send your comments, suggestions or queries regarding this site to [email protected].