How many dimensions should embeddings have?

More dimensions capture more nuance but cost more to store and search. For most applications, 768-1536 dimensions is the sweet spot. 384 is fine for prototyping.

Can I use embeddings for images?

Yes — CLIP and similar models create embeddings for both text and images in the same vector space. You can search for images using text queries and vice versa.

How much does vector storage cost?

Pinecone: ~$0.33/million vectors/month. Supabase pgvector: included in Supabase pricing. Self-hosted Qdrant or Chroma: just your server costs.

The Complete Guide to AI Embeddings (With Python Code)

TL;DR

Technical deep-dive on AI embeddings, vector databases, and similarity search. Complete Python tutorial covering OpenAI, sentence-transformers, Pinecone, and pr

If you build AI applications, you need to understand embeddings. They power semantic search, RAG systems, recommendation engines, clustering, anomaly detection, and dozens of other applications. Yet most developers treat them as a black box.

Let’s open that box.

What Are Embeddings?

An embedding is a list of numbers (vector) that represents the meaning of text. Similar meanings produce similar vectors. Different meanings produce different vectors.

# "The cat sat on the mat" → [0.23, -0.45, 0.67, ...]  (1536 numbers)
# "A feline rested on a rug" → [0.21, -0.43, 0.65, ...]  (similar!)
# "Stock prices rose sharply" → [-0.89, 0.12, -0.34, ...] (very different!)

The mathematical distance between vectors corresponds to the semantic distance between concepts. This is what makes them useful — you can search by meaning, not just keywords.

Generating Embeddings in Python

Option 1: OpenAI Embeddings

from openai import OpenAI

client = OpenAI()

def get_embedding(text: str, model: str = "text-embedding-3-large") -> list[float]:
    response = client.embeddings.create(
        input=text,
        model=model
    )
    return response.data[0].embedding

# Generate embeddings
embedding = get_embedding("How do neural networks learn?")
print(f"Dimensions: {len(embedding)}")  # 3072 for text-embedding-3-large

Option 2: Open-Source with sentence-transformers

from sentence_transformers import SentenceTransformer

# Free, runs locally, no API key needed
model = SentenceTransformer('all-MiniLM-L6-v2')

texts = [
    "How do neural networks learn?",
    "What is backpropagation in deep learning?",
    "Best restaurants in New York City"
]

embeddings = model.encode(texts)
print(f"Shape: {embeddings.shape}")  # (3, 384)

What Are Embeddings?

Generating Embeddings in Python

Option 1: OpenAI Embeddings

Option 2: Open-Source with sentence-transformers

Try Our Free Tools

JSON Formatter & Validator

cURL to Code Converter

More from AI for Professionals

Loan Amortization Extra Payments: Save Thousands (Calculator Guide)

OpenAI on Amazon Bedrock: GPT-5.5, Codex & Managed Agents Guide 2026

Similarity Search

Vector Databases

Pinecone Example

Supabase pgvector (Free Option)

Practical Applications

1. Semantic Search Engine

2. Duplicate Detection

3. Recommendation Systems

4. Clustering and Classification

Choosing an Embedding Model

People Also Ask

How many dimensions should embeddings have?

Can I use embeddings for images?

How much does vector storage cost?

Ready to ship faster?

One insight, every Monday. 7am IST. Zero fluff.

Comments · 0

Key takeaways · 7

Topics

Article stats

Regex Playground

Base64 Encoder / Decoder

UUID Generator

Microsoft Agent 365 GA: Enterprise AI Governance Guide 2026

Claude Security Public Beta: Complete Guide to AI-Powered Vulnerability Scanning (2026)

How to Choose the Right AI Model in 2026: A Practical Decision Framework

AI Agents Now Have Visa Cards: Intelligent Commerce Connect 2026