WOWHOW
  • Browse
  • Blogs
  • Tools
  • About
  • Sign In
  • Checkout

WOWHOW

Premium dev tools & templates.
Made for developers who ship.

Products

  • Browse All
  • New Arrivals
  • Most Popular
  • AI & LLM Tools

Company

  • About Us
  • Blog
  • Contact
  • Tools

Resources

  • FAQ
  • Support
  • Sitemap

Legal

  • Terms & Conditions
  • Privacy Policy
  • Refund Policy
About UsPrivacy PolicyTerms & ConditionsRefund PolicySitemap

© 2025 WOWHOW — a product of Absomind Technologies. All rights reserved.

Blog/AI for Professionals

The Complete Guide to AI Embeddings (With Python Code)

P

Promptium Team

16 March 2026

11 min read1,700 words
embeddingsvector-databasepythonsimilarity-searchpineconerag

Embeddings are the backbone of modern AI applications — from search to recommendations to RAG systems. This technical guide covers everything from theory to production Python code.

If you build AI applications, you need to understand embeddings. They power semantic search, RAG systems, recommendation engines, clustering, anomaly detection, and dozens of other applications. Yet most developers treat them as a black box.

Let's open that box.


What Are Embeddings?

An embedding is a list of numbers (vector) that represents the meaning of text. Similar meanings produce similar vectors. Different meanings produce different vectors.

# "The cat sat on the mat" → [0.23, -0.45, 0.67, ...]  (1536 numbers)
# "A feline rested on a rug" → [0.21, -0.43, 0.65, ...]  (similar!)
# "Stock prices rose sharply" → [-0.89, 0.12, -0.34, ...] (very different!)

The mathematical distance between vectors corresponds to the semantic distance between concepts. This is what makes them useful — you can search by meaning, not just keywords.


Generating Embeddings in Python

Option 1: OpenAI Embeddings

from openai import OpenAI

client = OpenAI()

def get_embedding(text: str, model: str = "text-embedding-3-large") -> list[float]:
    response = client.embeddings.create(
        input=text,
        model=model
    )
    return response.data[0].embedding

# Generate embeddings
embedding = get_embedding("How do neural networks learn?")
print(f"Dimensions: {len(embedding)}")  # 3072 for text-embedding-3-large

Option 2: Open-Source with sentence-transformers

from sentence_transformers import SentenceTransformer

# Free, runs locally, no API key needed
model = SentenceTransformer('all-MiniLM-L6-v2')

texts = [
    "How do neural networks learn?",
    "What is backpropagation in deep learning?",
    "Best restaurants in New York City"
]

embeddings = model.encode(texts)
print(f"Shape: {embeddings.shape}")  # (3, 384)

Similarity Search

The core operation: given a query, find the most similar items.

import numpy as np
from numpy.linalg import norm

def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
    return np.dot(a, b) / (norm(a) * norm(b))

# Compare similarity
query = model.encode("machine learning algorithms")
documents = [
    model.encode("supervised learning classification methods"),
    model.encode("best pizza delivery services"),
    model.encode("neural network training techniques"),
]

for i, doc in enumerate(documents):
    sim = cosine_similarity(query, doc)
    print(f"Document {i}: {sim:.4f}")

# Output:
# Document 0: 0.7823  (related!)
# Document 1: 0.1234  (not related)
# Document 2: 0.8156  (very related!)

Vector Databases

For production use, you need a vector database that handles billions of vectors efficiently.

Pinecone Example

from pinecone import Pinecone

pc = Pinecone(api_key="your-key")
index = pc.Index("my-index")

# Upsert vectors
vectors = [
    {"id": "doc1", "values": embedding1, "metadata": {"title": "ML Basics"}},
    {"id": "doc2", "values": embedding2, "metadata": {"title": "Pizza Guide"}},
]
index.upsert(vectors=vectors)

# Query
results = index.query(
    vector=query_embedding,
    top_k=5,
    include_metadata=True
)

Supabase pgvector (Free Option)

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table with a vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(1536)
);

-- Insert with embedding
INSERT INTO documents (content, embedding)
VALUES ('Machine learning basics', '[0.1, 0.2, ...]');

-- Similarity search
SELECT content, 1 - (embedding <=> '[0.15, 0.25, ...]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.15, 0.25, ...]'
LIMIT 5;

Practical Applications

1. Semantic Search Engine

Replace keyword search with meaning-based search. Users can search "how to fix slow website" and find documents about "performance optimization" even if those words don't appear.

2. Duplicate Detection

Find near-duplicate content by comparing embedding similarity. Useful for content moderation, plagiarism detection, and deduplication.

3. Recommendation Systems

Embed user preferences and item descriptions. Recommend items whose embeddings are closest to the user's preference vector.

4. Clustering and Classification

Group similar items automatically using k-means or HDBSCAN on embeddings. No labels needed — the structure emerges from the data.


Choosing an Embedding Model

  • OpenAI text-embedding-3-large: Best quality, $0.13/million tokens
  • OpenAI text-embedding-3-small: Good quality, $0.02/million tokens
  • Cohere embed-v4: Competitive quality, good for multilingual
  • all-MiniLM-L6-v2: Free, runs locally, 384 dimensions, great for prototyping
  • BGE-large-en-v1.5: Free, runs locally, 1024 dimensions, production-ready quality

People Also Ask

How many dimensions should embeddings have?

More dimensions capture more nuance but cost more to store and search. For most applications, 768-1536 dimensions is the sweet spot. 384 is fine for prototyping.

Can I use embeddings for images?

Yes — CLIP and similar models create embeddings for both text and images in the same vector space. You can search for images using text queries and vice versa.

How much does vector storage cost?

Pinecone: ~$0.33/million vectors/month. Supabase pgvector: included in Supabase pricing. Self-hosted Qdrant or Chroma: just your server costs.


Want to skip months of trial and error? We've distilled thousands of hours of prompt engineering into ready-to-use prompt packs that deliver results on day one. Our packs at wowhow.cloud include battle-tested prompts for marketing, coding, business, writing, and more — each one refined until it consistently produces professional-grade output.

Blog reader exclusive: Use code BLOGREADER20 for 20% off your entire cart. No minimum, no catch.

Browse Prompt Packs →

Tags:embeddingsvector-databasepythonsimilarity-searchpineconerag
All Articles
P

Written by

Promptium Team

Expert contributor at WOWHOW. Writing about AI, development, automation, and building products that ship.

Ready to ship faster?

Browse our catalog of 1,800+ premium dev tools, prompt packs, and templates.

Browse ProductsMore Articles

More from AI for Professionals

Continue reading in this category

AI for Professionals12 min

Claude Code Subagents: Build an AI Development Team

Claude Code's subagent system lets you spawn multiple AI developers that work in parallel on different parts of your project. This advanced guide shows you how to orchestrate an AI development team.

claude-codesubagentsai-development
27 Feb 2026Read more
AI for Professionals12 min

How to Fine-Tune Your Prompts for Each AI Model (Claude, GPT, Gemini)

The same prompt produces very different results on Claude, GPT, and Gemini. This guide reveals the specific preferences of each model and how to optimize your prompts accordingly.

prompt-optimizationclaude-promptsgpt-prompts
5 Mar 2026Read more
AI for Professionals11 min

Prompt Injection Attacks: How to Protect Your AI Apps (2026 Guide)

Prompt injection is the SQL injection of the AI era. If you're building AI-powered applications, this is the security guide you can't afford to skip.

prompt-injectionai-securityllm-security
7 Mar 2026Read more