How many embeddings can GAIA store?

GAIA's ChromaDB instance scales to millions of embeddings in a self-hosted deployment. The hosted service scales automatically. Typical users with years of email, tasks, and documents generate 100K-500K embeddings.

Embedding

An embedding is a dense numerical vector representation of text (or other data) that encodes semantic meaning such that similar concepts are positioned close together in vector space.

Understanding Embedding

Embeddings are the bridge between human language and mathematical computation. A word like 'meeting' is meaningless to a computer as a string. As a 768 or 1536-dimensional vector, it can be compared mathematically to other vectors. Embeddings encode meaning so that 'meeting' and 'conference' are close in vector space, while 'meeting' and 'database' are far apart. The power of embeddings is semantic similarity search. Given a query like 'emails about the product launch,' an embedding model converts the query to a vector, then finds all stored email embeddings that are mathematically similar — surfacing relevant emails without requiring exact keyword matches. This captures semantics, not just text patterns. Embedding models are trained separately from language models and optimized specifically for representation quality. OpenAI's text-embedding-3 models, Cohere's embed models, and open-source models like sentence-transformers are popular choices. Embeddings are typically 768-3072 dimensional vectors. Applications using embeddings store content in a vector database (ChromaDB, Pinecone, Weaviate) that enables fast approximate nearest-neighbor search over large embedding collections.

How GAIA Uses Embedding

GAIA embeds all ingested content — emails, tasks, calendar events, documents — into ChromaDB, its vector database. When GAIA needs to find relevant context (e.g., 'what have we discussed about the Q4 budget?'), it converts the query to an embedding and searches ChromaDB for semantically similar content rather than keyword matching, surfacing relevant items regardless of exact phrasing.

Related Concepts

Vector Embeddings

Vector embeddings are numerical representations of text, images, or other data that capture semantic meaning, enabling machines to understand similarity and relationships between pieces of information.

Vector Database

A vector database is a database system designed to store, index, and query high-dimensional vector embeddings at scale, enabling fast similarity search across large collections of embedded data.

Semantic Search

Semantic search is a search technique that understands the meaning and intent behind a query, returning results based on conceptual relevance rather than exact keyword matches.

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by first retrieving relevant documents or data from an external knowledge base and injecting that context into the model's prompt.

Graph-Based Memory

Graph-based memory is an AI memory architecture that stores information as interconnected nodes and relationships, enabling rich contextual understanding and persistent knowledge across interactions.

Frequently Asked Questions

Keyword search finds documents containing the exact words in your query. Embedding-based search finds documents with the same meaning, even if different words are used. 'Budget discussion' would find 'Q4 financial planning meeting' via embedding search but not keyword search.

Explore More

Compare GAIA with Alternatives

See how GAIA stacks up against other AI productivity tools in detailed comparisons

GAIA for Your Role

Discover how GAIA helps professionals in different roles leverage AI for productivity

Embedding

An embedding is a dense numerical vector representation of text (or other data) that encodes semantic meaning such that similar concepts are positioned close together in vector space.

Understanding Embedding

How GAIA Uses Embedding

Frequently Asked Questions

Explore More

Compare GAIA with Alternatives

See how GAIA stacks up against other AI productivity tools in detailed comparisons

GAIA for Your Role

Discover how GAIA helps professionals in different roles leverage AI for productivity

Understanding Embedding

How GAIA Uses Embedding