๐ Vector Search with Monggregate
MongoDB Atlas provides powerful vector search capabilities through the $vectorSearch
stage, enabling approximate nearest neighbor (aNN) search on vector embeddings. Monggregate makes these advanced vector search features accessible through an intuitive Python interface.
๐ What is Vector Search?
๐ก Vector search allows you to find documents with similar vector embeddings to a query vector, enabling semantic search, recommendations, and AI-powered applications.
Atlas Vector Search offers:
- ๐ง Semantic similarity search using vector embeddings
- ๐ Approximate nearest neighbor (aNN) algorithms for efficient vector comparison
- โก Fast retrieval of similar items from large collections
- ๐งฉ Pre-filtering to narrow search scope and improve relevance
- ๐ Integration with AI models like OpenAI, Hugging Face, and others
๐ Vector search is particularly useful for applications like: - Semantic text search that understands meaning, not just keywords - Image similarity search - Recommendation systems - AI-powered chatbots and RAG (Retrieval Augmented Generation)
๐ฐ Prerequisites for Vector Search
Before using vector search with Monggregate, you need to:
- ๐ Create an Atlas Vector Search index on your collection
- ๐งช Generate vector embeddings for your documents using an embedding model
- ๐พ Store these embeddings in your MongoDB documents
โ ๏ธ Vector search is only available on MongoDB Atlas clusters running v6.0.11 or v7.0.2 and later.
๐ Basic Vector Search
Creating a vector search query with Monggregate is straightforward:
๐ This query will find the 10 documents whose embedding vectors are most similar to your query vector, considering 100 nearest neighbors during the search.
๐ Filtering Vector Search Results
You can narrow your vector search with filters:
๐ This search will only consider products in the "electronics" category with a price less than 1000.
๐ Retrieving Search Scores
To include the similarity score in your results:
๐ฏ Atlas Vector Search assigns a score between 0 and 1 to each result, with higher scores indicating greater similarity.
๐ Complete Example: Semantic Search
Here's a comprehensive example that uses vector search for a semantic search application:
๐ฌ Technical Details
- ๐ข Vector dimensions: Your query vector must have the same number of dimensions as the vectors in your indexed field
- ๐ฏ numCandidates: Should be greater than the limit for better accuracy, typically 10-20x for optimal recall
- โก Performance tuning: Adjust numCandidates to balance between search quality and speed
- ๐ Filtering: Only works on indexed fields marked as the "filter" type in your vector search index
- ๐ Scoring: For cosine and dotProduct similarities, scores are normalized using the formula:
score = (1 + cosine/dot_product(v1,v2)) / 2
๐ Next Steps
- ๐ ๏ธ Explore the full range of MongoDB operators for additional data manipulation
- ๐ Learn how to build complex aggregation pipelines combining vector search with other stages
- ๐ Discover Atlas Search capabilities for traditional text search and faceting