MongoDB Atlas
This notebook covers how to MongoDB Atlas vector search in LangChain, using the langchain-mongodb
package.
MongoDB Atlas is a fully-managed cloud database available in AWS, Azure, and GCP. It supports native Vector Search, full text search (BM25), and hybrid search on your MongoDB document data.
MongoDB Atlas Vector Search allows to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm (
Hierarchical Navigable Small Worlds
). It uses the $vectorSearch MQL Stage.
Setup
*An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later (including RCs).
To use MongoDB Atlas, you must first deploy a cluster. We have a Forever-Free tier of cluster on a cloud of your choice available. To get started head over to Atlas here: quick start.
You'll need to install langchain-mongodb
and pymongo
to use this integration.
pip install -qU langchain-mongodb pymongo
Credentials
For this notebook you will need to find your MongoDB cluster URI.
For information on finding your cluster URI read through this guide.
import getpass
MONGODB_ATLAS_CLUSTER_URI = getpass.getpass("MongoDB Atlas Cluster URI:")
If you want to get best in-class automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
Initialization
pip install -qU langchain-openai
import getpass
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
from langchain_mongodb import MongoDBAtlasVectorSearch
from pymongo import MongoClient
# initialize MongoDB python client
client = MongoClient(MONGODB_ATLAS_CLUSTER_URI)
DB_NAME = "langchain_test_db"
COLLECTION_NAME = "langchain_test_vectorstores"
ATLAS_VECTOR_SEARCH_INDEX_NAME = "langchain-test-index-vectorstores"
MONGODB_COLLECTION = client[DB_NAME][COLLECTION_NAME]
vector_store = MongoDBAtlasVectorSearch(
collection=MONGODB_COLLECTION,
embedding=embeddings,
index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
relevance_score_fn="cosine",
)
# Create vector search index on the collection
# Since we are using the default OpenAI embedding model (ada-v2) we need to specify the dimensions as 1536
vector_store.create_vector_search_index(dimensions=1536)
[OPTIONAL] Alternative to the vector_store.create_vector_search_index
command above, you can also create the vector search index using the Atlas UI with the following index definition:
{
"fields":[
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}