Create or Update a collection
This guide will show you how to create a new collection from an existing one and to update an existing collection.
Like in the previous how-to guide, we will use the listingsAndReviews
collection from the sample_airbnb
database.
What Do We Want to Achieve ?
We want to separate the reviews from the listings and to create a new collection reviews
that will contain all the reviews while keeping the relationship between the reviews and the listings.
How ?
Creating the New Collection
It's going to be very similar to what we have done previously. Except that this time we will save the result of the pipeline in a new collection.
# /!\ /!\ /!\
# Imports
# and boilerplate code
# to get the db object
# are not included
# /!\ /!\ /!\
from monggregate import Pipeline
# Building the pipeline
reviews = "reviews"
pipeline = Pipeline()
pipeline.unwind(
reviews
).replace_root(
reviews
).out(
reviews
)
# Executing the pipeline
db["listingsAndReviews"].aggregate(pipeline=pipeline.export())
# This pipeline won't output anything
We now have created our reviews collection. However now the reviews live in two places. The listingsAndReviews
collection and the reviews
collection.
In the listingsAndReviews
collection, we want to keep only the reference to a given review in the reviews collection.
Updating the "listingsAndReviews" Collection
We want to replace the listingsAndReviews
collection with a new one that will contain the reference to the reviews instead of the full review documents.
We have two options here, we can either create a new collection and drop the old one or we can update the existing collection.
# /!\ /!\ /!\
# Imports
# and boilerplate code
# to get the db object
# are not included
# /!\ /!\ /!\
from monggregate import Pipeline
# Useful variables
new_field = "review_ids"
new_collection = "listings"
old_collection = "listingsAndReviews"
# Building the pipeline
pipeline = Pipeline()
# Showing Option 1: Creating a new collection
# and dropping the old one
pipeline.add_fields(
{new_field:"$reviews._id"}
).add_fields(
{"reviews":f"${new_field}"}
).unset(
new_field
).out(new_collection)
db.drop_collection(old_collection)
# Showing Option 2: Updating the existing collection
pipeline.add_fields(
{new_field:"$reviews._id"}
).add_fields(
{"reviews":f"${new_field}"}
).unset(
new_field
).out(old_collection)
db[old_collection].rename(new_collection)
You should now have two distinct collections: reviews
and listings
.
Separating the reviews can be convenient to be able to retrieve a particular review document.
Now you can do so, by querying the reviews
collection with MQL.
On the contrary, if you want to query a given listing with its reviews, you will have to perform a join operation using the aggregation framework.
Generalization
The $out
stage is very useful to create new collections or update existing ones.
Alternatively, you can use the $merge
stage to update an existing collection with more control on what happens in case of conflicts.