A little known fact about MongoDB object IDs is that they encode the documents creation date. This means that
you don’t have to create an index on a dedicated date field, because the _id
field is automatically indexed. Let’s generate an ObjectId with nothing else than a command line, sprinkled with some black magic:
# 1. get the date 3 months ago
$ date=$(date --date="3 months ago" +%s)
# 2. generate the ObjectId by converting the date to hex and fill the rest with 0s
$ echo "$(printf "%x" ${date})0000000000000000"
5ca378720000000000000000
# 3. query the DB for documents according to creation date
$ mongo
>> use mydb
>> db.collection.find({_id:{$gt: ObjectId("5cab57740000000000000000")}})
-
date
has a nifty feature, where you can provide a human readable, relative string called a date string, and it will get back to you with the date at that time. -
Mongo object IDs consist of 12 byte hexadecimal strings. The first 4 describe the number of seconds since the epoch - the star of our show, the next 5 is a random value, and the last 3 bytes describe a counter. Since only the first 4 bytes are storing the date, we can fill the rest with zeros.
We convert our date (expressed in seconds since the epoch) to Hex, by using the built-in
printf
function and the%x
(unsigned hexadecimal) format. -
We use the generated object ID to execute the gt (greater than) query, and as a result, we will get all the documents which were created after the given date.
Now, why would you need this?
If you don’t have an index on a date field, but you need to filter out documents with a date boundary. For example:
- deleting all documents older than 30 days
- archiving all documents older than 6 months and then removing the data to save MongoDB disk space 1
-
deleting documents will not free up the disk on the host - Mongo reserves the newly available free space for new documents. Starting from Mongo 3.2, when you compact a collection, then the OS will regain the free space. ↩