Getting a Random Record From a MongoDB Collection

One of my issues with MongoDB is that, as of this writing, there is no way to retrieve a random record. In SQL, you can simply do something similar to ORDER BY RAND() (this varies depending on your flavor) and you can retrieve random records (at a slightly expensive query cost). There is not yet an equivalent in MongoDB because of its sequential access nature. There is a purely Javascript method in the MongoDB cookbook here. If you are really interested, I would also read the Jira ticket thread #533 on this issue.

Although it feels a little dirty and kind of hackish, here is how I accomplished getting a random record using the Mongo-Ruby driver. Part of this is documented in the cookbook article I linked to above, but I reiterate bits and pieces of it here. This is essentially the same thing that any ORDER BY RAND() statement is doing, its just not doing it "on the fly".

The first thing you’ll have to do is add an additional column to the collection; we’ll call it random. For the ease of use, we’ll also say that every value that goes in this column is between 0 and 1 (and can therefore be generated via Kernel.rand()). This is important because we are going to use it as our criteria for finding a random record.

First, initialized the connection to the database and bind an instance variable to a collection. Then generate the random number that you are going to use to find a random record. Now we try to find_one document that is greater than or equal to our random number. In case we miss, we also do a less than or equal to next. This means that as long as we have at least 1 document in our collection, we will return a record. The more documents in the collection, the better the randomness of the returned document.

@@mongodb ="localhost", 27017).db("test_db}")
@collection = @@mongodb["collection_name"]

@rand = Kernel.rand()
@random_record = @collection.find_one({ 'random' => { '$gte' => @rand } })
if @random_record.nil?
    @random_record = @collection.find_one({ 'random' => { '$lte' => @rand } })

For reference, a mongodb collection with a random column may look like this:

    "_id"        : ObjectId("4c5c710e41b89d657d000001"),
    "url"        : "",
    "created_at" : "Fri Aug 06 2010 16:32:15 GMT-0400 (EDT)",
    "random"     : 0.45929463868260356


A+ a-
Clip in Evernote