Monday, 1 July 2019

Importance Of Using $elemMatch Operator, While Working With Arrays In MongoDB.

In MongoDB, its a common scenario, where we need to filter out the documents, based on the sub document/arrays, but there are couple of ways in which we can query the documents with arrays. It purely depends on the developer use case.

1. Using the dot operator. (like:  comments.rating).
2. Using the $elemMatch operator.

The main difference is that, they both behave differently when using with multiple conditions in filter operation. This blog is mainly to showcase that difference, and how they work when using it with array.

Lets understand this by example:

Suppose, we have a restaurant collection, with a document as follows: It has a comments array, with customer rating, and customer category (Bronze/Gold).

    "name": "Fly High",
    "restaurant_id" : "30075444"
    "street" : "Baner",
    "zipcode" : "10462"
    "comments" : [
            "rating" : "3",
            "message":"Nice Ambience Only",
            "customer":  "Gold"
            "rating" : "3.5",
            "message":"Ok Food",
            "rating" : "5", 
            "message":"Nice Ambience & Food",
            "customer":  "Bronze"
            "rating" : "2",
            "message":"Bad Service",
            "customer":  "Gold"

This is just the single restaurant, suppose if we have information for multiple restaurants (documents), and if we need to find out, all those restaurants, who have been rated greater than 4, by Gold customer.

Lets evaluate both the supported ways, for querying the document with arrays.

1. Using the (.) dot operator :

Query Statement  :  Find all restaurants, who have been rated greater than 4, by "gold" customer.

Query :{"comments.rating":{$gt:4}, "comments.customer":"gold"})

Suppose, if we only have the above specified document in the collection, and we executed the above query, the result count should be 0, as we don;t have any document, where the gold customer has given rating greater than 4. But the inverse of this, we will get the above document in result.

As, we can see in the above document, only rating greater than 4 we have, is from "Bronze" customer, still we get the current document in the result.

So, how "." dot operator works in multiple condition is, it will check, whether any combination of these condition matches in the comments array, if it has found the match, it will return that document.

In the current query case :

1. It will check, whether we do have any rating greater that 4 ? (Yes, we do have from the bronze customer).
2. It will check, whether we have any "gold" customer comment ? (Yes, we do have multiple comments).

i.e combination of conditions are matched in the array, lets return that document.

Note: It is good, when we are doing "OR" matching, but for "AND" matching, it can return some unexpected results.

The alternate for this, using the "$elemMatch" operator, which will behave as expected.

2. Using the ($elemMatch) operator : 

Lets rewrite the same above query using the "$elemMatch" operator and validate the results.

Query Statement  :  Find all restaurants, who have been rated greater than 4, by "gold" customer.

Query :{"comments":{$elemMatch:{"rating":{$gt:4},"customer":"gold"}}})

If we only have the above document in collection, and we execute the elemMatch query, it will return the result count as 0. As, no document exist, where "gold" customer has given the rating greater than 5, and that was the expected result.

Hope it will help you, in understanding the difference between, in both the operators. The dot operator still can be used, when we do have only single condition in query.