DEV Community

Peter Eysermans
Peter Eysermans

Posted on • Originally published at eysermans.com

Things I wish I knew before I got started with CosmosDB

Lately I've been working on a project using Cosmos DB on Azure. I really like the simplicity of a document DB. However there were some things I learned while on the job which I wished somebody told me before starting with it.

Cosmos DB?

Cosmos DB is Microsoft's flavoured document DB, you could call it Microsoft's Mongo DB that runs on Azure. On contrary to a normal SQL database, a document DB stores documents in the database. There is no schema, so you don't need to define columns and data types. Instead you pass a document to the document database and it will store it in JSON for you. This also means that you can add and remove properties on a document without having to change other documents which store similar information. An example: if there are 2 documents in the database which represent a User, the first User can have an address and it's perfectly possible for the second User to not have an address.

First user:

{
    id: "41bb417e-c942-4df2-96f0-cd2c3e1b2f91",
    name: "User 1",
    street: "My street",
    number: 55,
    postalCode: 1234,
    city: "Antwerp"
}
Enter fullscreen mode Exit fullscreen mode

Second user:

{
    id: "cc4f3a5b-5e1c-4cdf-8167-dbd363538e26",
    name: "User 2"
}
Enter fullscreen mode Exit fullscreen mode

Now we got the basics, let's look into some operations.

Parameterized query

Let's start with the very basic. If you'd like to run a query using parameters with C#, the SqlQuerySpec class can be used and passed to the DocumentClient. Both classes can be found in the Microsoft.Azure.DocumentDB package.

var querySpec = new SqlQuerySpec {
    QueryText = "select * from c where c.id = @id",
    Parameters = new SqlParameterCollection {
        new SqlParameter { 
            Name = "@id",
            Value = userId
        }
    }
}

// These properties can be found in the Azure portal.
var documentDbUri = "cosmosdb-uri";
var databaseId = "cosmosdb-database";
var collectionId = "cosmosdb-collection";

var documentClient = new DocumentClient(new Uri(documentDbUri), documentDbKey);
var database = documentClient.CreateDatabaseQuery().FirstOrDefault(d => d.Id == databaseId);
var collection = documentClient.CreateDocumentCollectionQuery().FirstOrDefault(c => c.Id == collectionId);
var queryResult = documentClient.CreateQuery(collection.DocumentsLink, query);
Enter fullscreen mode Exit fullscreen mode

ARRAY_CONTAINS

This function's name pretty much speaks for itself. If a document has a property which is an array, ARRAY_CONTAINS can be used in a query to check if the array contains a certain value.

An example, this is our document:

{
    id: "9476d2bb-6f19-483f-9a59-446ddc693050",
    name: "Jane Doe",
    roles: [
        "manager",
        "reporting"
    ]
}
Enter fullscreen mode Exit fullscreen mode

To query all the users that have the manager role:

select * from c where ARRAY_CONTAINS(c.roles, "manager")
Enter fullscreen mode Exit fullscreen mode

If the property is an array of objects, you can also pass an object to the ARRAY_CONTAINS function. More info can be found in this Stackoverflow question

SELECT  *
FROM    food as f
WHERE   ARRAY_CONTAINS(f.servings, {"description":"bar"}, true)
Enter fullscreen mode Exit fullscreen mode

Another great use of this function is using it as an alternative for the traditional SQL in keyword. If you would want to check if an ID is in an array that is passed as a parameter you would write something like this:

SELECT *
FROM Orders
WHERE CustomerId in (@customerIds)
Enter fullscreen mode Exit fullscreen mode

@customerIds would then be an array of ids which is passed from C#.

To do a similar query on a CosmosDB instance, ARRAY_CONTAINS can be used. The array is passed as a parameter and ARRAY_CONTAINS can be used to check if the id is in the given parameter.

var customerIds = new List<int> { 5, 6, 7, 8 };

var querySpec = new SqlQuerySpec {
    QueryText = "select * from c where ARRAY_CONTAINS(@customerIds, c.customerId)",
    Parameters = new SqlParameterCollection {
        new SqlParameter { 
            Name = "@customerIds",
            Value = customerIds
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Count

If you want to know how many records are in the result of a given query, there are two ways of getting the result. Using the LINQ Count method.

// These properties can be found in the Azure portal.
var documentDbUri = "cosmosdb-uri";
var databaseId = "cosmosdb-database";
var collectionId = "cosmosdb-collection";

var documentClient = new DocumentClient(new Uri(documentDbUri), documentDbKey);
var database = documentClient.CreateDatabaseQuery().FirstOrDefault(d => d.Id == databaseId);
var collection = documentClient.CreateDocumentCollectionQuery().FirstOrDefault(c => c.Id == collectionId);

var count = documentClient.CreateDocumentQuery<User>(collection.DocumentsLink)
                          .Where(u => u.city == "Antwerp")
                          .Count();
Enter fullscreen mode Exit fullscreen mode

Or you can use the COUNT keyword in a SQL query, this query can be ran using a parameterized query.

SELECT VALUE COUNT(d) FROM d where city = 'Antwerp'
Enter fullscreen mode Exit fullscreen mode

Read more about count in Cosmos DB

Client

Although the Azure portal allows to do any kind of operation on the document database, it's a very clunky way to work. I'm a big fan of using a dedicated client. Although the Microsoft Azure Storage Explorer still has some rough edges, the Cosmos DB support is still in preview, it works well. And besides managing documents in a Cosmos DB on Azure you can also manage your storage account.

Another option is the open source Cosmos DB explorer but it has less options than the Storage Explorer.

Like keyword

In traditional SQL the like keyword can be used to filter a string field on a certain value. CosmosDB foresees in equivalents for the like keyword. Let's go over the different options.

Show all rows where a specific column contains a certain value

SQL:

select * from Table where Column like '%value%'
Enter fullscreen mode Exit fullscreen mode

CosmosDB:

select * from d where CONTAINS(d.Column, 'value')
Enter fullscreen mode Exit fullscreen mode

Show all rows where a specific column starts with a certain value

SQL:

select * from Table where Column like 'value%'
Enter fullscreen mode Exit fullscreen mode

CosmosDB:

select * from d where STARTSWITH(d.Column, 'value')
Enter fullscreen mode Exit fullscreen mode

Show all rows where a specific column ends with a certain value

SQL:

select * from Table where Column like '%value'
Enter fullscreen mode Exit fullscreen mode

CosmosDB:

select * from d where ENDSWITH(d.Column, 'value')
Enter fullscreen mode Exit fullscreen mode




Conclusion

As you can see if you compare CosmosDB with a relational SQL database there are some things that work quite different. I hope this gives you a head start with Cosmos DB. Feel free to ping me if you have any other tips or remarks.

Top comments (4)

Collapse
 
khrome83 profile image
Zane Milakovic

This is great!

I actually stared to play with it a few months ago, and found myself abandoning the use of it.

To put it simple, I had a hard time with the documentation, much of which is outdated. Simple things like deleting a record or appending to an array was not easy to find. I ended up switching to FireBase.

I am going to try Fauna DB for my next project though.

Collapse
 
petereysermans profile image
Peter Eysermans

Thanks, there is indeed outdated documentation out there. That's one of the reasons why I wrote this post. But once I had the basic operations handled it was fun to work with. If I have the time I'll have a look at Fauna DB, looks interesting.

Collapse
 
tanishahudson1 profile image
tanishahudson1

Hello, I am trying to access values within CosmosDB and have hit a roadblock. Any insight is appreciated:

I wish to locate all the nested fields within my JSON document that contain the "state" = "new" and where the "id" length > 4

      {
          "id": "123"
          "feedback" : {
             "Features" : [
                              {
                                "state":"new"
                                "id": "12345"

                              }


                          ]
     }

This is what I have tried to do:

Since this is a nested document. My query looks like this:

 SELECT
 c.feedback
 ARRAY(SELECT Features.state FROM c IN c.Features) AS Features FROM 
 c
 WHERE length(Features.id) >15

However, the syntax is not correct and I am currently researching and looking for examples for this case

Any help is appreciated

Collapse
 
petereysermans profile image
Peter Eysermans • Edited

Hello, this is the query you are looking for:

 SELECT c.feedback from c
 join a in c.feedback.Features
 where a.state = 'new'
and length(a.id) > 4

As you can see there is a join on the Features array and then the fields of that array can be queried.

There are also additional options on the ARRAY_CONTAINS which are interesting to look at if you are working with nested objects and nested arrays.