Integrating OData and DocumentDb with Azure Functions

This is the finale in a series of posts that aimed to provide an alternative to Azure Mobile Apps based on Azure Functions and DocumentDb. I started with discussing how to run a CRUD HTTP API, then moved onto DocumentDb, handled inserts and replacements. Now it’s time to fetch data. Azure Mobile Apps uses a modified OData v3 query string to perform the offline sync and online querying of data. This is mostly because ASP.NET (which was the basis for the original service) has a nice OData library for it. OData is painful to use in our context, however. Firstly, there are some necessary renames – the updatedAt field is actually the DocumentDb timestamp, for example. The other thing is that there is no ready made library for turning an OData query string into a DocumentDb SQL statement. So I don’t have an “easy” way of fulfilling the requirement.

Fortunately, the Azure Mobile Apps Node SDK has split off a couple of libraries for more general use. The first is azure-query-js. This is a library for converting between a set of OData query parameters and an internal query structure. The second is azure-odata-sql, which is for turning a normalized OData query into SQL, based on Microsoft SQL or SQLite syntax. Neither of these libraries is particularly well documented, but they are relatively easy to use based on the examples used within the Azure Mobile Apps SDKs. We are going to need to modify the azure-odata-sql library to generate appropriate SQL statements for DocumentDB, so I’ve copied the source to the library into my project (in the directory odata-sql). My first stab at the getAllItems() method looks like this:

var OData = require('azure-query-js').Query.Providers.OData;
var formatSql = require('../odata-sql').format;

function getAllItems(req, res) {
    // DoumentDB doesn't support SKIP yet, so we can't do TOP either without some problems
    var query = OData.fromOData(
        settings.table,
        req.query.$filter,
        req.query.$orderby,
        undefined, //parseInt(req.query.$skip),
        undefined, //parseInt(req.query.$top),
        req.query.$select,
        req.query.$inlinecount === 'allpages',
        !!req.query.__includeDeleted);

    var sql = formatSql(OData.toOData(query), {
        containerName: settings.table,
        flavor: 'documentdb'
    });
    
    res.status(200).json({ query: req.query, sql: sql, message: 'getAll' });
}

As noted here, DocumentDB hasn’t added full support for SKIP/TOP statements, so we can’t use those elements. Once the support is available within DocumentDB, I just need to include that support in the odata-sql library and change the two parmeters to the fromOData() call.

So, what does this do? Well, first, it converts the request from the browser (or client SDK) from the jumble of valid OData query params into a Query object. That Query object is actually a set of functions to do the parsing. Then we use the toOData() method (from the azure-query-js library) to convert that Query object into a normalized OData query. Finally, we use a custom SQL formatter (based on the azure-odata-sql) library to convert it to a SQL statement. If you run this, you should get something like the following out of it:

getall-1

I can now see the SQL statements being generated. The only problem is that they are not actually valid SQL statements for DocumentDB. They are actually perfectly valid for Microsoft SQL Server or SQL Azure. We need to adjust the odata-sql library for our needs. There are a couple of things needed here. Our first requirement is around the updatedAt field. This is not updatedAt in DocumentDB – it’s _ts, and it’s a number. We can do this using regular expressions like this:

if (req.query.$filter) {
    while (/updatedAt [a-z]+ '[^']+'/.test(req.query.$filter)) {
        var re = new RegExp(/updatedAt ([a-z]+) '([^']+)'/);
        var results = re.exec(req.query.$filter);
        var newDate = moment(results[2]).unix();
        var newString = `_ts ${results[1]} ${newDate}`;
        req.query.$filter = req.query.$filter.replace(results[0], newString);
    }
}

I could have probably shrunk this code somewhat, but it’s clear as to what is going on. We loop around the filter while there is still an updatedAt clause, convert the date, then replace the old string with the new string. We need to do similar things with the $select and $orderby clauses as well – left out because I’m trying to make this simple.

In terms of the odata-sql library, most of what we want is in the helpers.js library. Specifically, in the case of DocumentDB, we don’t need the square brackets. That means the formatMember() and formatTableName() methods must be adjusted to compensate.

I found it easier to step through the code by writing a small test program to test this logic out. You can find it in todoitem\test.js. With Visual Studio Code, you can set breakpoints, watch variables and do all the normal debugging things to really understand where the code is going and what it is doing.

Now that the SQL looks good, I need to execute the SQL commands. I’ve got a version of queryDocuments() in the driver:

    queryDocuments: function (client, collectionRef, query, callback) {
        client.queryDocuments(collectionRef._self, query).toArray(callback);
    },

This is then used in the HTTP trigger getAllItems() method. I’ve included the whole method here for you:

function getAllItems(req, res) {
    // Adjust the query parameters for DocumentDB
    if (req.query.$filter) {
        while (/updatedAt [a-z]+ '[^']+'/.test(req.query.$filter)) {
            var re = new RegExp(/updatedAt ([a-z]+) '([^']+)'/);
            var results = re.exec(req.query.$filter);
            var newDate = moment(results[2]).unix();
            var newString = `_ts ${results[1]} ${newDate}`;
            req.query.$filter = req.query.$filter.replace(results[0], newString);
        }
    }
    // Remove the updatedAt from the request
    if (req.query.$select) {
        req.query.$select = req.query.$select.replace(/,{0,1}updatedAt/g, '');
    }

    // DoumentDB doesn't support SKIP yet, so we can't do TOP either
    var query = OData.fromOData(
        settings.table,
        req.query.$filter,
        req.query.$orderby,
        undefined, //parseInt(req.query.$skip),
        undefined, //parseInt(req.query.$top),
        req.query.$select,
        req.query.$inlinecount === 'allpages',
        !!req.query.__includeDeleted);

    var sql = formatSql(OData.toOData(query), {
        containerName: settings.table,
        flavor: 'documentdb'
    });

    // Fix up the object so that the SQL object matches what DocumentDB expects
    sql[0].query = sql[0].sql;
    sql[0].parameters.forEach((value, index) => {
        sql[0].parameters[index].name = `@${value.name}`;
    });

    // Execute the query
    console.log(JSON.stringify(sql[0], null, 2));
    driver.queryDocuments(refs.client, refs.table, sql[0])
    .then((documents) => {
        documents.forEach((value, index) => {
            documents[index] = convertItem(value);
        });

        if (sql.length == 2) {
            // We requested $inlinecount == allpages.  This means we have
            // to adjust the output to include a count/results field.  It's
            // used for paging, which DocumentDB doesn't support yet.  As
            // a result, this is a hacky way of doing this.
            res.status(200).json({
                results: documents,
                count: documents.length
            });
        } else {
            res.status(200).json(documents);
        }
    })
    .catch((error) => {
        res.status(400).json(error);
    });
}

Wrapping Up

So, there you have it. A version of the Azure Mobile Apps service written with DocumentDB and executing in dynamic compute on Azure Functions.

Of course, I wouldn’t actually use this code in production. Firstly, I have not written any integration tests on this, and there are a bunch of corner cases that I would definitely want to test. DocumentDB doesn’t have good paging support yet, so you are getting all records all the time. I also haven’t looked at all the OData methods that can be converted into SQL statement to ensure DocumentDB support. Finally, and this is a biggie, the service has a “cold start” time. It’s not very much, but it can be significant. In the case of a dedicated service, you spend that cold start time once. In the case of a dynamic compute Azure Function, you can spend that time continually. This isn’t actually a problem with DocumentDB, since I am mostly passing through the REST calls (adjusted). However, it can become a problem when using other sources. One final note is that I keep all the records in memory – this can drive up the memory requirements (and hence cost) of the Azure Function on a per-execution basis.

Until next time, you can find the source code for this project on my GitHub repository.

One thought

  1. Pingback: Azure Weekly: Feb 13, 2017 – Build Azure

Comments are closed.