Working with DocumentDb

In my last post, I introduced working with HTTP CRUD APIs with Azure Functions. My intent in all this is to create a proof of concept service that emulates the Azure Mobile Apps service, but using Azure Functions and the dynamic (or consumption-based) SKU. This means that you pay for the API only when it is being used, but it scales seamlessly as your needs grow. In addition, I’m going to make the backing store for this API a NoSQL store based on another Azure resource – DocumentDb.

Fortunately for me, DocumentDb has a nice Node.js driver. I’m going to promisify the callback-based SDK with bluebird. There are a number of samples available for the DocumentDb driver. For instance, here is my docdb-driver/database.js file:

module.exports = {
    createDatabase: function (client, databaseId, callback) {
        client.createDatabase({ id: databaseId }, callback);
    },

    deleteDatabase: function (client, databaseId, callback) {
        client.deleteDatabase(`dbs/${databaseId}`, callback);
    },

    findDatabaseById: function (client, databaseId, callback) {
        var qs = {
            query: 'SELECT * FROM root r WHERE r.id = @id',
            parameters: [
                { name: '@id', value: databaseId }
            ]
        };

        client.queryDatabases(qs).toArray(function (err, results) {
            if (err) {
                callback(err, null);
            } else {
                callback(null, (results.length === 0) ? null : results[0]);
            }
        });
    },

    listDatabases: function (client, callback) {
        client.readDatabases().toArray(callback);
    },

    readDatabase: function (client, database, callback) {
        client.readDatabase(database._self, callback);
    },

    readDatabases: function (client, databaseId, callback) {
        client.readDatabase(`dbs/${databaseId}`, callback);
    }
};

This is based on callbacks, rather than promises. So my docdb-driver/index.js file uses promisify to convert them to promises:

var Promise = require('bluebird');
var collection = require('./collection');
var database = require('./database');
var docops = require('./document');

var dbCache = {};

var createDatabase = Promise.promisify(database.createDatabase);
var findDatabaseById = Promise.promisify(database.findDatabaseById);

function ensureDatabaseExists(client, database) {
    if (database in dbCache) {
        return Promise.resolve(dbCache[database]);
    }

    return findDatabaseById(client, database).then((dbRef) => {
        if (dbRef == null) {
            return createDatabase(client, database).then((result) => {
                dbCache[database] = result;
                return result;
            });
        }
        dbCache[database] = dbRef;
        return dbRef;
    });
}

module.exports = {
    createCollection: Promise.promisify(collection.createCollection),
    listCollections: Promise.promisify(collection.listCollections),
    readCollection: Promise.promisify(collection.readCollection),
    readCollectionById: Promise.promisify(collection.readCollectionById),
    getOfferType: Promise.promisify(collection.getOfferType),
    changeOfferType: Promise.promisify(collection.changeOfferType),
    deleteCollection: Promise.promisify(collection.deleteCollection),

    createDatabase: createDatabase,
    deleteDatabase: Promise.promisify(database.deleteDatabase),
    ensureDatabaseExists: ensureDatabaseExists,
    findDatabaseById: findDatabaseById,
    listDatabases: Promise.promisify(database.listDatabases),
    readDatabase: Promise.promisify(database.readDatabase),
    readDatabases: Promise.promisify(database.readDatabases),

    createDocument: Promise.promisify(docops.createDocument)
};

I’m going to extend this driver package over time. Sometimes I use the straight API from the DocumentDb driver (see the readDatabase() method). Sometimes, however, I want to do something extra. The ensureDatabaseExists() method is an example of this. I want to find the database in the service and create it only if it doesn’t exist.

Back to the Azure Function I’m developing. DocumentDb mainly stores “documents” – JSON blobs of associated data. It organizes these documents into “collections” and collections into a “database”. In the Azure Mobile Apps equivalent, the collection would be a table and the individual rows or entities would be documents. My first requirement is to ensure that the database and collection are initialized properly (in todoitem/index.js):

var DocumentDb = require('documentdb');
var driver = require('../docdb-driver');

/**
 * Global Settings Object
 */
var settings = {
    host: process.env['DocumentDbHost'],
    accountKey: process.env['DocumentDbAccountKey'],
    database: 'AzureMobile',
    connectionPolicy: undefined,
    consistencyLevel: 'Session',
    pricingTier: 'S1',
    table: 'todoitem'
};

// Store any references we receive here as a cache
var refs = {
    initialized: false
};

/**
 * Routes the request to the table controller to the correct method.
 *
 * @param {Function.Context} context - the table controller context
 * @param {Express.Request} req - the actual request
 */
function tableRouter(context, req) {
    var res = context.res;
    var id = context.bindings.id;

    initialize(context).then(() => {
        switch (req.method) {
            case 'GET':
                if (id) {
                    getOneItem(req, res, id);
                } else {
                    getAllItems(req, res);
                }
                break;

            case 'POST':
                insertItem(req, res);
                break;

            case 'PUT':
                replaceItem(req, res, id);
                break;

            case 'DELETE':
                deleteItem(req, res, id);
                break;

            default:
                res.status(405).json({ error: "Operation not supported", message: `Method ${req.method} not supported`})
        }
    });
}

/**
 * Initialize the DocumentDb Driver
 * @param {Function.Context} context - the table controller context
 * @param {function} context.log - used for logging
 * @returns {Promise}
 */
function initialize(context) {
    if (refs.initialized) {
        context.log('[initialize] Already initialized');
    }

    context.log(`[initialize] Creating DocumentDb client ${settings.host} # ${settings.accountKey}`);
    refs.client = new DocumentDb.DocumentClient(
        settings.host,
        { masterKey: settings.accountKey },
        settings.connectionPolicy,
        settings.consistencyLevel
    );

    context.log(`[initialize] EnsureDatabaseExists ${settings.database}`);
    return driver.ensureDatabaseExists(refs.client, settings.database)
        .then((dbRef) => {
            context.log(`[initialize] Initialized Database ${settings.database}`);
            refs.database = dbRef;
            return driver.listCollections(refs.client, refs.database);
        })
        .then((collections) => {
            context.log(`[initialize] Found ${collections.length} collections`);
            const collection = collections.find(c => { return (c.id === settings.table); });
            context.log(`[initialize] Collection = ${JSON.stringify(collection)}`);
            if (typeof collection !== 'undefined') return collection;
            context.log(`[initialize] Creating collection ${settings.table}`);
            return driver.createCollection(refs.client, settings.pricingTier, refs.database, settings.table);
        })
        .then((collectionRef) => {
            context.log(`[initialize] Found collection`);
            refs.table = collectionRef;
            refs.initialized = true;
        });

    context.log('[initialize] Finished Initializing Driver');
}

Let's take this in steps. Firstly, I set up the settings. The important things here are the DocumentDbHost and the DocumentDbAccountKey. If you have created a DocumentDb within the Azure Portal, click on the Keys menu item. The DocumentDbHost is the URI field and the DocumentDbAccountKey is the PRIMARY KEY field. If you are running the Azure Function locally, then you will need to set these as environment variables before starting the func host. If you are running the Azure Function within Azure, you need to make these App Settings. An example of setting these locally in PowerShell:

$env:DocumentDbHost = "https://mydocdb.documents.azure.com:443/"
$env:DocumentDbAccountKey = "fuCZuSomeLongStringaLNKjIiMSEyaojsP05ywmevI7K2yCY9dYLRuCQPd3dMnvg=="
func run test-func --debug

When you use Postman (for example, a GET http://localhost:7071/tables/todoitem), you will see the initialize() method gets called. This method returns a Promise that, when resolved, will then allow the request to be continued. In the initialize() method, I short-circuit the initialization if it has already been initialized. If it has not been initialized, I fill in the refs object. This object will be used by the inidividual CRUD operations, so it needs to be filled in. The client, database, and collection that we need are found or created. At the end, we have resolve the promise by setting the initialized flag to true (thus future calls will be short circuited).

There is a race condition here. If two requests come in to a “cold” function, they will both go through the initialization together and potentially the “create database” and “create collection” will be duplicated, causing an exception in one of the requests. I’m sure I could fix this, but it’s a relatively rare case. Once the datbase and collection are created, the possibility of the condition goes away.

If you run this (either locally or within Azure Functions), you will see the following output in the log window:

function-docdb

If you’ve done something wrong, you will see the exception and you can debug it using the normal methods. Want a primer? I’ve written a blog post about it.

In the next post, I’ll cover inserting, deleting and updating records. Until then, check out the code on my GitHub repository.