Halfway through my expected list of posts on Azure Mobile Apps, I’m finally getting to Offline Sync. I thought I would split offline sync over a few posts. The first post will cover what offline sync is, how it works and some of the terminology. Next time, I will update my existing UWP app to use Offline Sync, including looking at conflict resolution. I’ll cover other platforms as we move forward.
Let’s start with some information first.
What is Offline Sync?
Great question. Normally, you will do LINQ (or similar) queries to get information from a backend OData service hosted in Azure Mobile Apps. You can do some really sophisticated queries. However, it suffers from two basic problems:
- You can’t do queries when there is no network available.
- The queries are slow and that results in a poor user experience.
Offline sync is designed to alleviate those concerns. To do that, it uses a local SQL store – CoreData on the iOS platform and SQLite on most other platforms. You perform queries against the local SQL store instead of the remote store. In the background (or via a user interaction), you perform a synchronization process to push changes to the server and bring down updated records.
To implement the synchronization logic, the client SDK uses a temporary table (called
MS_TableOperations) in the local SQL store to store the information pending to be transmitted to the server.
What is Incremental Sync?
To perform this magical synchronization process, the server relies on a specific field – the UpdatedAt field in the models – to determine which records have been updated. Incremental Sync saves the UpdatedAt value for the last synchronized record and uses that to pull the latest records down. Only records with an UpdatedAt value greater than the last synchronized record will be retrieved from the backend.
For this reason, it is important that the updatedAt field (a DateTimeOffset field in each table that is to by synchronized) is indexed. The server need to execute ORDER BY and WHERE clauses against it.
You, of course, don’t need to use incremental sync. If you don’t use it, all records are retrieved from the server during the sync process. This is great when you want to wipe out the table on the client and replace it with the server version, but in general it sucks for performance.
What is Optimistic Concurrency?
When there are conflicts between the client and server, the server returns a 409 or 412 response and the developer has to handle conflict resolution. In the case of offline sync, we can’t test conflicts until the sync handles, so we are optimistic about our chances – we just write the new record to the offline store as if the server isn’t there and ignore the conflict. When you finally push those changes to the server, you may have conflicts and have to handle them at that point. That’s Optimistic Concurrency.
What is Collapse Logic?
Love the jargon yet? Let’s say you have a table with three records in it – A, B and C. You change A and then change B. When you push these changes to the server, first A will be changed and then B will be changed – the client SDK pushes the changes in the order that they were made. This helps when you want to deal with foreign keys. You can have two tables and push the records in the right order on the client – the ordering that changes are made is preserved on the server so that dependent keys get updated in the right order.
What happens if I change A, then change B, then change A again? In this case, the client SDK collapses the two changes to A into one change. This has a benefit for performance (you are doing 2 updates instead of 3), but it may destroy your nice foreign key ordering.
The Offline Sync Process
You actually know all the operations since it’s a standard OData process. First, the changes in the MS_TableOperations are pushed to the server. This is done in single operations. If you have 500 changes, that means 500 separate connections are made. If you are doing bulk operations, this is not the most performant method of doing changes. I’ll talk about the right way of doing bulk changes later on.
Each push operation is identical to it’s “live” non-offline-sync version. If you are creating a record, then a POST of the record is done and the full record (with the Id) is received and placed in the offline cache. Ditto for updates and deletes (but I’ll be talking about soft delete a little later).
Once the pushes are done, the pull is initiated. The server will do something akin to the following:
GET /tables/todoitem?$filter=updatedAt%20ge%20value&$skip=0&$take=50&$orderBy=updatedAt%20asc&$includeDeleted=false HTTP/1.1
Here the string ‘value’ will be replaced by a standard ISO 8601 date-time stamp – something like
Let’s say you have thousands of records. If you execute the query without paging, then it is likely you will tie up your client process on the phone for a considerable period of time as you receive and process the data. To alleviate that and allow your mobile application to remain responsive, the client SDK implements paging. Be default, 50 records will be requested for each paged operation.
In reality, this means that you will see one more request than you expect. Let’s take an example. Let’s say you are doing an initial synchronization of 120 records. What you will see is:
- Request #1 with
$skip=0&$take=50returning 50 records
- Request #2 with
$skip=50&$take=50returning 50 records
- Request #3 with
$skip=100&$take=50returning 20 records
- Request #4 with
$skip=120&$take=50returning 0 records
Why not stop at the third request? We expect this to be a live system. The OData subsystem is allowed to return less than the requested value and will do so for a variety of reasons. For example, it may be configured with a maximum transfer size and the records won’t fit into the transfer buffer. The only sure way of knowing that you have received all the records is to request more and be told there is no more.
If you do a Pull operation with a dirty cache – i.e. there are some records in the MS_TableOperations table that have yet to be pushed – then the system will automatically push those records to the server for you. This is called Implicit Push. You should not rely on this as you are likely not expecting nor capturing conflict exceptions on a Pull operation – conflicts only occur on a Push. So do yourself a favor and always explicitly push.
You can also be proactive about pushing. Not everywhere has ubiquitous bandwidth. Sometimes, you only want to push your changes up to the server (and get their corresponding updates back). You don’t want to pull down new records unless you are on wifi, for example. Because the push and the pull are explicit, you can do the push without the pull.
I’ve mentioned soft delete previously but not really defined it. Let’s say you have two clients and both of them implement offline sync. CLIENT1 downloads items A, B, C and so does CLIENT2. CLIENT1 now deletes item A. It’s gone from the cache in CLIENT1 and it’s gone from the server. But what about CLIENT2? Since no updates are received when CLIENT2 does a sync, the record still exists in the cache of CLIENT2.
There are two ways to fix this:
- Do a full sync of the table without the benefit of incremental sync
- Implement soft delete
Soft delete adds another field to the table – the
deleted field. This field is a boolean. Records that are deleted just have the
deleted field set to true (and the
updatedAt field is updated). Let’s take a look at what happens now
- CLIENT1 sets deleted=true on record A
- CLIENT1 syncs – record A is sent to the server
- CLIENT2 syncs – record A is received with an updated deleted flag
- CLIENT2 removes the record from the local cache
When you request records from the server, records marked as
deleted are automatically discluded from the results unless you explicitly ask for them. If you are accessing the data and not using the OData service (e.g. you are using SQL directly), then ensure you add
AND deleted=false to your queries.
With soft delete you still enjoy incremental sync and the performance and network bandwidth benefits that come with it, but at a price. You have to purge deleted records yourself. I’ll have another post on dealing with this eventuality.
No code for today, but next time I’ll be implementing offline sync in my UWP app, so watch for that.