Nick's .NET Travels

Continually looking for the yellow brick road so I can catch me a wizard....

Thinking about Synchronisation in a Cloud-First World

As a mobile applications enthusiast from way back before Windows Mobile was a thing (yeh, I mean the first time), one of the challenges has always been data synchronisation. The challenge comes down to a trade off between stale data and user experience. Simple applications don’t worry about any form of data synchronisation; instead relying on pulling whatever data they need, when they need it. Unfortunately, these applications feel like the user is always waiting on data, or worse, seeing no data or error messages when there’s no connection. In order to fix, what is essentially a user experience issue, application developers often look at caching, or synchronising, data so that it can be made available offline. This both improves performance, since data is read from what’s cached on the device, it also means the data can be access when offline. The inevitable question is then how much data should be cached, and how should data be synchronised.

There is no golden-bullet when it comes to identifying what data needs to be cached on the device. Some applications only cache data that the user has chosen to look at; allowing them to come back and review the data at a later point without having to request the data again. Other applications will proactively cache all data related to the current user – whilst this seems like a good idea initially, as the data related to a user increases, so does the time taken to complete an initial, or even future, synchronisations. In addition, the logic to retrieve all data related to a user can grow in complexity, often resulting in more data than necessary being retrieved, to ensure no data is omitted.

Data synchronisation used to be an important topic with several attempts being made by Microsoft to assist developers. For example there was a Windows Mobile client for Merge Replication; there was the Microsoft Sync Framework and more recently the Mobile App Service has a limited form of data synchronization. Unfortunately none of these methods are well supported. Nor are they optimised to take advantage of some the benefits of the cloud.

Let’s look at the most recent attempt by Microsoft to provide a synchronisation framework for mobile apps – Offline Data Sync for Azure Mobile Apps. The basic premise is that the application defines a table, or a query on a table, that will be pulled into a local Sqlite database. Local changes can be made to the data. Data can then be synchronised by pulling any server side changes and pushing local changes to the server. The architecture has the mobile application connecting to a service, which then either retrieves data from the database, or submits changes to the database.

Thinking about how a cloud application should scale, there are a number of issues with this architecture:

  • If you look at how to scale cloud applications, having any services which connect directly to the database, can be a source for bottlenecks. I’m definitely not advocating for no relational database, just a bit of separation between the service tier and the database, such that most service calls don’t block waiting for the database to be available.
  • Connecting directly to services is a very slow way to retrieve data – it involves querying the database, processing the data into the format to be returned, and then returning it directly from the service instance. Compare this to retrieving the same data, pre-fetched and available via a CDN. The latter is going to be significantly faster, will cut down on bandwidth costs, will reduce load on both services and database since they no longer have to do work for every request for data.
  • Changes submitted to the services have to be applied directly to the database – if the database is offline, or otherwise unavailable, the service call will fail. Even if the data can be written immediately, there is a latency involved which increases the execution time of the service. This means more scaling of the service and slower responses back to the mobile application.

Reading between the lines you might notice that there are two areas that I want to investigate to see if there’s an opportunity to improve data synchronisation for cloud applications:

  1. Data retrieval: If we take the stance that the services won’t retrieve data directly from the database and return it to the mobile application, we need to think about how we can pre-fetch data and publish it out so it can be retrieved via the CDN. Of course, we don’t want to be publishing out the entire dataset into a single file which updates every time there is a change to the database, as this would undo any benefits of using the CDN. CDNs are optimised for immutable data; if you go trying to make changes to a file that’s already been published, you’ll run into a bunch of issues around caching of the data both in the CDN and on the client. Additionally, you don’t want the mobile application to download the entire dataset every time it needs to update. I’ll come back to talk more about the structure of the data but for the moment the assumption is that data will be retrieved and saved into an immutable file that will be placed in blob storage, making it accessible via the CDN.
  2. Data changes: Changes to data are a little harder as the initial standpoint for any developer is that changes should be immediately sent to the database, otherwise the database will be out of date. One of the realisations learnt by a lot of social sites (as they represent some of the largest datasets available) is that they rely on Eventual Consistency. In some data synchronisation cases, eventual consistency isn’t an option but for the other scenarios, it’s important to consider the use of queues to break up the change workflow. For example when a mobile app sends a change to the service, the service can simply queue the change and respond back a successful acknowledgement. When the change gets popped off the queue and the change applied to the database, a new message can go on the queue to indicate the response. When this message is popped off the queue, a notification is sent back to the mobile application indicating that the data has been saved to the database.

It’s been a while since I’ve reviewed synchronisation logic but it’s become evident that there are some interesting options that are now possible with the Azure cloud platform. I want to visit both of these areas in much more detail.

Comments are closed