Thinking about Synchronisation in a Cloud-First World

As a mobile applications enthusiast from way back before Windows Mobile was a
thing (yeh, I mean the first time), one of the challenges has always been data
synchronisation. The challenge comes down to a trade off between stale data and
user experience. Simple applications don’t worry about any form of data
synchronisation; instead relying on pulling whatever data they need, when they
need it. Unfortunately, these applications feel like the user is always waiting
on data, or worse, seeing no data or error messages when there’s no connection.
In order to fix, what is essentially a user experience issue, application
developers often look at caching, or synchronising, data so that it can be made
available offline. This both improves performance, since data is read from
what’s cached on the device, it also means the data can be access when offline.
The inevitable question is then how much data should be cached, and how should
data be synchronised.

There is no golden-bullet when it comes to identifying what data needs to be
cached on the device. Some applications only cache data that the user has chosen
to look at; allowing them to come back and review the data at a later point
without having to request the data again. Other applications will proactively
cache all data related to the current user – whilst this seems like a good idea
initially, as the data related to a user increases, so does the time taken to
complete an initial, or even future, synchronisations. In addition, the logic to
retrieve all data related to a user can grow in complexity, often resulting in
more data than necessary being retrieved, to ensure no data is omitted.

Data synchronisation used to be an important topic with several attempts
being made by Microsoft to assist developers. For example there was a Windows
Mobile client for Merge Replication; there was the Microsoft Sync Framework and
more recently the Mobile App Service has a limited form of data synchronization.
Unfortunately none of these methods are well supported. Nor are they optimised
to take advantage of some the benefits of the cloud.

Let’s look at the most recent attempt by Microsoft to provide a
synchronisation framework for mobile apps – Offline
Data Sync for Azure Mobile Apps
. The basic premise is that the application
defines a table, or a query on a table, that will be pulled into a local Sqlite
database. Local changes can be made to the data. Data can then be synchronised
by pulling any server side changes and pushing local changes to the server. The
architecture has the mobile application connecting to a service, which then
either retrieves data from the database, or submits changes to the database.

Thinking about how a cloud application should scale, there are a number of
issues with this architecture:

  • If you look at how to scale cloud applications, having any services which
    connect directly to the database, can be a source for bottlenecks. I’m
    definitely not advocating for no relational database, just a bit of separation
    between the service tier and the database, such that most service calls don’t
    block waiting for the database to be available.
  • Connecting directly to services is a very slow way to retrieve data – it
    involves querying the database, processing the data into the format to be
    returned, and then returning it directly from the service instance. Compare this
    to retrieving the same data, pre-fetched and available via a CDN. The latter is
    going to be significantly faster, will cut down on bandwidth costs, will reduce
    load on both services and database since they no longer have to do work for
    every request for data.
  • Changes submitted to the services have to be applied directly to the
    database – if the database is offline, or otherwise unavailable, the service
    call will fail. Even if the data can be written immediately, there is a latency
    involved which increases the execution time of the service. This means more
    scaling of the service and slower responses back to the mobile application.

Reading between the lines you might notice that there are two areas that I
want to investigate to see if there’s an opportunity to improve data
synchronisation for cloud applications:

  1. Data retrieval: If we take the stance that the services won’t retrieve data
    directly from the database and return it to the mobile application, we need to
    think about how we can pre-fetch data and publish it out so it can be retrieved
    via the CDN. Of course, we don’t want to be publishing out the entire dataset
    into a single file which updates every time there is a change to the database,
    as this would undo any benefits of using the CDN. CDNs are optimised for
    immutable data; if you go trying to make changes to a file that’s already been
    published, you’ll run into a bunch of issues around caching of the data both in
    the CDN and on the client. Additionally, you don’t want the mobile application
    to download the entire dataset every time it needs to update. I’ll come back to
    talk more about the structure of the data but for the moment the assumption is
    that data will be retrieved and saved into an immutable file that will be placed
    in blob storage, making it accessible via the CDN.
  2. Data changes: Changes to data are a little harder as the initial standpoint
    for any developer is that changes should be immediately sent to the database,
    otherwise the database will be out of date. One of the realisations learnt by a
    lot of social sites (as they represent some of the largest datasets available)
    is that they rely on Eventual
    Consistency
    . In some data synchronisation cases, eventual consistency isn’t
    an option but for the other scenarios, it’s important to consider the use of
    queues to break up the change workflow. For example when a mobile app sends a
    change to the service, the service can simply queue the change and respond back
    a successful acknowledgement. When the change gets popped off the queue and the
    change applied to the database, a new message can go on the queue to indicate
    the response. When this message is popped off the queue, a notification is sent
    back to the mobile application indicating that the data has been saved to the
    database.

It’s been a while since I’ve reviewed synchronisation logic but it’s become
evident that there are some interesting options that are now possible with the
Azure cloud platform. I want to visit both of these areas in much more
detail.

Leave a comment