Wednesday, October 15, 2008
Safely parsing JSON in JavaScript
I love me some JSON. It saves me tons of parsing headaches when exchanging data between web services because it maps so well to concepts shared among most common programming languages. It’s super easy to take a PHP object, convert it to JSON, and then push it to a Javascript (or a Ruby, or a Python) app.
Because JSON is valid JavaScript code, the most common method for converting it into native JS objects is to just eval the JSON. This is an extremely bad idea, because it opens your app up to all sorts of code injection attacks. Even with “trusted” sources, a security failure on your source’s end, or just a disgruntled employee, could wreak havoc on your apps and your users. I’d recommend reading Douglas Crockford’s “JSON and Browser Security”. Go ahead; I’ll wait. 
jQuery, which we’ll use for all our examples because it’s awesome, will in many cases automatically parse JSON responses for you. This, as we learned above, is a Bad Thing. The following Ajax methods will automatically parse JSON in jQ (as of 1.2):
jQuery.getJSON()– alwaysjQuery.ajax()– iftypeis ‘json’jQuery.get()– iftypeis ‘json’jQuery.post()– iftypeis ‘json’
So my rules of thumbs are:
- never, ever use
$.getJSON() - never, ever set the
typeoption to ‘json.’
To force the issue, I set my type to ‘text’ in my ajax calls. For example:
<script type="text/javascript" charset="utf-8" src="/js/jquery.js"></script>
<script type="text/javascript" charset="utf-8">
$.ajax('http://twitter.com/statuses/public_timeline.json', function(data, textStatus) {
alert('Status is '+textStatus);
alert('JSON data string is: '+data);
}, 'text');
</script>
In the example above, we’re including the jquery library with the first <script> tag, and then calling the jQuery.ajax() method in the second. We’re passing three parameters:
- the URL we’re pulling the JSON string from. In this case, it’s the Twitter public timeline
- an anonymous function that’s called when the request is successful
- the type of data we’re getting, as a string. Using ‘text’ ensures no extra processing is done on the response string
So this is great, but all we’ve got is a string of serialized data, which isn’t terribly useful. Thankfully, there’s a handy library at JSON.org that takes care of parsing JSON without using without using evaleval on non-JSON code1. The library gives us two methods: JSON.parse() for turning a JSON string into a JS object, and JSON.stringify() for turning a JS object into a JSON string. So let’s utilize JSON.parse() in our code, and actually do something with that data:
<script type="text/javascript" charset="utf-8" src="/js/jquery.js"></script>
<script type="text/javascript" charset="utf-8" src="/js/JSON2.js"></script>
<script type="text/javascript" charset="utf-8">
$.get('http://twitter.com/statuses/public_timeline.json', function(data, textStatus) {
alert('Status is '+textStatus);
alert('JSON data string is: '+data);
// this will give us an array of objects
var public_tweets = JSON.parse(data);
// iterate over public_tweets
for(var x=0; x < public_tweets.length; x++) {
var twt = public_tweets[x];
var elm = '<div class="tweet" id="'+twt.id+'"> \
<a href="'+twt.user.url+'"><img src="'+twt.user.profile_image_url+'" /></a> \
<div class="tweet-text">'+twt.text+'</div> \
</div>';
$('BODY').prepend(elm);
}
}, 'text');
</script>
In the modified example above, the second script tag loads the JSON2 library. We then use the JSON.parse() method to turn the data string into a JavaScript object – in this case, and array of Twitter message objects. Then we iterate over the array, building a string of HTML for each entry and prepending it to the <body> tag (so the newest item is on top).
Note: If you’re using this code on a remotely-hosted html page (or loading it as a local file under Firefox 3), it won’t work, and if you check in your error console you’ll probably see a security warning. That’s because our $.get() call directly accesses the Twitter API hosted on Twitter.com, which is almost certainly not the domain your files are hosted on. When we try to do so, it violates the same-origin policy enforced by browsers. The only workaround that I think is safe is to set up some sort of proxy on your domain to pass requests – other approaches like JSONP rely on eval()ing the result, which is what we’re trying to avoid here. I’ll try to cover setting up a local domain proxy in a future post.
By handling JSON with a parser rather than just using eval(), we mitigate the risk of code injection. This helps us protect both our application and our users.
-
Basically, JSON.parse() runs a regex search for code that appears to be defining a function or redefining prototypes or other kinds of stuff beyond simple data transmission, and guts those parts. ↩

