Tuesday, December 23, 2008

PHP Advent 2008 Post: JSON is Everybody’s Friend

Security

Reposting something you wrote is a lot easier than coming up with original content. This article was originally posted as part of the 2008 PHP Advent Calendar

Last year I wrote some kind of feel-good, hippie crap for the PHP Advent Calendar. This year I got a haircut, took a shower, and am ready to tell you about something practical: working with JavaScript Object Notation (JSON) in PHP.

“JSON?!” you cry. “But that’s for JavaScript fruitcakes!” And you’d be right. But JSON, while valid JavaScript code, is actually a very effective serialized data format for exchanging data between applications written in a variety of languages. And as of PHP 5.2, we have native support for JSON encoding and decoding.

But I Love XML!

Who doesn’t? The thing is, for moving data around—not documents, but data—I think JSON is superior to XML. This is mainly because JSON tends to map better to the data models in most programming languages—arrays, objects, etc. JSON is serialized JavaScript, so it’s easy peasy representing these kinds of things. Moving between JSON and your language’s internal data models is usually just a single function call. Smarter people than me can debate the pros and cons of XML and JSON in various scenarios, but for a lot of what I do, JSON does the same thing and requires less work.

Get Your PHP Objects On

In PHP 5.2+, there are two functions that handle JSON: json_encode(), which takes a PHP data structure and returns a JSON string, and json_decode(), which takes a JSON string and converts it into a PHP data structure. Let’s take a look at json_encode() and how it handles converting a PHP object into JSON:

<?php
// make a simple object
$obj->body           = 'another post';
$obj->id             = 21;
$obj->approved       = true;
$obj->favorite_count = 1;
$obj->status         = NULL;

// convert into JSON
$json_obj = json_encode($obj);
echo $json_obj;

This will output the following JSON string:

{"body":"another post","id":21,"approved":true,"favorite_count":1,"status":null}

Now, let’s take that JSON and turn it back into PHP:

// convert back into PHP
$new_obj = json_decode($json_obj);
var_dump($new_obj);

And we’ll have the following PHP object:

object(stdClass)#2 (5) {
  ["body"]=>
  string(12) "another post"
  ["id"]=>
  int(21)
  ["approved"]=>
  bool(true)
  ["favorite_count"]=>
  int(1)
  ["status"]=>
  NULL
}

Notice that the data types are retained in the process. JSON supports strings, numbers, objects, arrays (kinda), booleans and NULL. Things like methods, constants, and non-public properties are dropped in the encoding process, as shown in the example below:

<?php
class Foo {
  const     ERROR_CODE = '404';
  public    $public_ex = 'this is public';
  private   $private_ex = 'this is private!';
  protected $protected_ex = 'this should be protected';

  public function getErrorCode() {
    return self::ERROR_CODE;
  }
}

$foo = new Foo
$foo_json = json_encode($foo
echo $foo_json

This outputs the following JSON:

{"public_ex":"this is public"}

Also note that when you decode JSON into a PHP object, the object type is always stdClass. If we use $foo_json from above…

$foo_new = json_decode($foo_json);
var_dump($foo_new);

…we’ll get back the following object:

object(stdClass)#4 (1) {
 ["public_ex"]=>
 string(14) "this is public"
}

Arrays, Indexed & Associative

Array support in JSON is limited to indexed arrays—those with sequential, numeric keys. Associative arrays are converted into an object with corresponding properties.

<?php
// here's an indexed array. JSON natively handles these
$arr = array('orange', 'yellow', 'green', 122);
$json_arr = json_encode($arr);
echo $json_arr."\n";

That gives us this JSON:

["orange","yellow","green",122]

And we can convert it back into PHP…

$new_arr = json_decode($json_arr);
var_dump($new_arr);

…without losing anything in the process.

array(4) {
  [0]=>
  string(6) "orange"
  [1]=>
  string(6) "yellow"
  [2]=>
  string(5) "green"
  [3]=>
  int(122)
}

With associative arrays, the array is converted to an object. Also note that all associative keys are converted into strings in the process:

// associative PHP arrays are converted into objects
$assoc = array('orange'=>'yellow', 'green'=>122, 9=>122);
$json_assoc = json_encode($assoc);

The encoded JSON will be {"orange":"yellow","green":122,"9":122}. Decoding that gives us this PHP object:

object(stdClass)#5 (3) {
  ["orange"]=>
  string(6) "yellow"
  ["green"]=>
  int(122)
  ["9"]=>
  int(122)
}

You can, however, convert the JSON object back into a PHP associative array by passing TRUE as the second param to json_decode() (thanks Richard Orelup).

Playing Well with Others

That’s all fine and good, but it might help to actually do something useful with all this encodin’ and decodin’. JSON is a really good choice for moving data between client-side JavaScript applications and server-side applications, because JSON is JavaScript code, and maps perfectly onto JavaScript’s data structures.

So, let’s contrive a simple application that passes a “counter” object back and forth between the client and server. The counter object stores information on how often the PHP server application and the JavaScript client application “touch” it. First, we’ll make a PHP script to serve up some JSON code to our JavaScript client. It will look for an action parameter in $_GET, perform the requested action, and serve up some JSON in return.

<?php
if ($_GET['action'] === "getinit") {

  // make the PHP object and add a PHP touch
  $counter->php_touches    = 1;
  $counter->js_touches    = 0;
  send_as_json($counter);

} elseif ($_GET['action'] === "updatedata") {
  // decode JSON
  $counter = json_decode($_POST['counter']);

  // increment php touch counter
  $counter->php_touches++;

  // serve up as JSON
  send_as_json($counter);

} else {
  send_as_json(false);
}


function send_as_json($obj) {

  // convert object to JSON
  $json = json_encode($obj);

  // set appropriate headers
  header('Content-Type: application/json');

  // output JSON string as body
  echo $json;
}

The getinit action will create a $counter object, initialize it with a couple of variables to track “touches” by PHP and JSON, and serve up the object as JSON via the send_as_json() function. The updatedata action will take JSON from the $_POST array, decode it into the $counter object, increment the php_touches property, and serve up the modified object as JSON.

In the send_as_json() function, note the header() call that sets the Content-Type to application/json. Setting the content type correctly is good practice, and will sometimes avoid unexpected security issues.

On the client side, we’ll make a short HTML page with some embedded JavaScript to interact with our PHP service.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <title>JSON PHP Example</title>
  <!-- Load the Google JS libs hosting API -->
  <script src="http://www.google.com/jsapi"></script>
  <!-- Load the newest jquery major version 1 from Google -->
  <script type="text/javascript" charset="utf-8">
    google.load("jquery", "1");
  </script>
  <!-- load the JSON2.js parser -->
  <script src="JSON2.js"></script>

  <script type="text/javascript" charset="utf-8">
    var counter;

    function processData(data, textStatus) {
      counter = JSON.parse(data);
      counter.js_touches++;
      $('#js-count').val( parseInt(counter.js_touches));
      $('#php-count').val( parseInt(counter.php_touches));
    }

    // when the DOM is ready, do this stuff
    $(document).ready( function() {
      $.get('jsonphp.php?action=getinit', processData, 'text');
      $('#increment-button').click( function() {
        counter.js_touches = parseInt( $('#js-count').val() );
        counter.php_touches = parseInt( $('#php-count').val() );
        var json = JSON.stringify(counter);
        $.post('jsonphp.php?action=updatedata', { 'counter':json }, processData, 'text');
      });
    });
  </script>

  <style type="text/css" media="screen">
    body, input {
      font-family: Baskerville, Georgia, "Times New Roman", serif;
    }
    table {
      border:1px solid #CCC;
      padding:.2em;
      margin: .2em;
    }
    td.label {
      text-align:right;
      font-weight:bold;
    }
  </style>
</head>
<body>
  <table>
    <tr>
      <td class="label">JavaScript Touches:</td>
      <td><input type="text" id="js-count" value="0" /></td>
    </tr>
    <tr>
      <td class="label">PHP Touches:</td>
      <td><input type="text" id="php-count" value="0" /></td>
    </tr>
  </table>
  <form>
    <input type="button" name="increment" value="Increment!" id="increment-button" />
  </form>
</body>
</html>

Rendered in the browser, the page will look something like this:

Screenshot

Note that we’re using the jQuery library (loaded from the Google JavaScript Libraries API) to handle DOM manipulation and Ajax calls. We’re also using the JSON2.js parser, available at JSON.org, to safely parse and create JSON. The default method for decoding JSON into a JavaScript object or array is eval(), but that leaves us open to all sorts of code injection issues.

Jumping into our JavaScript code, the $().ready() is the most interesting portion. The ready() method called on $(document) takes the anonymous function passed into it, and executes the function as soon as the document is “ready”—that is, when the DOM is completed and can be safely messed with. So, when the DOM is ready, two things will happen:

  • A GET request is made to jsonphp.php?action=getinit, and the results are passed as text to the processData() function

  • An event listener is assigned to the click event for the button with the id of increment-button. This listener will fire a function that:

    1. Grabs the values for the touch counts out of the form fields

    2. Assigns them to the counter object

    3. Encodes the counter object as JSON

    4. POSTs counter to the server

So, each time we click the “Increment!” button, the object is sent from the JavaScript application to the PHP application as JSON. Each side will increment its own counter in the object, and the JavaScript application will display those values to us in the page. If we want, we can even change the values, and the JavaScript application will bind the new values to the object before sending it to the server.

What’s It All Mean, Dave?

Metaphysically, I have no idea. But practically, JSON support in PHP means that our favorite server-side language is an even better glue language— something it’s been good at for a long time. Now, go forth and make the next mega-ajax-crowdsourced-mashup masterpiece.

Posted in JavaScript, jQuery, PHP by funkatron on 12/23 at 04:34 PM
(9) CommentsPost a comment

Thursday, November 06, 2008

The Funkatron.com PHPMAX 2008 US Tour

The Funkatron.com PHPMAX 2008 US Tour

The coming weeks are gonna be a little hectic for me, as I fly to Atlanta on the 12th to attend php|works / PyWorks 2008. I’ll be doing the following talks:

I’m pretty excited about php|works / PyWorks 2008, especially because we can interact with the Python community and learn more about each other’s tech. Cross-pollination is a Good Thing.

I’m flying back to IND on the 15th, but then I’m leaving again on the 16th for San Francisco and the Adobe MAX 2008 conference. I’m giving the following talk:

Originally this talk was going to be given by John Resig, creator of the jQuery library, but he ran into some time constraints and thoughtfully suggested me. Of course, the description still mentions his name, which likely means I’ll be tarred and feathered by the audience. This also looks like it may be the largest audience I’ve ever given a talk before: there are currently about 70 folks listed as attending, with ~300 capacity. So if I wet myself on stage, please forgive me.

Posted in AIR, JavaScript, jQuery, Python, PHP by funkatron on 11/06 at 06:29 PM
(2) CommentsPost a comment

Wednesday, October 15, 2008

Safely parsing JSON in JavaScript

Wear safety shoes

I love me some JSON. It saves me tons of parsing headaches when exchanging data between web services because it maps so well to concepts shared among most common programming languages. It’s super easy to take a PHP object, convert it to JSON, and then push it to a Javascript (or a Ruby, or a Python) app.

Because JSON is valid JavaScript code, the most common method for converting it into native JS objects is to just eval the JSON. This is an extremely bad idea, because it opens your app up to all sorts of code injection attacks. Even with “trusted” sources, a security failure on your source’s end, or just a disgruntled employee, could wreak havoc on your apps and your users. I’d recommend reading Douglas Crockford’s “JSON and Browser Security”. Go ahead; I’ll wait. Rockford is impatient

jQuery, which we’ll use for all our examples because it’s awesome, will in many cases automatically parse JSON responses for you. This, as we learned above, is a Bad Thing. The following Ajax methods will automatically parse JSON in jQ (as of 1.2):

  • jQuery.getJSON()always
  • jQuery.ajax()if type is ‘json’
  • jQuery.get()if type is ‘json’
  • jQuery.post()if type is ‘json’

So my rules of thumbs are:

  1. never, ever use $.getJSON()
  2. never, ever set the type option to ‘json.’

To force the issue, I set my type to ‘text’ in my ajax calls. For example:

<script type="text/javascript" charset="utf-8" src="/js/jquery.js"></script>
<script type="text/javascript" charset="utf-8">
    $.ajax('http://twitter.com/statuses/public_timeline.json', function(data, textStatus) {
        alert('Status is '+textStatus);
        alert('JSON data string is: '+data);
    }, 'text');     
</script>

In the example above, we’re including the jquery library with the first <script> tag, and then calling the jQuery.ajax() method in the second. We’re passing three parameters:

  1. the URL we’re pulling the JSON string from. In this case, it’s the Twitter public timeline
  2. an anonymous function that’s called when the request is successful
  3. the type of data we’re getting, as a string. Using ‘text’ ensures no extra processing is done on the response string

So this is great, but all we’ve got is a string of serialized data, which isn’t terribly useful. Thankfully, there’s a handy library at JSON.org that takes care of parsing JSON without using eval without using eval on non-JSON code1. The library gives us two methods: JSON.parse() for turning a JSON string into a JS object, and JSON.stringify() for turning a JS object into a JSON string. So let’s utilize JSON.parse() in our code, and actually do something with that data:

In the modified example above, the second script tag loads the JSON2 library. We then use the JSON.parse() method to turn the data string into a JavaScript object – in this case, and array of Twitter message objects. Then we iterate over the array, building a string of HTML for each entry and prepending it to the <body> tag (so the newest item is on top).

Note: If you’re using this code on a remotely-hosted html page (or loading it as a local file under Firefox 3), it won’t work, and if you check in your error console you’ll probably see a security warning. That’s because our $.get() call directly accesses the Twitter API hosted on Twitter.com, which is almost certainly not the domain your files are hosted on. When we try to do so, it violates the same-origin policy enforced by browsers. The only workaround that I think is safe is to set up some sort of proxy on your domain to pass requests – other approaches like JSONP rely on eval()ing the result, which is what we’re trying to avoid here. I’ll try to cover setting up a local domain proxy in a future post.

By handling JSON with a parser rather than just using eval(), we mitigate the risk of code injection. This helps us protect both our application and our users.


  1. Basically, JSON.parse() runs a regex search for code that appears to be defining a function or redefining prototypes or other kinds of stuff beyond simple data transmission, and guts those parts.

Posted in AIR, JavaScript, jQuery, InfoSec by funkatron on 10/15 at 11:24 AM
(2) CommentsPost a comment

Friday, September 19, 2008

Slides and code from my ZendCon08 talk on AIR+PHP

The Doom Beer

Update: Audio is now available, thanks to Kevin Hoyt.

ZendCon was pretty fun, although I was feeling under the weather most of the time. Still, I was able to offend lots of folks with my DearZend project, which was also the source for one of my three examples in my talk. You can get the code for my examples, and the slides in PDF, from here or from SlideShare:

You can also get the code for the DearZend client app and server app on GitHub. For those curious, I wrote the server app in CodeIgniter.


Posted in AIR, JavaScript, jQuery, PHP by funkatron on 09/19 at 05:09 PM
(1) CommentsPost a comment

Friday, August 22, 2008

The ugly side of developing AIR HTML apps

i will take over the world.

I’ve been developing with AIR for over a year now. It’s all been HTML/JavaScript apps, but the complexity of most of them has been pretty low. Spaz, the Twitter client I created, definitely isn’t simple, though – mostly because I’m kind of a crappy coder, but also because it just does a lot of stuff.

Lately I’ve been getting a little bummed about AIR, because I’m starting to bump into limitations of the platform more and more. What’s particularly frustrating is that some of these limitations are specific to HTML AIR app dev. So what follows is a couple of issues that have been particularly troublesome when developing Spaz and other HTML/JS AIR apps.

AIR gets greedy under OS X

Julien Viet, who has contributed significantly to Spaz, made a post recently about the base resources used by the AIR runtime. In it he finds that the basic “Hello World” sample app provided by Adobe used about 20MB of RAM, and hovered at about 2% CPU usage even when idling. I replicated these tests on my Macbook (2.2Ghz, 4GB) under OS X 10.5.4 and WinXP SP2, and had similar results:

AIR HTML app - Base CPU and memory usage (WinXP SP2) AIR HTML app - Base CPU and memory usage

I also created a Flex-based “Hello World” application to compare base usage if the Webkit renderer isn’t being used. Memory usage was slightly lower, but otherwise it was comparable to the HTML app.

AIR Flex app - Base CPU and memory usage (WinXP) AIR Flex app - Base CPU and memory usage

The constant CPU usage seems to be specific to OS X: WinXP will jump back and forth between 0% and 2%. This is consistent with my observations of CPU usage in Spaz as well: in OS X, we get 10-15% CPU usage constantly, but under XP and Vista, CPU usage is 0% while idling. Initial memory usage is also about 30% higher with Spaz under OS X than it is under Windows. Something is goofy with AIR on OS X.

That aside, I don’t think the memory and CPU usage of AIR are surprising, considering the technology. AIR is a full Flash runtime, and HTML apps add a Webkit renderer on top of that. The memory and CPU usage is in line — or a bit lower — than most SSBs. Using traditional browser tech to build a desktop app is going to be a tradeoff in terms of memory and CPU usage, just like many other technologies that make development easier. But for one reason or another, the expectation seems to be that AIR is a lightweight technology, and it’s really not.

AIR HTML apps are leaky

What is particularly problematic is that there seem to be memory leaks within the Webkit engine as used in AIR – or at very least, it’s pretty easy to leak memory. Apps like Spaz run into this problem because it’s constantly pulling in large amounts of new data and images, and is often left running for hours or days on end.

Some of this is almost certainly because you can get away with that kind of thing on a web site, where users are unlikely to leave a single page open for days on end. Even if a web site does causes a big leakage, the browser almost always gets blamed. The libraries and tools available for HTML/JS devs, being browser-focused, typically don’t pay much attention to long-term performance issues.

Essential tools, like profilers and step debuggers, are non-existent

But memory leaks are a problem for most desktop app developers. That’s why memory profiling tools are common in desktop frameworks. And Adobe certainly understands this to be the case with Flex, as it’s FlexBuilder 3 IDE ships with memory and peformance profiling tools, as well as a full-featured debugger with breakpoints et al. That’s perfect if I was writing something in Flex, but it is completely useless for HTML/JavaScript AIR apps. I was able to get Spaz to run from within Flex Builder with a tiny bit of hackery, but the profiler gets no information about what’s going on within Webkit.

Other common choices for debugging web apps aren’t available either: Firebug is Mozilla-specifc, and the Drosera debugger and Web Inspector tools can’t hook into AIR’s Webkit instances. The information on Drosera does say:

In WebKit there is a WebScriptDebugServer which is started when WebKit starts and listens for a DCOM connection. Once a connection is made it will begin to interact with Drosera via Drosera’s ServerConnection class. WebScriptDebugServer informs Drosera about every line of JavaScript executed.

I would assume that WebScriptDebugServer exists inside AIR’s Webkit, but AFAIK this has not been utilized in any way. Adobe does have a nice AIR-specific tool called the AIR Introspector in the SDK, and it does offer some very useful functionality, but it too lacks step debugging and profiling capabilities.

That leaves us with good old console logging as our only option for debugging AIR HTML apps. That’s okay for simpler stuff, but tracking down execution bugs in more complex applications becomes extremely tedious. And console logging is basically useless for memory profiling, as there’s no insight into the underlying operations of Webkit.

I’m proud of the work I’ve done on Spaz, but as the application has grown in complexity, I’ve found the limitations of HTML app dev in AIR more and more problematic. Adobe has consistently stated that the HTML dev side of AIR is as important as the Flash/Flex side. To quote Mike Chambers:

Adobe AIR is as much about JavaScript, HTML, CSS, etc… as it is about Flash / Flex. #

If that truly is the case, Adobe needs to commit to providing a toolset for HTML/JS developers comparable to what Flex developers have. Until that’s changes, Mozilla’s XULRunner and Prism platforms are looking more and more appealing.

Posted in AIR, Development, JavaScript, OS X, Spaz, XULRunner by funkatron on 08/22 at 04:10 PM
(7) CommentsPost a comment
Page 1 of 124 pages  1 2 3 >  Last »