Performance

For the purposes of this discussion, there are two types of applications on Facebook:

  • FBML Canvas applications, which are rendered by Facebook, using FBML hosted by you, the developer, on your server.
  • IFrame Canvas applications and websites using Facebook, which are rendered by a developer’s server without using Facebook as an intermediary, although certain XFBML tags are rendered by Facebook as Iframes. Although IFrame Canvas applications appear on Facebook.com, they are architecturally very similar to the Facebook applications that appear on non-Facebook sites.

The difference between the two application types is mainly in the display mechanism. FBML applications pass markup to Facebook for rendering, while other applications do not. Both types of applications retrieve user data from Facebook’s servers, and that is one area in which speed and efficiency can be improved with a few simple techniques.


Getting Data Faster

Optimizing the way you fetch user data can make a dramatic difference in the load times of your application and the burden on Facebook’s servers.

Here’s a diagram of the progression of a FBML Canvas page load:

FBML Canvas Applications Information Flow Figure 1 - FBML Canvas Applications Information Flow

API calls can, in the worst cases, cause several trips over the network for steps 3 and 4, and each call also incurs the overhead of starting a new request on Facebook’s servers.

Websites and IFrame Canvas Applications Information Flow Figure 2 - Websites and IFrame Canvas Applications Information Flow

Websites and IFrame applications seem a little more complicated, but the important thing to remember for now is that you have a choice as to when you fetch the data; you can do it server side (steps 4 and 5 in Figure 2), or client side (steps 7 and 8). There are different optimization techniques for each, and different reasons to choose one approach or the other. Choose the approach that fits best with your development practices. If both approaches are equally acceptable to you, this document can help you choose based on performance characteristics.

Getting data quickly involves three main principles:

  • Combining multiple requests into one.
  • Reducing network round trips
  • Caching data.

Use FBML and XFBML when you can

FBML and XFBML are HTML extensions used in Facebook applications. FBML is used in traditional Canvas applications, and is rendered by Facebook directly; XFBML is used in iframe applications and websites, and is rendered by JavaScript and using iframes. Social Plugins can be rendered using XFBML or iframes.

For the simplest data-fetching scenarios, you can use FBML or XFBML markup tags which both get and display data. For example, <fb:name> gets the user’s name and profile link and displays it.

The good news is that if you’re using FBML or XFBML, Facebook optimizes your data fetching: all of the data needed for an entire document’s FBML or XFBML tags is retrieved in one call to Facebook’s servers. This does not apply to XFBML tags which generate iframes, as some Social Plugins do. Each XFBML tag requests data separately.

Take a look at Figure 1 - Canvas FBML Applications. The retrieval of FBML data happens between steps 5 and 6, after Facebook gets the markup back from your application. If all of your data requests were done through FBML, steps 3 and 4 are eliminated, minimizing total network round trips.

XFBML is different; any data requests are made from your browser using Facebook’s JavaScript after the page has been downloaded to the user.

XFBML still can require a round-trip, but in some cases it can use the browser’s cache to avoid going over the network at all. The client library batches all the requests together before sending them over the wire. In mid-2010, Facebook’s JavaScript library will support browser caching, so that users who visit your application multiple times will benefit from improved page loads.

For more advanced scenarios, where you want to get certain data and display it in your own way, you’ll need to be a bit more clever about getting data.

Use FBML.

The old REST-style API approach is inefficient

Let’s start with a snippet of a canvas FBML application written in PHP. It’s a naïve implementation of an application, written using the old REST API, that simply displays the hometowns of twenty-five of the user’s friends.

$facebook = new Facebook($appapikey, $appsecret);
$user_id = $facebook->require_login();

$friends = $facebook->api_client->friends_get();
$friends_locations =
    $facebook->api_client->users_getInfo($friends, 'hometown_location');

// Greet the currently logged-in user!
echo "<p>Hello, <fb:name uid=\"$user_id\" useyou=\"false\" />!</p>";

// Print out at most 25 of the logged-in user's friends,
// using the friends.get API method    
echo "<p>Friends' Towns:<br>";

$friends_locations = array_slice($friends_locations, 0, 25);
foreach ($friends_locations as $location) {
    if (!empty($location['hometown_location'])) {
        echo $location['hometown_location']['city'] . ', ' .
        $location['hometown_location']['state'] . '<br>';
    }
}
echo "</p>";

The two Facebook API calls, users_getInfo and friends.get, can take more than 50% of the loading time of the canvas page for this simple application. Optimizing those two calls could make a big difference.

Combining API calls

If you aren't using FBML or XFBML for a given data request, be sure to batch the data fetches. The rule of thumb is:

Combine as many requests as possible.

Combining server-side data fetches

You can use Facebook’s query language, FQL, to combine multiple API calls into one using the query function.

Here's an example of combining the two requests by using FQL to query the Graph API:

https://api.facebook.com/method/fql.query?access_token=...&query=SELECT%20hometown_location%20from%20user%20where%20uid%20in%20%28SELECT%20uid2%20from%20friend%20where%20uid1=1285036001%29    

Here's an example of making the same FQL query using the PHP client library:

$friends_locations = $facebook->api_client->fql_query(
    'SELECT hometown_location from user where uid in ' .
    '(SELECT uid2 from friend where uid1=' . $user_id . ')');

Using FQL to combine queries avoids multiple round trips from your server to Facebook’s server, as well as the overhead costs involved in processing a request on Facebook’s side. In addition, Facebook’s servers can sometimes do more optimization when you provide more information about which data you need. If the same data is used in multiple queries, running them on the same box in the same request allows them to be cached.

Virtually all API methods that fetch data have an equivalent (and often more flexible) FQL query.

In situations where the queries do not depend on each other, or do not neatly nest in FQL’s syntax, you may use fql.multiquery to send multiple queries at once.

If you are making API calls other than queries, you can use batch.run to process up to twenty requests in one round-trip with Facebook’s servers.

Using FQL instead of the two API calls described above lowers the page load time by upwards of one second.

Use FQL instead of other old REST API methods.

Combining client-side data fetches

This section applies only to IFrame Canvas applications and websites using Facebook.

Instead of making requests from your servers to Facebook, you might choose to request that data from the user’s browser. However, using the FB.Data abstraction from Facebook’s JavaScript client library can result in a net win for making requests from the browser, because it makes use of the Browser cache.

The FB.Data API allows you to combine multiple FQL queries into one multiquery. You can pass FQL queries into FB.Data.query.

FB.Data combines queries into one multiquery on your behalf, making it easy to do the right thing Use FB.Data.

Caching data

Both IFrame Canvas applications and websites using Facebook have the option of requesting data from your server or from the user’s browser. Thinking through the caching implications will help you make that decision.

Be aware of potential legal issues about caching information about your users. Which user data you can cache on your servers, and for how long, is governed by Facebook Platform’s Developer Principles & Policies. Read over that policy before implementing any of these suggestions.

Server-side data caching

Users get the best experience if the data displayed back to them is up to date in real time. One way to ensure this is, of course, to request data from Facebook each time you need it.

A better way is to cache the data, and then subscribe to push notifications for all of your users when their data changes, using real time updates.

Assuming user data changes less frequently than the user accesses your application, using caching with push notifications is a good idea. It is also a good idea if caching data brings down the latency for your application load.

If you can’t set up a server to take realtime updates, you should request new data each time, or cache the data for 24 hours or so if your users tend to access multiple pages on your app per day.

Facebook does not currently provide a client library for caching data on your server; however, it is an active contributor to the open source caching utility memcached, and many developers find this to be a good choice.

To cache data, subscribe to realtime data updates.

Client-side data caching

Requesting data from the user’s browser may, at first glance, seem like a bad move; the user’s latency with Facebook’s servers is likely to be slower than your server’s.

However, using XFBML and FB.Data.query will, as of mid-2010, have the additional advantage of using your browser cache, so users’ subsequent visits to your pages can result in much faster loads.

If the nature of your data and your users’ usage of your application is likely to benefit from caching, consider using the client-side library. Be sure to test the speed of your application both when the cache is primed and when it isn’t.

Minimizing fan-out

Fan-out queries are those that access many different servers. Since each user’s data is likely to be on a different server, the classic example of a fan-out query is getting information about all of a user’s friends. Facebook usually gathers information for each friend in parallel, so the response time of a query is based on the slowest response time for a given friend. You can increase the responsiveness of a query by limiting the number of users. This is often appropriate in scenarios when your application page is only going to display a few items anyway.

Many queries have implicit limits. For example, only up to 4500 rows are typically returned from a query of the event_member FQL table. For large events, this could mean intersecting those results with other sets would return fewer results than it should.

It is important to test your queries on nodes with lots of connections: groups with lots of members, users with lots of friends, etc. Find people with 2500 friends to stress your queries. If they fail or are remarkably slow, consider using one of the following techniques to improve performance, as well as correctness.

Limits

Use limits in your FQL query, and do it as early as you can. Example:

$friends_locations = $facebook->api_client->fql_query(
    'SELECT hometown_location ' .
    'FROM user ' .
    'WHERE uid in ' .
       '(SELECT uid2 FROM friend WHERE uid1=' . $user_id . ' LIMIT 40)' .
    'LIMIT 25');

This query limits the sub-query to 40 results and the final query to 25 results. Since not everyone enters their hometown locations, you may receive fewer than 40 results. You hope that a query for 40 records provides enough hometowns to fill out the final list of 25 records.

Depending on the query, the earlier LIMIT clause may or may not be necessary. Rather than guessing, time the performance to see how fast the queries are running, preferably on an object that has lots of connections.

According to informal measurements, for a user with around 150 friends, page load time dropped by about 100ms after adding limits to the query.

Smallest query first

Another important technique is to order your queries properly, especially those that intersect (join) two sets. Let’s say you want to find the names of a user’s friends who attended a particular event. That event could have an extremely large number of users attending. There are several ways you could write this query:

1. SELECT first_name 
   FROM user 
   WHERE uid IN 
      (SELECT uid2 FROM friend WHERE uid1={*user*})
   AND uid IN 
      (SELECT uid FROM event_member WHERE eid={*event*})

In this first query, the two sub-clauses in parentheses are peers – neither depends on the other. Both are executed in full, and the results intersected.

2. SELECT first_name 
   FROM user 
   WHERE uid IN 
       (SELECT uid2 FROM friend WHERE uid1={*user*} 
        AND uid IN 
            (SELECT uid FROM event_member WHERE eid={*event*}))

3. SELECT first_name 
   FROM user 
   WHERE uid IN 
       (SELECT uid FROM event_member WHERE eid={*event*}
        AND uid IN 
            (SELECT uid2 FROM friend WHERE uid1={*user*}))

In the last two queries, the sub-clauses are nested, and therefore the FQL engine will execute the innermost query, and then individually check the outer query for each result.

Query 3 is the best query because it selects the friends first. Users average about 100 friends, and are limited to only a few thousand friends; events can be much larger. Query 3 checks on average 100 friends to see if they attended the event. Query 2, in contrast, checks a potentially large number of event attendees to see whether he/she is a friend.

In fact, the first and second queries may not even be correct if the event has more than a few thousand members, because the FQL engine imposes arbitrary limits on the amount of data returned from queries.

Write queries to return as few results as possible.

Request as few columns as possible

There is a big difference in performance between

SELECT about_me, activities, affiliations, birthday, books, current_location, 
education_history, first_name, has_added_app, hometown_location, 
hs_info, interests, is_app_user, last_name, locale, meeting_for, meeting_sex, 
movies, music, name, notes_count, pic, pic_with_logo, pic_big, 
pic_big_with_logo, pic_small, pic_small_with_logo, pic_square, 
pic_square_with_logo, political, profile_update_time, profile_url, 
proxied_email, quotes, relationship_status, religion, sex, 
significant_other_id, status, timezone, tv, wall_count, work_history 
FROM user 
WHERE uid in (SELECT uid2 FROM friend WHERE uid1=X)

and

SELECT about_me, activities, affiliations, birthday, books, first_name, 
has_added_app, interests, is_app_user, last_name, movies, music, pic_big, 
pic_square, profile_url, quotes, sex
FROM user 
WHERE uid in (SELECT uid2 FROM friend WHERE uid1=X)

Each field takes additional time to process; notably, because of Facebook's granular privacy, each could incur additional overhead for privacy checks for each user in the query.

Also, not all fields are created equally; some take a more time to retrieve than others. For example, for the user table, retrieving lists of interests (e.g. music, tv) will take longer if the user has many interests.

Write queries to select as few columns as possible.

Preload FQL Query and Multiquery

This section applies to FBML canvas pages, but not to websites or IFrame canvas pages.

If you will always (or almost always) need the results of one or more queries in your page, preload an FQL query or multiquery. To preload an FQL query or multiquery, write a script to register a particular query with Facebook for a certain set of pages for your application. After you run the script, Facebook will POST the results of that query to your application's URL.

The FQL still executes on each page load, but still avoids the cost of an additional network round trip between your servers and Facebook's servers, so the time you save will be proportional to the network distance. Facebook's API traffic is currently served from the East and West coasts of the US. If your server is on the west coast, and its traffic is served from the east, the cost of a round trip currently averages around 70ms.

In short, using preloaded FQL does not make the FQL itself faster. It only cuts down the need for extra API calls and related latency.

Usage

You need to define a rule for the preloaded FQL query. The rule has three parts:

  • name - The name of this rule.
  • pattern - A regular expression to match to the URL.
  • query - A parameterized FQL query to run.

Your application can set as many rules as you like. Here's how Facebook processes them. For each rule, if a pattern matches a URL snippet:

new_query = query with substituted get params 
post( "fb_sig_".name, run_query(new_query))}

For instance, in our example, you may have a rule (in JSON) as:

{"interests" : 
    {"pattern": "view.*", 
     "query" : "SELECT interests FROM user WHERE uid={*user*}"
    }
}

To set these rules, you need to call admin.setAppProperties. Here's how you can do this with PHP:

$fetch = array('interests' => 
               array('pattern' => 'view.*', 
                     'query' => 'SELECT interests FROM user WHERE uid="{*user*}"'));
$fb->api_client->admin_setAppProperties(array('preload_fql' => json_encode($fetch))); 
$res = $fb->api_client->admin_getAppProperties(array('preload_fql')); 
var_dump($res);

Regarding the json_encode for the $fetch variable, admin.setAppProperties expects its arguments as strings or numbers. See Application Properties for more details.

Also note there is a special parameter {*user*} that gives the active user, so if you want to get the friends of the current user who have authorized the application, try:

{"appfriends" : 
    {"pattern": ".*", 
     "query" : "SELECT uid FROM user WHERE is_app_user=\"1\" 
                AND uid IN (SELECT uid2 FROM friend WHERE uid1={*user*})"
    }
}

{*user*} is the only special parameter. You can use any GET parameter in your calls. Wrap the parameter in {curly braces}.

Applications will receive preloaded FQL data through HTTP_POST when a page matching the pattern is loaded. The data comes to your server as a JSON-encoded post parameter named signed_request. See the canvas authentication guide for details about the parameters.

Using the PHP client, you can preload FQL with a call to a client object's get_valid_fb_params($_POST, null, 'fb_sig') method, which should return an associative array with your preload FQL rule name as a key and result as a value. If the client method validate_fb_params() is called, preload FQL will be available through $(your client object)->fb_params[(your preload FQL rule name)].

To get the interests in PHP without using the client, we can look in $_POST['fb_sig_interests']. Since PHP auto-escapes quotes in post parameters, we need to be a bit tricky.

$interests = json_decode(stripslashes($_POST['fb_sig_interests']), true);

In other languages this should be as simple as:

interests = json_decode(_POST['fb_sig_interests']);

Removing Preload FQL

$facebook->api_client->admin_setAppProperties(array('preload_fql' => json_encode(array())));

Encoding the Preloaded FQL

admin.setAppProperties takes a map from keys to values where the values are strings or numbers. So to use preload FQL, you need to first encode the preload_fql structure as a string and then encode the whole property map.

For instance, this is incorrect:

"{ \"preload_fql\" : 
    {\"friends\" : 
        {\"pattern\": \"friends.aspx\", \"query\" : 
         \"SELECT uid2 FROM friend WHERE uid1={*user*})\"
        } 
     } 
}"

But this is correct:

{ "preload_fql" : "{\"friends\" : {\"pattern\": 
  \".*\", \"query\" : \"SELECT uid2 FROM friend WHERE uid1={*user*}\"} }" 
}

Optimizing Static Resources on Canvas Pages

Using a variety of pre-fetching and caching techniques, you can make your application run as quickly as possible. While there is general advice to follow (Steve Souders’ High Performance Web Sites is a good resource), there are several guidelines unique to Facebook.

Caching Facebook JavaScript (FBJS)

This section applies only to FBML Canvas pages.

FBJS is Facebook’s safe form of JavaScript for use with FBML tags. It looks like normal JavaScript, but there are a few differences.

Facebook’s application server parses your FBJS and converts it to sanitized JavaScript code. This process takes some time. Luckily, Facebook caches the results of this sanitization process. This happens in one of two ways:

  • For each <script src=”http://foo.com/bar.js”> tag in your HTML document, Facebook creates a signature from all of the contained FBJS code and inserts the sanitized JavaScript results into a cache, keyed on that signature and your application ID. If the FBJS code changes, the signature mismatch triggers a re-parsing of the code.
  • You can also include FBJS files using a script tag. See this document on including files for instructions. For each <script src=”http://foo.com/bar.js”> tag, Facebook will parse and sanitize the FBJS referenced by the tag, and keep a cached version keyed on the URL. This means that when you update the contents of that script, you’ll want to change the name of the file so that a new version may be cached.

In both cases, the FBJS script itself should not change dynamically. Use the same script across all users so that just one version can be cached and used by everyone. If for some reason the code itself must be dynamic, isolate the dynamic part in a separate script tag so that the rest of the script can be cached.

A script might take 90ms to parse and sanitize, but as little as 4 ms to pull from cache.

Getting static resources faster using Early Flush

This section applies to both FBML and IFrame Canvas pages, but not websites using Facebook.

On Facebook’s own web pages, references to static resources such as .JS, .SWF, and .CSS files are inside the <head> block so that they will begin downloading earlier, and in parallel with the rendering of the page.

Canvas applications are no exception, and can do the same thing using the early_flush property of admin.setAppProperties.

The concept here is similar to Preload FQL. Instead of queries, you put references to static resources that you are certain will be needed during the page load, in effect, pre-fetching them.

Each entry has a “name" attribute for organizational purposes, in case you have multiple categories of pages in your application. Each entry also has a "pattern" attribute that is a regular expression to match the URL, and a "resources" attribute that is a JSON object containing a list of the absolute URLs to resources you would like Early Flushed.

For example, use the regex .* to match all pages, and give it the name “All”. To Early Flush two resources (a resource.css and a flash.swf) you would pass:

{
    "name":"All", 
    "pattern":".*", 
    "resources":
        ["http://www.myapp.com/resource.css", 
         "http://www.myapp.com/flash.swf"]
}

You can define the JSON for your early_flush property via the App Settings Editor, under the Advanced tab. Note that you include references to these static resources as normal in your document; Facebook will generate an additional reference on your behalf.

This screenshot illustrates the type of improvement you might expect to get by using early flush. The developer had been loading CSS documents at the end of his HTML, and moved them, using early flush, into the header.

Loading of Static Resources without Early Flush Figure 3 - Loading of Static Resources without Early Flush

Loading of Static Resources with Early Flush Figure 4 - Loading of Static Resources without Early Flush

IFrame Canvas applications benefit less from the Early Flush strategy. Because the content is in a separate iframe, the browser must reparse the data in the context of your iframe. However, it will be fetching it from the browser’s cache when it otherwise might not be, so there may still be a performance benefit to using Early Flush.

Please note: When you are viewing your own page as the application’s developer, Early Flush will not happen because Facebook renders all of the FBML in the markup for easier debugging. Please use a test account, or friend’s account, to confirm your Early Flush is working properly.

Optimizing JavaScript Delivery

When a web browser encounters a reference to JavaScript in the body of an html document, it is quite conservative; it stops rendering the page until it can load the JavaScript and see what effect it might have on the remainder of the document.

How you avoid this issue depends, as usual, on which type of application you have.

FBML Canvas Pages – Avoid the <fb:js-string> Tag.

Facebook automatically places the JavaScript generated from FBJS at the bottom of the page. It also places any JavaScript in the Canvas page’s chrome at the bottom of the page. However, it can only do this if the application does not use the fb:js-string tag. You may gain a performance benefit if you do not use this tag.

Non-blocking JavaScript

This section applies to websites using Facebook and IFrame Canvas applications.

The JavaScript SDK facilitates faster page loads by allowing you to load JavaScript asynchronously; that is, it will not block the downloading and rendering of the page.

Unlike with FBML pages, you don’t get this for free. Reference the code sample for "asynchronous loading" here to use this approach.