March 06, 2012

Making mPACT API Better

By: Alexey Sadovoy

You have probably read the overview of our mPACT API™ written by Yuriy Gumen, mBLAST VP of Product Development, that explains why and how you should use the calls we provide.

We don’t believe in resting on our laurels though, so as demand has grown for our API, we have been thinking of ways to improve its performance and response time. After all, our API is designed to be a day-to-day tool for anyone who watches media and social media, tracks company presence (or for whatever purpose – we’re hearing new uses from our users all the time!), so we want it to provide the data people need, exactly when they need it.

As a first step, we focused on what is probably the most challenging part of the mPACT API: social media search. We understand how important it is that our users get this information very quickly, ideally within minutes after it was posted on a social network. That’s a challenge, because the process of post entering our system and appearing in our search results is actually very complicated and requires a lot of processing. In order to make our API calls even more responsive, we thoroughly reviewed all of these background processes, optimized them and now we are pleased to announce that you can receive social media posts in less than 15 minutes after they first appear on the social network platform. Does that sounds slow to you? Unfortunately Twitter doesn’t just deliver all its posts packed in a present box with a gift card, saying “here ’s a post you can use”.  Because of that, we have to grab social media posts from multiple sources, weed out all of the duplicates and then format (or normalize) the data, index it, filter out spam and scams, and only then bring it to you.  That’s a lot of work, but it means we present you with data that’s useful, curated and ready-to-go. We do this for millions of posts every hour – all processed into a format that you can search and use.

Speaking of searching, we also improved the performance of our API search, by adding new hosts to our infrastructure, moving drives to a new high-speed array, and optimizing the performance of our operating systems to perfection. mPACT API search uses Lucene as a full-text search engine and Solr as a platform for it, slathered with mBLAST special sauce. These are, simply put, the most powerful tools currently available for search and our search implementation team has taken advantage of all features that these platforms give us.

After we did this work on under the covers, we next released a new version of the API itself, preserving the original functions while adding a bunch of new features. Why release a new version when the first version was announced less than two months ago? The answer is performance, stability and scalability. To give you some background, our API started as a side project while we were focused on releasing mPACT Pro™ a year and a half ago. Essentially, we created the API as custom development project for one of our major clients, satisfying their need to get information from our system continuously, without using mPACT Pro’s Web-based front end. We had a lot of success with that custom project, so at the end of 2011 we decided to extend our API by adding a number of functions and releasing it to the public. The demand since the launch has been huge, so we immediately began asking ourselves how we could improve our API to be able to support a large number of users and then an even larger number of calls that we realized we were going to have. Accordingly, we started development of this new version simultaneously with release of the current API.

Writing this kind of software, where you know you will be supporting a very significant number of calls, is always a challenge. We’ve all probably experienced sites that can’t handle the load (and hated it), so one of our main goals for our API service is to be scalable, quick and responsive. Down those lines, we have chosen the ServiceStack platform as a core of our new API. It’s simple, it’s fast, it’s written in a very friendly way and it’s Open Source. Those are attributes we think everyone will appreciate. This new approach allowed us to make a robust and quick API that conforms to all rules of modern APIs. It’s RESTful, of course, and requests are done in a neat and structured way.

This approach should be familiar to anyone who has used any Web APIs before; any developer (even a novice) will be able to understand it.  Our calls’ response is more informative now as well, including a caching flag and extended error information. We are using Redis NoSQL storage for caching, with a seamless ability to change to any other solution, like memcached. We have also added support for several formats that any application can benefit from:  XML, JSON, JSV, CSV, HTML.

Perhaps the most exciting new feature is the ability to get results in HTML. Imagine the following scenario: you add an iframe element to your page and give it a link that points to our API call returning social media for the topic you are interested in. That’s it. We will provide you with recent posts inside a pretty HTML template that fits your page design. Your site visitors will have this info automatically updated and even frame refreshed if they stay on a page for a while. Just like Twitter but more focused on what you need to display to your users.

Finally, our third and final step (for now, but not the last we’re taking to improve the mPACT API) is improving the quality of our data. We have a team dedicated to constantly reviewing publications, authors, media and social media, making sure that the results we provide are what users actually need. Not BS SEO-powered results, but real-world ones that have meaning and value. Our data team and team of processors do this as their daily job, all day, every day. We’ve learned that it’s almost impossible to have an automatic process handle this work entirely – if you try to be smart and only apply scripts that do this job, you will definitely lose huge amounts of important data. We do, of course, use a set of services that prepares our data and filters out obvious junk, but most of this process simply prepares the data and allows mBLAST’s humans to do their job faster and more efficiently.

With these new features in place (and more that I don’t have room to mention), we feel that our  mPACT API solution is the best on the market and is ready to keep up with the demand of even our highest volume clients. And really, we’re just getting started, so stay tuned for more as we roll out our next set of improvements. I hope that you’ll try it today and – please – give us your feedback. We’re always listening.

Alexey is mBLAST's CTO and has been with the company since its inception. He is responsible for leading all of mBLAST's development efforts and architecting mBLAST's mPACT product line. You can follow him on Twitter here

More Posts by Alexey Sadovoy

Filed in Category: mPACT API, Tech Updates