Thursday, April 12, 2012

More quick links

More of what has caught my attention recently:
  • $1B for Instagram was silly and caused by fear ([1] [2] [3] [4]), but it is impressive the scale Instagram built with just three engineers ([5] [6])

  • Felix Salmon at Reuters writes that Twitter is under revenue pressure and will start doing things that make the site much less pleasant to use. I'd say Facebook is under similar pressure. Both likely will do increasingly aggressive attempts to sell their users to advertisers and may face a backlash. ([1] [2] [3])

  • Google has millions of machines ([1] [2] [3]), so many that "a performance improvement of even 1% can results in millions of dollars saved", which explains why they spend so much time on the details, like how threads run on cores and estimating disk space needed ([4] [5])

  • Great recent talk by Googler Jeff Dean on problems due to hitting occasional latency in large scale distributed systems, some surprising and useful advice here. ([1] [2])

  • While 89% of ad clicks are incremental (visit wouldn't have happened without the ad), only 50% of ad clicks on the top ad are incremental. Is that due to ads on navigational queries? And does Google effectively force companies to buy those ads (so competitors don't get them) even though the ads are not very effective? ([1])

  • "In this two-part blog post, we will open the doors of one of the most valued Netflix assets: our recommendation system." ([1])

  • "Yahoo's Chief Product Officer Blake Irving resigns" over disagreements on strategy, in particular he was "concerned about the massive engineering and research talent exodus of late, especially in Yahoo's vaunted Labs arm." ([1] [2])

  • The field of astronomy appears to be going through a major shift to large scale analysis of truly massive data sets ([1] [2])

  • Amazing to me that Walmart has taken this long to ramp up online against Amazon. Amazon even has been called the "Walmart of the Web"; you going to take that, Walmart? ([1])

  • A clever analysis deduces that Amazon has 450k machines in AWS. ([1] [2])

  • A video out of Microsoft Research shows how different interacting with a tablet would feel if touch response times could be made faster. Very compelling. ([1])

  • Other work out of Microsoft Research demos a Kinect-like gesture interface built using what is essentially echolocation via a laptop's built-in microphone and speaker, no other hardware required. (video [1] and CHI 2012 paper [2])

Sunday, March 18, 2012

Quick links

What has caught my attention lately:
  • Videos showing Windows 8 is horribly painful for most people, looks likely to be another Windows Vista-like flop. Really worth watching the videos or trying it yourself (videos [1] [2], try it [1] [2])

  • "38% of the ads are never in view to a user" and another 12% "of the ads are in view for less than 0.5 seconds" ([1])

  • "Many more ads" are coming on Facebook, "a lot more advertising ... [on] Facebook's traditionally clean interface." Could this mean Facebook is having revenue trouble already? ([1] [2] [3])

  • Not only is the iPhone over half of Apple's revenue, it is more than 70% of their profits. Apple really is a mobile phone manufacturer with a few other businesses attached. ([1])

  • Coming soon, a "voice-activated assistant that remembers everything you say ... systems that are more conversational, that have the ability to ask more sophisticated followup questions and adapt to the individual ... [with] short-term and long-term memory." ([1])

  • "Microsoft tries to find pockets of unrealized revenue and then figures out what to make. Apple is just the opposite: It thinks of great products, then sells them." ([1])

  • "The best way to get the most out of engineers is to surround them with other great engineers." ([1])

  • "It’s positively de-motivating to work for a company where your job is just to shut up and take orders. In tech startup land, we all understand instinctively that we have to hire super smart people, but we forget that we then have to organize the workforce so that those people can use their brains." ([1])

  • Programmers want to learn new skills and technology while working in a team of people they respect, and over 90% of programmers said they are willing to take a lower paying job to get that. ([1])

  • Netflix's streaming catalog continues to deteriorate, is now down to only 853 good movies, of which only 155 were released within the last five years ([1] [2] [3])

  • "This class is about setting you on the path to developing good taste as a programmer" (free, from Udacity, taught by Googler and AI guru Peter Norvig, starts Apr 16) ([1])

  • Could this be the business model for Udacity? Offer free classes online, then send companies candidates pre-screened for machine learning programming ability? ([1] [2])

  • If my blog is any indication, the only RSS feeder still being used is Google Reader. Are all others dead now? ([1])

  • Huge and wide open opportunity in personalized advertising for online news. Amazes me Yahoo and Amazon haven't gone after this, and that Google hasn't done a better job going after it. ([1])

  • Paper with fascinating statistics on Groupon and other daily deal sites. Most dramatic, it costs restaurants half a star in their Yelp rating if they offer Groupon deals. ([1])

Friday, March 09, 2012

Ad targeting at Yahoo

A remarkably detailed paper, "Web-Scale User Modeling for Targeting" (PDF), will be presented at WWW 2012 that gives many insights into how Yahoo does personalized advertising.

In summary, the researchers describe a system used in production at Yahoo that does daily builds of large user profiles. Each profile contains tens of thousands of features that summarize the interests of each user from the web pages they have viewed, searches they made, and ads they have viewed, clicked on, and converted (bought something) on. They explain how important it is to use conversions, not just ad clicks, to train the system. They measure the importance of using recent history (what you did in the last couple days), of using fine-grained data (detailed categories and even some specific pages and queries), of using large profiles, and of including data about ad views (which is a huge and low quality data source since there are multiple ad views per page view), and find all those significantly help performance.

Some excerpts from the paper:
We present the experiences from building a web-scale user modeling platform for optimizing display advertising targeting at Yahoo .... Our work ... [looks] into understanding the effect of different user activities on prediction, [gives] insights about the temporal aspect of user behavior (recency vs. long-term trends), and [explores] different variants (user representation and target label) through large offline and online experiments .... We deployed our platform to production and achieved a [large] boost in online metrics, such as eCPA, compared to the old system.

Our objective is to refine the targeting constraints using the past behavior of the users ... [so] we can improve the number of conversions per ad impression without greatly increasing the number of impressions.

User profiles are aggregated logs from different systems/products (e.g. user logs of Yahoo News, Yahoo Finance, etc.) .... We consider several different events ... [including] pages visited .. the category of the page ... searches issued, clicks on search links, clicks on search advertising links ... [and] the category of the search query ... [and] views and clicks on ads ... [and] the ad category ... from an existing hierarchical ad categorizer.

Our results show [a] large performance loss incurred in favoring long-term history over short-term history. This is obvious as the recent history clearly communicates with a high probability the current interest of the user ... Although recent history is more important than older history, we still need to include older history to get the most complete idea about the user.

Results show ... many of our raw features are completely non-discriminative. However, a small percentage of these features are actually important ... [For example, just] ... dropping all raw ad views ... [or if] we drop all raw features and only keep categorical features ... [causes] dropping [of] the weighted AUC measure by 3.69% and 4.26%, respectively ... In production ... we apply a coarse feature selection through mutual information, then we apply a rigorous feature selection through l1 regularization.
Very interesting. A couple things I am left wondering:

First, they found recent history is very effective, yet only update the profiles daily. Wouldn't their results on the value of recent behavior (which others found too) suggest that there would be benefit from hourly or, even better, real-time updates of the profiles (perhaps with a second memory-based, unreliable, and partial coverage system supplementing the data in the more complete and more accurate older profiles)? That would allow the system to adapt immediately when someone, for example, starts looking at information for a vacation to Hawaii and show relevant offers immediately instead of only being able to do it the next day when it is usually too late. Unfortunately, I suspect we're not going to see really big gains in relevance and usefulness of ads without real-time updates to profiles of fine-grained interests; results that show that data only 24 hours old is better than data a week old may only be a tease of the gains to be seen with data only seconds old.

Second, they find that features based on individual search queries and pages viewed ("raw features") usually have no value, but occasionally have enough value that it is important to include some. Wouldn't that suggest that the categorization scheme for pages viewed, searches made, and ads need to be more fine-grained (e.g. not just the category "pants", but the category "men's boot cut jeans")? Or, better, perhaps more fine-grained while also correctly cross correlated (interest in "men's boot cut jeans" not only shows in the data a weak interest in all pants, but also maybe has been shown to indicate a fairly strong interest in "men's flannel shirts")?

If you are interested in this paper, you might also want to look at another recent paper out of Yahoo Research, "Learning to Target: What Works for Behavioral Advertising" (ACM), which is referenced multiple times by this paper and describes the features used in the user profile in a bit more detail, as well as the results of some other experiments.

Please see also my 2007 post, "What to advertise when there is no commercial intent?"

Friday, January 27, 2012

More quick links

More of what has caught my attention lately:
  • Laptops with Kinect sensors are coming. Worth paying attention to, gesturing in air to issue commands, a very different UX could be built on top of this ([1] [2])

  • "Each streaming subscriber is worth only $2.40 in profit each quarter to Netflix, compared to $17.32 for each DVD subscriber. The old business was very lucrative. The new business kind of sucks." ([1])

  • "You're not going to get content owners to license ... for less than what they get from the cable companies ... [if you will] use that cheap content to destroy the cable companies' business model." ([1])

  • "Federal officials approached Google with evidence of its employees' wrongdoing ... Google agreed to pay $500 million to ... ward off criminal charges against the company." ([1])

  • Google is spending nearly $1B every quarter buying new servers and data centers. That buys a lot of machines. ([1] [2])

  • Education startups are suddenly very, very hot. ([1] [2] [3] [4])

  • "Tiny directional antennas at the top of each rack ... send and receive data. A central controller monitors traffic patterns, finds network bottlenecks, configures the antennas and turns on the wireless links when more bandwidth is required ... The design sped up traffic by at least 45 percent." ([1])

  • "Wimpy cores are fine, but if you go down to the wimpiest range, your gains really have to be enormous if you want to consider all the aggravation -- and the hit to their productivity -- that your software engineers face." ([1])

  • A Facebook engineer explains why is actually the right thing for Facebook to produce buggy code ([1])

  • "How sex, bombs, and burgers shaped our world" ([1])

  • "There is a monolithic view that this generation of technology I.P.O.'s is completely broken." ([1])

  • Just three engineers built and run Instagram, which has 14 million users, 150 million photos, several terabytes of data, and hundreds of machines. ([1] [2])

  • Startup founders "say that if they'd known when they were starting their company about the obstacles they'd have to overcome, they might never have started it." ([1])

  • Two 17-year-olds used a weather balloon to send a little Lego astronaut and a video camera 15 miles into the stratosphere. Very fun. ([1])

Tuesday, December 27, 2011

Quick links

Some of what has caught my attention recently:
  • Security guru Bruce Schneier predicts "smart phones are going to become the primary platform of attack for cybercriminals" soon ([1])

  • If, next, Amazon does a smartphone, I hope it is WiFi-based, like Steve Jobs originally wanted to do with the iPhone ([1] [2] [3])

  • iPhone owners love Siri despite its flaws ([1])

  • Valve, makers of Steam, talks about their pricing experiments: "Without making announcements, we varied the price ... pricing was perfectly elastic ... Then we did this different experiment where we did a sale ... a highly promoted event ... a 75 percent price reduction ... gross revenue [should] remain constant. Instead what we saw was our gross revenue increased by a factor of 40. Not 40 percent, but a factor of 40 ... completely not predicted by our previous experience with silent price variation." [[1]]

  • An idea whose time has come, profiling code based not on the execution time required, but the power consumed ([1])

  • Grumpy about work and dreaming about doing a startup? Some food for thought for those romanticizing startup life. ([1] [2])

  • Yahoo discovers toolbar data (the urls people click on and browse to) helps a lot for web crawling ([1])

  • Google Personalized Search adds explanations. Explanations not only add credibility to recommendations, but also make people more accepting of recommendations they don't like. ([1])

  • "Until now, many education studies have been based on populations of a few dozen students. Online technology can capture every click: what students watched more than once, where they paused, what mistakes they made ... [massive] data ... for understanding the learning process and figuring out which strategies really serve students best." ([1])

  • Andrew Ng's machine learning class at Stanford was excellent; I highly recommend it. If you missed it the first time, it is being offered again (for free again) next quarter. ([1])

  • Microsoft giving up on its version of Hadoop? Surprising. ([1])

  • The NYT did a fun experiment crowdsourcing predictions. The results are worth a look. ([1] [2])

  • Web browsers (Firefox and Chrome) will be a gaming platform soon ([1] [2])

Wednesday, November 30, 2011

Browsing behavior for web crawling

A recent paper out of Yahoo, "Discovering URLs through User Feedback" (ACM), describes the value from using what pages people browse to and click on (which is in Yahoo's toolbar logs) to inform their web crawler about new pages to crawl and index.

From the paper:
Major commercial search engines provide a toolbar software that can be deployed on users' Web browsers. These toolbars provide additional functionality to users, such as quick search option, shortcuts to popular sites, and malware detection. However, from the perspective of the search engine companies, their main use is on branding and collecting marketing statistics. A typical toolbar tracks some of the actions that the user performs on the browser (e.g., typing a URL, clicking on a link) and reports these actions to the search engine, where they are stored in a log file.

A Web crawler continuously discovers new URLs and fetches their content ... to build an inverted index to serve [search] queries. Even though the basic mechanism of a crawler is simple, crawling efficiently and eff ectively is a difficult problem ... The crawler not only has to continuously enlarge its repository by expanding its frontier, but also needs to refresh previously fetched pages to incorporate in its index the changes on those pages. In practice, crawlers prioritize the pages to be fetched, taking into account various constraints: available network bandwidth, peak processing capacity of the backend system, and politeness constraints of Web servers ... The delay to discover a Web page can be quite long after its creation and some Web sites may be only partially crawled. Another important challenge is the discovery of hidden Web content ... often ... backed by a database.

Our work is the first to evaluate the benefits of using the URLs collected from a Web browser toolbar as a form of user feedback to the crawling process .... On average, URLs accessed by the users are more important than those found ... [by] the crawler ... The crawler has a significant delay in discovering URLs that are first accessed by the users ... Finally, we [show] that URL discovery via toolbar [has a] positive impact on search result quality, especially for queries seeking recently created content and tail content.
The paper goes on to quantify the surprisingly large number of URLs found by the toolbar that are useful, not private, and not excluded by robots.txt. Importantly, a lot of these are deep web pages, only visible by doing a query on a database, and hard to ferret out of that database any way but looking at the pages people actually look at.

Also interesting are the metrics on pages the toolbar data finds first. People often send links to new web pages by e-mail or text message. Eventually, those links might appear on the web, but eventually can be a long time, and many of the urls found first in the toolbar data ("more than 60%") are found way before the crawler manages to discover them ("at least 90 days earlier than the crawler").

Great paper out of Yahoo Research and a great example of how useful behavior data can be. It is using big data to help people help others find what they found.

Monday, November 28, 2011

What mobile location data looks like to Google

A recent paper out of Google, "Extracting Patterns From Location History" (PDF), is interesting not only for confirming that Google is studying using location data from mobile devices for a variety of purposes, but also for the description of the data they can get.

From the paper:
Google Latitude periodically sends his location to a server which shares it with his registered friends.

A user's location history can be used to provide several useful services. We can cluster the points to determine where he frequents and how much time he spends at each place. We can determine the common routes the user drives on, for instance, his daily commute to work. This analysis can be used to provide useful services to the user. For instance, one can use real-time traffic services to alert the user when there is traffic on the route he is expected to take and suggest an alternate route.

Much previous work assumes clean location data sampled at very high frequency ... [such as] one GPS reading per second. This is impractical with today's mobile devices due to battery usage ... [Inferring] locations by listening to RF-emissions from known wi-fi access points ... requires less power than GPS ... Real-world data ... [also] often has missing and noisy data.

17% of our data points are from GPS and these have an accuracy in the 10 meter range. Points derived from wifi signatures have an accuracy in the 100 meter range and represent 57% of our data. The remaining 26% of our points are derived from cell tower triangulation and these have an accuracy in the 1000 meter range.
The paper goes on to describe how they clean the data and pin noisy location trails to roads. But the most interesting tidbit for me was how few of their data points come from GPS and how much they have to rely on less accurate cell tower and WiFi hotspot triangulation.

A lot of people have assumed mobile devices would provide nice trails of accurate and frequently sampled locations. But, if the Googlers' data is typical, it sounds like location data from mobile devices is going to be very noisy and very sparse for a long time.

Tuesday, November 15, 2011

Even more quick links

Even more of what has caught my attention recently:
  • Spooky but cool research: "Electrical pulses to the brain and muscles ... activate and deactivate the insect's flying mechanism, causing it to take off and land ... Stimulating certain muscles behind the wings ... cause the beetle to turn left or right on command." ([1])

  • Good rant: "Our hands feel things, and our hands manipulate things. Why aim for anything less than a dynamic medium that we can see, feel, and manipulate? ... Pictures Under Glass is old news ... Do you seriously think the Future Of Interaction should be a single finger?" ([1])

  • Googler absolutely shreds traditional Q&A; and argues that the important thing is getting a good product, not implementing a bad product correctly to spec. Long talk, if you're short on time, the talk starts at 6:00, meat of the talk starts at 13:00, and the don't miss parts of the talk are at 17:00 and 21:00. ([1])

  • "There has been very little demand for Chromebooks since Acer and Samsung launched their versions back in June. The former company reportedly only sold 5,000 units by the end of July, and the latter Samsung was said to have sold even less than that in the same timeframe." ([1])

  • With the price change to offer Kindles at $79, Amazon is now selling them below cost ([1])

  • Personalization applied to education, using the "combined data power of millions of students to provide uniquely personalized learning to each." ([1] [2] [3] [4] [5] [6])

  • It is common to use human intuition to choose algorithms and tune parameters on algorithms, but this is the first I've ever heard of using games to crowdsource algorithm design and tuning ([1])

  • Great slides from a Recsys tutorial by Daniel Tunkelang, really captures the importance of UX and HCIR in building recommendation and personalization features ([1])

  • Bing finally figured out that when judges disagree with clicks, clicks are probably right ([1])

  • Easy to forget, but the vast majority of US mobile devices still are dumbphones ([1])

  • Finally, finally, Microsoft produces a decent mobile phone ([1])

  • Who needs a touch screen when any surface can be a touch interface? ([1])

  • Impressive augmented reality research demo using Microsoft Kinect technology ([1])

  • Very impressive new technique for adding objects to photographs, reproducing lighting, shadows, and reflections, and requiring just a few corrections and hints from a human about the geometry of the room. About as magical as the new technology for reversing camera shake to restore out-of-focus pictures to focus. ([1] [2])

  • Isolation isn't complete in the cloud -- your neighbors can hurt you by hammering the disk or network -- and some startups have decided to go old school back to owning the hardware ([1] [2])

  • "The one thing that Siri cannot do, apparently, is converse with Scottish people." ([1])

  • Amazon grew from under 25,000 employees to over 50,000 in two years ([1])

  • Google Chrome is pushing Mozilla into bed with Microsoft? Really? ([1])

  • Is advice Steve Jobs gave to Larry Page the reason Google is killing so many products lately? ([1])

  • Why does almost everyone use the default software settings? Research says it appears to be a combination of minimizing effort, an assumption of implied endorsement, and (bizarrely) loss aversion. ([1])

Friday, October 14, 2011

More quick links

More of what has caught my attention recently:
  • The first Kindle was so ugly because Jeff Bezos so loved his BlackBerry ([1])

  • "Sometimes it takes Bad Steve to bring products to market. Real artists ship." ([1])

  • "The Mac sleep indicator is timed to glow at the average breathing rate of an adult: 12 breaths per minute." Beautiful example of attention to design. ([1])

  • "A one-star increase in Yelp rating leads to a 5-9 percent increase in revenue" ([1])

  • Facebook games, rather than try to be fun, try to be addictive. They feed on the compulsive until they give up their cash. The most addicted spend $10k in one game in less than a year. ([1])

  • "The Like and Recommend buttons Facebook provides to other Web sites send information about your visit back to Facebook, even if you don't click on them ... Facebook can find out an awful lot about what you do online." ([1])

  • A new automated attack on CAPTCHAs that can break them in an average of three tries. Even so, paying people to break CAPTCHAs is so cheap that that is probably what the bad guys will continue to do. ([1] [2])

  • Online backup and storage is now basically free. I expect this to be part of the operating systems soon (nearly is in Windows and Ubuntu) and all profits in online backup to drop to zero. ([1])

  • Prices for Netflix acquiring their streaming content appear to be going way up. Netflix just paid $1B over eight years for some CW network shows, and Starz rejected $300M/year -- a x10 increase -- for their movies. ([1] [2])

  • Someone spun up a truly massive cluster on Amazon EC2, "30,472 cores, 26.7TB of RAM and 2PB (petabytes) of disk space." ([1])

  • "Google's brain [is] like a baby's, an omnivorous sponge that [is] always getting smarter from the information it [soaks] up." ([1])

Monday, September 19, 2011

Quick links

Some of what has caught my attention recently:
  • "60 percent of Netflix views are a result of Netflix's personalized recommendations" and "35 percent of [Amazon] product sales result from recommendations" ([1] [2])

  • When doing personalization and recommendations, implicit ratings (like clicks or purchases) are much less work and turn out to be highly correlated to what people would say their preferences are if you did ask ([1])

  • Good defaults are important. 95% won't change the default configuration even in cases where they clearly should. ([1])

  • MSR says 68% of mobile local searches occur while people are actually in motion, usually in a car or bus. Most are looking for the place they want to go, usually a restaurant. ([1])

  • Google paper on Tenzing, a SQL layer on top of MapReduce that appears similar in functionality to Microsoft's Scope or Michael Stonebraker's Vertica. Most interesting part is the performance optimizations. ([1])

  • Googler Luiz Barroso talks data centers, including giving no love to using flash storage and talking about upcoming networking tech that might change the game. ([1] [2])

  • High quality workers on MTurk are much cheaper than they should be ([1])

  • Most newspapers should focus on being the definitive source for local news and the primary channel to get to small local advertisers ([1] [2])

  • Text messaging charges are unsustainable. Only question is when and how they break. ([1])

  • "If you want to create an educational game focus on building a great game in the first place and then add your educational content to it. If the game does not make me want to come back and play another round to beat my high-score or crack the riddle, your educational content can be as brilliant as it can be. No one will care." ([1])

  • A few claims that it is not competitor's failures, but Apple's skillful dominance of supply chains, that prevents Apple's competitors from successfully copying Apple products. I'm not convinced, but worth reading nonetheless. ([1] [2] [3])

  • Surprising amount of detail about the current state of Amazon's supply chain in some theses out of MIT. Long reads, but good reads. ([1])

  • If you want to do e-commerce in a place like India, you have to build out your own delivery service. ([1])

  • Like desktop search in 2005, Dropbox and other cloud storage products exist because Microsoft's product is broken. Microsoft made desktop search go away in 2006 by launching desktop search that works, and it will make the cloud storage opportunity go away by launching a cloud drive that works. ([1] [2] [3])

  • Just like in 2005, merging two failing businesses doesn't make a working business. Getting AOL all over you isn't going to fix you, Yahoo. ([1] [2])

  • Good rant on how noreply@ e-mail addresses are bad customer service. And then the opposite point of view from Google's Sergey Brin. ([1] [2])

  • Google founder Sergey Brin proposed taking Google's entire marketing budget and allocating it "to inoculate Chechen refugees against cholera" ([1])

  • Brilliant XKCD comic on passwords and how websites should ask people to pick passwords ([1])

Wednesday, September 07, 2011

Blending machines and humans to get very high accuracy

A paper by six Googlers from the recent KDD 2011 conference, "Detecting Adversarial Advertisements in the Wild" (PDF) is a broadly useful example of how to succeed at tasks requiring very high accuracy using a combination of many different machine learning algorithms, high quality human experts, and lower quality human judges.

Let's start with an excerpt from the paper:
A small number of adversarial advertisers may seek to profit by attempting to promote low quality or untrustworthy content via online advertising systems .... [For example, some] attempt to sell counterfeit or otherwise fraudulent goods ... [or] direct users to landing pages where they might unwittingly download malware.

Unlike many data-mining tasks in which the cost of false positives (FP's) and false negatives (FN's) may be traded off, in this setting both false positives and false negatives carry extremely high misclassification cost ... [and] must be driven to zero, even for difficult edge cases.

[We present a] system currently deployed at Google for detecting and blocking adversial advertisements .... At a high level, our system may be viewed as an ensemble composed of many large-scale component models .... Our automated ... methods include a variety of ... classifiers ... [including] a single, coarse model ... [to] filter out .. the vast majority of easy, good ads ... [and] a set of finely-grained models [trained] to detect each of [the] more difficult classes.

Human experts ... help detect evolving adversarial advertisements ... [through] margin-based uncertainty sampling ... [often] requiring only a few dozen hand-labeled examples ... for rapid development of new models .... Expert users [also] search for positive examples guided by their intuition ... [using a custom] tool ... [and they have] surprised us ... [by] developing hand-crafted, rule-based models with extremely high precision.

Because [many] models do not adapt over time, we have developed automated monitoring of the effectiveness of each ... model; models that cease to be effective are removed .... We regularly evaluate the [quality] of our [human experts] ... both to access the performance of ... raters and measure our confidence in these assessments ... [We also use] an approach similar to crowd-sourcing ... [to] calibrate our understanding of real user perception and ensure that our system continues to protect the interest of actual users.
I love this approach, blending experts and the human intuition of experts to help guide, assist, and correct algorithms running over big data. These Googlers used an ensemble of classifiers, trained by experts that focused on labels of the edge cases, and ran them over features extracted from a massive data set of advertisements. They then built custom tools to make it easy for experts to search over the ads, follow their intuition, dig in deep, and fix the hardest cases the classifiers missed. Because the bad guys never quit, the Googlers not only constantly add new models and rules, but also constantly evaluate existing rules, models, and the human experts to make sure they are still useful. Excellent.

I think the techniques described here are applicable well beyond detecting naughty advertisers. For example, I suspect a similar technique could be applied to mobile advertising, a hard problem where limited screen space and attention makes relevance critical, but we usually have very little data on each user's interests, each user's intent, and each advertiser. Combining human experts with machines like these Googlers have done could be particularly useful in bootstrapping and overcoming sparse and noisy data, two problems that make it so difficult for startups to succeed on problems like mobile advertising.

Tuesday, July 19, 2011

Quick links

Some of what has caught my attention recently:
  • Netflix may have been forced to change its pricing by the movie studios. It appears the studios may have made streaming more expensive for Netflix and, in particular, too costly to keep giving free access to DVD subscribers who rarely stream. ([1] [2] [3])

  • Really fun idea for communication between devices in the same room, without using radio waves, by using imperceptible fluctuations in the ambient lighting. ([1])

  • Games are big on mobile devices ([1] [2])

  • "Customers have a bad a taste in their mouths when it comes to Microsoft's mobile products, and few are willing to give them a try again." Ouch, that's going to be expensive to fix. ([1])

  • Microsoft's traditional strategy of owning the software on most PC-like devices may not be doing well in mobile, but they're stomping in consoles ([1]). On a related note, Microsoft now claims their effort on search is less about advertising revenue and more about improving interfaces on PC-like devices. ([2])

  • Many people have vulnerable computers and passwords. Why aren't more of them hacked? Maybe it just isn't worth it to hackers, just too hard to make money given the effort required. ([1])

  • In 2010, badges in Google Reader is an April Fools joke. In 2011, badges in Google News is an exciting new feature. ([1]).

  • Good (and free) book chapter by a couple Googlers summarizing the technology behind indexing the Web ([1])

  • Most people dread when their companies ask them every year to set performance goals because it is impossible to do well and can impact raises the next year. Google's solution? Don't do that. Instead, set lightweight goals more frequently and expect people to not make some of their goals. ([1] [2])

  • 60% of business PCs are still running WinXP. Maybe this says that businesses are so fearful of changing anything that upstarts like Google are going to have an uphill battle getting people to switch to ChromeOS. Or maybe this says businesses consider it so painful to upgrade Microsoft software and train their people on all the changes that, when they do bite the bullet and upgrade, they might as well switch to something different like ChromeOS. ([1])

  • Fun interview with Amazon's first employee, Shel Kaphan ([1])

  • Thought-provoking speculation on the future of health care. Could be summarized as using big data, remote monitoring, and AI to do a lot of the work. ([1])

  • Unusually detailed slides on Twitter's architecture. Really surprising that they just use mysql in a very simple way and didn't even partition at first. ([1])

  • Impressive demo, I didn't know these were possible so easily and fluidly using just SVG and Javascript ([1] [2])

Monday, July 11, 2011

Google and suggesting friends

A timely paper out of Google at the recent ICML 2011 conference, "Suggesting (More) Friends Using the Implicit Social Graph" (PDF), not only describes the technology behind GMail's fun "Don't forget Bob!" and "Got the right Bob?" features, but also may be part of the friend suggestions in Google+ Circles.

An excerpt from the paper:
We use the implicit social graph to identify clusters of contacts who form groups that are meaningful and useful to each user.

The Google Mail implicit social graph is composed of billions of distinct nodes, where each node is an email address. Edges are formed by the sending and receiving of email messages ... A message sent from a user to a group of several contacts ... [is] a single edge ... [of] a directed hypergraph. We call the hypergraph composed of all the edges leading into or out of a single user node that user's egocentric network.

The weight of an edge is determined by the recency and frequency of email interactions .... Interactions that the user initiates are [considered] more significant .... We are actively working on incorporating other signals of importance, such as the percentage of emails from a contact that the user chooses to read.

"Don't forget Bob" ... [suggests] recipients that the user may wish to add to the email .... The results ... are very good - the ratio between the number of accepted suggestions and the number of times a suggestion was shown is above 0.8. Moreover, this precision comes at a good coverage ... more than half of email messages.

"Got the wrong Bob" ... [detects] inclusion of contacts in a message who are unlikely to be related to the other recipients .... Almost 70% of the time [it is shown] ... users accept both suggestions, deleting the wrong Bob and adding the correct one.
I like the idea of using e-mail, mobile, and messaging contacts as an implicit social network. One problem has always been that the implicit social network can be noisy in embarrassing ways. As this paper discusses, using it only for suggesting friends is forgiving and low-risk while still being quite helpful. Another possible application might be to make it easier to share content with people who might be interested.

For more on what Google does with how you use e-mail to make useful features, you might also be interested in another Google paper, "The Learning Behind Gmail Priority Inbox" (PDF).

For more on implicit social networks using e-mail contacts, please see my 2008 post, "E-mail as the social network".

Thursday, June 09, 2011

Quick links

Some of what has caught my attention recently:
  • Oldest example I could find of the "PC is dead" in the press, a New York Times article from 1992. If people keep making this prediction for a few more decades, eventually it might be right. ([1])

  • Amazon CEO Jeff Bezos says to innovate, you have to try many things, fail but keep trying, and be "willing to be misunderstood for long periods of time". ([1])

  • Median tenure at Amazon and Facebook is a year or less (in part due to their massive recent hiring). Also, most people at Facebook have never worked anywhere other than Facebook. ([1])

  • Spooky research out of UW CS and Google that crowdsources surveillance, finding all the Flickr photos from an big event like a concert that happen to include a specific person (no matter at what angle or from what location the crowd of people at the event took the pictures). ([1])

  • You can scan someone's fingerprints from 6 feet away and copy their keys from 200 feet away. ([1] [2])

  • Pretty impressive valuations incubator Y Combinator is getting on its startups: "The combined value of the top 21 companies is $4.7 billion."([1])

  • But even even for some of the more attractive small startups to acquire, those out of Y Combinator, odds of acquisition still are only about 8%, and most of those will be relatively low valuation talent acquisitions. Sometimes it can seem like everyone is getting bought, but it is only a fortunate few who have the right combination of product, team, timing, luck, and network.([1])

  • Someone going solidly for the dumbphone market, which is by far the biggest market still, with a snazzy but simple and mostly dumb phone. That's smart. ([1] [2])

  • Google Scribe makes suggestions for what you are going to type next when you are writing documents. Try starting with "All work and" ([1]).

  • When I started my blog and called it "Geeking with Greg", the word "geek" still had pretty negative connotations, especially in the mainstream. A decade later, things have changed. ([1])

  • Not surprising people don't use privacy tools since the payoff is abstract and the tools require work for the average user to understand and use. What surprises me more is that more people don't use advertising blocking tools like AdBlock. ([1])

  • The sad story of why Google never launched GDrive. ([1])

  • Carriers are going to be upset about Apple's plans to disrupt text messaging. Those overpriced plans are a big business for carriers. ([1])

  • It would be great if Skype acquisition was part of a plan to disrupt the mobile industry by launching a mobile phone that always picks the lowest cost data network (including free WiFi networks) available. Consumers would love that; it could lower their monthly bills by an order of magnitude. ([1] [2])

  • Social data is of limited use in web search because there isn't much data from your friends. Moreover, the best information about what is a good website for you almost certainly comes from people like you who you might not even know, not from the divergent tastes of your small group of friends. As Chris Anderson (author of The Long Tail) said, "No matter who you are, someone you don't know has found the coolest stuff." ([1] [2])

  • Customization (aka active personalization) is too much work. Most people won't do it. If you optimize for the early adopter tinkerer geeks who love twiddling knobs, you're designing a product that the mainstream will never use. ([1])

  • If you launch a feature that just makes your product more complicated and confusing to most customers, you would have been better off doing nothing at all. Success is not launching things, but launching things that help customers. ([1])

  • Google News shifts away from clustering and toward personalization. ([1] [2])

  • Crowdsourcing often works better when unpaid ([1])

  • Eli Pariser is still wrong. ([1])

Monday, June 06, 2011

Continuous profiling at Google

"Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers" (PDF) has some fascinating details on how Google does profiling and looks for performance problems.

From the paper:
GWP collects daily profiles from several thousand applications running on thousands of servers .... At any moment, profiling occurs only on a small subset of all machines in the fleet, and event-based sampling is used at the machine level .... The system has been actively profiling nearly all machines at Google for several years.

Application owners won't tolerate latency degradations of more than a few percent .... We measure the event-based profiling overhead ... to ensure the overhead is always less than a few percent. The aggregated profiling overhead is negligible -- less than 0.01 percent.

GWP profiles revealed that the zlib library accounted for nearly 5 percent of all CPU cycles consumed ... [which] motivated an effort to ... evaluate compression alternatives ... Given the Google fleet's scale, a single percent improvement on a core routine could potentially save significant money per year. Unsurprisingly, the new informal metric, "dollar amount per performance change," has become popular among Google engineers.

GWP profiles provide performance insights for cloud applications. Users can see how cloud applications are actually consuming machine resources and how the picture evolves over time ... Infrastructure teams can see the big picture of how their software stacks are being used ... Always-on profiling ... collects a representative sample of ... [performance] over time. Application developers often are surprised ... when browsing GWP results ... [and find problems] they couldn't have easily located without the aggregated GWP results.

Although application developers already mapped major applications to their best [hardware] through manual assignment, we've measured 10 to 15 percent potential improvements in most cases. Similarly ... GWP data ... [can] identify how to colocate multiple applications on a single machine [optimally].
One thing I love about this work is how measurement provided visibility and motivated people. Just by making it easy for everyone to see how much money could be saved by making code changes, engineers started aggressively going after high value optimizations and measuring themselves on "dollar amount per performance change".

For more color on some of the impressive performance work done at Google, please see my earlier post, "Jeff Dean keynote at WSDM 2009".

Wednesday, May 18, 2011

Eli Pariser is wrong

In recent interviews and in his new book, "The Filter Bubble", Eli Pariser claims that personalization limits serendipity and discovery.

For example, in one interview, Eli says, "Basically, instead of doing what great media does, which is push us out of our comfort zone at times and show us things that we wouldn't expect to like, wouldn't expect to want to see, [personalization is] showing us sort of this very narrowly constructed zone of what is most relevant to you." In another, he claims, personalization creates a "distorted view of the world. Hearing your own views and ideas reflected back is comfortable, but it can lead to really bad decisions--you need to see the whole picture to make good decisions."

Eli has a fundamental misunderstanding of what personalization is, leading him to the wrong conclusion. The goal of personalization and recommendations is discovery. Recommendations help people find things they would have difficulty finding on their own.

If you know about something already, you use search to find it. If you don't know something exists, you can't search for it. And that is where recommendations and personalization come in. Recommendations and personalization enhance serendipity by surfacing useful things you might not know about.

That is the goal of Amazon's product recommendations, to help you discover things you did not know about in Amazon's store. It is like a knowledgeable clerk who walks you through the store, highlighting things you didn't know about, helping you find new things you might enjoy. Recommendations enhance discovery and provide serendipity.

It was also the goal of Findory's news recommendations. Findory explicitly sought out news you would not know about, news from a variety of viewpoints. In fact, one of the most common customer service complaints at Findory was that there was too much diversity of views, that people wanted to eliminate viewpoints that they disagreed with, viewpoints that pushed them out of their comfort zone.

Eli's confusion about personalization comes from a misunderstanding of its purpose. He talks about personalization as narrowing and filtering. But that is not what personalization does. Personalization seeks to enhance discovery, to help you find novel and interesting things. It does not seek to just show you the same things you could have found on your own.

Eli's proposed solution is more control. But, as Eli himself says, control is part of the problem: "People have always sought [out] news that fits their own views." Personalization and recommendations work to expand this bubble that people try to put themselves it, to help them see news they would not look at on their own.

Recommendations and personalization exist to enhance discovery. They improve serendipity. If you just want people to find things they already know about, use search or let them filter things themselves. If you want people to discover new things, use recommendations and personalization.

Update: Eli Pariser says he will respond to my critique. I will link to it when he does.

Friday, May 13, 2011

Taking small steps toward personalized search

Some very useful lessons in this work in a recent WSDM 2011 conference, "Personalizing Web Search using Long Term Browsing History" (PDF).

First, they focused on a simple and low risk approach to personalization, reordering results below the first few. There are a lot of what are essentially ties in the ranking of results after the first 1-2 results; the ranker cannot tell the difference between the results and is ordering them arbitrarily. Targeting the results the ranker cannot differentiate is not only low risk, but more likely to yield easy improvements.

Second, they did a large scale online evaluation of their personalization approach using click data as judgement of quality. That's pretty rare but important, especially for personalized search where some random offline human judge is unlikely to know the original searcher's intent.

Third, their goal was not to be perfect, but just help more often than hurt. And, in fact, that is what they did, with the best performing algorithm "improving 2.7 times more queries than it harms".

I think those are good lessons for others working on personalized search or even personalization in general. You can take baby steps toward personalization. You can start with minor reordering of pages. You can make low risk changes lower down to the page or only when the results are otherwise tied for quality. As you get more aggressive, with each step, you can verify that each step does more good than harm.

One thing I don't like about the paper is that they only investigated using long-term history. There is a lot of evidence (e.g. [1] [2]) that very recent history, your last couple searches and clicks, can be important, since they may show frustration in an attempt to satisfy some task. But otherwise great lessons in this work out of Microsoft Research.

Monday, May 09, 2011

Quick links

Some of what has caught my attention recently:
  • Apple captured "a remarkable 50% value share of estimated Q1/11 handset industry operating profits among the top 8 OEMs with only 4.9% global handset unit market share." ([1]). The iPhone generates 50% of Apple's revenue and even more of their profits. To a large extent, the company is the iPhone company. ([2] [3]) But, Gartner predicts iPhone market share will peak in 2011. ([4])

  • Researchers find bugs in payment systems, order free stuff from Buy.com and JR.com. Disturbing that, when they contacted Buy.com to report the problem, Buy.com's accounting systems had the invoice as fully paid even though they never received the cash. ([1])

  • Eric Schmidt says, "The story of innovation has not changed. It has always been a small team of people who have a new idea, typically not understood by people around them and their executives." ([1])

  • Netflix randomly kills machines in its cluster all the time, just to make sure Netflix won't go down when something real kills their machines. Best part, they call this "The Chaos Monkey". ([1] [2])

  • Hello, Amazon, could I borrow 1,250 of your computers for 8 hours? ([1])

  • Felix Salmon says, "Eventually ... ad-serving algorithms will stop being dumb things based on keyword searches, and will start being able to construct a much more well-rounded idea of who we are and what kind of advertising we're likely to be interested in. At that point ... they probably won't feel nearly as creepy or intrusive as they do now. But for the time being, a lot of people are going to continue to get freaked out by these ads, and are going to think that the answer is greater 'online privacy'. When I'm not really convinced that's the problem at all." ([1])

  • Not sure which part of this story I'm more amazed by, that Google offered $10B for Twitter or that Twitter rejected $10B as not enough. ([1])

  • Apple may be crowdsourcing maps using GPS trail data. GPS trails can also be used for local recommendations, route planning, personalized recommendations, and highly targeted deals, coupons, and ads. ([1] [2] [3])

  • Management reorg at Google. Looks like it knocks back the influence of the PMs to me, but your interpretation may differ. ([1] [2])

  • If you use Google Chrome and go to google.com, you're using SPDY to talk to Google's web servers, not HTTP. Aggressive of Google and very cool. ([1] [2])

  • Shopping search engines (like product and travel) should look for good deals in their databases and then help people find good deals ([1])

  • When Apple's MobileMe execs started talking about what the poorly reviewed MobileMe was really supposed to do, Steve Jobs demanded, "So why the f*** doesn't it do that?", then dismissed the executives in charge and appointed new MobileMe leaders. ([1] [2])

Friday, May 06, 2011

The value of Google Maps directions logs

Ooo, this one is important. A clever and very fun paper, "Hyper-Local, Direction-Based Ranking of Places" (PDF), will be presented at VLDB 2011 later this year by a few Googlers.

The core idea is that, when people ask for directions from A to B, it shows that people are interested in B, especially if they happen to be at or near A.

Now, certain very large search engines have massive logs of people asking for directions from A to B, hundreds of millions of people and billions of A to B queries. And, it appears this data may be as or more useful than user reviews of businesses and maybe GPS trails for local search ranking, recommending nearby places, and perhaps local and personalized deals and advertising.

From the paper:
A query that asks for directions from a location A to location B is taken to suggest that a user is interested in traveling to B and thus is a vote that location B is interesting. Such user-generated direction queries are particularly interesting because they are numerous and contain precise locations.

Direction queries [can] be exploited for ranking of places ... At least 20% of web queries have local intent ... [and mobile] may be twice as high.

[Our] study shows that driving direction logs can serve as a strong signal, on par with reviews, for place ranking ... These findings are important because driving direction logs are orders of magnitude more frequent than user reviews, which are expensive to obtain. Further, the logs provide near real-time evidence of changing sentiment ... and are available for broader types of locations.
What is really cool is that, not only is this data easier and cheaper to obtain than customer reviews, but also there is so much more of it that the ranking is more timely (if, for example, ownership changes or a place closes) and coverage much more complete.

I find it a little surprising that Google hasn't already heavily been using this data. In fact, the paper suggests that Google is only beginning to start using it. At the end of the paper, the authors write that they hope to investigate what types of queries benefit the most from this data and then look at personalizing the ranking based on each person's specific search and location history.

Monday, April 25, 2011

Resurgence of interest in personalized information

There has been a lot of news about personalization and recommendations of information in the last week.

Google News launched additional implicit personalization of news based on your clickstream, bringing it yet another step closer to the Findory of seven years ago.

Yahoo reversed an earlier policy to keep data only for 90 days, now upping it to three years, because tossing data away was hurting their efforts to improve relevance and personalize their search, news, and advertising.

Hank Nothhaft writes at Mashable that the Facebook News Feed really needs to be personalized implicitly based on what you consume and surface news from across the web. He says it should not only deliver "the best and most relevant news and blog posts on my favorite topics and interests, but it also recommends deals and product information, things to do and even media like videos and podcasts, all picked just for me" (which, if implemented as described, might also make Facebook News Feed similar to Findory).

Finally, Mark Milian at CNN points to all the efforts at newspapers and startups on personalized news. What strikes me about these is how few focus on advertising, which is the core business of newspapers, and improving the usefulness and revenue of ads on news sites. Former Google CEO Eric Schmidt had some useful thoughts on that some time ago that are still worth reading for those working on personalizing information today.