Excelling Beyond the Spreadsheet: Hector Garcia-Molina Kicks Off 2008 Yahoo! Research Big Thinkers Series

In support of what he describes as a “biodiversity” project, Stanford University Professor Hector Garcia-Molina demonstrated PhotoSpread, a spreadsheet system for organizing and analyzing photo collections. A self-proclaimed amateur photographer, he welcomed the crowd at Yahoo!’s Mission College campus in Santa Clara and greeted a few of his former students from Stanford in the audience.

First, he introduced his group at Stanford, known as “Infolab.” From traditional database problems to Web-related issues, and most recently social networking, Infolab is focused on obtaining, managing, and exploiting information. In fact, Infolab is currently building a social networking site for recommending academic courses to their students. The need for the photo spreadsheet system came from field biologists at Stanford University, whose research is based on a large collection of annotated photos.

Together with Sean Kandel, Eric Abelson, Andreas Paepcke, and Martin Theobald, Garcia-Molina worked with the biologists to collect, manage, and share biodiversity information such as photographs of species in different environments to analyze changes in appearances based on pollution or sunlight. The biologists are currently researching animal species in the Jasper Ridge preserve near Stanford University to understand patterns, sequences, and environmental impacts on evolution through thousands of photographs.

They were using a traditional spreadsheet system to capture their data. They liked the spreadsheet for its ability to sort by different categories, but complained about not being able to view the photos on their spreadsheets. Garcia-Molina saw this as an opportunity to create something more useful. He immediately brought up his demo of PhotoSpread. With an appearance resembling a calendar, PhotoSpread comprises cells that can hold sets of photographs, each photo containing an associated set of metadata. A third of the page allows the user to view a close-up of the photos in each cell.

A crescendo of laughter echoed in the room as photographs of Yahoo! Researchers appeared in Garcia-Molina’s demo. Each researcher was classified using metadata to tag the person’s age, research area, and gender. He then showed the crowd how to extract different sets of information, such as average age and the number of women in certain research areas, by writing formulas in cells. Using its drag and drop function, Garcia-Molina moved photos around in sets and individually to view other types of information. He also demonstrated the ability to divide original clusters of photos, correct misclassified photos, create exceptions for individual photos in a set, and add tags to create new classifications.

As he summarized his talk with an audience that was eager to learn more about PhotoSpread’s usability, Garcia-Molina reminded the crowd that the project is in its early stages, with a birth date of summer 2007. At this stage of development, PhotoSpread excels beyond the traditional spreadsheet with its ability to contain sets of objects in each cell, have metadata associated with each object, hold extended formulas to aggregate different types of information, and drag and drop objects easily. What’s next for PhotoSpread? According to Garcia-Molina, there will be further field tests, possible pivot operations where groups of photos that have been tagged can be dispersed (converting one cell into an array of cells), and peer-to-peer photo sharing.

He grinned at his closing slide -- a scenic photograph of an oak tree “taken by an amateur photographer.”