NFPA Journal - Perspectives, July August 2015

Crowd data. New research analyzes information on cell phone use to estimate crowd size.

. Author(s): Jesse Roman. Published on July 1, 2015.

ACCORDING TO A STUDY published in the journal Disaster Medicine and Public Health Preparedness, just seven people pushing in a single direction can exert more than 1,000 pounds of force, enough to bend steel railings. Multiply that number of people by 1,000, or 10,000, or 100,000, and imagine being near the front of that crowd as it presses forward in anticipation of a concert, a religious ceremony, a sporting event, or even a bargain at a big-box store.

In the past few decades, thousands of people around the world have died or been seriously injured in this way, many of compressive asphyxia, where victims are unable to expand their lungs under the force of bodies surrounding them. A 2014 study, “An Analysis of Mass Casualty Incidents in the Setting of Mass Gatherings and Special Events,” found that, from 1982 to 2012, there were 162 mass casualty incidents involving “the movement of people under crowded conditions.” It’s common for dozens if not hundreds of people to die in a single event. In 2012, the Fire Protection Research Foundation published a report, “A Literature Review of Emergency and Non-Emergency Events,” to help inform the NFPA 101®, Life Safety Code®, community on the topic.

The incidents span continents, cultures, and events. In 1990, a crowd crush in a pedestrian tunnel leading out of Mecca during the Muslim Hajj led to the deaths of 1,426 Muslim pilgrims. (Additionally, hundreds have died in separate crowd crushes on the Jamarat Bridge in Mecca during the annual “Stoning of the Devil” ritual, including 251 pilgrim deaths in 2004 and 345 deaths in 2006.) In 2008, 224 people were killed in a crowd-crush event at the Chamunda Devi temple in Jodhpur, India, when a rumor spread that there was a bomb planted in the temple. In 1989, 96 soccer fans died, mostly from compressive asphyxia, as thousands of fans poured into overcrowded terraces to watch a football match at Hillsborough Stadium in Sheffield, England. Earlier this year, 28 people were killed at a soccer game in Cairo, and at least 16 died during a festival in Haiti.

While crowd disasters are complex problems with no easy solutions, Federico Botta, a 27-year old PhD candidate and researcher at the University of Warwick Business School in Coventry, England, believes he might have found a tool that can help. Botta was the lead author on a study, “Quantifying Crowd Size with Mobile Phone and Twitter Data,” published in May in the online journal Royal Society Open Science, that details a promising new method for accurately and quickly estimating the number of people at an event using real-time cell phone data. As part of the study, Botta and his team analyzed two months of cell phone calls, smartphone Internet connections, and Twitter posts emanating from a soccer stadium in Milan, Italy, during 10 matches. They then used that information and the known attendance figures of the matches to find correlations and develop a method of accurately predicting the number of people in the stadium using the cell phone data.

That model could help public safety officials make critical real-time crowd-control decisions and better plan for future events, Botta says. It could also provide information to help researchers learn more about the dynamics of crowds and what causes crowd disasters.

NFPA Journal spoke with Botta about his research and its potential for preventing future crowd disasters.

What were the goals of the study?

Being able to accurately estimate the number of people in a crowd has traditionally been a very hard problem to solve. There are many techniques to estimate the size of a crowd, but unfortunately many of them are slow or inaccurate because they rely on human judgment.

Our idea was that almost everyone at an event today has a smartphone. People are making phone calls, checking Facebook and Twitter, checking their email. We thought that simply knowing the volume of this activity could provide information about the number of people in a given location. If there was a strong enough correlation, you could potentially have an accurate estimate of the number of people at an event, and that information would be immediately or almost immediately available. If you’re estimating the size of a crowd in an emergency, you obviously want the estimate to be accurate but also fast so you know how to react. That motivation is what led to our analysis.

You used known attendance figures at soccer matches to try and find correlations between crowd size and the volume of phone calls, smartphone Internet connections, and tweets emanating from the stadium. What did you find?

We obtained this data for 10 football matches and constructed a correlation using nine of the football matches and then tried to estimate the number of people at the tenth match to see how accurate we could be. What we found was quite exciting, because the relationships between smartphone use and the number of attendees were really, really strong. Our crowd estimates for the tenth match were within 13 percent of the actual number of attendees within the football stadium. For us that was quite remarkable considering we only had data for 10 matches. We think if we had a larger data set over a longer period of time, our results would be even stronger.

What data had the strongest correlation to crowd size—tweets, cell calls, or Internet use?

The strongest was mobile phone access to the Internet. Our interpretation is that, in general, mobile phones are connecting to the Internet and downloading emails even if users are not actively using the phones, as they would if they were making calls.

Obviously not everyone has a smartphone. How do you account for that?

This is what’s exciting to us. Our results seem to suggest that even if only a fraction of people are using smartphones, this fraction seems to be large enough to provide a correlation to accurately estimate the size of the entire crowd.

What about in developing countries where the percentage of people with smart phones is far less than that in Western Europe or the U.S.?

I would expect that the relationship [between smartphone use and crowd size] would still be present, but the model would have to be recalibrated to take into account that a higher or lower percentage of people own smartphones.

For the study, you relied on cell phone data provided after the fact by an Italian phone company. Would it be possible to obtain this information and estimate crowd sizes in real time?

If this idea were to be realized, it would require availability of data immediately and involve cooperation between the relevant phone company, event organizers, and public safety agencies. If that happened, this could be done and implemented in almost real time. Twitter is different because it’s something a user is actively deciding to share on the Internet. Some of the information coming from Twitter is freely available in real time, so that could provide additional resources.

If further study proves this method to be viable, how could it be used to help prevent crowd-related deaths and injuries?

In general, having an accurate estimate of crowd size would be useful in any emergency situation where organizers or police want people to leave a specific location in a short period of time, whether the result of a threat or some other factor. How they choose to evacuate the crowd might depend on how many people there are. If the number of people in a location is much larger than what can be safely evacuated quickly, it’s better to have that information so you can make a quick decision to alter the plan.

Also, there may be cases where authorities want to limit the number of people in a location. If I know how many people are there already, I might be able to prevent a crowd disaster by keeping more people out.

Knowing the crowd size is a useful tool, but it seems like how you use that information is really the key.

You’re absolutely right. Our study only shows that providing quick and accurate estimates of the number of people at a gathering is possible. That single figure cannot give you the plan on how to respond to an emergency, but it may help you make planning and policy choices about what to do in the particular situation you have.

What more needs to be learned before this method of estimating crowd size becomes a viable technique that public safety officials can use?

Work needs to be done to build on our results and see what changes with different contexts and scenarios. We need a lot more data to see if similar correlations exist across communities, countries, and types of events. Our study is only related to football matches in the city of Milan, but we expect further work could build on our results and hopefully help lead this method to be implemented in real contexts.

How do officials currently estimate crowd sizes at events?

There is a range of techniques. Some of them rely on officials or news reporters to take aerial photographs on the day of an event and then someone counts the number of heads they can see in the photo, and from that they can infer how many people there are in total. That technique is fairly slow and can also be very inaccurate because it relies on human judgment. There are other techniques such as dividing an area into small cells and counting the number of people in a few cells and then multiplying. But that assumes that the density of people at the event is constant. Again, this is slow and can be subject to human judgment. You can imagine that if this is information you need in an emergency, it’s just not feasible.

What are some other potential applications for your research?

One possibility is to obtain more accurate estimates of how many people attended a protest, rally, or some other important event. On a larger scale, there may be applications for public health, such as how a disease can spread. If we have a dynamic view of where people are, the modeling of how a disease or pandemic spreads could be updated to include where people are in real time. I think this knowledge could be of great importance in many situations.

Using the vast amount of data being generated in our world to spot trends and gain efficiencies seems to be a huge area of research at the moment.

The crowd size project is part of a larger project happening here at the Data Science lab in Warwick Business School. There are numerous projects looking at what all of this data from Twitter, searches on Google and Wikipedia, and more can tell us about our behavior. For example, researchers here are looking at whether we can use Google searches to estimate how many people have the flu in a given area. We’re trying to find out if this information can help us predict what will happen in certain situations in the near future. It’s a very exciting field of research. It’s information that could be used by a lot of different people, including policy makers, to make our lives safer and better.

JESSE ROMAN is the staff writer of NFPA Journal.