Step One: Acquiring the Knowledge

18th Jan 2012 | Graham Rhind | Addressing

When you step into a London black cab and give your destination address, it would be unusual if the driver spent the next five minutes consulting maps and working out routes.  This is because each driver must learn about London’s streets and routes in depth before obtaining their license.

When we work with data, we tend to do things the other way around.  We create database structures, input screens, reports, processes and analyses based on the knowledge we already have. As this knowledge is usually based on our own cultural background, it can work well with data coming from our own country, in our own language, for people of our own ethnic background.


When we apply the same structures and processes to data from other countries or regions, we start coming across issues which we don’t expect.  How do we fit that 80-character German company name into a field of only 40 characters?  What do we do with Islamic names that don’t fit into the Anglo-Saxon first name/last name fields in our database? How do we ensure that we collect the names in the right field for people who write their names starting with the surname or family name? How do we make that householding procedure work for data from countries where each member of a family has a different surname? How do we fit that seven-line address into a structure intended for two lines? Why doesn’t our de-duplication work as well in countries where each postal code covers a larger and more populous area?

Once we reach the point where the data and systems are in place, it is rare that an organisation recognises quickly that these structures are poorly designed and need overhauling.

More often, time and money is spent trying to shoehorn data into unsuitable structures and processes. I see many otherwise successful and go-ahead internationally active companies struggling with systems designed to work for data from only one cultural area. Ultimately, their attempts to firefight the issues fail. Either the system is retained with unacceptably low data quality, with the resultant consequences for the commercial health of the company and their international operations, or the whole system gets trashed and (sometimes) a new, improved, system is implemented, at huge cost.

Read the map before you set out

We know about people, addresses and places where we were raised. But if you want to work internationally, you need to know about other languages, cultures and places too.

This does not mean that you need to create systems that will work everywhere (a hugely complex technical challenge). What is important is that the choices that you do make are informed. If you decide to create your company name field with only 40 characters, for example,  knowing in advance that there will be company names that don’t fit into the field, and how to tackle that issue, will be to the benefit of your data and to your commercial success.

Acquire the knowledge. Before you start.

About The Author


Graham Rhind

Graham Rhind is an acknowledged expert in the field of data quality. He runs his own consultancy company, GRC Database Information, based in The Netherlands, where he researches postal code and addressing systems, collates international data, runs a busy postal link website and writes data management software. Graham speaks regularly on the subject and is the author four books on the topic of international data management. You can find him on Twitter via @grahamrhind.


Add a comment