GeoQR – Using QR Codes for Geotagging Video

This could so easily have been a lazyweb post: “Oh hai internets, you could encode a geotag in QR Code and then overlay it on video frames, making each frame uniquely tagged with its location, kthxbai” and left it at that. But instead, in the spirit of “Get Excited and Make Things” I decided it was time to make something instead of just talking about it.

Geotagging Video

You might imagine if you’re going to geotag video, you could start by using a similar system to EXIF tags (which store geodata into the metadata of a static image). Yes, there’s probably some space for metadata in a video file header, but all that will do is mark the beginning of the recording. You need location data per-frame (or at least periodically) to make it worthwhile.

Method 1 – In-frame metadata

I’m sure there’s some highly complex MPEG technical definition for encoding data into each frame of the video, but it started to make my brain melt reading about codecs, fields and multiplexing. And coming to implement it, you’d need to start messing with low-level coding, and that’s where I leave off.

Method 2 – Ready, steady, GO!

Press record simultaneously on your GPS logger and video recorder, then afterwards go back and post-process the video to make an index of location/timestamp/frame. Low-tech, but it works.

Method 3 – In-band Audio

Use the audio track. That way you can completely self-contain the data within the recording medium, making it more or less random-access. (This method I’m leaving to http://www.wilsonpymmay.com with their forthcoming Geovideo product which inspired me to cook this one up.)

Method 4 – In-frame Marks/Characters

Put an OMR field (Optical Mark Recognition – Like OCR only easier to do) on the screen where it can be read and decoded. Trying to parse overlaid text is a big hassle, but a fixed-size area of the frame would make it trivial to look for the mark. Because of the great job that QR-Decoding software on phones does with wonky images, awkward angles and fuzzy cameras, it might even be possible to read the QR Code image without any cropping of the image.

QR Codes

You might have seen them and wondered what they are – they are the slightly ominous bit patterns you see on lampposts and Pepsi cans which suggest a big conspiracy you’re not privy to, such as alien trig points for a forthcoming invasion.

Fig 1. Timestamp and LatLong encoded in QR Code

Actually, they’re essentially 2-d barcodes, capable of encoding around 4,000 western characters. They’re popular in countries where it’s a real pain in the arse to type in the local character set, such as Kanji in Japan.

The QR-code in Fig 1. represents the string: “20090401712:24:03 loc:51.713198,-1.213124“, a timestamped latlong reading of the Littlemore sewage works. If you have a Nokia N95 you might have a Barcode Reader installed as standard. Run the program and hold it up to the picture. *Click*, you have a georeference.

Implementation

So that’s the easy bit done. Now is when the work starts. So I thought I’d have a stab at proving it could work.

First, I did what I always do – Google for it and see if anything exists already. I found Georeferencing with QR Codes by Marc Pfister, which kind of proved the concept for me, though if I’m honest I still don’t have a clue what his intended use was.

To do this I would either need two months of solid programming or to glue some tools together to prove the concept. I went for the easy way:

Capture + Encoding

GPS -> gpsd -> python -> qrcode image -> vlc -> video file

Decoding

video file -> ffmpeg -> individual frames (and their qr-codes)
video file -> python -> gpx/kml output file with a link to each image

As is often the case with these things it’s better to prove the key components first before trying to line them all up.

Step 1 – Overlay a QR Code on a video

VLC (Video LAN Client) might seem like a fairly average media player, but it has a few tricks up its sleeve. It can do things like multiplexing and transcoding on the fly using Video LAN Manager or VLM. The mosaic feature seemed to do what I wanted – Overlay a smaller picture over a bigger one. The benefit of using VLC is that it works in Windows and Linux (possibly MacOS too, but I’ll leave that to someone with a Mac).

If you want it to sit up and beg, you can feed it VLM files:

/usr/bin/vlc --vlm-conf ./overlay.vlm --no-media-library --no-video-title --mosaic-keep-picture --extraintf logger --logfile ./vlc.log

Here we’re calling vlc, with a VLM configuration file called overlay.vlm, which you can download and read along, or if you’re impatient just download it and skip to ‘Generating the QR Code Image‘‘. After that, we’re turning off the user interface guff, doing something with the mosaic, and enabling the logger interface.

Ready for the VLM? It starts like this:

# clear previous config
del all
# Setup input 'bg' for background image
new bg broadcast enabled
# Receive the input from Video4Linux2 Device /dev/video0 (my crappy webcam)
setup bg input     "v4l2:///dev/video0 "
# enable the mosaic
setup bg option    sub-filter=mosaic
# this is the science bit, concentrate...
setup bg output    #bridge-in{offset=100}:transcode{vcodec=mp2v,vb=768,scale=1,acodec=mpga,ab=128,sfilter=mosaic}:duplicate{dst=display,dst=std{access=file,mux=ts,dst=myfile.mpg}}

This clears all existing config and defines a new broadcast channel called ‘bg’ (short for background, but you can call it anything you like), then receives input from the webcam at /dev/video0. After that, the mosaic magic starts (bit sketchy on this stuff), and then the behemoth of an output channel is configured. From what I can figure out, the colon (:) pipes the output through each filter or multiplexer or something, so in our case we pass it through the bridge (dunno), then transcode it (to mpeg2 video at 768kbps, don’t rescale it, use an audio codec of MPEG audio at bitrate 128kbps and add the mosaic filter maybe), then pipe it on to a duplicator, which outputs to both the display (ie. a window on-screen), and to a file called myfile.mpg. (It’s multiplexed into a Transport Stream (mux=ts) for some reason, this is probably bad, since we lose timecode/index data. Must try AVI at some point.)

So, on its own that would input a webcam image, overlay nothing and then save it to disk and show it on screen. We need an overlay to replace the nothing (called a Mosaic in VLM terms).

# Configure some mosaic options. Some are just the defaults explicitly defined.
setup bg option mosaic-alpha=255
setup bg option mosaic-align=5
setup bg option mosaic-xoffset=0
setup bg option mosaic-yoffset=0
setup bg option mosaic-vborder=5
setup bg option mosaic-hborder=10
setup bg option mosaic-position=1
setup bg option mosaic-rows=2
setup bg option mosaic-cols=2
# choose the order in which to display mosaic inputs
setup bg option mosaic-order=overlayimage,_,_,_
setup bg option mosaic-delay=0
setup bg option mosaic-keep-picture
setup bg option mosaic-keep-aspect-ratio

This configures the options for the overlay image to appear at default size in the top-left of the frame. (All I can suggest is you muck about with them. They’re not brilliantly documented, and although I’m a big fan of examples, there doesn’t seem to be anything much beyond example configs in the VideoLAN wiki.)

Then configure the overlay channel:

# Fake a png into a video stream
new   overlayimage broadcast enabled
setup overlayimage input     "fake://" option "fake-file=./qrcode.png" option "fake-file-reload=1"
setup overlayimage output #duplicate{dst=mosaic-bridge{id=overlayimage,width=64,height=64,chroma=YUVA},select=video,select=audio,dst=bridge-out{id=0}}

The fake:// bit is very useful. It takes a static image and fakes it into being a video stream. In this case I’m using our (as yet ungenerated qrcode.png as the image. This might fail on the first go, until the file exists, though it tends to fail gracefully (ie. not appearing). I have set a fake:// reload timer of 1 second since the GPS data generally will only be refreshed at 1 second intervals.

Next is another one of those hideous output strings. Mostly it duplicates the output to the mosaic-bridge filter and ‘bridge-out’, whatever that is. Importantly we set the width and height of the overay image here (though it can be done elsewhere or by image properties), and set the chroma to match the one the camera is using. Using the wrong Chroma (RGB vs YUV) will mean you get rainbow blobs all over your image. Match them up and it’s crisp and clear.

And finally play it all:

# Launch everything
control bg play
control overlayimage play

This all happens in an instant, when VLC starts, so you end up with a VLC player and an output like this:

Fig 2. VLC framegrab showing mosaic image

Generating the QR Code Image

Incidentally, well done if you read all the VLC bit, the hard part is over and you only really need to do it once.

So now we need something to make the lovely image you see in fig 2. You could just grab an image from the Google Chart API (see fig 1.) and then save it as qrcode.png, but we may not be able to access the web from our survey vehicle, and hammering the API every second for an image might soon alert the Google police. Also, consider the latency of the http transaction might exceed a second and you end up with a real lag on the geotags. Plus, it’s easier than you might think to generate them locally.

[At this point I think the instructions will go out of date pretty quickly. I ended up using python because it had libraries that supported GPSd and QRcodes and could run the process for me. Python's useful because it's lightweight(ish) and can even be run on some phones. I did find an example of how to manually create QRCodes, which might still come in useful, since PyQrcodec won't add borders, meaning the code can't be read from the video frames without cropping.]

I used PyQrcodec, which also can be used under Windows, but don’t ask me how – there’s more info on the PyQrcodec pages.

Here’s some code

# start python, initiate PyQrcodec library
import PyQrcodec
codec = PyQrcodec
 
# encode a string, save it to disk
size, image = codec.encode('www.example.com')
image.save('qrcode.png')

Told you it was easy! I will give you a link to the complete source after we’ve done the GPS bit.

Getting the GPS

gpsd – apparently the easiest way to connect to your GPS in Linux. Ignore the stupid logo and the slightly pompous tone of the site, it does what we need for now. Perhaps there’s a less fiddly alternative, but for me parsing NMEA sentences is fiddly, so this will do. Windows people, this is where we part company. I haven’t had the time to work out an alternative. For all I know there’s actually a Windows version too.

The key is to get the GPS reading into python, then stick it in the PyQrcodec.encode() function. Avoid messing with NMEA sentences unless you have to, since they have various caveats depending on your GPS receiver, phase of the moon, etc. If you run gpsd you can connect your GPS(es) up, get a fix and run them as a server, independently of the client software which will use them. Then whenever you need a fix just connect the client to the server and get a value. You can enable this always-on mode with the ‘-n’ switch which helps with GPS devices which won’t even try to get a fix unless they have an active connection (most bluetooth GPSes including Nokia LD-1W). To get your bluetooth GPS working under Ubuntu, try these instructions, then come back and try this:

sudo gpsd -b -N -n -D 2 /dev/rfcomm4

Options explained:

-b = Bluetooth Read-only mode so you don’t accidentally wipe your GPS
-n = Don’t wait for a client to connect before getting a fix (essential for bluetooth devices)
/dev/rfcomm4 is my bluetooth GPS com port (yours will be whatever you configured in /etc/bluetooth/rfcomm.conf)

Debugging Options

-N = Don’t daemonize, but run in foreground. Useful for debugging.
-D 2 = Debug level 2 – a bit too much but you can at least watch your client connecting and proof of a GPS fix

Run gpsd as root and you can set the system clock with GPS time. Useful, no?

What you need now is a GPS client. Install the python-gps libraries, which include a gpsd client.

sudo apt-get install python-gps

To use the library try this:

import time, gps
# Start a GPSd client session
session = gps.gps()
 
# Experiments with a single-shot GPSd connection lead to bad results
# for best results, stay connected and query in a loop.
while 1:
        # GPSd query flags:
        # a = altitude, d = date/time, m=mode,
        # o=postion/fix, s=status, y=satellites
        session.query('admosyp')
 
        # concatenate the data together
        # you could verify the GPS fix status at this point, which avoids the ambiguity of 0,0 coordinates
        #  which are most likely false, but must be supported
        mytext = str(session.utc) + ' ' + str(session.fix.latitude) + ',' + str(session.fix.longitude)
        print mytext
 
        # wait for a bit, then do it again
        time.sleep(1)

Which should generate a series of time, lat and long stamps. There is obviously no error checking or anything here, but if you use this in the field you’ll need something to handle a lack of fix from the GPSd, otherwise you’ll produce lots of 0,0 stamps. (The sea south of Nigera was never so popular as when they gave GPS to the masses )

Encoding the GPS

Combine the GPS code with the following bits of QRencode action:

import time, gps, PyQrCodec
# Start a GPSd client session
session = gps.gps()
 
# Start a QrCode encoder
codec = PyQrcodec
 
# Experiments with a single-shot GPSd connection lead to bad results
# for best results, stay connected and query in a loop.
while 1:
        # GPSd query flags:
        # a = altitude, d = date/time, m=mode,
        # o=postion/fix, s=status, y=satellites
        session.query('admosyp')
 
        # concatenate the data together
        # you could verify the GPS fix status at this point, which avoids the ambiguity of 0,0 coordinates
        #  which are most likely false, but must be supported
        mytext = str(session.utc) + ' ' + str(session.fix.latitude) + ',' + str(session.fix.longitude)
        print mytext
 
        # minimum image width for decoding is ~256pixels, but we can do that with image enlargement during extraction
        size, image = codec.encode(mytext, image_width=128)
        image.save('qrcode.png')
 
        # wait for a bit, then do it again
        time.sleep(1)

There, so what we’ve done is codec.encode(mytext, image_width=128) which is the GPS data encoded at 128×128 and saved as qrcode.png, once a second.

This is immediately picked up by VLC which then combines it with the video to produce a video stream.
See on Map

Job done!

Decoding the Video

I think I’ll leave this til Part 2…

Conclusions

It worked. As you’ll see in part2, the qrcodes decode successfully and produce geotags once a second…

This solution cost me nothing in equipment, though we don’t all have these lying around, I know:

Laptop running Ubuntu 8.10
Logitech Quickcam E3500
Nokia LD-1W Bluetooth GPS

To make this worthwhile you will need a slightly better camera than I used. This leads to new problems. If you use an old analogue camera you’ll need to use a video capture card, complicating the setup. If you use a digital camera, it has to be able to stream its video output. Some do, but this is very expensive at the moment (2009). Give it some time and they will be cheaper. However, given time they might make geolocation part of MPEG. At best this project will make a stopgap geocoding method until we can afford these things. [A possible budget hack would be a head-up-display a bit like a teleprompter, made of a bit of glass reflecting an image from a mobile phone into the lens of the camera, but that's physical stuff and I'm a bit cackhanded.]

Code/Downloads

gps2qrcode.py – Python code
geoqrencode.txt – Shell Script, rename to .sh to make it work.
overlay.vlm – for your VLC bit