Proposed format(s) for geotagging arbitrary types of media

Yet more thoughts on geotagging – here’s what I’ve come up with so far.

The format needs to handle only two fundamental data types – points and polygons. It also obviously needs to handle “lines” or tracks, but those are made of “points”. Polygon, for my purposes, might be unnecessary and I’m not sure if I should leave it in. I’m reluctant to leave it out – that way you could easily georeference media to a building or field’s outline, for example. On the other hand, I’m trying to keep this format terse and concise – I’m not trying to merely embed .gpx or .kml files in things.

A “point”, as I am thinking of defining it here, is made of up to seven attributes (more or less in order of importance): a latitude/longitude pair, elevation, timestamp, track-ID, heading, and angle. A polygon is the same, except that it contains a list of at least three lat/lon/optional-elevation sets. It still only has a single timestamp, though, just like a “point”. I suppose in some odd cases one could even define a track as a series of polygons – defining the field of view in a video taken from the bottom of an airplane that’s taking off, for example.

Leaving aside the question of polygons for now, I’m envisioning two possible formats which I will arbitrarily name “geotag” (XML-type) and “geostring”(simple text) for the moment.

I picture a geotag entry looking something like this:

<geotag:point lat="41.228063" lon="-115.058119" elev="1720.901m" datetime="20071115T143000-06" trackid="1" heading="340" angle="-5.0">Metropolis Hotel</geotag:point>

In this format, the optional description of the point is between the opening and closing tags there. “lat” and “lon” might be better as a single “latlon” or “coord” attribute, with the latitude and longitude separated by commas (i.e. <geotag:point coord="41.228063,-115.058119">:</geotag:point>)

A “geotring” point might look something like this instead:

geostring:point:41.228063:-115.058119:1720.901m:20071115T143000-06:1:340:-5.0:geostring

Not sure if the closing “geostring” is really necessary here, but it would make backwards-compatibility easier if fields were added to future revisions. As with the geotag, it might be better to treat the lat/lon pair (the only mandatory information for a minimal “point” definition) as a single field, so the minimal “geotag” example above done as a “geostring” would look something like: geostring:41.228063,-115.058119::::::geostring

Even as I write this, I find myself leaning towards combining the latitude and longitude into a single field, if for no other reason than it means each point only has one required field. Either way, I currently think the fields ought to be defined thus:

  • latitude and longitude are decimal degrees. Either may be prefixed by a + or – (lat: +=”Northern Hemisphere”, -=”Southern Hemisphere”, Lon: +=East, -=West) – if neither is there, + will be assumed. Latitude and longitude are required for every point.
  • Elevation may be suffixed by “m” or “f” (for “meters” or “feet”). If neither is specified, meters are assumed.
  • Timestamp is in the ISO 8601 “basic format”. If neither “Z” or an offset from UTC are specified, “the viewer’s local time” should be assumed (which is kind of silly, but it still would allow one to synchronize a track with, say, an audio recording or video.)
  • trackid is any arbitrary alphanumeric term with a maximum of, say, 16 characters (is that enough?) Any points with the same trackid are assumed to be part of the same track. If unspecified, the point is assumed to be unrelated to any other points (if any exist) that may be in the same file.
  • Heading is in decimal degrees from 0 to 360. This represents facing a particular (horizontal) direction from the point in question. “Which direction the camera was pointing” in the case of a photograph.
  • Angle is in decimal degrees from -90 to 90. This represents an angle above or below the current elevation at that point (for a picture, this would represent the upward or downward angle that the camera was pointing when the picture was taken.)

Hmmm, if I shorten “geostring” to “geostr” and either eliminate the “data type” field (“point”) or just reduce it to a single letter, that entire and complete “geostring” example would fit even into a single tiny 64-character comment field, if there are any file formats still floating around limited to that kind of small metadata size.

My main goal here is to make it easy to create files tagged with this information. So long as it’s easily read and not likely to get separated from the file it describes, using the data for anything ought to be easy, even if one has to do it “by hand”. As was mentioned on the “Into the Pudding” blog (found via the GeoRSS blog), having applications that can read metadata is useless if nobody’s putting the metadata in their files to begin with. If an acceptable format can be worked out, I intend to start making as much georeferenced information available as possible.

Who’s with me? Comments, suggestions, offers of patronage, anyone?

More on geotagging

Some good comments came up in the last post on georeferencing. I thought a followup post was
merited.

The itch I’m trying to scratch here is that I want to be able to georeference just about any kind of data,
and I want to be able to embed the georeference information directly in the data file, whether it’s a
graphic, or audio, or video, or gene sequence data, or anything else. I want to have a standard form for tagging any of these files. And I don’t want to store the location metadata in a separate file.

What I think I need, then, is a standard, simple way of making geographic notations in a terse, concise format that is both easily parsed by and readily recognizeable to a computer, is reasonably human readable, and can be made to fit just about anywhere that arbitrary text is allowed.

Right now, there are only two types of files that have some way of embedding geographic information into them that I know of. The obvious one is that EXIF data in JPEG files can contain “GPS” tags. For hardcore GIS people, GeoTIFF is the other one. Both are for photographs or other still-image data only. What about the rest?

A variation of one of the current geotagging XML formats like the W3C (“<geo:lat>41.4354840</geo:lat><geo:lon>-112.6660845</geo:lon>”) or GeoRSS is an obvious possibility. XML has two potential problems though, as I see it. First, it’s not very terse – the markup substantially increases the amount of space the information takes up. I think in most cases that wouldn’t necessarily be a problem, but I suspect there are a few file formats out there with only comparatively small spaces set aside for a “comment” or “description” field.

The second potential “problem” is something odd that occurred to me today: it’s hard to pronounce out loud. There are some popular audio formats (e.g. “.wav”) that as far as I know have no space whatsoever for arbitrary text…but if my little standard was something that could be distinctly spoken, someone making a recording could literally speak the metadata in a format that a speech-to-text engine (like Sphinx) might be able to recognize and convert to a compatible string of text which could be parsed just like data from anywhere else. This is something of a corner case, I admit, but I think it’s at least worth considering.

Another good point that came up was what you do if your data extends beyond a single point. For example, if I want to georeference an audio recording I might make while narrating what I’m seeing out the window of a speeding train, it makes good sense to at least try to store line segments rather than just a point. That way, if someone wants to find the spot within a several-mile stretch where I suddenly exclaim “Hey, wow, look at that!” they can. The ability to define areas with a polygon or a point-and-radius seems like it would be handy, too, though obviously much more optional.

So, let’s see, I’m looking for a format with minimal markup, but which is easily recognized, is made of plain text which could be crammed into, say, a PNG tEXt chunk, an mp3 comment frame, a Genbank “Source” field, or any other field which allows arbitrary text. I want a form that’s minimally objectionable to anyone else who might be willing to use it. And I think I want it to be able handle points consisting of at least latitude, longitude, optional elevation, optional timestamp, and possibly even an optional heading and angle, and can handle more than one point per file (for the case of lines). Am I forgetting anything?

Besides “going to bed before 3am”?

I want to geotag something besides photographs!

Cornelia - Queen of the Snow!For no particular reason, here is a picture of The Dog in her natural habitat. This picture really has nothing to do with today’s blog post, but since this is supposed to be a happy time of year, I suppose a happy picture is in order.

In case anyone is wondering if I’ve forgotten the supposed microbiological emphasis on this blog, the answer is no. In fact, I’ve got a post on amateur yeast culture brewing, but I’m still researching it a bit.

Meanwhile, it seems reasonable to post about geolocation, which after all is an important and useful trick for associating information with its place in The Big Room.

Geolocation of photographs is well established, at least for JPEG images. There are standard ways of tagging a JPEG file with an ICBM address, and I’ve been having a lot of fun doing this with my own pictures. (If you’re bored, you can browse them on Panoramio, and perhaps in a few weeks may stumble on some of them in Google Earth.)

There doesn’t appear to be any standard way of tagging other forms of media files, though. What if I want to geotag an .mp3 or OGG/Vorbis audio file recorded at a particular spot? Or a “DivX/Xvid” or OGG/Theora video?

Irritatingly, it seems as though a few people have mused about it, but nobody seems to have addressed it. There are projects like The Freesound Project which does geolocate sounds, but the geographic information is not actually embedded into the sound files in any way. As far as I can tell, the location is tracked in their own server’s database only. A Google search turned up a post on the “Random Connections” Blog musing about this, but the only application mentioned is adding georss tags to the RSS for a podcast feed, not to the podcast’s audio file itself. Even the otherwise excellent Mapping Hacks book (written before O’Reilly’s current decline into yet another “Proprietary Product® How-To Guides” publisher over the last couple of years) mentions the topic in Hack #59, but disappointingly appears to have really had nothing to do with tagging files so much as “interpolating a position from a GPS track, given a timestamp”.

This all comes up because we’re about to go on a roadtrip to check out a part of the country where we seem likely to end up living next year. I’ve been told I’ve got a pretty good voice, so I was considering generating a travelogue series along the way. It appears to be relatively easy to generate a “narrated picture” as a standard mp3 file, the picture being loaded as though it were “album art”. The only aspect of the whole thing that’s missing is geolocation. For now, just being able to easily obtain the ICBM address associated with the file while playing it so that one could plug the coordinates into Google Maps to see where the recording was done, but ideally I’d like to do it in a way that could be considered standardized, so that later on people might be encouraged to add geolocalization plugins to their media-playing software.

Sure, I can just generate a .kml file with a track of where we were, with markers containing picture and audio links. In fact, I probably will, but I don’t want people to have to use Google Maps or Google Earth to make use of the geolocation information associated with the audio.

Any suggestions, anyone?

Why you really do or don’t want me as a student…

Of the classes I took this last semester, there’s only one I haven’t blogged about at least once.

Masochist that I am, I went and took “Applied Calculus”, even though I’d gotten approval to count my previous semester of calculus (about 8 years ago) as fulfilling the mathematics requirement for graduation. The “applied” in the title of the class caught my eye, and after speaking to the instructor before the semester to find out what the class was like I decided that if there was time and money left I’d take the class. So I did.

Although I’d rank it as only the second most useful “Mathematics” course I’ve taken so far, Dr. Wolper was one of the best mathematics instructors I’ve had up to this point, so I’ve got no regrets for having spent the time and money to take it. I suspect I’ll remember a lot more of it than I did of the previous calculus class.

Anyway, getting to the point of this post:

There are times when I am unable to restrain myself and answer homework or exam questions in a terse, boring manner, regardless of the subject. If you’re an instructor and are wondering if you want me in your class, here is something to judge by.

Calculus (for those who don’t know) is more or less the math you use to deal with when, how, and how fast things change. In practical terms, when dealing with real-world applications this often means dealing with a graph of some data. A number of homework (and exam) problems this semester dealt with questions along the lines of “what would a graph of such-and-such a situation look like and how would you interpret it?”. Here’s one from early in the semester:

This was my answer:

You may judge for yourself whether this is a good answer or not…

I can has graduation?

The last undergraduate final is over.

Everything it taken care of save for one overdue library book, which I intend to take care of tomorrow.

All the other fees are paid. All the paperwork is done. I’m pretty sure I got well above the F– that was the minimum I needed on the Philosophy final to achieve the minimum passing grade. In fact, my only current stress about my grades is whether or not I managed to end out my last undergraduate semester with a 4.0 or not.

I FEEL BETTER THAN JAMES BROWN! WHEEEEEEEEEEEEEEEE!!!!!!!!!!!!!!!!!!!!!!!!!!

Let the wild, uncontrollable drunken orgiastic celebration begin!

After my nap

Make it stop!

Specifically, I think I’m getting a severe case of Noel poisoning.

One of the things I hate most about Christmas is the incessant “re-imaginings” of the same handful of accursed songs, generally done in the same awful forced pretend-emotional tone.

They’ve got “The First Noel” playing in the style of a late-1950’s/early-1960’s Disney Choir style. On a loop. For the last half hour so far.

Ugh. Make it stop…

Thank the Noodly One for headphones, Amarok, and the collection of hard bouncy techno music that happens to be on Igor here…

I’m down to the last class of the last week prior to next week’s finals, so I should have time for a real post again soon…

I’m having too much fun with this.

I finally managed to get Hugin to work, as you can see from the picture of the Dead Fish Museum above.

Okay, it’s the visitor’s center at the Fossil Butte National Monument, but it really is a museum of dead fish. And other fossils. If you click the image to get to the Panoramio page, you can even see where it is on the map: in fact if you zoom in, the building itself is visible in the aerial photo imagery.

Between digiKam’s ability to handle geocorrelation with tracks from my GPS, Panoramio’s support for geolocation and mapping (and connection to Google Earth…), playing with High Dynamic Range digital photography, and now panoramas, I’m beginning to develop an increased urge to travel around and take pictures again…

Nerd Photography in the Big Room

Readers may have noticed by now that I have a cheap but serviceable digital camera that I’ve been using to take pictures which occasionally show up here on the blog. (Hey, there’s another thing that the External Deliverer, in Its benevolence, might bring me: a nicer digital camera.)

I’ve been playing with geolocation for a while now. Just recently, I started also doing some crude playing with High Dynamic Range digital photography. It’s obviously going to take me some work to get it figured out and get better results, but what I’m getting so far doesn’t look too bad, at least in my own opinion. Kind of surreal, like Mars Rover pictures…

I’ve discovered that my Handy-Dandy Linux box has access to a couple of tools that make these easy.

I noticed a few days ago that digiKam is actually able to read .gpx format files downloaded from my GPS and then correlate the track from the GPS with the timestamps on the photos automatically, so in what little spare time I have I’ve been going back through my archives of GPS tracks and timestamped photos and trying to find as many to correlate as I can. I managed to get geolocation tagged into pictures from as long ago as three years or so. I also tagged this more recent one. I saw this place half a decade ago and had been wondering if it was still there. Last week we finally had a chance to visit and sure enough, it was there. If you were wondering where one could go to learn to do the Squirrel Dance, here it is.

Landscape and Sign:Don't Trespass on the 'I'

Today after classes I trudged up to the top of the hill at one corner of the campus with my trusty GPS in hand and took a few pictures, as you can tell. Since Google Earth seems to get most of it’s photos from Panoramio, I’ve started uploading them there. I may also get around to uploading them to flickr one of these days, too. I kind of need some pleasant distraction – I’m starting to hit the “Am I there yet???” phase of the semester. Just another week-and-a-half of classes, then finals, then I’m finally done. At least with the undergraduate stuff.

If you’re bored, there are a couple of additional pictures on the Panoramio site, here. You can also get the ICBM address there, and a .kml file for Google Earth so my pictures will pop up if you happen to run past an area where one of them is while you’re browsing the globe.