Let’s modify our representation of addresses in adr microformat

Microformats define a representation spec for addresses, called adr. This year, I made two distinct proposal to modify the current draft, but turned down each time I tried. In this post, I’m going to address the current problems and how tiny enhancements can bring new horizons in the retrieval of location based information.

The problem

Current spec does not serve as a latitude, longitude carrier. Current properties only include post-office-box, extended-address, street-address, locality, region, postal-code and country-name which are text fields to form an address. This schema is defined in vCard and migrated to hCard microformat in 2004. Then, the need of address-based extraction led them to copycat this format and call it adr. vCard’s final design spec was way before we had online maps. Nowadays, we have addresses all over the Web. Automatically directing these text addresses to locations on maps or providing a preview on hovers would be the first basic attempts to improve our data representations. But unfortunately, maps are talking more in mathematics than text addresses. In practise, there is a process that takes addresses and transforms them into a latitude, longitude couple and pans map to that location. This process is called geocoding, and it is far away from perfection in today’s scale. Instead of depending on a geocoder to transform addresses into mathematical locations, I suggest microformats to enable built-in (lat, long) arrays in adr.

Extending adr with a set of latitudes and longitudes

What I’m going for is to extend adr with an optional list of (lat, long) values. So, in cases where coordinates are given, instead of asking a geocoder to land us on a location we can directly move. But why to use a list of coordinates and not a single point? Because, in spatial domain different geometric structures are being represented as different shapes. Examples are below.

  1. If you are talking about a city centre, it’s most likely to be a Point.
  2. Mississippi river is a long long Line.
  3. And a university campus is obviously a Polygon.

In the image above, British Museum is represented by 12 latitudes and longitudes as the inner area which these points compose. On the other hand another representation may be made with the centre point of the museum with (51.529038,-0.128403). Formally speaking, the museum is located on “British Museum, Great Russell Street, London, WC1B 3DG, UK“. And this translates to the coordinate I gave. What about using them together to form:

<div class="adr">

<div class="street-address">Great Russell Street</div>
<span class="locality">London</span>,
<span class="postal-code">WC1B 3DG</span>,
<div class="country-name">UK</div>

<div class="geo"> <!-- optional coordinates attribute from geo -->
<span class="latitude">51.529038</span>,
<span class="longitude">-0.128403</span>
</div>

</div>

In the example above, I’ve used geo to include the single point <lat, long> optionally to map the address to a physical location. More useful structures can be defined within standards to enable multiple point entries to provide polygons such as 12-point representation of British Museum in the image above. Or basically, multiple geo entries inside an adr may work.

TODO: Write about the impact this usage can bring.