The offline geocoder we wanted

Posted by gipsyjaeger 4 hours ago

Counter5Comment2OpenOriginal

What is this?

This is an offline reverse geocoder written in Python. Given a latitude–longitude pair, it returns the correct administrative region such as country, state, or district without calling any external APIs. This avoids API costs, rate limits, and network dependency.

Why build another reverse geocoder?

Most offline reverse geocoders rely on nearest-neighbor lookups. While fast, this approach often fails near borders because the closest location is not always the correct administrative region. This project focuses on correctness over proximity by verifying which boundary a coordinate actually falls inside.

How does it work?

A KD-Tree is used to quickly shortlist nearby administrative boundaries. For those candidates, the system performs polygon containment checks to confirm the true region. It supports both single-process execution for small workloads and multiprocessing for large batch processing.

Performance

The system processes 10,000 coordinates in under 2 seconds, with an average polygon validation time below 0.4 milliseconds per coordinate.

Who is this for?

Anyone who needs reverse geocoding, predictable costs, large-scale batch processing.

Implementation notes

This started as a toy implementation to explore boundary-aware reverse geocoding, but it turned out to be reliable enough for real production use. The dataset covers more than 210 countries with over 145,000 administrative boundaries.

Links

Source code: https://github.com/SOORAJTS2001/gazetteer

Documentation: https://gazetteer.readthedocs.io/en/stable

Feedback is welcome, especially around the approach, performance trade-offs, and edge cases.

Comments

Comment by sixtyj 3 hours ago

Well done.

> it returns the correct administrative region such as country, state, or district

Do you have any plans to add some street level geocoding?

I know that database would really heavy… but as there could be a huge dataset with buildings - Global Building Atlas…

Comment by gipsyjaeger 3 hours ago

Hi, For the given location, the library would return it's corresponding ADM2/ADM3, ADM1, ADM0. Which are essentially county/city, state and country

As of now, I am planning to add more meta data to the location like pincode, population etc

Thanks!