Hey, there! Have you ever wondered how your phone magically finds and points you to the nearest coffee shop? Let’s dive into the world of location searches and maps. Imagine you’re in a new city, craving a good cup of coffee. How does your phone know where to find the nearest café? That’s the magic of geospatial algorithms at work!

In this blog, we’ll unravel the secrets behind these algorithms, exploring how they tackle the challenge of finding locations(let’s call them businesses) efficiently. Get ready for a simple yet fascinating journey into the tech that powers your digital exploration!

Two-Dimensional Search

The most intuitive but native way to search nearby businesses is to draw a circle with a predefined radius and find all the businesses within the circle as shown in the below image.

2dsearch

Let's say you have a business table that stores business details with latitude and longitude and you want to search all businesses in 2D search. The SQL query for this process would be like

SELECT id, latitude, longitude
FROM business
WHERE (latitude BETWEEN {:your\_lat} - radius AND {:your\_lat} + radius)
AND (longitude BETWEEN) {:your\_long} - radius AND {:your\_long} + radius)

This query is not efficient because we need to scan the whole table.

What if we build indexes on longitude and latitude columns? Would this improve the efficiency?

The answer is not by much. The problem is that we have 2D data and the dataset returned from each dimension could still be huge.

The problem with this approach is that the database index can only improve search speed in one dimension. So naturally, the next question is, can we map two-dimensional data to one dimension? The answer is yes.

This can be achieved by geospatial indexing. There are two types of geospatial indexing approaches.

table

The underlying implementations of those approaches are different, but the high-level idea is the same, i.e. to divide the map into smaller areas and build indexes for the fast search.

Let’s take a look at them one by one.

In this blog, we will cover the Hash algorithms.

1. Evenly Divided grid

This is a very simple approach. This approach divides the world into small grids. This way, one grid could have multiple businesses, and each business on the map belongs to one grid.

evenly

This approach works to some extent, but it has one major issue. The distribution of the businesses is not even. There could be a lot of business in big cities like Delhi, and Mumbai, while other grids in deserts or oceans have no business at all. By dividing in an even grid we produce a very uneven data distribution. Ideally, we want to use a more granular grid for dense areas and large grids in sparse areas. Another major issue with this approach is to find the neighboring grids.

2. Geohash

This approach is better than the evenly divided grid. It is based on the concept of dividing the Earth’s surface into a hierarchical grid of cells. Each cell is represented by a unique Geohash code, which is a string of characters. Geohashing is commonly used in geographic information systems (GIS), databases, and location-based services to quickly index and retrieve spatial data.

Let's go over how Geohash works at a high level.

table

First, let’s divide the planet into four quadrants as shown in above Fig(a). The division is like

  • The latitude range [-90,0] is represented by 0
  • The Latitude range [0,90] is represented by 1
  • The Longitude range [-180, 0] is represented by 0
  • The Longitude range [0, 180] is represented by 1

Second, divide each grid into four smaller grids. Each grid can be represented by alternating between the longitude bit and latitude bit.

Repeat this subdivision until the grid size is within the precision desired. Geohash uses base32 representation. Ex: 9q9hvu, 9q9jhr

Size of Grid

The size of the grid is determined by the precision factor. Geohash has 12 precisions(also called levels) as shown in the table.

table

3. Uber’s H3

Uber H3 is a geospatial indexing system created by Uber Technologies. In simpler terms, it’s a way to organize and identify locations on the Earth’s surface using a hierarchical hexagon grid.

Instead of using traditional square grids or latitude/longitude coordinates, Uber H3 divides the world into hexagons. Hexagons are useful because they can cover a spherical surface more efficiently than squares, and they provide a nice balance between size and shape.

table

Just like Geohash, H3 also has levels(called resolutions). At each level, hexagons are subdivided into smaller ones. This hierarchical structure allows for efficient representation of geographical areas at various scales.

table