Skip to content

Knowing the vast, rural area of Tanzania is crucial to providing timely and effective help for girls during Female Genital Mutilation (FGM) “cutting seasons.” In recent years, we have managed to map millions of buildings which can help us determine the distribution of the population. Although areas of low population density in Tanzania are not sufficiently mapped yet, the initial steps have already been taken.

Goals of mapping

Crowd2Map Tanzania is a “crowdsourced mapping project aiming to put rural Tanzania on the map.” A primary goal is to help fight against FGM. Girls are rescued and taken to safe houses by local volunteers and police. However, for this they need maps. But maps can do more than just show these rescue teams the way to remote villages. The existence of spatial information can help with development and to increase commercial efficiency and economic growth opportunities for businesses and entrepreneurs, giving them the opportunity to make better-informed decisions. Growing wealth improves the quality of life, gives a chance for more opportunities and a better quality of education.

The ArcGIS license – provided to us by GISCorps – was a great help to us in this work. In the last year, the GIS Service Pledge Program was let me help and support the Crowd2map Tanzania in different assignments. Following the priorities of the previous year, our website has been completely revamped, the different online maps, charts, and tables on this site was updated regularly. The GIS dataset and data visualisation helped project managers and the field workers to answer in any case of emergency more efficient.

In the past few years, we have trained over 14,000 remote mappers from all over the world to map from satellite images and over 3000 field mappers to add their local knowledge to these base maps, mostly using smartphone applications like Maps.Me. Since the project was founded, our volunteers added more than 4,700,000 buildings and 265,000 km of road networks, along with thousands of settlement names and POIs to OpenStreetMap.

New priorities – 1. Find the village

So, we now know where to find traces of human settlements, but how do we delineate each settlement and, more importantly, how do we know what the name of the settlement is? The delimitation of settlements is not easy, the structure of the settlement is often region dependent. What does it mean? In the Ruvuma region (southern Tanzania) the settlements are well separated on the map. In contrast, in agricultural areas of the Shinyanga region, delimitation sometimes seems an impossible task.

According to the data created by Digital Globe and funded by the Gates Foundation, approximately 17 million buildings are in Tanzania. By the time of this study made, the OSM community is mapped circa 11 million building so far, so using buildings data from OpenStreetMap is not enough, we had to use a different source to identify the settlement pattern. To map populations, we used the previously mentioned data from the Gates Foundation and Facebook’s High-Resolution Population Density data for validation.


A building aggregation tool was prepared in ArcGIS to aggregate the building footprints (Polygon layer) to produce settlement layers.

  1. At first, buffer building footprints by 50m was created and merged
  2. Secondly, the merged multipolygons was divided into single polygons.
  3. Each settlement patterns area was measured, also the covered buildings were counted for each “building cluster”. These two data can help us to calculate:
    • Building density (number of buildings / total area)
    • Number of buildings, which data can help to predict the number of the population
    • The building density and building number can help in the classification of settlements
      • urban or rural area
      • Settlement type: hamlet, village, or town
  4. During the post-process, sharp angles in polygon were outlined and made smoother to improve aesthetic or cartographic quality. Finally, the unwanted “holes” from polygons were eliminated.


Using this data, we can identify the settlement pattern which can help us creating settlement boundaries for bigger villages, – especially with high building density – and adding to OSM.


New priorities – 2. Name the village

And what about the names of the settlements? Local volunteers can help us identify all the names of circa 10,000 – 12,000 settlements in Tanzania, OR we can try to find some open-source data which contains this information. Recruiting hundreds of volunteers from all over the country is beyond our power, so we need to focus on the second SOLUTION in most places. Fortunately, we have some open source data from The United Republic of Tanzania – Government Basic Statistics Portal, like health facilities or schools, or waterpoints located all over Tanzania.  

Our project objective is to add the missing village names in Tanzania, using open source government data about water sources in Tanzania.

Water Points Location in Rural Water Supply – 2015-2016


Method for the estimation of village position

The shared database contains about 87,000 water sources, which can be lakes, rivers, machine drilled boreholes or springs. The database also contains the physical condition (quality, quantity) of the water sources as well as their spatial location, indicating, for example, the village name where the water source is, or the nearest village to it. This data helps us determine the name of the village in OSM.

To understand how the village data POIs is created, as well as their actual location for the potential settlement, it is necessary to describe each step of spatial analysis and data management:

Step 1: Thiessen polygons were calculated from the waterpoints layer, to get the influence zone of each waterpoint.


Step 2: The Voronoi polygons were merged by attribute, where the village name is the same.


Step3: In the same time, Mean center was calculated for the points inside a Voronoi polygon potential position of the village. (Since in a few cases the name of a village occurs more than once in the country, a “village+district” column was used to help us to find the real mean center.)


Data management

The created point data was split by attribute and saved to a *.geojson file. Each file contains the potential village POIs for each District. 

In summary

The Voronoi polygon assigns the area where the village is located (or has to be). The village POI assigns the potential location of the settlement, BUT its accuracy depends on the number of water abstraction points and their location in/around the given settlement.

The mean center (village icon) for the Waterpoints (red dots) sometimes clearly shows the center of the settlement if these water points are evenly distributed within and around the settlement


In a well-mapped area – where, moreover, the settlements can be easily separated from each other – we did not have a difficult time with validation (mean centers before validation).



Next to the created village POIs, we must use many other datasets and imagery for validation. 

  1. At first, imagery was created from the Voronoi polygons. This imagery can help to determine the area where the village has to be. (We used ArcGIS online for this reason as a WMTS server.) 
  2. OpenStreetMap Carto Standard imagery was used to identify the trace of human activity if the area was well mapped. We were also able to get an answer as to whether the name of the settlement has already been given to OSM. 
  3. Maxar satellite imagery was used for those areas that were not mapped yet.

Further datasets for validation

Waterpoints: The “Holy Grail” – This can be useful during validation in some cases. E.g. if the position of the village’s POI is unusually far from any populated area. In this case, it is worth looking at how each water point is located in the area. Another example, when the village consists of two sub-villages, then the “SUBVILLAGE” attribute of the water database can help determine where the center of the village can be.

Health facilities data: The government data contains more than 7,000 health facilities like hospitals or clinics. The names of these facilities are usually, but not exclusively, the same as the name of the municipality where it is located. 

Education data: The government data contains almost 7,000 schools. The village names are available in this data. 

Potential anomalies during data validation

Of course, some villages are already included in the OSM database, but their quality is sometimes insufficient. During the project, quite often a village POI was assigned to a neighbouring settlement. Because of these, no existing settlement names have been removed.

Provisional results

In the past few months, experienced mappers validated the created village POIs and added to OSM, if:

  1. there was no village name, but the area was well mapped in OSM. Mappers moved the POI inside it’s Voronoi polygon to the highest population density – aka where you see the most of houses. 
  2. there was no village name, and the area was badly mapped on OSM. Mappers used the Maxar imagery and identified the crowded area inside the polygon and leave the POI there. 

Provisional results: By the end of August, more than 111 districts were validated (68% of all districts), and 3640 villages POIs were added, which is 38% of all places in Tanzania.

OSM data retrieved from Tanzania 2020/08/21

tags[‘place’] IN (‘village’, ‘town’, ‘hamlet’)

Result: 9369
Number of hamlets: 613
Number of towns: 213
Number of villages: 8543
Added villages during “Missing villages” project: 3640 (42.6%)

Back To Top