The Malaria Quarterly Report (QR) is the primary data resource in M-DIVE for understanding malaria incidence, burden, and progress metrics in PMI-supported countries. It includes data provided by countries each quarter, based on what they have available from their existing data collection systems. This document describes how the different geographic and administrative areas in each data submission are aligned to a standard set of geographic definitions used within M-DIVE, in order to produce more consistent maps and analyses.
Geographic Standardization
Data systems may have different names for the geographical areas they represent when compared to government administered names, especially when working with lower level administrative units that may or may not match to the official administrative subdivisions. Through a geographic standardization process, also referred to as “Admin Alignment”, we match the different geographic locations/names submitted to a list of names in the M-DIVE database that have been collected from previous reports and other sources. This list of names is referred to as M-DIVE standard geographies. This process ensures that data from different sources can be joined together, compared, or drawn on maps with the same boundaries, even when there are differences in spelling and naming conventions between sources.
Since M-DIVE standard geographies might not match the exact geographies defined within each country’s existing data systems, the geographies submitted during each round of reporting may appear in one of the following ways:
- Submitted as an exact match of the standard geographies.
- Submitted with variations in spaces, capitalization, special characters, and accents compared to the M-DIVE standard geography, but otherwise matching the standard geographies.
- Submitted as a known alias that has been encountered previously, such as a confirmed alternate spelling, or a label that includes supplemental information about the type of area in the name.
- Submitted as newly aggregated or disaggregated geographies, which happen as a result of standard geographies combined into one new area or split into multiple areas
- Submitted as a nonstandard geography that does not have an obvious match amongst the set of defined standard geographies.
The submitted geographies can vary for each country from one period of submission to another. As such, the QR data process in M-DIVE includes a number of set “mappings” that describe how the submitted geographies relate to the standard geographies made available in the final M-DIVE data set.
These mappings define in a formulaic, repeatable way, how the geographies in the submitted data need to be adjusted in order to create a unified output. This ensures that the QR data in M-DIVE have one consistent set of standard geographies for each country that can be used to process data across time, create map attributes, and other applications. Below is a diagram of the M-DIVE approach to mapping a submitted geography to a standard geography:
Challenges and Approaches
Aligning submitted geographies with standard geographies comes with a set of challenges for which we have designed approaches, outlined in the table below.
Challenges | Approaches |
Geographical boundaries are submitted with variations in spaces, capitalization, special characters, and accents. | Civis stores a standard list of names, and maps all variations to the standard name. The list of standard geographies are pulled from the following data submission sources for each country in this prioritization order: DHIS2, Health Management Information System, Logistic Management Information System, QR metadata, and shapefile. |
We assume that variations in spaces, capitalization, special characters, and accents on characters are never important for distinguishing two names. For example, "North District" and "North district" would always be referring to the same district, as would "N'zi-Ifou" and "Nzi Ifou", and "Adja-Ouèrè" and "Adja-Ouere". | |
Submitted geographies do not have an obvious match to standard geographies. | Civis first contacts the country team and tries to determine a match for the submitted geographies with the standard geographies. If a match cannot be determined, then the geography and its associated data are omitted from the QR dataset, dashboard, and similarly processed assets in M-DIVE. |
Standard geographical areas go through changes over time, like being combined into one area or split into multiple areas. |
The standard geographies in M-DIVE represent the current geographies in a place and time. The system is not designed to show changes in geography across time. Data (recent or historical) that cannot be reconciled under the current version are omitted from the QR. |
|
Civis updates the old standard names to the new standard names. Civis also asks the country teams to resubmit historical data that has the previous geographical areas split or combined into the new geographical areas, and new shapefiles with the newly split or combined geographies. Alternatively, if the new shapefile is not available for areas that were combined, Civis asks country teams to provide the crosswalk, which is a table that matches the historical geographical areas to the new geographical areas and shows which areas need to be merged and how. If areas were split, Civis asks country teams to provide the crosswalk of which lower level areas make up each of the newly created areas. Civis can update the existing shapefile accordingly. If the historical data aligned with the new geographies cannot be submitted, we can aggregate the values from pre-combined areas by summing the data, or use the crosswalk to align the lower level areas to the newly split areas if the data are already available in the QR at a lower level.
|
Map visualizations in M-DIVE require data to match up with geographical boundaries in shapefiles. |
Each standard geographical name, and all of its mappings, is connected to a distinct geographical boundary from a country’s shapefile through a unique number. Shapefiles are typically sourced from Humanitarian Data Exchange, Database of Global Administrative Areas, or the PMI country teams, and the boundaries therein are also matched to the standard geography names. If a standard geographical name does not have a corresponding boundary in the shapefile, the standard name will be omitted from map visualizations. |
Some small changes to standard geographies (like spelling corrections or name changes) need to apply to previously processed data, without needing to completely re-process historical data for every new update. |
The mappings and standard geographical names are organized with a unique identifier number (geo id) so that the two can change independently. For example, if the spelling of a standard name needs to be corrected, it only needs to change in one place, and the mappings are untouched. |
If you have any questions, please submit them to Megan Klinger (wvr1@cdc.gov) and Bryan Baird (bbaird@usaid.gov) on the PMI Surveillance & Informatics team, or to support@civisanalytics.com. If you would like to learn more about additional topics on M-DIVE, please click the links below for further reading.
Important Resources
- Malaria Quarterly Report Data Pipeline Overview: How Quarterly Report Data Are Processed in M-DIVE
- Malaria Quarterly Report Quality Control Processes Overview in M-DIVE
- Malaria Quarterly Report Time Standardization
- Malaria Quarterly Report Indicator Standardization
- Malaria Quarterly Report Format Standardization
- Malaria Quarterly Report (M-DIVE access required)
- M-DIVE Help Center (M-DIVE access required)
Comments
0 comments
Please sign in to leave a comment.