Search for a command to run...
Background: The Swiss neighbourhood index of socioeconomic position (Swiss-SEP) is an area-based measure of socioeconomic status for Switzerland. The Swiss-SEP was constructed by Panczak et al. in 2012 (version 1.0) and re-validated in 2023 (versions 2.0 and 3.0). Each Swiss-SEP value is linked to a geocode in Switzerland. The Swiss-SEP dataset is available free of charge for research purposes and can be accessed after signing a contract with the Swiss National Cohort (doi.org/10.48620/110). Current solutions, to geocode addresses and link the Swiss-SEP, such as the R package GeoSwiss, are based on the application programming interface (API) of the Federal Office of Topography (swisstopo): geo.admin.ch. We created an offline solution that allows for correction of spelling errors in addresses to link the Swiss-SEP to patients’ addresses of SwissPedHealth (www.swisspedhealth.com), a National Data Stream of the Swiss Personalized Health Network (SPHN). SwissPedHealth includes electronic health records of patients that visited seven large Swiss children’s hospitals in Basel, Bern, Geneva, Lausanne, Luzern, St. Gallen, and Zurich. Standard operating procedure: The purpose of this standard operating procedure is to standardize how the Swiss-SEP variables are linked to patients based on their addresses. This standard operating procedure is intended to be used in the clinical data warehouses (CDWs) of hospitals participating in the SwissPedHealth project. The SwissPedHealth team at ISPM Bern created a reference dataset with all Swiss addresses from the Federal building and apartment registry and corresponding Swiss-SEP values, see “Preparing reference dataset: Swiss-SEP and Swiss addresses”. The reference dataset contains both Swiss-SEP values and distance to nearest Swiss-SEP. The “Transforming addresses to Swiss-SEP: Standard operating procedure” tries to match the patient addresses to addresses in the reference dataset and hence identify the Swiss-SEP values for that patient. The SOP follows these steps: i) Split patient addresses into street, number, postal code (plz), and city; ii) Spell out common abbreviations; iii) Remove special characters from patient addresses; iv) Find patient addresses in reference dataset with a) Simple dataset matching and b) Allowing spelling errors during dataset matching; v) Add Swiss-SEP variables to matched addresses. To allow for spelling errors, the SOP calculates a score for each address compared to the addresses in the reference dataset based on street, number, postal code, and city. For non-numeric tokens, the score is based on the Levenshtein distance between the characters of the address and the reference address. The programming code for this SOP is provided as R code and as python code and is available on GitHub. Results: The SOP was piloted at the Inselspital, Bern University Hospital with 50,000 addresses of patients. More than 98% of addresses were successfully matched to an address in the reference dataset and got linked to a Swiss-SEP value. In SwissPedHealth, including more than 600,000 patients, three out of seven hospitals achieved to link the Swiss-SEP to 95–98% of addresses. Two hospitals close to the boarder of neighboring countries had a lower linkage proportion of 88–90%, likely because of patients without a Swiss address. One hospital reported difficulties implementing the SOP and linked the Swiss-SEP only to 54% of addresses. One hospital was unable to implement the SOP due to limited resources. Many hospitals reported that the SOP was computationally demanding. Conclusion: With this SOP, we provide an offline solution to geocode Swiss addresses and link the Swiss-SEP that allows for spelling errors. Implementing it in the SwissPedHealth project, five hospitals achieved to link the Swiss-SEP for most patient addresses. However, computational demand of the SOP is high and could be improved by future projects