Published February 22, 2016
| Version v1
Software
Open
Processing and similarity scoring WHO ICTRP data
Description
Source code for "Previously Unidentified Duplicate Registrations of Clinical Trials: an Exploratory Analysis of Registry Data Worldwide" (under review).
This code was used to process the WHO International Clinical Trials Registry Platform (ICTRP) dataset retrieved in April 2015 (see related). The code imports the XML data into a SQL database and performs a number of standardizations. There is also code to group records by referenced primary registry IDs and to perform text-based similarity scoring on registration fields.
The README file included with the code provides detailed instructions on dependencies and running the code.
Files
ictrp-source-code.zip
Files
(47.3 MB)
Name | Size | Download all |
---|---|---|
md5:29322bd38e9ed6e8099683fce28f24f8
|
47.3 MB | Preview Download |
Additional details
Related works
- Is supplemented by
- 10.5281/zenodo.46392 (DOI)