Informations and abstract
Keywords: Probabilistic Record Linkage; Deterministic Record Linkage; Survey Data; Administrative Data; Pharmaceutical Industry.
Objectives. This paper proposes an empirical comparison between the probabilistic record linkage and the deterministic record linkage. We use both the procedures for combining two diverse data sources, an administrative archive and a sample survey, for a situation in which the true status of each pair of linked records is known. The sample survey provides information on Italian doctor prescribing behaviour in 2007; the administrative archive provides personal and professional data for the population of Italian doctors in 2007. Match rates and error rates for the two procedures are compared and discussed in order to make suggestions on when their use is appropriate. The record linkage aims at enriching the information separately stored in the two data sources. Methods and Results. We firstly adopt a deterministic linkage procedure which consists of applying various decisional criteria to establish when a pair of records should be linked. We then use a probabilistic approach to link the records of the two files. For both procedures, we calculate sensitivity and positive predictive value, and use them to assess the accuracy of linkage. Our findings suggest that the deterministic procedure is more accurate than the probabilistic one. Conclusions. Overall, this paper confirms the feasibility of this kind of data linkage. Linked doctor records are now endowed with inforanno mation on personal characteristics, professional features and prescribing behaviour. Considering that all pharmaceutical companies have access to the administrative data source used in our linkage exercise and many of them conduct their own surveys, our results may represent a basis for comparisons when attempting to conduct similar linkage exercises.