Skip links and keyboard navigation

Introduction and references


The ultimate aim of working with linked data should not be to produce a single, integrated dataset containing all the information from each table. 'Linkage' should not be confused with 'merging':

For records that can be linked, data merging refers to the process of combining individual records (or information in those records) into an integrated dataset. (National Statistical Service, 2015)

Linkage of separate datasets by a common identifier or key facilitates this merging of subsets of multiple datasets.


Borer, E.T., Seabloom, E.W. Jones, M.B. & Schildhauer, W. (2009). Some simple guidelines for effective data management. Bulletin of the Ecological Society of America, 90, 205–214.
Braun, M.T., Kuljanin, G. &  DeShon, R.P. (2018). Special considerations for the acquisition and wrangling of Big Data. Organizational Research Methods21(3), 633-659.
Broman, K.W. & Woo, K.H.     (2018). Data Organization in Spreadsheets, The American Statistician, 72(1), 2-10.
Ellis, S.E., & Leek, J.T. (2018). How to share data for collaboration. The American Statistician72(1), 53-57.
Hart, E.M., Barmby, P., LeBauer, D., Michonneau, F., Mount, S., Mulrooney, P., Poisot, T., Woo, K.H., Zimmerman, N.B. and Hollister, J.W. (2016). Ten simple rules for digital data storage. PLoS Computational Biology12(10), p.e1005097 -
ORNL DAAC [Oak Ridge National Laboratory Distributed Active Archive Center]. (2018). Data Management - Best Practices for Data Management. Accessed 20 July 2018 from
Murrell, P. (2013). “Data Intended for Human Consumption, Not Machine Consumption,” in              Bad Data Handbook, ed. MacCallum, Q.E.  Sebastopol, C.A.:  O’Reilly Media, pp.31–51.
National Statistical Service(2015). A Guide for Data Integration Projects Involving Commonwealth Data for Statistical and Research Purposes. Accessed 18 July 2018 from:
Schildhauer, M. (2018). "Data Integration: Principles and Practice" in Ecological Informatics, ed. Recknagel, F. & Michener, W.K. Springer, Cham,     pp. 129-157.
Strasser, C.A., Cook, R., Michener, W.K., & Budden, A. (2012). Primer on Data Management: What you always wanted to know. UC Office of the President: California Digital Library
White, E.P., Baldridge, E., Brym, Z.T., Locey, K.J., McGlinn, D.J., & Supp, S.R. (2013). Nine simple ways to make it easier to (re)use your data. Ideas in Ecology and Evolution,   6(2), 1-10.
Wickham, H., & Grolemund, G. (2017). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Accessed 18 July 2018 from:
(Join diagrams reproduced here under Creative Commons licence  -
Wickham, H. (2014). Tidy data.     Journal of Statistical Software59(10), 1-23.
Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., & Teal, T. K. (2017). Good enough practices in scientific computing.     PLoS Computational Biology13(6), e1005510.

Last updated: 9 November 2018