The data collected through surveys such as the Decennial Census are used by government analysts and social scientists alike to study population movements and changes. Access to some of the richest variables in these data, such as detailed geography, is often restricted. Full datasets are not publicly available due to the risk posed by the release of personally identifiable information. There are concerns about the violation of confidentiality and trust with respondents if these datasets are shared. As a result, repositories have established barriers to ensure that those seeking access to restricted data have a valid need and meet certain security requirements (United States Census Bureau, 2009; United States Department of Commerce, 2000; Williams & Pigeot, 2016) Researchers who want to analyze these data must, at each repository and for each dataset, request permission to access the full dataset. This access is generally granted only for that dataset in that repository. Should the researcher require access to a similar dataset at another institution, they must go through the access request process again, and there is no guarantee that access will be granted. While security and privacy are valid, the barriers put in place are a hindrance to researcher re-use of data (Bishop, 2009; Kim & Adler, 2015). These restrictions also limit the possibilities for research that would require multiple restricted datasets housed in different secure repositories.
This poster presents on-going research into the criteria data repositories curating restricted data use to develop digital identities of access for users. The results of the analysis of data-access-request policies, repository management documentation, and other security procedures will be a data access credentialing model that can be used to validate a user’s identity against access and security requirements of the repositories and datasets. This model has implications for data repositories handling restricted data beyond census data and other social science surveys. Data reusability is important for maximizing the productivity of investments in data collection. Removing the barriers that occur when trusted users are required to repeat, often redundantly, requests to access datasets will improve the data’s re- use potential.
Allison Tyler is a first-year PhD student at the University of Michigan School of Information. She received her master’s degree in library and information science from the University of Denver in 2016, a master’s degree in space studies-planetary science from the American Military University in 2013, and a bachelor’s degree in mathematics from the United States Naval Academy in 2007. Her research interests include the social and technical barriers to information access, with a focus how those barriers hinder scientific data reuse and access. She was recently a co-author on a book chapter in Participatory Heritage (2017), describing the challenges and solutions to on-going preservation of oral histories. She is currently a research team member at the Inter-University Consortium for Political and Social Research developing researcher credentials for accessing restricted data.