Increasingly, funders are requiring that the data underlying a publication be made publicly available. Making data available is also a good part of the research lifecycle regardless of funding requirements. Most venues for making data public will give the researcher a persistent identifier such as a Digital Object Identifier (DOI) for use in a citation and to help the researcher track the impact of the data. Published data can also be cited and can be included on your CV.
Should I upload my data to ScholarWorks or another site?
We encourage researchers to look for a discipline specific data repository (see below) and email firstname.lastname@example.org to place a copy of the data record in the Boise State Institutional Repository. If a discipline specific repository is not an option, ScholarWorks may be able to make the data available. See examples of Boise State data sets currently in ScholarWorks.
Who can access my data and what can they do with it?
ScholarWorks is an Open Access repository and strives to make the scholarly outputs of Boise State available worldwide without restrictions. Anyone with access to the Internet should be able to find and download your data. If someone re-uses your data, it is expected that they will credit you using the citation and DOI provided on the ScholarWorks page.
What kinds of data can be uploaded to ScholarWorks?
Data classified as Level Three by the Boise State University Data Classification Standard and possibly data classified as Level Two can be uploaded to ScholarWorks. If you do not have a classification for your data, please contact email@example.com. In general, these classification standards allow us to upload data that does not contain protected, private, confidential information or human subject’s data. If your data does not fit the Level Two or Level standard but you believe it has been properly anonymized, please contact firstname.lastname@example.org.
How should I send my data to ScholarWorks?
Please email email@example.com to start the data submission process. Include data in the subject line and attach the data file. We will reply with several questions to help us make the data as accessible and reusable as possible.
How do I get a DOI for my data?
DOI stands for Digital Object Identifier. DOIs allow for objects to be easily cited and discovered, giving the creator of the work credit. ScholarWorks is able to issue DOIs for many types of work including articles, books, images, and data sets. If the work does not already have a DOI and can be made publicly available on the Internet, we will work with you to collect the information needed for a DOI. Please contact firstname.lastname@example.org.
There are hundreds of discipline-specific and interdisciplinary data repositories. Here are a few links to lead you to some of those repositories:
re3data: registry of research data repositories
Open Access Directory of Data Repositories (collected by Simmons College)
Consider expanding your data’s reach by publishing your data set in a data journal.
Questions to Consider:
- How will the service sustain itself? Is there a long-term funding stream?
- How will the service care for my data in the long term should the service fail? Is there a safety net?
- Can the service quickly maximize discoverability of my data? How?
- Does it have a large network of researchers and students seeking data? Will my data get used?
- Does the service understand international archiving standards?
- Does it provide a DOI, data citation, and version control for updating my files?
- Does the service have proven experience securing sensitive data upon intake and when sharing?
In 2013, the Amsterdam Manifesto on Data Citation Principles was published, detailing the concept of data as a citable product of research in 8 short statements.
- Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.
- Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.
- In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
- A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community. [An example of a persistent identifier might be a DOI or digital object identifier]
- Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
- Unique identifiers, and metadata describing the data, and its disposition, should persist — even beyond the lifespan of the data they describe. [See #4]
- Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.
- Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities
Basic Items to Include:
Author: Name of is the individual, group, or organization responsible for the creation of the data set.
Title: Name of the data set.
Format: Notation that this is a data set as opposed to a journal article, book, or website.
Location: City/State of the organization/institution that produced the data set.
Date: Year the data set was released/published. Also, in some cases, the date you accessed the data set.
Version: If multiple versions of the data set are available, include the version number of the data set you used.
Unique Identifier: A unique identifier to link back to the specific data set (examples include: DOI, PURL, repository ID number, etc.).
Distributor: Name of the organization/site providing access to the data set.
The order and formatting of these pieces of information will vary according to different citation styles, journal publishers, and data repositories.
Author/Rightsholder. (Year). Title of data set (Version number) [Description of form]. Location: Name of producer.
Author/Rightsholder. (Year). Title of data set (Version number) [Description of form]. Retrieved from http://
Advanced Cooperative Arctic Data and Information Service (ACADIS). (2010). LiDAR (DEM) NIMS grid Barrow, Alaska 2010. [Data set]. Retrieved from:https://www.aoncadis.org/
Zhang, G., Parker, P., Li, B., Li, H., & Wang, J. (2012). The genome of Darwin’s Finch (Geospiza fortis). GigaScience. [Data set]. doi: 10.5524/100040
See more information from the APA Style Blog here: http://blog.apastyle.org/
Note: Some data sources will provide additional citation information and help.
MLA has not yet developed specific rules for dataset citations, so follow the rules for a general website.
Tweedie, Craig. E., and Steven Oberbauer. Kite Aerial Photography NIMS Grid Barrow, Alaska 2013. (Data set). Barrow, AK: Advanced Cooperative Arctic Data and Information Service, 2010. Web. 7 Apr. 2014. <https://www.aoncadis.org/
Zhang, G., D. Lambert, and J. Wang. Genomic Data from Adelie Penguin (Pygoscelis adeliae). (Data set). Gigascience, 2011. Web. 17 Apr. 2014. <http://dx.doi.org/10.5524/100006>
Note: Some data sources will provide additional citation information and help.
- Figshare is a free site/repository that allows users to “upload any file format to be previewed in the browser so that any research output, from posters and presentations to datasets and code, can be disseminated in a way that the current scholarly publishing model does not allow.”
Additional tools and resources can be found here.