Guide to self-preservation
Learn how to manage your data, keep it accessible and usable, and extend its lifespan for as long as possible.
About the guide
There are many risks that can negatively impact your digital records. These include:
- Software and hardware obsolescence or failure
- File degradation or corruption
- Physical damage to storage media, such as flash drives or hard drives
- Human error, such as accidental deletion of files
Considering these risks, your digital records have an average lifespan of 5-15 years, depending on the software and hardware you choose to maintain your data and a number of other variables.
The Libraries’ guide to self-preservation aims to help you extend the life of your digital records in your research, at work, and at home by providing a variety of general and format specific strategies to create and maintain good data.
General strategies
Storage and back-ups
Consider where all of your records are stored: on your computer, your phone, your tablet, social media accounts, as e-mail attachments, in Dropbox, Google Drive, and other file storage services, or on external media such as flash drives, optical media, or external hard drives.
The copies you maintain and back-up should consolidate important records from across these storage areas. File storage services and social media have changed their policies in the past to limit how much users can upload and have even deleted user content in response to these policy changes.
Third party platforms are not required to maintain your data. External media has a shelf life and will become unreadable and inaccessible over time. Make sure to have a local copy of all of your important or meaningful records and refresh your hardware as needed.
Having copies of your data will ensure you have a back-up in the event of data loss. The 3-2-1 rule is a good approach to follow: Keep 3 copies, on 2 different types of storage media (example: 1 laptop, 2 external hard drives) and store one copy in a separate location.
Copies should be checked routinely, including after files are transferred and/or backed-up. If you are accessing a copy of your files without the intention to edit them, do not save files after viewing them to avoid unintentional changes on these copies. Consider the type of back-up to perform:
- Full back-ups facilitate comprehensive data recovery by creating a copy of all data , but they may require significant storage space.
- Incremental back-ups involve creating a copy of all new and modified data since the last incremental back-up They require less storage space, but a full back-up may still be required periodically to ensure comprehensive data recovery.
- Differential back-ups involve creating a copy of all data since the last Full Backup. They require more storage space than incremental back-ups. For comprehensive data recovery, an initial full back-up and and the last differential back-up is required.
Hardware, software, and format sustainability
Keeping your software and hardware up to date will help keep your data accessible by improving your ability to open and read your files. It also makes your data more easily accessible to others who may require access.
Replace your storage media every 5-7 years to stay ahead of technological changes and keep your data accessible.
Choosing commonly used and accessible formats will help ensure your files remain accessible. Where possible, avoid proprietary formats as copyright protections limit the accessibility of such formats outside of their designed software. Overall, file formats should be non-proprietary and/or commonly used, either more broadly or within a specific discipline or research community. File formats should also be uncompressed and unencrypted.
It’s also important to think about your long-term plans before choosing a format. Lossless formats, such as tiffs are more stable, but can use up more storage space and be less portable, meaning that they are not ideal formats if you intend to share files as e-mail attachments or through other methods. Jpegs are lossy, but are still relatively stable and commonly used and will be better suited for file sharing.
Using software that can be rendered in different operating systems, and that is regularly upgraded to the latest stable version will further ensure that your files remain usable and can be more widely accessed.
Metadata and file naming
Metadata, or data about data, is descriptive information that tells you what a file is and helps you better manage, store, and find your data. It’s important to have a folder system and file naming system that is clear and intuitive and that is applied consistently across your records.
If you are organizing your research files, documentation such as readme files, data dictionaries, and code books can also be a useful method of recording metadata about the research data.
File names should be no longer than 25 characters, and should facilitate file identification and retrieval. How you name your files will also impact the order in which they appear in any given directory, as files typically sort alphabetically by file name.
Avoid non-alphanumeric characters such as punctuation, ampersands or asterisks in your file names. Spaces can also sometimes be problematic in file and folder names. Consider the following alternatives:
- kebab-case: where hyphens are used instead of spaces. Example: file-name
- camelCase: where the first letter of each word, with the exception of the first word, is capitalized. Example: fileName
- snake_case: where underscores are used instead of spaces. Example: file_name
- PascalCase: where the first letter of each word is capitalized. Example: FileName
You can also use a combination of these alternatives to separate values in a file name. For example, if your file name includes the author name and a date, snake case could be used as a separator between name and date, while kebab case could be used as a separator within each separate value. Example: smith-jane_2021-09-28
If you are keeping different versions of a document, you should also make sure the file name clearly expresses the version. Example: fileName-v1
Selection and retention
Periodically clean your files to ensure you’re using your space efficiently. Some questions that can help you decide what should be kept include:
- Is there a business or research purpose that requires me to keep these records?
- Are these records unique or meaningful?
- What resources in terms of time and money would be involved in replacing these files if they were lost?
- Are these files duplicated or available elsewhere?
- Do I have similar content, such an earlier draft of the same record, or multiple photographs of the same thing? If yes, which is the best version to keep?
Keeping everything may cost you more over time in terms of storage costs, and may make it more difficult to find and retrieve relevant data if you are intending to deposit it with an institution for long-term preservation in the future.
File relationships
Avoid creating dependencies between files that might be lost when transferring records over. For example, if you hyperlink text in one document so that it links to another document on your computer, this link will only be functional on your local computer, and only as long as the files remain in the same location. If you save a copy of those files to a separate storage device, or move the file to a different directory on your computer, these links will break.
As an alternative, use a descriptive hyperlink so that you, or other potential users of your data know how to retrieve the content if the link breaks. For example, include the title of the file and other contextual information so that you or a secondary user can retrieve the file.
Monitoring
Files can be altered inadvertently, both through human error, and through other processes that may be operating in the background without your knowledge. You can monitor these changes in numerous ways.
Anti-virus software will ensure you are alerted of any malicious actions that may cause changes to your files. You can also monitor file integrity by installing software that monitors and reports on data integrity. Free versions of these tools, such as Fixity by AVP, can be installed on your storage devices. Routinely running this software will help alert you to unintentional changes to your files.
Data security
Ensuring your data is secure is particularly important for those working with sensitive and/or private data that could cause harm to others if unintentionally released.
Consider data security not only when storing your data on your computer or other storage devices, but also when sharing your data with relevant parties. If multiple people need access to the data for various reasons, make sure their access permissions are specific to their needs. For example, if a person only needs to view a file, make sure they do not have editing permissions.
Encrypting your data with a strong password can also help keep your data secure, but it is important to remember your encryption key to avoid being locked out of your own data.
For University of Manitoba researchers, the Libraries supports an instance of Dataverse, which can be used as a secure file share system, where specific user permissions can be assigned. For the wider University of Manitoba community, the Access and Privacy Office has a Data Sharing and Storage Guidelines - University Records (PDF).
Content-specific considerations
Alongside general considerations, you should also consider content-specific issues that might impact the lifespan and preservability of your digital records. The resources below may help you to better format and maintain specific types of content.
Research data
The following resources may be useful in learning how to better maintain and manage research data, including raw data, curated data, published data, supporting documentation (e.g., consent forms, agreements, etc.), as well as related metadata.
UM resources
- Research data management at the Libraries
- Research lifecycle at UM
- Upcoming workshops or past sessions offered through the Libraries
- The Research Data Storage Finder tool
- Using SharePoint and OneDrive for Research
- Research computing
- IST's Information Security tips
Additional resources
- DMP Assistant
- Dataverse Curation Guide (NDRIO/Portage)
- Data Curation Network resources, including:
- FAIR Principles
- First Nations principles of OCAP® training resources
- CARE Principles for Indigenous Data Governance
University records
The following UM resources may be useful in learning how to better maintain and manage records created by UM employees in the course of their work.
- Data Sharing and Storage Guidelines – University Records and Data Quick Reference
- UM Common Records Authority Schedule
- UM records management policies and procedures
- Donating to the Archives: University Office Donor Guidelines
Digitized records
The following UM resources may be useful in learning how to better maintain records that have been digitized from paper-based formats.
- Digitization at the Libraries
- Imaging records for use as official records (UM records management policy)
Personal records
The following resources may be useful in learning how to manage personal digital records.