Guide to self-preservation

Libraries

About the guide

There are many risks that can negatively impact your digital records. These include:

Software and hardware obsolescence or failure
File degradation or corruption
Physical damage to storage media, such as flash drives or hard drives
Human error, such as accidental deletion of files

Considering these risks, your digital records have an average lifespan of 5-15 years, depending on the software and hardware you choose to maintain your data and a number of other variables.

The Libraries’ guide to self-preservation aims to help you extend the life of your digital records in your research, at work, and at home by providing a variety of general and format specific strategies to create and maintain good data.

General strategies

Storage and back-ups

Consider where all of your records are stored: on your computer, your phone, your tablet, social media accounts, as e-mail attachments, in Dropbox, Google Drive, and other file storage services, or on external media such as flash drives, optical media, or external hard drives.

The copies you maintain and back-up should consolidate important records from across these storage areas. File storage services and social media have changed their policies in the past to limit how much users can upload and have even deleted user content in response to these policy changes.

Third party platforms are not required to maintain your data. External media has a shelf life and will become unreadable and inaccessible over time. Make sure to have a local copy of all of your important or meaningful records and refresh your hardware as needed.

Having copies of your data will ensure you have a back-up in the event of data loss. The 3-2-1 rule is a good approach to follow: Keep 3 copies, on 2 different types of storage media (example: 1 laptop, 2 external hard drives) and store one copy in a separate location.

Copies should be checked routinely, including after files are transferred and/or backed-up. If you are accessing a copy of your files without the intention to edit them, do not save files after viewing them to avoid unintentional changes on these copies. Consider the type of back-up to perform:

Full back-ups facilitate comprehensive data recovery by creating a copy of all data , but they may require significant storage space.
Incremental back-ups involve creating a copy of all new and modified data since the last incremental back-up They require less storage space, but a full back-up may still be required periodically to ensure comprehensive data recovery.
Differential back-ups involve creating a copy of all data since the last Full Backup. They require more storage space than incremental back-ups. For comprehensive data recovery, an initial full back-up and and the last differential back-up is required.

Hardware, software, and format sustainability

Keeping your software and hardware up to date will help keep your data accessible by improving your ability to open and read your files. It also makes your data more easily accessible to others who may require access.

Replace your storage media every 5-7 years to stay ahead of technological changes and keep your data accessible.

Choosing commonly used and accessible formats will help ensure your files remain accessible. Where possible, avoid proprietary formats as copyright protections limit the accessibility of such formats outside of their designed software. Overall, file formats should be non-proprietary and/or commonly used, either more broadly or within a specific discipline or research community. File formats should also be uncompressed and unencrypted.

It’s also important to think about your long-term plans before choosing a format. Lossless formats, such as tiffs are more stable, but can use up more storage space and be less portable, meaning that they are not ideal formats if you intend to share files as e-mail attachments or through other methods. Jpegs are lossy, but are still relatively stable and commonly used and will be better suited for file sharing.

Using software that can be rendered in different operating systems, and that is regularly upgraded to the latest stable version will further ensure that your files remain usable and can be more widely accessed.

Metadata and file naming

Metadata, or data about data, is descriptive information that tells you what a file is and helps you better manage, store, and find your data. It’s important to have a folder system and file naming system that is clear and intuitive and that is applied consistently across your records.

If you are organizing your research files, documentation such as readme files, data dictionaries, and code books can also be a useful method of recording metadata about the research data.

File names should be no longer than 25 characters, and should facilitate file identification and retrieval. How you name your files will also impact the order in which they appear in any given directory, as files typically sort alphabetically by file name.

Avoid non-alphanumeric characters such as punctuation, ampersands or asterisks in your file names. Spaces can also sometimes be problematic in file and folder names. Consider the following alternatives:

kebab-case: where hyphens are used instead of spaces. Example: file-name
camelCase: where the first letter of each word, with the exception of the first word, is capitalized. Example: fileName
snake_case: where underscores are used instead of spaces. Example: file_name
PascalCase: where the first letter of each word is capitalized. Example: FileName

You can also use a combination of these alternatives to separate values in a file name. For example, if your file name includes the author name and a date, snake case could be used as a separator between name and date, while kebab case could be used as a separator within each separate value. Example: smith-jane_2021-09-28

If you are keeping different versions of a document, you should also make sure the file name clearly expresses the version. Example: fileName-v1

Selection and retention

Periodically clean your files to ensure you’re using your space efficiently. Some questions that can help you decide what should be kept include:

Is there a business or research purpose that requires me to keep these records?
Are these records unique or meaningful?
What resources in terms of time and money would be involved in replacing these files if they were lost?
Are these files duplicated or available elsewhere?
Do I have similar content, such an earlier draft of the same record, or multiple photographs of the same thing? If yes, which is the best version to keep?

Keeping everything may cost you more over time in terms of storage costs, and may make it more difficult to find and retrieve relevant data if you are intending to deposit it with an institution for long-term preservation in the future.

File relationships

Avoid creating dependencies between files that might be lost when transferring records over. For example, if you hyperlink text in one document so that it links to another document on your computer, this link will only be functional on your local computer, and only as long as the files remain in the same location. If you save a copy of those files to a separate storage device, or move the file to a different directory on your computer, these links will break.

As an alternative, use a descriptive hyperlink so that you, or other potential users of your data know how to retrieve the content if the link breaks. For example, include the title of the file and other contextual information so that you or a secondary user can retrieve the file.

Monitoring

Files can be altered inadvertently, both through human error, and through other processes that may be operating in the background without your knowledge. You can monitor these changes in numerous ways.

Anti-virus software will ensure you are alerted of any malicious actions that may cause changes to your files. You can also monitor file integrity by installing software that monitors and reports on data integrity. Free versions of these tools, such as Fixity by AVP, can be installed on your storage devices. Routinely running this software will help alert you to unintentional changes to your files.

Data security

Ensuring your data is secure is particularly important for those working with sensitive and/or private data that could cause harm to others if unintentionally released.

Consider data security not only when storing your data on your computer or other storage devices, but also when sharing your data with relevant parties. If multiple people need access to the data for various reasons, make sure their access permissions are specific to their needs. For example, if a person only needs to view a file, make sure they do not have editing permissions.

Encrypting your data with a strong password can also help keep your data secure, but it is important to remember your encryption key to avoid being locked out of your own data.

For University of Manitoba researchers, the Libraries supports an instance of Dataverse, which can be used as a secure file share system, where specific user permissions can be assigned. For the wider University of Manitoba community, the Access and Privacy Office has a Data Sharing and Storage Guidelines - University Records (PDF).

Content-specific considerations

Alongside general considerations, you should also consider content-specific issues that might impact the lifespan and preservability of your digital records. The resources below may help you to better format and maintain specific types of content.

Research data

The following resources may be useful in learning how to better maintain and manage research data, including raw data, curated data, published data, supporting documentation (e.g., consent forms, agreements, etc.), as well as related metadata.

UM resources

Research data management at the Libraries
Research lifecycle at UM
Upcoming workshops or past sessions offered through the Libraries
The Research Data Storage Finder tool
Using SharePoint and OneDrive for Research
Research computing
IST's Information Security tips

Additional resources

University records

The following UM resources may be useful in learning how to better maintain and manage records created by UM employees in the course of their work.

Digitized records

The following UM resources may be useful in learning how to better maintain records that have been digitized from paper-based formats.

Personal records

The following resources may be useful in learning how to manage personal digital records.

Libraries

What are you looking for?

Admissions

Academics

Research

Student supports

Community

About UM

Libraries

University of Manitoba

Libraries

About the guide

General strategies

Storage and back-ups

Hardware, software, and format sustainability

Metadata and file naming

Selection and retention

File relationships

Monitoring

Data security

Content-specific considerations

Research data

UM resources

Additional resources

University records

Digitized records

Personal records