Open Source Psychology/Data Encryption

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Some files that you will collect will need to be encrypted due to your Institutional Review Board Protocol, other regulations, or you simply want to protect your data so you can safely transfer it.

Encryption is the process of encoding your data so that even if an unauthorized person intercepts the data, they will be unable to access the raw data.

De-identifying Data[edit | edit source]

If you simply want to change your data so that individuals can no longer be linked to their data, but that you still have all data points with unique user ids, you will need to de-identify your data.

Identifiable data is anything that can be used to link information back to a specific individual. This normally includes names, address, social security numbers, other identification numbers, etc., but can include much less obvious indicators of who the person is.

...

In R, open the 'digest' package

library(digest)

You will then want to 'salt' the unique values so that nobody can reverse the hash codes through brute force...

a$first <- c("Aaron","Brittany","Chris")
a$last <- c("Moore","Norren","Oster")