Never Trust User Input for Generic Fields
Disclaimer: I am not a professional developer or database designer, this is a hobby for me.
I’ve written in the past about php unit testing and why you should always use example.com for your testing efforts. Now, in the wake of the Equifax data breach, I am taking a stab at expressing a thought of mine with regards to safe guarding PII (Personal Identifiable Information) in a data application.
Anyone whose ever worked with data driven web applications will already know that user input is to never be trusted. Sanitizing data is always necessary before working with it in a data driven web application. There are many different ways to sanitize user input such as escaping special characters on input and using prepared statements. I am not going to get into the nuts and bolts of that right now. This article is an argument for treating input into generic fields as untrusted in the fact that it could potentially contain PII.
For the purposes of this article, generic fields is a term that I am using for any field that is not for a specific type of information. Fields labeled as “Notes”, “Additional Information”, “Descriptions”, etc. fall under this term.
Many times development organizations will not encrypt these generic fields and instead trust that training will be provided for the end user to not input PII information into such fields. We should never trust user input. Handling PII through policy instead of technically is equivalent to trying to stop a leak with a screen. Some of the water will stop, but it only takes one hole for a data leak. Relating this to the issue at hand, it only takes one person forgetting what a policy is. This can cause catastrophic PII issues for your application. This should be handled at the development level.
An HR organization has a database of employees. They need to make a note that Jon Doe has a peanut allergy and that there is an epi-pen in the first-aid kit for emergencies. There is no specific field to denote medical conditions so they place it in the “Additional Details” field. Furthermore, the person making the entry adds another emergency contact (name, phone, etc) into the same field for this particular allergy case. Due to a security issue with the SQL server, hackers capture a dump of the database. Almost all the data containing PII is secure except for the generic fields. Now the hackers know Jon’s medical condition without having to decrypt the database. Not only is this a privacy violation, but also a potential HIPAA violation. Again, we should never trust user input.
The recent data breach at Equifax reminds us of what can happen with PII information once released to the world. While the Equifax breach is probably related to a hacker or group of hackers gaining access to an account that has legitimate access to this information and (hopefully) not one in which they had direct access to unencrypted data in the databases at Equifax, my argument for protecting generic fields still applies.
In this case, the solution to storing data in these generic fields should be simply to encrypt these fields. Through good database practices, such as the holding the PII data fields is in their own table and using primary and foreign keys, then encrypting them should have a minimal impact on performance.