Never Trust User Input for Generic Fields

Never trust user inputNever Trust User Input for Generic Fields

Disclaimer: I am not a professional developer or database designer, this is a hobby for me.

I’ve written in the past about php unit testing and why you should always use example.com for your testing efforts. Now, in the wake of the Equifax data breach, I am taking a stab at expressing a thought of mine with regards to safe guarding PII (Personal Identifiable Information) in a data application.

Anyone whose ever worked with data driven web applications will already know that user input is to never be trusted. Sanitizing data is always necessary before working with it in a data driven web application. There are many different ways to sanitize user input such as escaping special characters on input and using prepared statements. I am not going to get into the nuts and bolts of that right now. This article is an argument for treating input into generic fields as untrusted in the fact that it could potentially contain PII.

The Problem

For the purposes of this article, generic fields is a term that I am using for any field that is not for a specific type of information. Fields labeled as “Notes”, “Additional Information”, “Descriptions”, etc. fall under this term.

Many times development organizations will not encrypt these generic fields and instead trust that training will be provided for the end user to not input PII information into such fields. We should never trust user input. Handling PII through policy instead of technically is equivalent to trying to stop a leak with a screen. Some of the water will stop, but it only takes one hole for a data leak. Relating this to the issue at hand, it only takes one person forgetting what a policy is. This can cause catastrophic PII issues for your application. This should be handled at the development level.

Example Scenario

An HR organization has a database of employees. They need to make a note that Jon Doe has a peanut allergy and that there is an epi-pen in the first-aid kit for emergencies. There is no specific field to denote medical conditions so they place it in the “Additional Details” field. Furthermore, the person making the entry adds another emergency contact (name, phone, etc) into the same field for this particular allergy case. Due to a security issue with the SQL server, hackers capture a dump of the database. Almost all the data containing PII is secure except for the generic fields. Now the hackers know Jon’s medical condition without having to decrypt the database. Not only is this a privacy violation, but also a potential HIPAA violation. Again, we should never trust user input.

The recent data breach at Equifax reminds us of what can happen with PII information once released to the world. While the Equifax breach is probably related to a hacker or group of hackers gaining access to an account that has legitimate access to this information and (hopefully) not one in which they had direct access to unencrypted data in the databases at Equifax, my argument for protecting generic fields still applies.

The Solution

In this case, the solution to storing data in these generic fields should be simply to encrypt these fields. Through good database practices, such as the holding the PII data fields is in their own table and using primary and foreign keys, then encrypting them should have a minimal impact on performance.

Example.com: Always Use It for Testing

Testing with example.comBackground

I was looking over some software tests today and they had different testing addresses such as test.com or test@test.com. This got me to thinking, isn’t there a standard site or address that we should use for testing? It didn’t take me long to find my answer; example.com. More on that in a bit.

Security Concerns

A couple of thoughts that came up while thinking about this; where is my information going while testing with made up sites and what kind of data am I sending? From a security standpoint, using unknown sites for testing may reveal flaws, sensitive data or PII to parties that may not have the best intentions in mind. Let me throw a hypothetical out there. Suppose I am a party that sees an opportunity to purchase the domain name tester.com. My reason for purchasing such a domain is not for legitimate reasons but rather as a honey pot. With that honey pot, I harvest the information by pulling in emails that come to that domain. Once that information is in hand, they could sell it on the dark web. Thankfully, my honor is paramount to me so I will not do such a thing.

Real Life Examples

A quick search on whois found the following: test.com has a private registration in the United States. We don’t know who owns this site. The question here is what are their intentions for the data they gather? Registration for somewhere.com is private in Panama. Nowhere.com redirects to a media outlet in Germany that looks like a simple front site. The last update for this site? 2012. I’m not saying that this one is, but its suspicious in the very least. A web advertising agency owns the site Test-site.com. There is a potential that the owner of test@test-site.com may add emails gleaned from tests to spam lists. How would your clients feel about a sudden influx of spam?

Other Concerns

A less evil, but realistic concern using random sites is that some of these sites could be real and legit. Take, for example a company named Pinacle Associates; I have no idea if such a company exists and please don’t bombard them with emails. Tes Thompson is an SVP for Public Relations for this company. For emails, this company decided on the naming scheme of first name last initial. In this case, Tes’s email would be test@pinacle-associates.com; again, I don’t know if this exists, so please be kind and don’t spam it. Imagine the amount of mail she must get if a test team decided to use her email address for testing?

The Solution: Example Domains

So what is the solution then? Set aside for the very purposes of testing and documentation are Example.com, example.net, example.org and example.edu. The Internet Corporation for Assigned Names and Numbers or ICANN owns and manages these domains. These are the folks that give out and manage domain names.

Conclusion

So the moral of the story here is that you should always use one of the example domains. Using a domain such as example.com when testing software will help prevent inadvertently leaking PII data. Your company or client values their data and wants it kept secure.