feed2list
feed2list will stop its service on 2019-01-01
Search and browse in Computer · Security
   search hits: 33
website Errata Security
Advanced persistent cybersecurity
feed text Brian Kemp is bad on cybersecurity
Sun, 04 Nov 2018 23:22:00 +0000
I'd prefer a Republican governor, but as a cybersecurity expert, I have to point out how bad Brian Kemp (candidate for Georgia governor) is on cybersecurity. When notified about vulnerabilities in election systems, his response has been to shoot the messenger rather than fix the vulnerabilities. This was the premise behind the cybercrime bill earlier this year that was ultimately vetoed by the current governor after vocal opposition from cybersecurity companies. More recently, he just announced that he's investigating the Georgia State Democratic Party for a "failed hacking attempt".


According to news stories, state elections websites are full of common vulnerabilities, those documented by the OWASP Top 10, such as "direct object references" that would allow any election registration information to be read or changed, as allowing a hacker to cancel registrations of those of the other party.

Testing for such weaknesses is not a crime. Indeed, it's desirable that people can test for security weaknesses. Systems that aren't open to test are insecure. This concept is the basis for many policy initiatives at the federal level, to not only protect researchers probing for weaknesses from prosecution, but to even provide bounties encouraging them to do so. The DoD has a "Hack the Pentagon" initiative encouraging exactly this.

But the State of Georgia is stereotypically backwards and thuggish. Earlier this year, the legislature passed SB 315 that criminalized this activity of merely attempting to access a computer without permission, to probe for possibly vulnerabilities. To the ignorant and backwards person, this seems reasonable, of course this bad activity should be outlawed. But as we in the cybersecurity community have learned over the last decades, this only outlaws your friends from finding security vulnerabilities, and does nothing to discourage your enemies. Russian election meddling hackers are not deterred by such laws, only Georgia residents concerned whether their government websites are secure.

It's your own users, and well-meaning security researchers, who are the primary source for improving security. Unless you live under a rock (like Brian Kemp, apparently), you'll have noticed that every month you have your Windows desktop or iPhone nagging you about updating the software to fix security issues. If you look behind the scenes, you'll find that most of these security fixes come from outsiders. They come from technical experts who accidentally come across vulnerabilities. They come from security researchers who specifically look for vulnerabilities.

It's because of this "research" that systems are mostly secure today. A few days ago was the 30th anniversary of the "Morris Worm" that took down the nascent Internet in 1988. The net of that time was hostile to security research, with major companies ignoring vulnerabilities. Systems then were laughably insecure, but vendors tried to address the problem by suppressing research. The Morris Worm exploited several vulnerabilities that were well-known at the time, but ignored by the vendor (in this case, primarily Sun Microsystems).

Since then, with a culture of outsiders disclosing vulnerabilities, vendors have been pressured into fix them. This has led to vast improvements in security. I'm posting this from a public WiFi hotspot in a bar, for example, because computers are secure enough for this to be safe. 10 years ago, such activity wasn't safe.

The Georgia Democrats obviously have concerns about the integrity of election systems. They have every reason to thoroughly probe an elections website looking for vulnerabilities. This sort of activity should be encouraged, not suppressed as Brian Kemp is doing.

To be fair, the issue isn't so clear. The Democrats aren't necessarily the good guys. They are probably going to lose by a slim margin, and will cry foul, pointing to every election irregularity as evidence they were cheated. It's like how in several races where Republicans lost by slim numbers they claimed criminals and dead people voted, thus calling for voter ID laws. In this case, Democrats are going to point to any potential vulnerability, real or imagined, as disenfranchising their voters. There has already been hyping of potential election systems vulnerabilities out of proportion to their realistic threat for this reason.

But while not necessarily in completely good faith, such behavior isn't criminal. If an election website has vulnerabilities, then the state should encourage the details to be made public -- and fix them.

One of the principles we've learned since the Morris Worm is that of "full disclosure". It's not simply that we want such vulnerabilities found and fixed, we also want the complete details to be made public, even embarrassing details. Among the reasons for this is that it's the only way that everyone can appreciate the consequence of vulnerabilities.

In this case, without having the details, we have only the spin from both sides to go by. One side is spinning the fact that the website was wide open. The other side, as in the above announcement, claims the website was completely secure. Obviously, one side is lying, and the only way for us to know is if the full details of the potential vulnerability are fully disclosed.

By the way, it's common for researchers to falsely believe in the severity of the vulnerability. This is at least as common as the embarrassed side trying to cover it up. It's impossible to say which side is at fault here, whether it's a real vulnerability or false, without full disclosure. Again, the wrong backwards thinking is to believe that details of vulnerabilities should be controlled, to avoid exploitation by bad guys. In fact, they should be disclosed, even if it helps the bad guys.

But regardless if these vulnerabilities are real, we do know that criminal investigation and prosecution is the wrong way to deal with the situation. If the election site is secure, then the appropriate response is to document why.

With that said, there's a good chance the Democrats are right and Brian Kemp's office is wrong. In the very announcement declaring their websites are secure, Google Chrome indicates their website is not secure in the URL sos.ga.gov, because they don't use encryption.

Using LetsEcnrypt to enable encryption on websites is such a standard security feature we have to ask ourselves what else they are getting wrong. Normally, I'd run scanners against their systems in order to figure this out, but I'm afraid to, because they are jackbooted thugs who'll come after me, instead of honest people who care about what vulnerabilities I might find so they can fix them.

Conclusion

I'm Libertarian, so I'm going to hate a Democrat governor more than a Republican governor. However, I'm also a cybersecurity expert and somebody famous for scanning for vulnerabilities. As a Georgia resident, I'm personally threatened by this backwards thuggish behavior by Brian Kemp. He learned nothing from this year's fight over SB 315, and unlike the clueful outgoing governor who vetoed that bill, Kemp is likely to sign something similar, or worse, into law.

The integrity of election systems is an especially important concern. The only way to guarantee them is to encourage research, the probing by outsiders for vulnerabilities, and fully disclosing the results. Even if Georgia had the most secure systems, embarrassing problems are going to be found. Companies like Intel, Microsoft, and Apple are the leaders in cybersecurity, and even they have had embarrassing vulnerabilities in the last few months. They have responded by paying bounties to the security researchers who found those problems, not by criminally investigating them.














text Why no cyber 9/11 for 15 years?
Fri, 02 Nov 2018 06:57:00 +0000
This The Atlantic article asks why hasn't there been a cyber-terrorist attack for the last 15 years, or as it phrases it:
National-security experts have been warning of terrorist cyberattacks for 15 years. Why hasn’t one happened yet?
As a pen-tester who has broken into power grids and found 0dayss in control center systems, I thought I'd write up some comments.


Instead of asking why one hasn't happened yet, maybe we should instead ask why national-security experts keep warning about them.

One possible answer is that national-security experts are ignorant. I get the sense that "national" security experts have very little expertise in "cyber" security. That's why I include a brief resume at the top of this article, I've actually broken into a power grid and found 0days in critical power grid products (specifically, the ABB implementation of ICCP on AIX -- it's rather an obvious buffer-overflow, *cough* ASN.1 *cough*, I don't know if they ever fixed it).

Another possibility is that they are fear mongering in order to support their agenda. That's the problem with "experts", they get their expertise by being employed to achieve some goal. The ones who know most about an issue are simultaneously the ones most biased. They have every incentive to make people be afraid, and little incentive to tell the truth.

The most likely answer, though, is simply because they can. Anybody can warn of "digital 9/11" and be taken seriously, regardless of expertise. It's always the Morally Right thing to say. You never have to back it up with evidence. Conversely, those who say the opposite don't get the same level of press, and are frequently challenged to defend their abnormal stance.

Indeed, that's how this article by The Atlantic works. It's entire premise is that the national security experts are still "right" even though their predictions haven't happened, and it's reality that's somehow "wrong".


Now let's consider the original question.

One good answer in the article is that terrorists want attacks that "cause certain types of fear and terror, that garner certain media attention, that galvanize followers". Blowing something up causes more fear in the target population than deleting some data.

But something similar is true of the terrorists themselves, that they prefer violence. In other words, what motivates terrorists, the ends or the means? It is it the need to achieve a political goal? Or is it simply about looking for an excuse to commit violence?

I suspect that it's the later issue. It's not that terrorists are violent so much as violent people are attracted to terrorism. This can explain a lot, such as why they have such poor op-sec and encryption, as I've written about before. They enjoy learning how to shoot guns and trigger bombs, but they don't enjoy learning how to use a computer correctly.

I've explored the cyber Islamic dark web and come to a couple conclusions about it. The primary motivation of these hackers is gay porn. A frequent initiation rite to gain access to these forums is to post pictures of your, well, equipment. Such things are repressed in their native countries and societies, so hacking becomes a necessary skill in order to get it.

It's hard for us to understand their motivations. From our western perspective, we'd think gay young men would be on our side, motivated to fight against their own governments in defense of gay rights, in order to achieve marriage equality. None of them want that, as far as I can tell. Their goal is to get married and have children. Sure, they want gay sex and intimate relationships with men, but they also want a subservient wife who manages the household, and the deep family ties that come with spawning progeny. Thus, their motivation is still to defend the umma (the whole community of Muslims bound together by ties of religion) against the West, not pursue personal rights.

The point is, when asking why terrorists do and don't do types of attacks, their own personal motivations are probably pretty darn important.

Another explanation in that article is simply because Islamic hackers aren't good enough. This requires a more sophisticated discussion of what skills they need. As The Atlantic says in their article:
The most powerful likely barrier, though, is also the simplest. For all the Islamic State’s much-vaunted technical sophistication, the skills needed to tweet and edit videos are a far cry from those needed to hack.
It's indeed not just "editing videos". Most hacker attacks you read about use un-sophisticated means like phishing. They are only believed to be sophisticated because people get confused by the results they achieve with the means with which they do it. For example, much of the DNC hack which had important consequences for our election was done simply by phishing the password from people like John Podesta.

A convincing cyber terrorism attack, such as causing a power black out, would take different skills -- much rarer skills. I refer to my pentests above. The techniques used were all painfully simple, such as SQL injection from the Internet, but at the same time, it's a much rarer skill. No matter how simple we think SQL injection is, it takes a different skillset than phishing. It takes people more interested in things like math. By the time such skills are acquired, they get gainfully employed at a technical job and no longer have free time to pursue the Struggle. Phishing skills won't land you a high paying job, but web programming (which you need for SQL injection) will.

Lastly, I want to address the complexity of the problem. The Atlantic quotes Robert M. Lee of Dragos, a well-respected technical expert in this area, but I don't think they get the quote right. He points out the complexity of the power grid. What he means is not complex as in hard but complex as in diverse. There's 10,000 different companies involved in power production, long haul, distribution to homes, and so forth. Every state is different, every city is different, and even within cities there may be multiple small companies involved.

What this means is that while hacking any one of these entities would be easy, it'd only cause a small-scale effect. To cause big-scale effects would require a much larger hacking campaign, of a lot of targets, over a long period of time. Chances are high that before you hacked enough for a convincing terror effect, they'd catch on to you, and take moves to stop you. Thus while any individual target is easy, the campaign as a whole is complex.

In the end, if your goal is to cause major power blackouts, your best bet is to bomb power lines and distribution centers, rather than hack them.

Conclusion

I'm not sure if I have any better answers, just more complex perspectives.

I think there are lots of warning from so-called "experts" who aren't qualified to make such warnings, that the press errs on the side of giving such warnings credibility instead of challenging them.

I think mostly the reason why cyberterrorism doesn't happen is that which motivates violent people is different than what which motivates technical people, pulling apart the groups who would want to commit cyberterrorism from those who can.

At least for power grid blackouts, while small attacks would be easy, the ones large enough to grab people's attention would be difficult, due to our power grid's diversity.


text Masscan and massive address lists
Thu, 01 Nov 2018 05:59:00 +0000
I saw this go by on my Twitter feed. I thought I'd blog on how masscan solves the same problem.


Both nmap and masscan are port scanners. The differences is that nmap does an intensive scan on a limited range of addresses, whereas masscan does a light scan on a massive range of addresses, including the range of 0.0.0.0 - 255.255.255.255 (all addresses). If you've got a 10-gbps link to the Internet, it can scan the entire thing in under 10 minutes, from a single desktop-class computer.

How massan deals with exclude ranges is probably its defining feature. That seems kinda strange, since it's a little used feature in nmap. But when you scan the entire list, people will complain, with nasty emails, so you are going to build up a list of hundreds, if not thousands, of addresses to exclude from your scans.

Therefore, the first design choice is to combine the two lists, the list of targets to include and the list of targets to exclude. Other port scanners don't do this because they typically work from a large include list and a short exclude list, so they optimize for the larger thing. In mass scanning the Internet, the exclude list is the largest thing, so that's what we optimize for. It makes sense to just combine the two lists.

So the performance now isn't how to lookup an address in an exclude list efficiently, it's how to quickly choose a random address from a large include target list.

Moreover, the decision is how to do it with as little state as possible. That's the trick for sending massive numbers of packets at rates of 10 million packets-per-second, it's not keeping any bookkeeping of what was scanned. I'm not sure exactly how nmap randomizes it's addresses, but the documentation implies that it does a block of a addresses at a time, and randomizes that block, keeping state on which addresses it's scanned and which ones it hasn't.

The way masscan is not to randomly pick an IP address so much as to randomize the index.

To start with, we created a sorted list of IP address ranges, the targets. The total number of IP addresses in all the ranges is target_count (not the number of ranges but the number of all IP addresses). We then define a function pick() that returns one of those IP addresses given the index:

ip = pick(targets, index);

Where index is in the range [0..target_count].

This function is just a binary search. After the ranges have been sorted, a start_index value is added to each range, which is the total number of IP addresses up to that point. Thus, given a random index, we search the list of start_index values to find which range we've chosen, and then which IP address address within that range. The function is here, though reading it, I realize I need to refactor it to make it clearer. (I read the comments telling me to refactor it, and I realize I haven't gotten around to that yet :-).

Given this system, we can now do an in-order (not randomized) port scan by doing the following:

for (index=0; index ip = pick(targets, index);
scan(ip);
}

Now, to scan in random order, we simply need to randomize the index variable.

for (index=0; index xXx = shuffle(index);
ip = pick(targets, xXx);
scan(ip);
}

The clever bit is in that shuffle function (here). It has to take an integer in that range [0..target_count] and return another pseudo-random integer in the same range. It has to be a function that does a one-to-one mapping. Again, we are stateless. We can't create a table of all addresses, then randomize the order of the table, and then enumerate that table. We instead have to do it with an algorithm.

The basis of that algorithm, by the way, is DES, the Data Encryption Standard. That's how a block cipher works. It takes 64-bit number (the blocksize for DES) and outputs another 64-bit block in a one-to-one mapping. In ECB mode, every block is encrypted to a unique other block. Two input blocks can't encrypt into the same output block, or you couldn't decrypt it.

The only problem is the range isn't neat 64-bit blocks, or any number of bits. It's an inconveniently sized number. A cryptographer Phillip Rogaway wrote a paper how to change DES to support integer ranges instead. The upshot is that it uses integer division instead of shifts, which makes it more expensive.

So how we randomize that input variable is that we encrypt it, where the encrypted number is still in the same range.

Thus, the source of masscan's speed is the way it randomizes the IP addresses in a wholly stateless manner. It:
  • doesn't use any state, just enumerates an index from [0..target_count]
  • has a fast function given an index, retrieve the indexed IP address from a large list of ranges
  • has a fast function to randomize that index using the Power of Crypto
Given this as the base, there's lots of additional features we can add. For one thing, we are randomizing not only IP addresses to scan, but also ports. I think nmap picks the IP address first, then runs through a list of ports on that address. Masscan combines them altogether, so when scanning many ports on an address, they won't come as a burst in the middle of the scan, but be spread evenly throughout the scan. It allows you to do things like:

masscan 0.0.0.0/0 -p0-65535

For this to work, we make the following change to the inner loop:

range = port_count * target_count;
for (index=0; index<range; index++) {
xXx = shuffle(index);
ip = pick(targets, xXx % target_count);
port = pick(targets, xXx / target_count);
scan(ip, port);
}

By the way, the compile optimizes both the modulus and division operations into a single IDIV opcode on Intel x86, since that's how that instruction works, returning both results at once. Which is cool.

Another change we can make is sharding, spreading the scan across several CPUs or several servers. Let's say this is server #3 out of 7 servers sharing the load of the scan:

for (index=shard; index ...
}

Again, notice how we don't keep track of any state here, it's just a minor tweak to the loop, and now *poof* the sharding feature appears out of nowhere. It takes vastly more instructions to parse the configuration parameter (masscan --shard 3/7 ...) than it takes to actually do it.

Let's say that we want to pause and resume the scan. What state information do we need to save? The answer is just the index variable. Well, we also need the list of IP addresses that we are scanning. A limitation of this approach is that we cannot easily pause a scan and change the list of IP addresses.

Conclusion

The upshot here is that we've twisted the nature of the problem. By using a crypto function to algorithmically create a one-to-one mapping for the index variable, we can just linearly enumerate a scan -- but magically in random order. This avoids keeping state. It avoids having to lookup addresses in an exclude list. And we get other features that naturally fall out of the process.



What about IPv6?

You'll notice I talking only about IPv4, and masscan supports only IPv4. The maximum sized scan right now is 48 bits (16-bit port number plus 32-bit IPv4 address). Won't larger scans mean using 256 bit integers?

When I get around to adding IPv6, I'll still keep a 64-bit index. The index variable is the number of things you are going to probe, and you can't scan 64-bit space right now. You won't scan the entire IPv6 128-bit address space, but a lot of smaller address spaces that add up to less than 64-bits. So when I get around to adding IPv6, the concept will still work.



text Systemd is bad parsing and should feel bad
Sat, 27 Oct 2018 11:08:00 +0000
Systemd has a remotely exploitable bug in its DHCPv6 client. That means anybody on the local network can send you a packet and take control of your computer. The flaw is a typical buffer-overflow. Several news stories have pointed out that this client was rewritten from scratch, as if that were the moral failing, instead of reusing existing code. That's not the problem.

The problem is that it was rewritten from scratch without taking advantage of the lessons of the past. It makes the same mistakes all over again.

In the late 1990s and early 2000s, we learned that parsing input is a problem. The traditional ad hoc approach you were taught in school is wrong. It's wrong from an abstract theoretical point of view. It's wrong from the practical point of view, error prone and leading to spaghetti code.

The first thing you need to unlearn is byte-swapping. I know that this was some sort epiphany you had when you learned network programming but byte-swapping is wrong. If you find yourself using a macro to swap bytes, like the be16toh() macro used in this code, then you are doing it wrong.

But, you say, the network byte-order is big-endian, while today's Intel and ARM processors are little-endian. So you have to swap bytes, don't you?

No. As proof of the matter I point to every other language other than C/C++. They don't don't swap bytes. Their internal integer format is undefined. Indeed, something like JavaScript may be storing numbers as a floating points. You can't muck around with the internal format of their integers even if you wanted to.

An example of byte swapping in the code is something like this:


In this code, it's taking a buffer of raw bytes from the DHCPv6 packet and "casting" it as a C internal structure. The packet contains a two-byte big-endian length field, "option->len", which the code must byte-swap in order to use.

Among the errors here is casting an internal structure over external data. From an abstract theory point of view, this is wrong. Internal structures are undefined. Just because you can sort of know the definition in C/C++ doesn't change the fact that they are still undefined.

From a practical point of view, this leads to confusion, as the programmer is never quite clear as to the boundary between external and internal data. You are supposed to rigorously verify external data, because the hacker controls it. You don't keep double-checking and second-guessing internal data, because that would be stupid. When you blur the lines between internal and external data, then your checks get muddled up.

Yes you can, in C/C++, cast an internal structure over external data. But just because you can doesn't mean you should. What you should do instead is parse data the same way as if you were writing code in JavaScript. For example, to grab the DHCP6 option length field, you should write something like:


The thing about this code is that you don't know whether it's JavaScript or C, because it's both, and it does the same thing for both.

Byte "swapping" isn't happening. We aren't extracting an integer from a packet, then changing it's internal format. Instead, we are extracting two bytes and combining them. This description may seem like needless pedantry, but it's really really important that you grok this. For example, there is no conditional macro here that does one operation for a little-endian CPU, and another operation for a big-endian CPU -- it does the same thing for both CPUs. Whatever words you want to use to describe the difference, it's still profound and important.

The other thing that you shouldn't do, even though C/C++ allows it, is pointer arithmetic. Again, it's one of those epiphany things C programmers remember from their early days. It's something they just couldn't grasp until one day they did, and then they fell in love with it. Except it's bad. The reason you struggled to grok it is because it's stupid and you shouldn't be using it. No other language has it, because it's bad.

I mean, back in the day, it was a useful performance optimization. Iterating through an array can be faster adding to pointer than incrementing an index. But that's virtually never the case today, and it just leads to needless spaghetti code. It leads to convoluted constructions like the following at the heart of this bug where you have to both do arithmetic on the pointer as well as on the length which you are checking against. This nonsense leads to confusion and ultimately, buffer overflows.


In a heckofalot of buffer overflows over the years, there's a construct just like this lurking near the bug. If you are going to do a rewrite of code, then this is a construct you need to avoid. Just say no to pointer arithmetic.

In my code, you see a lot of constructs where it's buf, offset, and length. The buf variable points to the start of the buffer and is never incremented. The length variable is the max length of the buffer and likewise never changes. It's the offset variable that is incremented throughout.

Because of simplicity, buffer overflow checks become obvious, as it's always "offset + x length", and easy to verify. In contrast, here is the fix for the DHCPv6 buffer overflow. That this is checking for an external buffer overflow is less obvious:


Now let's look at that error code. That's not what ENOBUFS really means. That's an operating system error code that has specific meaning about kernel buffers. Overloading it for userland code is inappropriate.

That argument is a bit pedantic I grant you, but that's only the start. The bigger issue is that it's identifying the symptom not the problem. The ultimate problem is that the code failed to sanitize the original input, allowing the hacker to drive computation this deep in the system. The error isn't that the buffer is too small to hold the output, the original error is that the input was too big. Imagine if this gets logged and the sysadmin reviewing dmesg asks themselves how they can allocate bigger buffers to the DHCP6 daemon. That is entirely the wrong idea.

Again, we go back to lessons of 20 years that this code ignored, the necessity of sanitizing input.

Now let's look at assert(). This is a feature in C that we use to double-check things in order to catch programming mistakes early. An example is the code below, which is checking for programming mistakes where the caller of the function may have used NULL-pointers:


This is pretty normal, but now consider this other use of assert().


This isn't checking errors by the programmer here, but is instead validating input. That's not what you are supposed to use asserts for. This are very different things. It's a coding horror that makes you shriek and run away when you see it. In my fact, that's my Halloween costume this year, using asserts to validate network input.

This reflects a naive misunderstanding by programmers who don't understand the difference between out-of-band checks validating the code, and what the code is supposed to be doing validating input. Like the buffer overflow check above, EINVAL because a programmer made a mistake is a vastly different error than EINVAL because a hacker tried to inject bad input. These aren't the same things, they aren't even in the same realm.


Conclusion

Rewriting old code is a good thing -- as long as you are fixing the problems of the past and not repeating them. We have 20 years of experience with what goes wrong in network code. We have academic disciplines like langsec that study the problem. We have lots of good examples of network parsing done well. There is really no excuse for code that is of this low quality.

This code has no redeeming features. It must be thrown away and rewritten yet again. This time by an experienced programmer who know what error codes mean, how to use asserts properly, and most of all, who has experience at network programming.


text Masscan as a lesson in TCP/IP
Wed, 24 Oct 2018 00:03:00 +0000
When learning TCP/IP it may be helpful to look at the masscan port scanning program, because it contains its own network stack. This concept, "contains its own network stack", is so unusual that it'll help resolve some confusion you might have about networking. It'll help challenge some (incorrect) assumptions you may have developed about how networks work.
For example, here is a screenshot of running masscan to scan a single target from my laptop computer. My machine has an IP address of 10.255.28.209, but masscan runs with an address of 10.255.28.250. This works fine, with the program contacting the target computer and downloading information -- even though it has the 'wrong' IP address. That's because it isn't using the network stack of the notebook computer, and hence, not using the notebook's IP address. Instead, it has its own network stack and its own IP address.

At this point, it might be useful to describe what masscan is doing here. It's a "port scanner", a tool that connects to many computers and many ports to figure out which ones are open. In some cases, it can probe further: once it connects to a port, it can grab banners and version information.

In the above example, the parameters to masscan used here are:
  • -p80 : probe for port "80", which is the well-known port assigned for web-services using the HTTP protocol
  • --banners : do a "banner check", grabbing simple information from the target depending on the protocol. In this case, it grabs the "title" field from the HTML from the server, and also grabs the HTTP headers. It does different banners for other protocols.
  • --source-ip 10.255.28.250 : this configures the IP address that masscan will use
  • 172.217.197.113 : the target to be scanned. This happens to be a Google server, by the way, though that's not really important.
Now let's change the IP address that masscan is using to something completely different, like 1.2.3.4. The difference from the above screenshot is that we no longer get any data in response. Why is that?

The answer is that the routers don't know how to send back the response. It doesn't go to me, it goes to whoever owns the real address 1.2.3.4. If you visualize the Internet, the subnetworks are on the edges. The routers in between examine the destination address of each packet and route it in the proper direction. You can send packets from 1.2.3.4 from anywhere in the network, but responses will always go back to the proper owner of that address.

Thus, masscan can spoof any address it wants, but if it's an address that isn't on the local subnetwork, then it's never going to see the response -- the response is going to go back to the real owner of the address. By the way, I've made this mistake before. When doing massive scans of the Internet, generating billions of packets, I've accidentally typed the wrong source address. That meant I saw none of the responses -- but the hapless owner of that address was inundated with replies. Oops.

So let's consider what masscan does when you use --source-ip to set its address. It does only three things:
  • Uses that as the source address in the packets it sends.
  • Filters incoming packets to make sure they match that address.
  • Responds to ARP packets for that address.
Remember that on the local network, communication isn't between IP addresses but between Ethernet/WiFi addresses. IP addresses are for remote ends of the network, MAC addresses are how packets travel across the local network. It's like when you send your kid to grab the mail from the mailbox: the kid is Ethernet/WiFi, the address on the envelope is the IP address.

In this case, when masscan transmits packets to the local router, it needs to first use ARP to find the router's MAC address. Likewise, when the router receives a response from the Internet destined for masscan, it must first use ARP to discover the MAC address masscan is using.

As you can see in the picture at the top of this post, the MAC address of the notebook computer's WiFi interface is 14:63:a3:11:2d:d4. Therefore, when masscan see's an ARP request for 10.255.28.250, it must respond back with that MAC address.

These three steps should impress upon you that there's not actually a lot that any operating system does with the IP address assigned to it. We imagine there is a lot of complicated code involved. In truth, there isn't -- there's only a few simple things the operating system does with the address.

Moreover, this should impress upon you that the IP address is a property of the network not of the operating system. It's what the network uses to route packets to you, and the operating system has very little control over which addresses will work and which ones don't. The IP address isn't the name or identity of the operating system. It's like how your postal mailing address isn't you, isn't your identity, it's simply where you live, how people can reach you.

Another thing to notice is the difference between phone numbers and addresses. Your IP address depends upon your location. If you move your laptop computer to a different location, you need a different IP address that's meaningful for that location. In contrast, the phone has the same phone number wherever you travel in the world, even if you travel overseas. There have been decades of work in "mobile IP" to change this, but frankly, the Internet's design is better, though that's beyond the scope of this document.

That you can set any source address in masscan means you can play tricks on people. Spoof the source address of some friend you don't like, and they'll get all the responses. Moreover, angry people who don't like getting scanned may complain to their ISP and get them kicked off for "abuse".

To stop this sort of nonsense, a lot of ISPs do "egress filtering". Normally, a router only examines the destination address of a packet in order to figure out the direction to route it. With egress filtering, it also looks at the source address, and makes sure it can route responses back to it. If not, it'll drop the packet. I tested this by sending such spoofed addresses from 1.2.3.4 to a server of mine on the Internet, and found that I did not receive them. (I used the famous tcpdump program to filter incoming traffic looking for those packets).
By the way, masscan also has to ARP the local router. in order to find it's MAC address before it can start sending packets to it. The first thing it does when it starts up is ARP the local router. It's the reason there's a short delay when starting the program. You can bypass this ARP by setting the router's MAC address manually.

First of all, you have to figure out what the local router's MAC address is. There are many ways of doing this, but the easiest is to run the arp command from the command-line, asking the operating system for this information. It, too, must ARP the router's MAC address, and it keeps this information in a table.
Then, I can run masscan using this MAC address:

masscan --interface en0 --router-mac ac:86:74:78:28:b2 --source-ip ....

In the above examples, while masscan has it's own stack, it still requests information about the operating system's configuration, to find things like the local router. Instead of doing this, we can run masscan completely indepedently from the operating system, specifying everything on the command line.

To do this, we have to configure all the following properties of a packet:
  • the network interface of my MacBook computer that I'm using
  • the destination MAC address of the local router
  • the source hardware address my MacBook computer
  • the destination IP address of the target I'm scanning
  • the source IP address where the target can respond to
  • the destination port number of the port I am scanning
  • the source port number of the connection
An example is shown below. When I generated these screenshots I'm located on a different network, so the local addresses have changed from the examples above. Here is a screenshot of running masscan:
And here is a screenshot from Wireshark, a packet sniffer, that captures the packets involved:
As you can see from Wireshark, the very first packet is sent without any preliminaries, based directly on the command-line parameters. There is no other configuration of the computer or network involved.

When the response packet comes back in packet #4, the local router has to figure out the MAC address of where to send it, so it sends an ARP in packet #2, to which masscan responds in packet #3, after which that incoming packet can successfully be forwarded in packet #4.

After this, the TCP connection proceeds as normal, with a three way handshake, an HTTP request, an HTTP response, and so forth, with a couple extra ACK packets (noted in red) that happen because masscan is actually a bit delayed in responding to things.

What I'm trying to show here is again that what happens on the network, the packets that are sent, and how things deal with them, is a straightforward function of the initial starting conditions.

One thing about this example is that I had to set the source MAC address the same as my laptop computer. That's because I'm using WiFi. There's actually a bit of invisible setup here where my laptop must connect to the access-point. The access-point only knows the MAC address of the laptop, so that's the MAC address masscan must use. Had this been Ethernet instead of WiFi, this invisible step wouldn't be necessary, and I would be able to spoof any MAC address. In theory, I could also add a full WiFi stack to masscan so that it could create it's own independent association with the WiFi access-point, but that'd be a lot of work.

Lastly, masscan supports a feature where you can specify a range of IP addresses. This is useful for a lot of reasons, such as stress-testing networks. An example:

masscan --source-ip 10.1.10.100-10.1.10.64 ....

For every probe, it'll choose a random IP address from that range. If you really don't like somebody, you can use masscan and flood them with source addresses in the range 0.0.0.0-255.255.255.255. It's one of the many "stupid pet tricks" you can do with masscan that have no purpose, but which comes from a straightforward applications of the principles of manually configuring things.

Likewise, masscan can be used in DDoS amplification attacks. Like addresses, you can configure payloads. Thus, you can set the --source-ip to that of your victim, a list of destination addresses consisting of amplifiers, and a payload that triggers the amplification. The victim will then be flooded with responses. It's not something the program is specifically designed for, but usage that I can't prevent, as again, it's a straightforward application of the basic principles involved.

Conclusion

Learning about TCP/IP networking leads to confusion about the boundaries between what the operating-system does, and what the network does. Playing with masscan, which has it's own network stack, helps clarify this.

You can download masscan source code at:









text Some notes for journalists about cybersecurity
Mon, 22 Oct 2018 20:22:00 +0000
The recent Bloomberg article about Chinese hacking motherboards is a great opportunity to talk about problems with journalism.

Journalism is about telling the truth, not a close approximation of the truth, but the true truth. They don't do a good job at this in cybersecurity.

Take, for example, a recent incident where the Associated Press fired a reporter for photoshopping his shadow out of a photo. The AP took a scorched-earth approach, not simply firing the photographer, but removing all his photographs from their library.

That's because there is a difference between truth and near truth.

Now consider Bloomberg's story, such as a photograph of a tiny chip. Is that a photograph of the actual chip the Chinese inserted into the motherboard? Or is it another chip, representing the size of the real chip? Is it truth or near truth?

Or consider the technical details in Bloomberg's story. They are garbled, as this discussion shows. Something like what Bloomberg describes is certainly plausible, something exactly what Bloomberg describes is impossible. Again there is the question of truth vs. near truth.

There are other near truths involved. For example, we know that supply chains often replace high-quality expensive components with cheaper, lower-quality knockoffs. It's perfectly plausible that some of the incidents Bloomberg describes is that known issue, which they are then hyping as being hacker chips. This demonstrates how truth and near truth can be quite far apart, telling very different stories.

Another example is a NYTimes story about a terrorist's use of encryption. As I've discussed before, the story has numerous "near truth" errors. The NYTimes story is based upon a transcript of an interrogation of the hacker. The French newspaper Le Monde published excerpts from that interrogation, with details that differ slightly from the NYTimes article.

One the justifications journalists use is that near truth is easier for their readers to understand. First of all, that's not justification for false hoods. If the words mean something else, then it's false. It doesn't matter if its simpler. Secondly, I'm not sure they actually are easier to understand. It's still techy gobbledygook. In the Bloomberg article, if I as an expert can't figure out what actually happened, then I know that the average reader can't, either, no matter how much you've "simplified" the language.

Stories can solve this by both giving the actual technical terms that experts can understand, then explain them. Yes, it eats up space, but if you care about the truth, it's necessary.

In groundbreaking stories like Bloomberg's, the length is already enough that the average reader won't slog through it. Instead, it becomes a seed for lots of other coverage that explains the story. In such cases, you want to get the techy details, the actual truth, correct, so that we experts can stand behind the story and explain it. Otherwise, going for the simpler near truth means that all us experts simply question the veracity of the story.

The companies mentioned in the Bloomberg story have called it an outright lie. However, the techniques roughly are plausible. I have no trouble believing something like that has happened at some point, that an intelligence organization subverted chips to hack BMC controllers in servers destined for specific customers. I'm sure our own government has done this at least once, as Snowden leaked documents imply. However, that doesn't mean the Bloomberg story is truthful. We know they have smudged details. We know they've hyped details, like the smallness of the chips involved.

This is why I trust the high-tech industry press so much more than the mainstream press. Despite priding itself as the "newspaper of record", on these technical issues the NYTimes is anything but. It's the techy sites like Ars Technica and sometimes Wired that become the "paper of record" on things cyber. I mention this because David Sanger gets all the credit for Stuxnet reporting when he's done a horrible job, while numerous techy sites have done the actual work figuring out what went on.


text TCP/IP, Sockets, and SIGPIPE
Sun, 21 Oct 2018 22:10:00 +0000
There is a spectre haunting the Internet -- the spectre of SIGPIPE errors. It's a bug in the original design of Unix networking from 1981 that is perpetuated by college textbooks, which teach students to ignore it. As a consequence, sometimes software unexpectedly crashes. This is particularly acute on industrial and medical networks, where security professionals can't run port/security scans for fear of crashing critical devices.

An example of why this bug persists is the well-known college textbook "Unix Network Programming" by Richard Stevens. In section 5.13, he correctly describes the problem.
When a process writes to a socket that has received an RST, the SIGPIPE signal is sent to the process. The default action of this signal is to terminate the process, so the process must catch the signal to avoid being involuntarily terminated.
This description is accurate. The "Sockets" network APIs was based on the "pipes" interprocess communication when TCP/IP was first added to the Unix operating system back in 1981. This made it straightforward and comprehensible to the programmers at the time. This SIGPIPE behavior made sense when piping the output of one program to another program on the command-line, as is typical under Unix: if the receiver of the data crashes, then you want the sender of the data to also stop running. But it's not the behavior you want for networking. Server processes need to continue running even if a client crashes.

But Steven's description is insufficient. It portrays this problem as optional, that only exists if the other side of the connection is misbehaving. He never mentions the problem outside this section, and none of his example code handles the problem. Thus, if you base your code on Steven's, it'll inherit this problem and sometimes crash.

The simplest solution is to configure the program to ignore the signal, such as putting the following line of code in your main() function:

signal(SIGPIPE, SIG_IGN);

If you search popular projects, you'll find that this there solution most of the time, such as openssl.

But there is a problem with this approach, as OpenSSL demonstrates: it's both a command-line program and a library. The command-line program handles this error, but the library doesn't. This means that using the SSL_write() function to send encrypted data may encounter this error. Nowhere in the OpenSSL documentation does it mention that the user of the library needs to handle this.

Ideally, library writers would like to deal with the problem internally. There are platform-specific ways to deal with this. On Linux, an additional parameter MSG_NOSIGNAL can be added to the send() function. On BSD (including macOS), setsockopt(SO_NOSIGPIPE) can be configured for the socket when it's created (after socket() or after accept()). On Windows and some other operating systems, the SIGPIPE isn't even generated, so nothing needs to be done for those platforms.

But it's difficult. Browsing through cross platform projects like curl, which tries this library technique, I see the following bit:

#ifdef __SYMBIAN32__
/* This isn't actually supported under Symbian OS */
#undef SO_NOSIGPIPE
#endif

Later in the code, it will check whether SO_NOSIGPIPE is defined, and if it is, to use it. But that fails with Symbian because while it defines the constant in the source code, it doesn't actually support it, so it then must be undefined.

So as you can see, solving this issue is hard. My recommendation for your code is to use all three techniques: signal(SIGPIPE), setsocktopt(SO_NOSIGPIPE), and send(MSG_NOSIGNAL), surrounded by the appropriate #ifdefs. It's an annoying set of things you have to do, but it's a non-optional thing you need to handle correctly, that must survive later programmers who may not understand this issue.


Now let's talk abstract theory, because it's important to understanding why Stevens' description of SIGPIPE is wrong. The #1 most important theoretical concept in network programming is this:
Hackers control input.
What that means is that if input can go wrong, then it will -- because eventually a hacker will discover your trusting of input and create the necessary input to cause something bad to happen, such as crashing your program, or taking remote control of it.

The way Steven's presents this SIGPIPE problem is as if it's a bug in the other side of the connection. A correctly written program on the other side won't generate this problem, so as long as you have only well-written peers to deal with, then you'll never see this. In other words, Stevens trusts input isn't created by hackers.

And that's indeed what happens in industrial control networks (factories, power plants, hospitals, etc.). These are tightly controlled networks where the other side of the connection is by the same manufacturer. Nothing else is allowed on the network, so bugs like this never happen.

Except that networks are never truly isolated like this. Once a hacker breaks into the network, they'll cause havoc.

Worse yet, other people may have interest in the network. Security professionals, who want to stop hackers, will run port/vuln scanners on the network. These will cause unexpected input, causing these devices to crash.

Thus we see how this #1 principle gets corrupted, from Stevens on down. Stevens' textbook teaches it's the peer's problem, a bug in the software on the other side of the connection. This then leads to industrial networks being based on this principle, as the programmers were taught in the university. This leads to persistent, intractable vulnerabilities to hackers in these networks. Not only are they vulnerable now, they can't be fixed, because we can't scan for vulnerabilities in order to fix them.


In this day and age of "continuous integration", programmers are interested not only in solving this in their code, but solving this in their unit/regression test suites. In the modern perspective, until you can create a test that exercises this bug, it's not truly fixed.

I'm not sure how to write code that adequately does this. It's not straightforward generating RSTs from the Sockets API, especially at the exact point you need them. There's also timing issues, where you may need to do something a million times repeatedly to just to get the timing right.

For example, I have a sample program that calls send() as fast as it can until it hit the limit on how much this side can buffer, and then closes the socket, causing a reset to be sent. For my simple "echo" server trying to echo back everything it receives, this will cause a SIGPIPE condition.

However, when testing a webserver, this may not work. A typical web server sends a short amount of data, so the send() has returned before you can get a RST packet sent. The web server software you are testing needs to be sending a large enough response that it'll keep sending until it hits this condition. You may need to run the client program trying to generate this error a ton of times until just the right conditions are met.


Conclusion

I don't know of any network programming textbook I like. They all tend to perpetuate outdated and incomplete information. That SIGPIPE is ignored so completely is a major cause of problems on the Internet.

To summarize: your code must deal with this. The most appropriate solution is signal(SIGPIPE) at the top of your program. If that doesn't work for you, then you may be able to use pthread_sigprocmask() for just the particular threads doing network traffic. Otherwise, you need the more platform specific methods for BSD and Linux that deal with this at the socket or function call level.

It persists because it's a bug in the original Sockets definition from 1981, it's not properly described by textbook, it escapes testing, and there is a persistent belief that if a program receives bad input, it's the sender's responsibility to fix, rather than the receiver's responsibility.



text Election interference from Uber and Lyft
Fri, 19 Oct 2018 23:13:00 +0000
Almost nothing can escape the taint of election interference. A good example is the announcements by Uber and Lyft that they'll provide free rides to the polls on election day. This well-meaning gesture nonetheless calls into question how this might influence the election.

"Free rides" to the polls is a common thing. Taxi companies have long offered such services for people in general. Political groups have long offered such services for their constituencies in particular. Political groups target retirement communities to get them to the polls, black churches have long had their "Souls to the Polls" program across the 37 states that allow early voting on Sundays.

But with Uber and Lyft getting into this we now have concerns about "big data", "algorithms", and "hacking".

As the various Facebook controversies have taught us, these companies have a lot of data on us that can reliably predict how we are going to vote. If their leaders wanted to, these companies could use this information in order to get those on one side of an issue to the polls. On hotly contested elections, it wouldn't take much to swing the result to one side.

Even if they don't do this consciously, their various algorithms (often based on machine learning and AI) may do so accidentally. As is frequently demonstrated, unconscious biases can lead to real world consequences, like facial recognition systems being unable to read Asian faces.

Lastly, it makes these companies prime targets for Russian hackers, who may take all these into account when trying to muck with elections. Or indeed, to simply claim that they did in order to call the results into question. Though to be fair, Russian hackers have so many other targets of opportunity. Messing with the traffic lights of a few cities would be enough to swing a presidential election, specifically targeting areas with certain voters with traffic jams making it difficult for them to get to the polls.

Even if it's not "hackers" as such, many will want to game the system. For example, politically motivated drivers may choose to loiter in neighborhoods strongly on one side or the other, helping the right sorts of people vote at the expense of not helping the wrong people. Likewise, drivers might skew the numbers by deliberately hailing rides out of opposing neighborhoods and taking them them out of town, or to the right sorts of neighborhoods.

I'm trying to figure out which Party this benefits the most. Let's take a look at rider demographics to start with, such as this post. It appears that income levels and gender are roughly evenly distributed.

Ridership is skewed urban, with riders being 46% urban, 48% suburban, and 6% rural. In contrast, US population is 31% urban, 55% suburban, and 15% rural. Giving the increasing polarization among rural and urban voters, this strongly skews results in favor of Democrats.

Likewise, the above numbers show that Uber ridership is strongly skewed to the younger generation, with 55% of the riders 34 and younger. This again strongly skews "free rides" by Uber and Lyft toward the Democrats. Though to be fair, the "over 65" crowd has long had an advantage as the parties have fallen over themselves to bus people from retirement communities to the polls (and that older people can get free time on weekdays to vote).

Even if you are on the side that appears to benefit, this should still concern you. Our allegiance should first be to a robust and fair voting system, and to our Party second. I mention this because the increased polarization of our politics seems to favor those (from both sides) who don't mind watching the entire system burn as long as their Party wins.

Right now, such programs are probably going to have a small impact. But the future is trending toward fewer individually owned cars and more dependence on services like Uber and Lyft. In a few years, big-data, machine learning algorithms, and hackers may be able to strongly influence elections.


text Notes on the UK IoT cybersec "Code of Practice"
Tue, 16 Oct 2018 21:06:00 +0000
The British government has released a voluntary "Code of Practice" for securing IoT devices. I thought I'd write some notes on it.

First, the good parts

Before I criticize the individual points, I want to praise if for having a clue. So many of these sorts of things are written by the clueless, those who want to be involved in telling people what to do, but who don't really understand the problem.

The first part of the clue is restricting the scope. Consumer IoT is so vastly different from things like cars, medical devices, industrial control systems, or mobile phones that they should never really be talked about in the same guide.

The next part of the clue is understanding the players. It's not just the device that's a problem, but also the cloud and mobile app part that relates to the device. Though they do go too far and include the "retailer", which is a bit nonsensical.

Lastly, while I'm critical of most all the points on the list and how they are described, it's probably a complete list. There's not much missing, and the same time, it includes little that isn't necessary. In contrast, a lot of other IoT security guides lack important things, or take the "kitchen sink" approach and try to include everything conceivable.

1) No default passwords

Since the Mirai botnet of 2016 famously exploited default passwords, this has been at the top of everyone's list. It's the most prominent feature of the recent California IoT law. It's the major feature of federal proposals.

But this is only a superficial understanding of what really happened. The issue wasn't default passwords so much as Internet-exposed Telnet.

IoT devices are generally based on Linux which maintains operating-system passwords in the /etc/passwd file. However, devices almost never use that. Instead, the web-based management interface maintains its own password database. The underlying Linux system is vestigial like an appendix and not really used.

But these devices exposed Telnet, providing a path to this otherwise unused functionality. I bought several of the Mirai-vulnerable devices, and none of them used /etc/passwd for anything other than Telnet.

Another way default passwords get exposed in IoT devices is through debugging interfaces. Manufacturers configure the system one way for easy development, and then ship a separate "release" version. Sometimes they make a mistake and ship the development backdoors as well. Programmers often insert secret backdoor accounts into products for development purposes without realizing how easy it is for hackers to discover those passwords.

The point is that this focus on backdoor passwords is misunderstanding the problem. Device makers can easily believe they are compliant with this directive while still having backdoor passwords.

As for the web management interface, saying "no default passwords" is useless. Users have to be able to setup the device the first time, so there has to be some means to connect to the device without passwords initially. Device makers don't know how to do this without default passwords. Instead of mindless guidance of what not to do, a document needs to be written that explains how devices can do this both securely as well as easy enough for users to use.

Humorously, the footnotes in this section do reference external documents that might explain this, but they are the wrong documents, appropriate for things like website password policies, but inappropriate for IoT web interfaces. This again demonstrates how they have only a superficial understanding of the problem.

2) Implement a vulnerability disclosure policy

This is a clueful item, and it should be the #1 item on every list.

Though they do add garbage on top of this, but demanding companies respond in a "timely manner", but overall this isn't a bad section.

3) Keep software updated

This is another superficial understanding of the problem.

Software patching works for desktop and mobile phones because they have interfaces the user interacts with, ones that can both notify the user of a patch as well as the functionality to apply it. IoT devices are usually stuck in a closet somewhere without such interfaces.

Software patching works for normal computers because they sell for hundreds of dollars and thus have sufficient memory and storage to reliably do updates. IoT devices sell for cut-throat margins and have barely enough storage to run. This either precludes updates altogether, or at least means the update isn't reliable, that upon every update, a small percentage of customer devices will be "bricked", rendered unusable. Adding $1 for flash memory to a $30 device is not a reasonable solution to the problem.

Software patching works for software because of its enormous margins and longevity. A software product is basically all profit. The same doesn't apply to hardware, where devices are sold with slim margins. Device makers have a hard time selling them for more because there are always no-named makers of almost identical devices in Shenzen willing to undercut them. (Indeed, looking at Mirai, it appears that was the majority of infected devices, not major brands, but no-named knock-offs).

The document says that device makers need to publish how long the device will be supported. This ignores the economics of this. Devices makers cannot know how long they will support a device. As long as they are selling new ones, they've got incentive and profits to keep supplying updates. After that, they don't. There's really no way for them to predict the long term market success of their devices.

Guarantees cost money. If they guarantee security fixes for 10 years, then that's a liability they have to account for on their balance sheet. It's a huge risk: if the product fails to sell lots of units, then they are on the hook for a large cost without the necessary income to match it.

Lastly, the entire thing is a canard. Users rarely update firmware for devices. Blaming vendors for not providing security patches/updates means nothing without blaming users for not applying them.

4) Securely store credentials and security-sensitive data

Like many guides, this section makes the superficial statement "Hard-coded credentials in device software are not acceptable". The reason this is silly is because public-keys are a "credential", and you indeed want "hard-coded" public-keys. Hard-coded public-key credentials is how you do other security functions, like encrypted and signature verification.

This section tells device makers to use the trusted-enclave features like those found on phones, but this is rather silly. For one thing, that's a feature of only high-end CPUs, not the low-end CPUs found in such devices. For another thing, IoT devices don't really contain anything that needs that level of protection.

Storing passwords in clear-text on the device is almost certain adequate security, and this section can be ignored.

5) Communicate securely

In other words, use SSL everywhere, such as on the web-based management interface.

But this is only a superficial understanding of how SSL works. You (generally) can't use SSL for devices because there's no secure certificate on the device. It forces users to bypass nasty warnings in the browser, which hurts the entire web ecosystem. Some IoT devices do indeed try to use SSL this way, and it's bad, very bad.

On the other hand, IoT devices can and should use SSL when connecting outbound to the cloud.

6) Minimise exposed attack surfaces

This is certainly a good suggestion, but it's a platitude rather than an action item. IoT devices already minimize as much as they can in order to reduce memory/storage requires. Where this is actionable requires subtler understanding. A lot of exposed attack services come from accidents.

A lot of other exposed attack surfaces come about because device makers know no better way. Actual helpful, meaning advice would consist of telling them what to do in order to solve problems, rather than telling them what not to do.

The reason Mirai-devices exposed Telnet was for things like "remote factory reset". Mirai infected mostly security cameras which don't have factory reset buttons. That's because they are located high up out of reach, or if they are in reach, they don't want to allow the public to press the factory reset button. Thus, doing a factory reset meant doing it remotely. That appears to be the major reason for Telnet and "hardcoded passwords", to allow remote factory reset. Instead of telling them not to expose Telnet, you need a guide explaining how to securely do remote factory resets.

This guide discussed "ports", but the reality is that the attack surface in the web-based management interface on port 80 is usually more than all other ports put together. Focusing on "ports" reflects a superficial understanding of the problem.

7) Ensure software integrity

The guide says "Software on IoT devices should be verified using secure boot
mechanisms". No, they shouldn't be. In the name of security, they should do the opposite.

First of all, getting "secure boot" done right is extraordinarily difficult. Apple does it the best with their iPhone and still they get it wrong. For another thing, it's expensive. Like trusted enclaves in processors, most of the cheap low-end processors used in IoT don't support it.

But the biggest issue is that you don't want it. "Secure boot" means the only operating system the device can boot comes from the vendor, which will eventually stop supporting the product, making it impossible to fix any security problem. Not having secure boot means that customers can still be able to patch bugs without the manufacturer's help.

Instead of secure boot, device makers should do the opposite and make it easy for customers to build their own software. They are required to do so under the GNU Public License anyway. That doesn't mean open-sourcing everything, they can still provide their private code as binaries. But they should allow users to fix any bug in open-source and repackage a new firmware update.

8) Ensure that personal data is protected

I suppose giving the GDPR, this section is required, but GDPR is a pox on the Internet.

9) Make systems resilient to outages

Given the recent story of Yale locks locking people out of their houses due to a system outage, this seems like an obviously good idea.

But it should be noted that this is hard. Obviously such a lock should be resilient if the network connection is down, or their servers have crashed. But what happens when such a lock can contact their servers, but some other component within their organization has crashed, such that the servers give unexpected responses, neither completely down, but neither completely up and running, either?

We saw that in the Mirai attacks against Dyn. It left a lot servers up and running, but took down on some other component that those servers relied upon, leaving things in an intermediate state that was neither unfunctional nor completely functional.

It's easy to stand on a soapbox and proclaim devices need to be resilient, but this is unhelpful. What would instead be helpful is a catalog of failures that IoT will typically experience.

10) Monitor system telemetry data

Security telemetry is a desirable feature in general. When a hack happens, you want to review logfiles to see how it happened. This item reflects various efforts to come up with such useful information

But again we see something so devoid of technical details as to be useless. Worse, it's going to be exploited by others, such as McAffee wanting you to have anti-virus on TV sets, which is an extraordinarily bad idea.

11) Make it easy for consumers to delete personal data

This is kinda silly in that the it's simply a matter of doing a "factory reset". Having methods to delete personal details other than factory resets is bad.

The useful bit of advise is that factory resets don't always "wipe" information, they just "forget" it in a way that can be recovered. Thus, we get printers containing old documents and voting machines with old votes.

On the other hand, this is a guide for "consumer IoT", so just the normal factory reset is probably sufficient, even if private details can be gleaned.

12) Make installation and maintenance of devices easy

Of course things should be easy, everyone agrees on this. The problem is they don't know how. Companies like Microsoft and Apple spend billions on this problem and still haven't cracked it.

My home network WiFi password uses quotes as punctuation to improve security. The Amazon Echo app uses Bluetooth to pair with the device and set which password to use for WiFi. This is well done from a security point of view.

However, their app uses an input field that changes quotes to curly-quotes making it impossible to type in the password. I instead had to go to browser, type the password in the URL field, copy it, then go back to the Alexa app and paste it into the field. Then I could get things to work.

Amazon is better at making devices easy and secure with Echo and they still get things spectacularly wrong.

13) Validate input data

Most security vulnerabilities are due to improper validation of input data. However, "validate input data" is stupid advice. It's like how most phishing attacks come from strangers, but how telling people to not open emails from strangers is stupid advice. In both cases, it's a superficial answer that doesn't really understand how the problem came about.

Let's take PHP and session cookies, for example. A lot of programmers think the session identifier in PHP is some internal feature of PHP. They therefore trust it, because it isn't input. They don't perceive how it's not internal to PHP, but external, part of HTTP, and something totally hackable by hackers.

Or take the famous Jeep hacker where hackers were able to remotely take control of the car and do mischievous things like turn it off on the highway. The designers didn't understand how the private connection to the phone network was in fact "input" coming from the Internet. And then there was data from the car's internal network, which wasn't seen as "input" from an external source.

Then there is the question of what "validation" means. A lot of programmers try to solve SQL injection by "blacklisting" known bad characters. Hackers are adept at bypassing this, using other bad characters, especially using Unicode. Whitelisting known good characters is a better solution. But even that is still problematic. The proper solution to SQL injection isn't "input validation" at all, but using "parameterized queries" that don't care about input.

Conclusion

Like virtually every other guide, this one is based upon platitudes and only a superficial understanding of the problem. It's got more clue than most, but is still far from something that could actually be useful. The concept here is virtue signaling, declaring what would be virtuous and moral for an IoT device, rather than something that could be useful to device makers in practice.

















text How to irregular cyber warfare
Sun, 14 Oct 2018 08:53:00 +0000
Somebody (@thegrugq) pointed me to this article on "Lessons on Irregular Cyber Warfare", citing the masters like Sun Tzu, von Clausewitz, Mao, Che, and the usual characters. It tries to answer:
...as an insurgent, which is in a weaker power position vis-a-vis a stronger nation state; how does cyber warfare plays an integral part in the irregular cyber conflicts in the twenty-first century between nation-states and violent non-state actors or insurgencies
I thought I'd write a rebuttal.

None of these people provide any value. If you want to figure out cyber insurgency, then you want to focus on the technical "cyber" aspects, not "insurgency". I regularly read military articles about cyber written by those, like in the above article, which demonstrate little experience in cyber.

The chief technical lesson for the cyber insurgent is the Birthday Paradox. Let's say, hypothetically, you go to a party with 23 people total. What's the chance that any two people at the party have the same birthday? The answer is 50.7%. With a party of 75 people, the chance rises to 99.9% that two will have the same birthday.

The paradox is that your intuitive way of calculating the odds is wrong. You are thinking the odds are like those of somebody having the same birthday as yourself, which is in indeed roughly 23 out of 365. But we aren't talking about you vs. the remainder of the party, we are talking about any possible combination of two people. This dramatically changes how we do the math.

In cryptography, this is known as the "Birthday Attack". One crypto task is to uniquely fingerprint documents. Historically, the most popular way of doing his was with an algorithm known as "MD5" which produces 128-bit fingerprints. Given a document, with an MD5 fingerprint, it's impossible to create a second document with the same fingerprint. However, with MD5, it's possible to create two documents with the same fingerprint. In other words, we can't modify only one document to get a match, but we can keep modifying two documents until their fingerprints match. Like a room, finding somebody with your birthday is hard, finding any two people with the same birthday is easier.

The same principle works with insurgencies. Accomplishing one specific goal is hard, but accomplishing any goal is easy. Trying to do a narrowly defined task to disrupt the enemy is hard, but it's easy to support a group of motivated hackers and let them do any sort of disruption they can come up with.

The above article suggests a means of using cyber to disrupt a carrier attack group. This is an example of something hard, a narrowly defined attack that is unlikely to actually work in the real world.

Conversely, consider the attacks attributed to North Korea, like those against Sony or the Wannacry virus. These aren't the careful planning of a small state actor trying to accomplish specific goals. These are the actions of an actor that supports hacker groups, and lets them loose without a lot of oversight and direction. Wannacry in particular is an example of an undirected cyber attack. We know from our experience with network worms that its effects were impossible to predict. Somebody just stuck the newly discovered NSA EternalBlue payload into an existing virus framework and let it run to see what happens. As we worm experts know, nobody could have predicted the results of doing so, not even its creators.

Another example is the DNC election hacks. The reason we can attribute them to Russia is because it wasn't their narrow goal. Instead, by looking at things like their URL shortener, we can see that they flailed around broadly all over cyberspace. The DNC was just one of their few successes, among a large number of failures. We then watched their incompetent bungling of that opportunity, such as inadvertently leaving their identity behind in Word metadata.

In contrast to these broad, opportunistic hacking from Russia, China, North Korea, and Iran we have the narrow, focused hacking from the U.S. and its allies Britain and Israel. Stuxnet is really the only example we have of a narrow, focused attack being successful. The U.S. can succeed at such an improbable attack because of its enormous investment in the best cyber warriors in the world. But still, we struggle against our cyber adversaries because they are willing to do undirected, opportunistic hacking while we insist on doing narrow, well-defined hacking. Despite our skill, we can't overcome the compelling odds of the Birthday Attack.

What's interesting about the cyber guerillas we face is their comparative lack of skill. The DNC hackers were based primarily on things like phishing, which unsophisticated teenagers can do. They were nothing like the sophisticated code found in Stuxnet. Rather than a small number of talented cyberwarriors, they are more accurately using the infinite monkeys approach of banging away on keyboards until they come up with the works of Shakespear.

I don't know about the real policy makers and what they decide in secret, but in public, our politicians struggle to comprehend this paradox. They insist on seeing things like the DNC hack or Wannacry as the careful plans of our adversaries. This hinders our response to cyber insurgencies.

I'm a hacker and not a student of history, but I suspect those famous real-world insurgencies relied upon much the same odds, that their success is the same illusion as hacker successes. Sure, Che Guevara participated in the successful Cuban revolution, but was a failure in other revolutions in Africa and South America. Mao Zedong wasn't the leader of China's communist revolution so much as one of many leaders. He's just the one of many who ended up with all the marbles at the end.

It's been fashionable lately to quote Sun Tzu or von Clausewitz on cyberwar, but it's just pretentious nonsense. Cyber needs to be understand as something in its own terms, not as an extension of traditional warfare or revolution. We need to focus on the realities of asymmetric cyber attacks, like the nation states mentioned above, or the actions of Anonymous, or the successes of cybercriminals. The reason they are successful is because of the Birthday Paradox: they aren't trying to achieve specific, narrowly defined goals, but are are opportunistically exploiting any achievement that comes their way. This informs our own offensive efforts, which should be less centrally directed. This informs our defenses, which should anticipate attacks based not on their desired effect, but what our vulnerabilities make possible.


Bloomberg has a story how Chinese intelligence inserted secret chips into servers bound for America. There are a couple issues with the story I wanted to address.


The story is based on anonymous sources, and not even good anonymous sources. An example is this attribution:
a person briefed on evidence gathered during the probe says
That means somebody not even involved, but somebody who heard a rumor. It also doesn't the person even had sufficient expertise to understand what they were being briefed about.

The technical detail that's missing from the story is that the supply chain is already messed up with fake chips rather than malicious chips. Reputable vendors spend a lot of time ensuring quality, reliability, tolerances, ability to withstand harsh environments, and so on. Even the simplest of chips can command a price premium when they are well made.

What happens is that other companies make clones that are cheaper and lower quality. They are just good enough to pass testing, but fail in the real world. They may not even be completely fake chips. They may be bad chips the original manufacturer discarded, or chips the night shift at the factory secretly ran through on the equipment -- but with less quality control.

The supply chain description in the Bloomberg story is accurate, except that in fails to discuss how these cheap, bad chips frequently replace the more expensive chips, with contract manufacturers or managers skimming off the profits. Replacement chips are real, but whether they are for malicious hacking or just theft is the sticking point.

For example, consider this listing for a USB-to-serial converter using the well-known FTDI chip. The word "genuine" is in the title, because fake FTDI chips are common within the supply chain. As you can see form the $11 price, the amount of money you can make with fake chips is low -- these contract manufacturers hope to make it up in volume.


The story implies that Apple is lying in its denials of malicious hacking, and deliberately avoids this other supply chain issue. It's perfectly reasonable for Apple to have rejected Supermicro servers because of bad chips that have nothing to do with hacking.

If there's hacking going on, it may not even be Chinese intelligence -- the manufacturing process is so lax that any intelligence agency could be responsible. Just because most manufacturing of server motherboards happen in China doesn't point the finger to Chinese intelligence as being the ones responsible.

Finally, I want to point out the sensationalism of the story. It spends much effort focusing on the invisible nature of small chips, as evidence that somebody is trying to hide something. That the chips are so small means nothing: except for the major chips, all the chips on a motherboard are small. It's hard to have large chips, except for the big things like the CPU and DRAM. Serial ROMs containing firmware are never going to be big, because they just don't hold that much information.

A fake serial ROM is the focus here not so much because that's the chip they found by accident, but that's the chip they'd look for. The chips contain the firmware for other hardware devices on the motherboard. Thus, instead of designing complex hardware to do malicious things, a hacker simply has to make simple changes to software, and replace the software.

Thus, if investigators are worried about hacking, they'll look at those chips first. When they find fake ones, because some manager tried to skim $0.25 per server that was manufactured, then they'll find evidence confirming their theory.

But if that were the case, investigators can simply pull the malicious software off the chip, reverse engineer it, and confirm its maliciousness. The Bloomberg story doesn't verify this happened. It's like a story of UFOs the rely upon the weight of many unconfirmed reports rather than citing a single confirmed one.

This story could be true, of course. And even if it's not true in this one case, there are probably other cases. The manufacturing process is so lax it's probable that somewhere some intelligence organization has done this. However, the quality of reporting is so low, quoting anonymous sources that appear not to have sufficient expertise, focusing on sensationalistic aspects, and not following up on background, that I have to question this story.




Update: Apple has a very direct refutation of Bloomberg's allegations.
https://www.apple.com/newsroom/2018/10/what-businessweek-got-wrong-about-apple/

Amazon likewise directly refutes the story:
https://aws.amazon.com/blogs/security/setting-the-record-straight-on-bloomberg-businessweeks-erroneous-article/



text Mini pwning with GL-iNet AR150
Fri, 28 Sep 2018 22:52:00 +0000
Seven years ago, before the $35 Raspberry Pi, hackers used commercial WiFi routers for their projects. They'd replace the stock firmware with Linux. The $22 TP-Link WR703N was extremely popular for these projects, being half the price and half the size of the Raspberry Pi.


Unfortunately, these devices had extraordinarily limited memory (16-megabytes) and even more limited storage (4-megabyte). That's megabytes -- the typical size of an SD card in an RPi is a thousand times larger.

I'm interested in that device for the simple reason that it has a big-endian CPU.

All these IoT-style devices these days run ARM and MIPS processors, with a smattering of others like x86, PowerPC, ARC, and AVR32. ARM and MIPS CPUs can run in either mode, big-endian or little-endian. Linux can be compiled for either mode. Little-endian is by far the most popular mode, because of Intel's popularity. Code developed on little-endian computers sometimes has subtle bugs when recompiled for big-endian, so it's best just to maintain the same byte-order as Intel. On the other hand, popular file-formats and crypto-algorithms use big-endian, so there's some efficiency to be gained with going with that choice.

I'd like to have a big-endian computer around to test my code with. In theory, it should all work fine, but as I said, subtle bugs sometimes appear.

The problem is that the base Linux kernel has slowly grown so big I can no longer get things to fit on the WR703N, not even to the point where I can add extra storage via the USB drive. I've tried to hack a firmware but succeeded only in bricking the device.

An alternative is the GL-AR150. This is a company who sells commercial WiFi products like the other vendors, but who caters to hackers and hobbyists. Recognizing the popularity of that TP-LINK device, they essentially made a clone with more stuff, with 16-megabytes of storage and 64-megabytes of RAM. They intend for people to rip off the case and access the circuit board directly: they've included the pins for a console serial port to be directly connected, connectors of additional WiFi antennas, and pads for soldering wires to GPIO pins for hardware projects. It's a thing of beauty.

So this post is about the steps I took to get things working for myself.

The first step is to connect to the device. One way to do this is connect the notebook computer to their WiFi, then access their web-based console. Another way is to connect to their "LAN" port via Ethernet. I chose the Ethernet route.

The problem with their Ethernet port is that you have to manually set your IP address. Their address is 192.168.8.1. I handled this by going into the Linux virtual-machine on my computer, putting the virtual network adapter into "bridge" mode (as I always do anyway), and setting an alternate IP address:

# ifconfig eth:1 192.168.8.2 255.255.255.0

The firmware I want to install is from the OpenWRT project which maintains Linux firmware replacements for over a hundred different devices. The device actually already uses their own variation of OpenWRT, but still, rather than futz with theirs I want to go with a vanilla installation.

https://wiki.openwrt.org/toh/gl-inet/gl-ar150

I download this using the browser in my Linux VM, then browse to 192.168.8.1, navigate to their firmware update page, and upload this file. It's not complex -- they actually intend their customers to do this sort of thing. Don't worry about voiding the warranty: for a ~$20 device, there is no warranty.

The device boots back up, this time the default address is going to be 192.168.1.1, so again I add another virtual interface to my Linux VM with "ifconfig eth:2 192.168.1.2" in order to communicate with it.

I now need to change this 192.168.1.x setting to match my home network. There are many ways to do this. I could just reconfigure the LAN port to a hard-coded address on my network. Or, I could connect the WAN port, which is already configured to get a DHCP address. Or, I could reconfigure the WiFi component as a "client" instead of "access-point", and it'll similarly get a DHCP address. I decide upon WiFi, mostly because my 24 port switch is already full.

The problem is OpenWRT's default WiFi settings. It's somehow interfering with accessing the device. I can't see how reading the rules, but I'm obviously missing something, so I just go in and nuke the rules. I just click on the "WAN" segment in the firewall management page and click "remove". I don't care about security, I'm not putting this on the open Internet or letting guests access it.

To connect to WiFi, I remove the current settings as an "access-point", then "scan" my local network, select my access-point, enter the WPA2 passphrase, and connect. It seems to work perfectly.

While I'm here, I also go into the system settings and change the name of the device to "MipsDev", and also set the timezone to New York.

I then disconnect the Ethernet and continue at this point via their WiFi connection. At some point, I'm going to just connect power and stick it in the back of a closet somewhere.

DHCP assigns this 10.20.30.46 (I don't mind telling you -- I'm going to renumber my home network soon). So from my desktop computer I do:

C:\> ssh root@10.20.30.46

...because I'm a Windows user and Windows supports ssh now g*dd***it.


OpenWRT had a brief schism a few years ago with the breakaway "LEDE" project. They mended differences and came back together again the latest version. But this older version still goes by the "LEDE" name.

At this point, I need to expand the storage from the 16-megabytes on the device. I put in a 32-gigabyte USB flash drive for $5 -- expanding storage by 2000 times.

The way OpenWRT deals with this is called an "overlay", which uses the same technology has Docker containers to essentially containerize the entire operating system. The existing operating system is mounted read-only. As you make changes, such as installing packages or re-configuring it, anything written to the system, is written into the overlay portion. If you do a factory reset (by holding down the button on boot), it simply discards the overlay portion.

What we are going to do is simply change the overlay from the current 16-meg on-board flash to our USB flash drive. This means copying the existing overlay part to our drive, then re-configuring the system to point to our USB drive instead of their overlay.

This process is described on OpenWRT's web page here:

https://wiki.openwrt.org/doc/howto/extroot

It works well -- but for systems with more than 4-megs. This is what defeated me before, there's not enough space to add the necessary packages. But with 16-megs on this device there is plenty off space.

The first step is to update the package manager, such like on other Linuxs.

# opkg update

When I plug in the USB drive, dmesg tells me it finds a USB "device", but nothing more. This tells me I have all the proper USB drivers installed, but not the flashdrive parts.

[ 5.388748] usb 1-1: new high-speed USB device number 2 using ehci-platform

Following the instructions in the above link, I then install those components:

# opkg install block-mount kmod-fs-ext4 kmod-usb-storage-extras

Simply installing these packages will cause it to recognize the USB drive in dmesg:

[ 10.748961] scsi 0:0:0:0: Direct-Access Samsung Flash Drive FIT 1100 PQ:
0 ANSI: 6
[ 10.759375] sd 0:0:0:0: [sda] 62668800 512-byte logical blocks: (32.1 GB/29.9 G
iB)
[ 10.766689] sd 0:0:0:0: [sda] Write Protect is off
[ 10.770284] sd 0:0:0:0: [sda] Mode Sense: 43 00 00 00
[ 10.771175] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn'
t support DPO or FUA
[ 10.788139] sda: sda1
[ 10.794189] sd 0:0:0:0: [sda] Attached SCSI removable disk

At this point, I need to format the drive with ext4. The correct way of doing this is to connect to my Linux VM and format it that way. That's because in these storage limited environments, OpenWRT doesn't have space for such utilities to do this. But with 16-megs that I'm going to overlay soon anyway, I don't care, so I install those utilities.

# opkg install e2fsprogs

Then I do the normal Linux thing to format the drive:

# mkfs.ext4 /dev/sda1

This blows away whatever was already on the drive.

Now we need to copy over the contents of the existing /overlay. We do that with the following:

# mount /dev/sda1 /mnt
# tar -C /overlay -cvf - . | tar -C /mnt -xf -
# umount /mnt

We use tar to copy because, as a backup program, it maintains file permissions and timestamps. So it's better to backup and restore. We don't want to actually create a file, but instead use it in streaming mode. The '-' on one invocation causes it to stream the results to stdout instead of writing a file. the other invocation uses '-' to stream from stdin. Thus, we never create a complete copy of the archive, either in memory or on the disk. We untar files as soon as we tar them up.

At this point we do a blind innovation I really don't understand. I just did it and it works. The link above has some more text on this, and some things you should check afterwards.

# block detect > /etc/config/fstab; \
sed -i s/option$'\t'enabled$'\t'\'0\'/option$'\t'enabled$'\t'\'1\'/ /etc/config/fstab; \
sed -i s#/mnt/sda1#/overlay# /etc/config/fstab; \
cat /etc/config/fstab;

At this point, I reboot and relogin. We need to update the package manager again. That's because when we did it the first time, it didn't include packages that could fit in our tiny partition. We update again now with a huge overlay to get a list of all the package.

# opkg update

For example, gcc is something like 50 megabytes, so I wouldn't have fit initially, and now it does. it's the first thing I grab, along with git.

# opkg install gcc make git

Now I add a user. I have to do this manually, because there's no "adduser" utility I can find that does this for me. This involves:

  • adding line to /etc/passwd
  • adding line to /etc/group
  • using passwd command to change the password for the accountt
  • creating a directory for the user
  • chown the user's directory

My default shell is /bin/ash (BusyBox) instead of /bin/bash. I haven't added bash yet, but I figure for a testing system, maybe I shouldn't.

I'm missing a git helper for https, so I use the git protocol instead:

$ git clone git://github.com/robertdavidgraham/masscan
$ cd masscan

At this point, I'd normally run "make -j" to quickly build the project, starting a separate process for each file. This compiles a lot faster, but this device only has 64-mgs of RAM, so it'll run out of space quickly. Each process needs around 20-megabytes of RAM. So I content myself with two threads:

$ make -j 2

That's enough such that one process can be stuck waiting to read files while the other process is CPU bound compiling. This is a slow CPU, so that's a big limitation.

The final linking step fails. That's because this platform uses different libraries than other Linux versions, the musl library instead of glibc you find on the big distros, or uClibc on smaller distros, like those you'd find on the Raspberry Pi. This is excellent -- I found my first bug I need to fix.

In any case, I need to verify this is indeed "big-endian" mode, so I wrote a little program to test it:

#include

void main(void)
{
int x = *(int*)"\1\2\3\4";
printf("0x%08x\n", x);
}

It indeed prints the big-endian result:

0x01020304

The numbers would be reversed if this were little-endian like x86.

Anyway, I thought I'd document the steps for those who want to play with these devices. The same steps would apply to other OpenWRT devices. GL-iNet has some other great options to work with, but of course, after some point, it's just easier getting Raspberry Pi knockoffs instead.







text California's bad IoT law
Mon, 10 Sep 2018 21:15:00 +0000
California has passed an IoT security bill, awaiting the governor's signature/veto. It’s a typically bad bill based on a superficial understanding of cybersecurity/hacking that will do little improve security, while doing a lot to impose costs and harm innovation.


It’s based on the misconception of adding security features. It’s like dieting, where people insist you should eat more kale, which does little to address the problem you are pigging out on potato chips. The key to dieting is not eating more but eating less. The same is true of cybersecurity, where the point is not to add "security features" but to remove "insecure features". For IoT devices, that means removing listening ports and cross-site/injection issues in web management. Adding features is typical "magic pill" or "silver bullet" thinking that we spend much of our time in infosec fighting against.

We don’t want arbitrary features like firewall and anti-virus added to these products. It’ll just increase the attack surface making things worse. The one possible exception to this is "patchability": some IoT devices can’t be patched, and that is a problem. But even here, it’s complicated. Even if IoT devices are patchable in theory there is no guarantee vendors will supply such patches, or worse, that users will apply them. Users overwhelmingly forget about devices once they are installed. These devices aren’t like phones/laptops which notify users about patching.

You might think a good solution to this is automated patching, but only if you ignore history. Many rate "NotPetya" as the worst, most costly, cyberattack ever. That was launched by subverting an automated patch. Most IoT devices exist behind firewalls, and are thus very difficult to hack. Automated patching gets beyond firewalls; it makes it much more likely mass infections will result from hackers targeting the vendor. The Mirai worm infected fewer than 200,000 devices. A hack of a tiny IoT vendor can gain control of more devices than that in one fell swoop.

The bill does target one insecure feature that should be removed: hardcoded passwords. But they get the language wrong. A device doesn’t have a single password, but many things that may or may not be called passwords. A typical IoT device has one system for creating accounts on the web management interface, a wholly separate authentication system for services like Telnet (based on /etc/passwd), and yet a wholly separate system for things like debugging interfaces. Just because a device does the proscribed thing of using a unique or user generated password in the user interface doesn’t mean it doesn’t also have a bug in Telnet.

That was the problem with devices infected by Mirai. The description that these were hardcoded passwords is only a superficial understanding of the problem. The real problem was that there were different authentication systems in the web interface and in other services like Telnet. Most of the devices vulnerable to Mirai did the right thing on the web interfaces (meeting the language of this law) requiring the user to create new passwords before operating. They just did the wrong thing elsewhere.

People aren't really paying attention to what happened with Mirai. They look at the 20 billion new IoT devices that are going to be connected to the Internet by 2020 and believe Mirai is just the tip of the iceberg. But it isn’t. The IPv4 Internet has only 4 billion addresses, which are pretty much already used up. This means those 20 billion won’t be exposed to the public Internet like Mirai devices, but hidden behind firewalls that translate addresses. Thus, rather than Mirai presaging the future, it represents the last gasp of the past that is unlikely to come again.

This law is backwards looking rather than forward looking. Forward looking, by far the most important thing that will protect IoT in the future is "isolation" mode on the WiFi access-point that prevents devices from talking to each other (or infecting each other). This prevents "cross site" attacks in the home. It prevents infected laptops/desktops (which are much more under threat than IoT) from spreading to IoT. But lawmakers don’t think in terms of what will lead to the most protection, they think in terms of who can be blamed. Blaming IoT devices for moral weakness of not doing "reasonable" things is satisfying, regardless if it's effective.

The law makes the vague requirement that devices have "reasonable" and "appropriate" security features. It’s impossible for any company to know what these words mean, impossible to know if they are compliant with the law. Like other laws that use these terms, it’ll have be worked out in the courts. But security is not like other things. Rather than something static that can be worked out once, it’s always changing. This is especially true since the adversary isn’t something static like wear and tear on car parts, but dynamic: as defenders improve security, attackers change tactics, so what’s "reasonable" is constantly changing. Security struggles with hindsight bias, so what’s "reasonable" and "appropriate" seem more obvious after bad things occur rather than before. Finally, you are asking the lay public to judge reasonableness, so a jury can easily be convinced that "anti-virus" would be a reasonable addition to IoT devices despite experts believing it would be unreasonable and bad.

The intent is for the law to make some small static improvement, like making sure IoT products are patchable, after a brief period of litigation. The reality is that the issue is going to constantly be before the courts as attackers change tactics, causing enormous costs. It’s going to saddle IoT devices with encryption and anti-virus features that the public believe are reasonable but that make security worse.

Lastly, Mirai was only 200k devices that were primarily outside the United States. This law fails to address this threat because it only applies to California devices, not the devices purchased in Vietnam and Ukraine that, once they become infected, would flood California targets. If somehow the law influenced general improvement of the industry, you’d still be introducing unnecessary costs to 20 billion devices in an attempt to clean up 0.001% of those devices.

In summary, this law is based upon an obviously superficial understanding of the problem. It in no way addresses the real threats, but at the same time, introduces vast costs to consumers and innovation. Because of the changing technology with IPv4 vs. IPv6 and WiFi vs. 5G, such laws are unneeded: IoT of the future is inherently going to be much more secure than the Mirai-style security of the past.




Update: This tweet demonstrates the points I make above. It's about how Tesla used an obviously unreasonable 40-bit key in its keyfobs.

It's obviously unreasonable and they should've known about the weakness of 40-bit keys, but here's the thing: every flaw looks this way in hindsight. There never has been a complex product ever created that didn't have similarly obvious flaws.

On the other hand, what Tesla does have better than any other car maker is the proper programs whereby they can be notified of such flaws in order to fix them in a timely manner. Better yet, they offer bug bounties. This isn't a "security feature" in the product, but yet is absolutely the #1 most important thing that a company has, more so than any security feature. What we are seeing with the IoT marketplace in general is that companies lack such notification/disclosure programs: companies can be compliant with the California law was still lacking such programs.

Finally, Tesla cars are "Internet connected devices" according to the law, so they can be sued under that law for this flaw, even though it represents no threat the law was intended to handle.

Again, the law wholly misses the point. A law demanding IoT companies have disclosure program would actually be far more effective at improving security than this current law, while not imposing the punitive costs the current law does.

text Debunking Trump's claim of Google's SOTU bias
Thu, 30 Aug 2018 01:51:00 +0000
Today, Trump posted this video proving Google promoted all of Obama "State of the Union" (SotU) speeches but none of his own. In this post, I debunk this claim. The short answer is this: it's not Google's fault but Trump's for not having a sophisticated social media team.


The evidence still exists at the Internet Archive (aka. "Wayback Machine") that archives copies of websites. That was probably how that Trump video was created, by using that website. We can indeed see that for Obama's SotU speeches, Google promoted them, such as this example of his January 12, 2016 speech:


And indeed, if we check for Trump's January 30, 2018 speech, there's no such promotion on Google's homepage:
But wait a minute, Google claims they did promote it, and there's even a screenshot on Reddit proving Google is telling the truth. Doesn't this disprove Trump?

No, it actually doesn't, at least not yet. It's comparing two different things. In the Obama example, Google promoted hours ahead of time that there was an upcoming event. In the Trump example, they didn't do that. Only once the event went live did they mention it.

I failed to notice this in my examples above because the Wayback Machine uses GMT timestamps. At 9pm EST when Trump gave his speech, it was 2am the next day in GMT. So picking the Wayback page from January 31st we do indeed see the promotion of the live event.


Thus, Trump still seems to have a point: Google promoted Obama's speech better. They promoted his speeches hours ahead of time, but Trump's only after they went live.

But hold on a moment, there's another layer to this whole thing. Let's look at those YouTube URLs. For the Obama speech, we have this URL:


For the Trump speech, we have this URL:


I show you the complete URLs to show you the difference. The first video is from the White House itself, whereas the second isn't (it's from the NBC livestream).

So here's the thing, and I can't stress this enough Google can't promote a link that doesn't exist. They can't say "Click Here" if there is no "here" there. Somebody has to create a link ahead of time. And that "somebody" isn't YouTube: they don't have cameras to create videos, they simply publish videos created by others.

So what happened here is simply that Obama had a savvy media that knew how to create YouTube live events, and make sure they get promoted, while Trump doesn't have such a team. Trump relied upon the media (which he hates so much) to show the video live, making no effort himself to do so. We can see this for ourselves: while the above link clearly shows the Obama White House having created his live video, the current White House channel has no such video for Trump.

So clearly the fault is Trump's, not Google's.

But wait, there's more to the saga. After Trump's speech, Google promoted the Democrat response:


Casually looking back through the Obama years, I don't see any equivalent Republican response. Is this evidence of bias?

Maybe. Or again, maybe it's still the Democrats are more media savvy than the Republicans. Indeed, what came after Obama's speech on YouTube in some years was a question-and-answer session with Obama himself, which of course is vastly more desirable for YouTube (personal interaction!!) and is going to push any competing item into obscurity.

If Trump wants Google's attention next January, it's quite clear what he has to do. First, set up a live event the day before so that Google can link to it. Second, setup a second post-speech interactive question event that will, of course, smother the heck out of any Democrat response -- and probably crash YouTube in the process.

Buzzfeed quotes Google PR saying:
On January 30 2018, we highlighted the livestream of President Trump’s State of the Union on the google.com homepage. We have historically not promoted the first address to Congress by a new President, which is technically not a State of the Union address. As a result, we didn’t include a promotion on google.com for this address in either 2009 or 2017.
This is also bunk. It ignores the difference between promoting upcoming and live events. I can't see that they promoted any of Bush's speeches (like in 2008) or even Obama's first SotU in 2010, though it did promote a question/answer session with Obama after the 2010 speech. Thus, the 2017 trend has only a single data point.

My explanation is better: Obama had a media savvy team that reached out to them, whereas Trump didn't. But you see the problem for a PR flack: while they know they have no corporate policy to be biased against Trump, at the same time, they don't necessarily have an explanation, either. They can point to data, such as the live promotion page, but they can't necessarily explain why. An explanation like mine is harder for them to reach.











text Provisioning a headless Raspberry Pi
Mon, 27 Aug 2018 01:39:00 +0000
The typical way of installing a fresh Raspberry Pi is to attach power, keyboard, mouse, and an HDMI monitor. This is a pain, especially for the diminutive RPi Zero. This blogpost describes a number of options for doing headless setup. There are several options for this, including Ethernet, Ethernet gadget, WiFi, and serial connection. These examples use a Macbook as an example, maybe I'll get around to a blogpost describing this from Windows.

Burning micro SD card

We are going to edit the SD card before booting, so for completeness, I thought I'd describe the process of burning an SD card.

We are going to download the latest "raspbian" operating system. I download the "lite" version because I'm not using the desktop features. It comes as a compressed .zip file which we need to extract into an .img file. Just double-click on the .zip on Windows or Mac.

The next step is to burn the image to an SD card. On Windows I use Win32DiskImager. On Mac I use the following command-line steps:

$ sudo -s
# mount
# diskutil unmount /dev/disk2s1
# dd bs=1m if=~/Downloads/2018-06-27-raspbian-stretch-lite.img of=/dev/disk2 conv=sync

First, I need a root prompt. I then use the mount command to find out where the micro SD card is mounted in the file system. It's usually /dev/disk2s1, but could be disk3 or disk4 depending upon other things that may already be mounted on my Mac, such as USB drives or dmg files. It's important to know the correct drive because the dd utility is unforgiving of mistakes and can wipe out your entire drive. For gosh's sake, don't use disk1!!!! Remember dd stands for danger-danger (well, many claim it stands for disk-dump, but seriously, it's dangerous).

The next step is to unmount the drive. Instead of the Unix umount utility use the diskutil unmount macOS tool.

Now we use good ol' dd to copy the image over. The above example is my recently download raspbian image that's two months old. When you do this, it'll be a newer version with a different file name, so look in your ~/Downloads folder for the correct name.

This takes a while to write to the SD card. You can type [ctrl-T] to see progress if you want.

When we are done writing, don't eject the card. We are going to edit the contents as described below before we stick it into our Raspberry Pi. After running dd, it's going to become automatically mounted on your Mac, on mine it comes up as /Volumes/boot. When I say "root directory of the SD card" in the instructions below, I mean that directory.

Troubleshooting: If you get the "Resource busy" error when running dd, it means you didn't unmount the drive. Go back and run diskutil unmount /dev/disk2s1 (or equivalent for whatever mount tell you which drive the SD card is using).

You can use the "raw" disk instead of normal disk, such as /dev/rdisk2. I don't know what the tradeoffs are.

Ethernet

The RPi B comes with Ethernet built-in. You simply need to hook up the Ethernet cable to your network to automatically get an IP address. Or, you can directly connect the Ethernet to your laptop -- that that'll require some additional steps.

For a RPi Zero, you can attach a USB Ethernet adapter via an OTG converter to accomplish the same goal. However, in the next section, we'll describing using a OTG gadget instead, which is better.

We want to use Ethernet to ssh into the device, but there's a problem: the ssh service is not enabled by default in Raspbian. To enable it, just create a file ssh (or ssh.txt) in the root directory of the SD card. On my Macbook, it looks like:

$ touch /Volumes/boot/ssh

Eject the SD card, stick it into your Raspberry Pi, and boot it. After the device has booted, you'll need to discover its IP address. On the local network from your Macbook, with the "Bonjour" service, you can just use the hostname "raspberrypi.local" (you can install Bonjour on Windows with iTunes, or the avahi service on Linux). Or, you can sniff the network with tcpdump. Or, you can scan for port 22 on the network with nmap or masscan. Or, you can look on your router's DHCP status page to see what was assigned.

When there is a direct Ethernet-to-Ethernet connection on your laptop, the RPi won't get an IP address because there is not DHCP service running on your laptop. In that case, the RPi will have an address in the range 169.254.x.x, or a link-local IPv6 address. You can discover which one via sniffing, or again, via Bonjour using raspberrypi.local.

Or, you can turn on "connection sharing" in Windows or macOS. This sets up your laptop to NAT the Ethernet out through your laptop's other network connection (such as WiFi). This also provides DHCP to the device. On my macBook, it assigns the RPi an address like 192.168.2.2 or 192.168.2.3.

System preferences -> Sharing
Allowing Ethernet devices to share WiFi connection to Internet
In the above example, the RPi is attached via my Thunderbolt Ethernet cable. I could also have used a USB Ethernet, or RNDIS Ethernet (described below).

The default login for Raspbian is username pi and password raspberry. To ssh to the device, use a command line like:

$ ssh pi@raspberrypi.local

or

$ ssh pi@192.168.2.2

Some troubleshooting tips. If you get an error "connection refused", that means the remote SSH service isn't running. It has to generate a new, random, SSH key the first time it runs, so startup can take a while. I just waited another minute and tried again, and everything worked. If you get an error "connection closed", then it borked generating a key on the first startup. The service is running, and allowing the connection, then closing it because it has no key to use. There's no hope for things at this point other than reflashing the SD card and starting over from scratch, or logging in some other way and fixing the SSH installation manually. I had this problem happen once, I don't know why, and ended up just starting over from scratch.

RPi Zero OTG Ether Gadget

The Raspberry PI Zero (not the other models) supports OTG (On-The-Go) USB. That means it can be something on either end of a USB cable, either a host or a device. Among the devices it can emulate is an Ethernet adapter, thus allowing a USB cable to act as a virtual Ethernet connection. This is useful because the same USB cable can also power the RPi Zero. Just be sure to plug the cable into the port labeled "USB" instead of "PWR IN".

I had to mess with these instructions twice. I haven't troubleshooted why, I suspect that things failed on the first time around setting up the RNDIS drivers and Internet sharing. Once I got those configured correctly to automatically work, I reflashed the SD card and started again from scratch, and things worked slick.

As described above for Ethernet, after flashing the Raspbian image to the SD card, do "touch ssh" in its root directory to tell it to enable the SSH service on bootup.

Also in that root directory you'll find a file config.txt. Edit that file and add the line "dtoverlay=dwc2" to the bottom.
The dwc2 is a driver for the OTG port that auto-detects if the port should be in host mode (where you attach devices to the RPi like flash drives), or device mode (such when emulating Ethernet, serial ports, and so forth).

Also in the root directory you'll find cmdline.txt. Edit that file. It has only one very long line of text (that'll wrap terminal). Edit that line. Move the cursor to after nowait and add the text "modules-load=dwc2,g_ether".


These are the Linux command-line boot parameters. This is telling Linux to load the dwc2 driver, and configure that driver for emulating an Ethernet adapter.

Now cleanly eject the SD card, stick it in the RPi zero, and connect the USB cable. Remember to plug into the USB port on the RPi Zero not the PWR IN port.

On macOS, go to the Network System Preferences. Wait a couple minutes for the RPi Zero to boot and you should see an "RNDIS" Ethernet device appear. I've given mine a manual IP address, though I don't think it matters, because I'm going to use "Internet sharing" to share the connection anyway.

RPi0 should now appear as RNDIS device
By the way, "RNDIS" is the name Microsoft gave this virtual Ethernet adapter, based on the NDIS name for Ethernet drivers Microsoft first created in the 1980s. It's the name we use on macOS, Linux, BSD, Android, etc.

I struggled getting the proper IP address on this thing and ended up using Internet sharing, as described above, for this. The only change was to share the RNDIS Ethernet instead of Thunderbolt Ethernet.
Use NAT/DHCP to allow RPi0 to share my laptop's WiFi

As described above, now do "ssh pi@raspberrypi.local" or "ssh pi@192.168.2.6" (IP address as appropriate), with password "raspberry".

As I mentioned above, I had to do this twice to get it to work the first time, I suspect that configuring macOS for the first time screwed things up.

The Ethernet interface will come up with the name usb0.

WiFi

For the devices supporting WiFi, instead of using Ethernet we can use WiFi.

To start with, we again create the ssh file to tell it to start the service:

$ touch /Volumes/boot/ssh

Now we to create a file in the SD root directory called "wpa_supplicant.conf" with contents that look like the following:

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=US

network={
ssid="YOURSSID"
psk="YOURPASSWORD"
scan_ssid=1
}

You need to change the SSID and password to conform to your WiFi network, as I do in the screenshot below (this is actually the local bar's network):

Safely eject the card, insert into RPi, and power it up. In the following screenshot, I'm using a battery to power the RPi, to show there is no other connection (Ethernet, USB, etc.).

I then log in from my laptop that is on the same WiFi. Again, you can use either raspberrypi.local (on a laptop using Apple's Bonjour service), or use the raw IP address, as in the following example:
This one time it didn't work. It had everything configured right, but for some reason it didn't find the WiFi network. Restarting the device fixed the problem. I'm not sure what this happened.

Enabling serial cable

The old-school way t.

The first step is to go to Adafruit and buy a serial cable for $10, this device for $7, or for $6 from Amazon, and install the drivers as documented here. The cable I got requires the "SiLabs CP210X" drivers.

The next step is to edit config.txt on the SD card and add the line at the end "enable_uart=1".
Now we are ready to cleanly eject the SD card and stick in the Raspberry Pi.

First, let's hook the serial cable to the Raspberry Pi. NOTE: don't plug in the USB end into the computer yet!!! The guide at Adafruit shows which colored wires to connect to which GPIO pins.
From Adafruit
Basically, the order is (red) [blank] [black] [white] [green] from the outer edge. It's the same configuration for Pi Zeroes, but you may get yours without pins. You either have to solder on some jumper wires [*] or use alligator clips.

You have two options on how to power the board. You can either connect the red wire to the first pin (as I do in the picture below) or you can connect power as normal, such as to a second USB port on your laptop. I chose to try the serial cable to power my Raspberry Pi 3 Model B+ from the serial port. I got occasional messages complaining about "undervoltage", but everything worked without corrupting the SD card (SD card corruption it often what happens with power problems).


Once you've got the serial cable attached to the Pi, then plug it into the USB port on the laptop. This should start booting up.

On Windows you can use Putty, and on Linux you can use /dev/ttyUSB0, but on the Macbook we are going to use an outgoing serial device. The first thing is to find the device, such as doing "ls /dev/cu.*" to see which devices are available. On my Macbook, I get "/dev/cu.SLAB_USBtoUART" as the one to use, plus some other possibilities (from Bluetooth and my iPhone) that I'm not interested in:

/dev/cu.Bluetooth-Incoming-Port
/dev/cu.SLAB_USBtoUART
/dev/cu.iPhone-WirelessiAPv2

The command to run to connect to the Pi is:

$ sudo screen /dev/cu.SLAB_USBtoUART 115200

You'll have to hit the return key a couple times for it to know you've connected, at which point it'll give you a command prompt.

(I've renamed the system from 'raspberrypi' in the screenshot to 'pippen').

Note that with some jumper wires you can simply connect the UART from one Raspberry Pi to another.

Conclusion

So here you have the various ways:

  • Raspberry Pi 3 Model B/B+ - Ethernet
  • Raspberry Pi 3 Model B/B+ - WiFi
  • Raspberry Pi 3 Model B/B+ - Serial
  • Raspberry Pi Zero/Zero W - USB Ethernet dongle
  • Raspberry Pi Zero/Zero W - OTG Ethernet gadget
  • Raspberry Pi Zero/Zero W - Serial
  • Raspberry Pi Zero W - WiFi
It seems we should also be able to get Bluetooth serial working, but there's no support for that yet in the Raspbian build. A serial OTG gadget (that works like the Ethernet gadget) should also work in theory, but apparently it needs an extra configuration step after bootup, so can't be configured completely headless.











text DeGrasse Tyson: Make Truth Great Again
Mon, 20 Aug 2018 20:03:00 +0000
Neil deGrasse Tyson tweets the following:
When people make comparisons with Orwell's "Ministry of Truth", he obtusely persists:
Given that Orwellian dystopias were the theme of this summer's DEF CON hacker conference, let's explore what's wrong with this idea.

Truth vs. "Truth"

I work in a corrupted industry, variously known as the "infosec" community or "cybersecurity" industry. It's a great example of how truth is corrupted into "Truth".

At a recent government policy meeting, I pointed out how vendors often downplay the risk of bugs (vulnerabilities that can be exploited by hackers). When vendors are notified of these bugs and release a patch to fix them, they often give a risk rating. These ratings are often too low, in order to protect the corporate reputation. The representative from Oracle claimed that they didn't do that, and that indeed, they'll often overestimate the risk. Other vendors chimed in, also claiming they rated the risk higher than it really was.

In a neutral world, deliberately overestimating the risk would be the same falsehood as deliberately underestimating it. But we live in a non-neutral world, where only one side is a lie, the middle is truth, and the other side is "Truth". Lying in the name of the "Truth" is somehow acceptable.

Moreover, Oracle is famous for having downplayed the risk of significant bugs in the past, and is well-known in the industry as being the least trustworthy vendor as far as security of their products is concerned. Much of their policy efforts in Washington D.C. are focused on preventing their dirty laundry from being exposed. They aren't simply another vendor promoting "Truth", but a deliberately exploiting "Truth" to corrupt ends.

That we should exaggerate the risks of cybersecurity, deliberately lie to people for their own good, is the uncontroversial consensus of our infosec/cybersec community. Most do it, few think this is wrong. Security is a moral imperative that justifies "Truth".

The National Academy of Scientists

So are we getting the truth or "Truth" from organizations like the National Academy of Scientists?

The question here isn't global warming. That mankind's carbon emissions warms the climate is truth. We have a good understanding of how greenhouse gases work, as well as many measures of the climate showing that warming is occurring. The Arctic is steadily losing ice each summer.

Instead, the question is "Global Warming", the claims made by politicians on the subject. Do politicians on the left fairly represent the truth, or are they the "Truth"?

Which side is the National Academy of Sciences on? Are they committed to the truth, or (like the infosec/cybersec community) are they pursuing "Truth"? Is global warming a moral imperative that justifies playing loose with the facts?

Googling "national academy of sciences climate change" quickly leads to this document: "Climate Change: Evidence and Causes". Let's skip past the basics and go directly to "hurricanes". It's a favorite topic among politicians, where every hurricane season they blame the latest damage on climate change. Is such blame warranted?

The answer is "no". There is not sufficient evidence to conclude hurricanes have gotten worse. There is good reason to believe they might get worse, after all, warmer oceans lead to more energy, but as far as we can tell, it hasn't happened yet. Moreover, when it does happen, the best theories point to hurricanes only becoming slightly worse. It's certainly worthy to add to future estimates of the costs of climate change, but it's not going to be catastrophic.

The above scientific document, though, punts on this answer, as shown in the below screenshot:


The document is clearly a political one. It's content is intended to refute any scientific claims made by Republicans, but not to offend Democrats. It is on the side of "Truth" not truth. If Obama blames hurricane damage on the oil companies, the National Academy of Sciences is going to politely dance around the issue.

Whenever I point out in conversation that the science is against somebody's claim about hurricanes, people ask me to cite my sources. This is exactly the source I would cite, but it's difficult. It's non-answer on hurricanes should be sufficient, after all, science. But since they prevaricate without being explicit on the issue, few accept this source.

Why this matters

Last year in the state of Washington, the Republicans put a carbon tax bill on the ballot in order to combat climate change. The Democrats shot it down.

The reason (for both actions) is that the tax was revenue neutral, meaning the added revenue from the carbon tax was offset by reduction in other taxes (namely, the sales tax). This matches the Republican ideology: they have no particular dispute with climate change as such, they just oppose expansion of government. If you can address climate change without increasing taxes or regulation, they have no principled reason to oppose it. Thus, revenue neutral carbon taxes are something Republicans will easily agree with. Even if they don't believe in global warming, they have no real opposition to replacing one tax with another.

Conversely, the Democrats don't care about solving climate change. Instead, their goal is to expand government, increasing taxes and regulation. They will reject any proposal to address climate change that doesn't match their ideological goals.

This description of what happened is extreme, of course. Things are invariably more nuanced than this. But there's still a kernel of truth here. This idea that one side is being ideological (denying climate change) and other side scientific is false. Both sides are equally ideological/scientific, just in different directions.

It's therefore not just Republican ideology here that is the sticking point, but also Democrat. As long as Democrats believe they don't have to compromise, because the "Truth" is on their side, they won't. Instead of agreeing on revenue neutral carbon taxes, they'll insist on that extra revenue subsidizing photovoltaic panels (or some such that increases total government taxes/spending). The National Academy of Sciences defending "Truth" is not helping the situation.

Conclusion

I believe in global-warming/climate-change, that mankind's carbon emissions are increasing temperatures and that we must do something about this. I drive an electric car, but more importantly, use carbon offsets in order to be completely carbon neutral. I want a large carbon tax, albeit one that is revenue neutral. This blogpost shouldn't be interpreted in any way as "denying climate change".

Instead, the point is about "Truth". I see the facile corruption of "Truth" in my own industry. It's incredibly Orwellian. I'm disappointed how those like Neil deGrasse Tyson haven't learned the lessons of history and 1984 about "Truth".



text That XKCD on voting machine software is wrong
Thu, 09 Aug 2018 00:09:00 +0000
The latest XKCD comic on voting machine software is wrong, profoundly so. It's the sort of thing that appeals to our prejudices, but mistakes the details.


Accidents vs. attack

The biggest flaw is that the comic confuses accidents vs. intentional attack. Airplanes and elevators are designed to avoid accidental failures. If that's the measure, then voting machine software is fine and perfectly trustworthy. Such machines are no more likely to accidentally record a wrong vote than the paper voting systems they replaced -- indeed less likely. The reason we have electronic voting machines in the first place was due to the "hanging chad" problem in the Bush v. Gore election of the year 2000. After that election, a wave of new, software-based, voting machines replaced the older inaccurate paper machines.

The question is whether software voting machines can be attacked. Well, if that's the measure, then airplanes aren't safe at all. Security against human attack consists of the entire infrastructure outside the plane, such as TSA forcing us to take off our shoes, to trade restrictions to prevent the proliferation of Stinger missiles.

Confusing the two, accidents vs. attack, is used here because it makes the reader feel superior. We get to mock and feel superior to those stupid software engineers for not living up to what's essentially a fictional standard of reliability.

To repeat: software is better than the mechanical machines they replaced, which is why there are so many software-based machines in the United States. The issue isn't normal accuracy, but their robustness against a different standard, against attack -- a standard which airplanes and elevators suck at.

The problems are as much hardware as software

Last year at the DEF CON hacking conference they had an "Election Hacking Village" where they hacked a number of electronic voting machines. Most of those "hacks" were against the hardware, such as soldering on a JTAG device or accessing USB ports. Other errors have been voting machines being sold on eBay whose data wasn't wiped, allowing voter records to be recovered.

What we want to see is hardware designed more like an iPhone, where the FBI can't decrypt a phone even when they really really want to. This requires special chips, such as secure enclaves, signed boot loaders, and so on. Only once we get the hardware right can we complain about the software being deficient.

To be fair, software problems were also found at DEF CON, like an exploit over WiFi. Though, a lot of problems are questionable whether the fault lies in the software design or the hardware design, fixable in either one. The situation is better described as the entire design being flawed, from the "requirements", to the high-level system "architecture", and lastly to the actual "software" code.

It's lack of accountability/fail-safes

We imagine the threat is that votes can be changed in the voting machine, but it's more profound than that. The problem is that votes can be changed invisibly. The first change experts want to see is adding a paper trail, rather than fixing bugs.

Consider "recounts". With many of today's electronic voting machines, this is meaningless, with nothing to recount. The machine produces a number, and we have nothing else to test against whether that number is correct or false. You can press a button and do an instant recount, but it won't tell you any other answer than the original one.

A paper trail changes this. After the software voting machine records the votes, it prints them to paper for the voter to check. This retains the features of better user-interface design than the horrible punch-hole machines of yore, and retains the feature of quick and cheap vote tabulation, so we know the results of the election quickly. But, if there's an irregularity, there exists an independent record that we can go back, with some labor, and verify.

It's like fail-systems in industrial systems, where we are less concerned about whether the normal systems have an error, but much more concerned about whether the fail-safe system works. It's like how famously Otis is not the inventor of elevators, but the inventor of elevator brakes that will safely stop the car from plummeting to your death if the cable snaps.

What's lacking in election machines therefore is not good or bad software engineering, but the failure of anybody to create fail-safes in the design, fail-safes that will work regardless of how the software works.

It's not just voting machines

It's actually really hard for the Russians to hack voting machines, as they have to be attacked them on a one-by-one basis. It's hard for a mass hack that affects them all.

It's much easier to target the back-end systems that tabulate the votes, which are more often normal computers connected to the Internet.

In addition, there are other ways that hackers can target elections. For example, the political divide in America between rural (Republican) and urban (Democrat) voters is well known. An attack against traffic lights, causing traffic jams, is enough to swing the vote slightly in the direction of rural voters. That makes a difference in places like last night's by-election in Ohio where a House candidate won by a mere 1,700 votes.

Voting machines are important, but there's way to much focus on them as if they are the only target to worry about.

Conclusion

The humor of this comic rests on smug superiority. But it's wrong. It's applying a standard (preventing accidents) against a completely different problem (stopping attackers) -- software voting machines are actually better against accidents than the paper machines they replace. It's ignoring the problems, which are often more system and hardware design than software. It ignores the solution, which isn't to fix software bugs, but to provide an independent, auditable paper trail.



So I took a survey of WiFi at Caesar's Palace and thought I'd write up some results.


When we go to DEF CON in Vegas, hundreds of us bring our WiFi tools to look at the world. Actually, no special hardware is necessary, as modern laptops/phones have WiFi built-in, while the operating system (Windows, macOS, Linux) enables "monitor mode". Software is widely available and free. We still love our specialized WiFi dongles and directional antennas, but they aren’t really needed anymore.

It’s also legal, as long as you are just grabbing header information and broadcasts. Which is about all that’s useful anymore as encryption has become the norm -- we can pretty much only see what we are allowed to see. The days of grabbing somebody’s session-cookie and hijacking their web email are long gone (though the was a fun period). There are still a few targets around if you want to WiFi hack, but most are gone.

So naturally I wanted to do a survey of what Caesar’s Palace has for WiFi during the DEF CON hacker conference located there.

Here is a list of access-points (on channel 1 only) sorted by popularity, the number of stations using them. These have mind-blowing high numbers in the ~3000 range for "CAESARS". I think something is wrong with the data.



I click on the first one to drill down, and I find a source of the problem. I’m seeing only "Data Out" packets from these devices, not "Data In".


These are almost entirely ARP packets from devices, associated with other access-points, not actually associated with this access-point. The hotel has bridged (via Ethernet) all the access-points together. We can see this in the raw ARP packets, such as the one shown below:

WiFi packets have three MAC addresses, the source and destination (as expected) and also the address of the access-point involved. The access point is the actual transmitter, but it's bridging the packet from some other location on the local Ethernet network.

Apparently, CAESARS dumps all the guests into the address range 10.10.x.x, all going out through the router 10.10.0.1. We can see this from the ARP traffic, as everyone seems to be ARPing that router.

I'm probably seeing all the devices on the CAESARS WiFi. In other words, if I sit next to another access-point, such as one on a different channel, I'm likely to see the same list. Each broadcast appears to be transmitted by all access-points, carried via the backend bridged Ethernet network.

The reason Caesars does it this way is so that you can roam, so that you can call somebody on FaceTime and walk to the end of the Forum shops on back and not drop the phone call. At least in theory, I haven’t tested it to see if things actually work out this way. It’s the way massive complexes like Caesars ought to work, but which many fail at doing well. Like most "scale" problems, it sounds easy and straightforward until you encounter all the gotchas along the way.

Apple’s market share for these devices is huge, with roughly 2/3rds of all devices being Apple. Samsung has 10% Apple’s share. Here's a list of vendor IDs (the first 3 bytes of the MAC address) by popularity, that I'm seeing on that one access-point:

  • 2327 Apple
  • 257 Samsung
  • 166 Intel
  • 132 Murata
  • 55 Huawei
  • 29 LG
  • 27 HTC-phone
  • 23 Motorola
  • 21 Foxconn
  • 20 Microsoft
  • 17 Amazon
  • 16 Lite-On
  • 13 OnePlus
  • 12 Rivet Networks (Killer)
  • 11 (random)
  • 10 Sony Mobile
  • 10 Microsoft
  • 8 AsusTek
  • 7 Xiaomi
  • 7 Nintendo

Apparently, 17 people can't bear to part with their Amazon Echo/Alexa devices during their trip to Vegas and brought the devices with them. Or maybe those are Kindle devices.

Remember that these are found by tracking the vendor ID from the hardware MAC addresses built into every phone/laptop/device. Historically, we could also track these MAC addresses via "probe" WiFi broadcasts from devices looking for access-points. As I’ve blogged before, modern iPhones and Androids randomize these addresses so we can no longer track the phones when they are just wandering around unconnected. Only once they connect do they use their real MAC addresses.

In the above sample, I’ve found ~1300 probers, ~90% of whose MAC addresses are randomized. As you can see, because of the way Caesars sets up their network, I can track MAC addresses better because of ARP broadcasts than I can with tracking WiFi probe broadcasts.

While mobile devices are the biggest source of MAC addresses, they also identify the fixed infrastructure. For example, some of the suites in Caesars have devices with a "Creston" MAC address. Somebody is releasing an exploit at BlackHat for Creston devices. There’s a patch available, but chances are good that hackers will start hacking these devices before Caesars gets around to patching them.

WPA-3 is promising to get rid of the open WiFi hotspots like CAESARS, but doing "optimistic encryption" by default. This is better, preventing me from even seeing the contents of ARP packets passively. However, as I understand the standard, it'll still allow me to collect the MAC addresses passively, as in this example.

Conclusion

This post doesn't contain any big hack. It's fascinating how big WiFi has become ,as everyone seems to be walking around with a WiFi device, and most are connecting to the hotel WiFi. Yet, with ubiquitous SSL and WPA encryption, there's much less opportunity for mischief, even though there's a lot more to see.

The biggest takeaway is that even from a single point on the big network on the CAESARS compound, I can get a record of some identifier that can in ideal circumstances be traced back to you. In theory, I could sit an airport, or drive around your neighborhood, an match up those records with these records. However, because of address randomization in probes, I can only do this is you've actually connected to the networks.

Finally, for me, the most interesting bit is to appreciate how the huge CAESARS networks actually works to drop everyone on the same WiFi connection.


I thought I'd document the solution to this problem I had.

The API libpcap is the standard cross-platform way of sniffing packets off the network. It works on Windows (winpcap), macOS, and all the Unixes. It's better than simply opening a "raw socket" on Unix platforms because it takes advantage of higher performance capabilities of the system, including specialized sniffing hardware.


Traditionally, you'd open an adapter with pcap_open(), whose function parameters set options like snap length, promiscuous mode, and timeouts.

However, in newer versions of the API, what you should do instead is call pcap_create(), then set the options individually with calls to functions like pcap_set_timeout(), then once you are ready to start capturing, call pcap_activate().

I mention this in relation to "TPACKET" and pcap_set_immediate_mode().

Over the years, Linux has been adding a "ring buffer" mode to packet capture. This is a trick where a packet buffer is memory mapped between user-space and kernel-space. It allows a packet-sniffer to pull packets out of the driver without the overhead of extra copies or system calls that cause a user-kernel space transition. This has gone through several generations.

One of the latest generations causes the pcap_next() function to wait forever for a packet. This happens a lot on virtual machines where there is no background traffic on the network.

This looks like a bug, but maybe it isn't. It's unclear what the "timeout" parameter actually means. I've been hunting down the documentation, and curiously, it's not really described anywhere. For an ancient, popular APIs, libpcap is almost entirely undocumented as to what it precisely does. I've tried reading some of the code, but I'm not sure I've come to any understanding.

In any case, the way to resolve this is to call the function pcap_set_immediate_mode(). This causes libpccap to backoff and use an older version of TPACKET such that it'll work as expected, that even on silent networks the pcap_next() function will timeout and return.

I mention this because I fixed this bug in my code. When running inside a VM, my program would never exit. I changed from pcap_open_live() to the pcap_create()/pcap_activate() method instead, adding the setting of "immediate mode", and now things work. Performance seems roughly the same as far as I can tell.

I'm still not certain what's going on here, and there are even newer proposed zero-copy/ring-buffer modes being added to the Linux kernel, so this can change in the future. But in any case, I thought I'd document this in a blogpost in order to help out others who might be encountering the same problem.







text Your IoT security concerns are stupid
Thu, 12 Jul 2018 23:41:00 +0000
Lots of government people are focused on IoT security, such as this bill or this recent effort. They are usually wrong. It's a typical cybersecurity policy effort which knows the answer without paying attention to the question. Government efforts focus on vulns and patching, ignoring more important issues.


Patching has little to do with IoT security. For one thing, consumers will not patch vulns, because unlike your phone/laptop computer which is all "in your face", IoT devices, once installed, are quickly forgotten. For another thing, the average lifespan of a device on your network is at least twice the duration of support from the vendor making patches available.

Naive solutions to the manual patching problem, like forcing autoupdates from vendors, increase rather than decrease the danger. Manual patches that don't get applied cause a small, but manageable constant hacking problem. Automatic patching causes rarer, but more catastrophic events when hackers hack the vendor and push out a bad patch. People are afraid of Mirai, a comparatively minor event that led to a quick cleansing of vulnerable devices from the Internet. They should be more afraid of notPetya, the most catastrophic event yet on the Internet that was launched by subverting an automated patch of accounting software.

Vulns aren't even the problem. Mirai didn't happen because of accidental bugs, but because of conscious design decisions. Security cameras have unique requirements of being exposed to the Internet and needing a remote factory reset, leading to the worm. While notPetya did exploit a Microsoft vuln, it's primary vector of spreading (after the subverted update) was via misconfigured Windows networking, not that vuln. In other words, while Mirai and notPetya are the most important events people cite supporting their vuln/patching policy, neither was really about vuln/patching.

Such technical analysis of events like Mirai and notPetya are ignored. Policymakers are only cherrypicking the superficial conclusions supporting their goals. They assiduously ignore in-depth analysis of such things because it inevitably fails to support their positions, or directly contradicts them.

IoT security is going to be solved regardless of what government does. All this policy talk is premised on things being static unless government takes action. This is wrong. Government is still waffling on its response to Mirai, but the market quickly adapted. Those off-brand, poorly engineered security cameras you buy for $19 from Amazon.com shipped directly from Shenzen now look very different, having less Internet exposure, than the ones used in Mirai. Major Internet sites like Twitter now use multiple DNS providers so that a DDoS attack on one won't take down their services.

In addition, technology is fundamentally changing. Mirai attacked IPv4 addresses outside the firewall. The 100-billion IoT devices going on the network in the next decade will not work this way, cannot work this way, because there are only 4-billion IPv4 addresses. Instead, they'll be behind NATs or accessed via IPv6, both of which prevent Mirai-style worms from functioning. Your fridge and toaster won't connect via your home WiFi anyway, but via a 5G chip unrelated to your home.

Lastly, focusing on the vendor is a tired government cliche. Chronic internet security problems that go unsolved year after year, decade after decade, come from users failing, not vendors. Vendors quickly adapt, users don't. The most important solutions to today's IoT insecurities are to firewall and microsegment networks, something wholly within control of users, even home users. Yet government policy makers won't consider the most important solutions, because their goal is less cybersecurity itself and more how cybersecurity can further their political interests.

The best government policy for IoT policy is to do nothing, or at least focus on more relevant solutions than patching vulns. The ideas propose above will add costs to devices while making insignificant benefits to security. Yes, we will have IoT security issues in the future, but they will be new and interesting ones, requiring different solutions than the ones proposed.


text Lessons from nPetya one year later
Wed, 27 Jun 2018 19:49:00 +0000
This is the one year anniversary of NotPetya. It was probably the most expensive single hacker attack in history (so far), with FedEx estimating it cost them $300 million. Shipping giant Maersk and drug giant Merck suffered losses on a similar scale. Many are discussing lessons we should learn from this, but they are the wrong lessons.


An example is this quote in a recent article:
"One year on from NotPetya, it seems lessons still haven't been learned. A lack of regular patching of outdated systems because of the issues of downtime and disruption to organisations was the path through which both NotPetya and WannaCry spread, and this fundamental problem remains."
This is an attractive claim. It describes the problem in terms of people being "weak" and that the solution is to be "strong". If only organizations where strong enough, willing to deal with downtime and disruption, then problems like this wouldn't happen.

But this is wrong, at least in the case of NotPetya.

NotPetya's spread was initiated through the Ukraining company MeDoc, which provided tax accounting software. It had an auto-update process for keeping its software up-to-date. This was subverted in order to deliver the initial NotPetya infection. Patching had nothing to do with this. Other common security controls like firewalls were also bypassed.

Auto-updates and cloud-management of software and IoT devices is becoming the norm. This creates a danger for such "supply chain" attacks, where the supplier of the product gets compromised, spreading an infection to all their customers. The lesson organizations need to learn about this is how such infections can be contained. One way is to firewall such products away from the core network. Another solution is port-isolation/microsegmentation, that limits the spread after an initial infection.

Once NotPetya got into an organization, it spread laterally. The chief way it did this was through Mimikatz/PsExec, reusing Windows credentials. It stole whatever login information it could get from the infected machine and used it to try to log on to other Windows machines. If it got lucky getting domain administrator credentials, it then spread to the entire Windows domain. This was the primary method of spreading, not the unpatched ETERNALBLUE vulnerability. This is why it was so devastating to companies like Maersk: it wasn't a matter of a few unpatched systems getting infected, it was a matter of losing entire domains, including the backup systems.

Such spreading through Windows credentials continues to plague organizations. A good example is the recent ransomware infection of the City of Atlanta that spread much the same way. The limits of the worm were the limits of domain trust relationships. For example, it didn't infect the city airport because that Windows domain is separate from the city's domains.

This is the most pressing lesson organizations need to learn, the one they are ignoring. They need to do more to prevent desktops from infecting each other, such as through port-isolation/microsegmentation. They need to control the spread of administrative credentials within the organization. A lot of organizations put the same local admin account on every workstation which makes the spread of NotPetya style worms trivial. They need to reevaluate trust relationships between domains, so that the admin of one can't infect the others.

These solutions are difficult, which is why news articles don't mention them. You don't have to know anything about security to proclaim "the problem is lack of patches". It's moral authority, chastising the weak, rather than a proscription of what to do. Solving supply chain hacks and Windows credential sharing, though, is hard. I don't know any universal solution to this -- I'd have to thoroughly analyze your network and business in order to make any useful recommendation. Such complexity means it's not going to appear in news stories -- they'll stick with the simple soundbites instead.

By the way, this doesn't mean ETERNALBLUE was inconsequential in NotPetya's spread. Imagine an organization that is otherwise perfectly patched, except for that one out-dated test system that was unpatched -- which just so happened to have an admin logged in. It hops from the accounting desktop (with the autoupdate) to the test system via ETERNALBLUE, then from the test system to the domain controller via the admin credentials, and then to the rest of the domain. What this story demonstrates is not the importance of keeping 100% up-to-date on patches, because that's impossible: there will always be a system lurking somewhere unpatched. Instead, the lesson is the importance of not leaving admin credentials lying around.


So the lessons you need to learn from NotPetya is not keeping systems patched, but instead dealing with hostile autoupdates coming deep within your network, and most importantly, stopping the spread of malware through trust relationships and loose admin credentials lying around.



text SMB version detection in masscan
Sun, 24 Jun 2018 23:39:00 +0000
My Internet-scale port scanner, masscan, supports "banner checking", grabbing basic information from a service after it connects to a port. It's less comprehensive than nmap's version and scripting checks, but it's better than just recording which ports are open.

I recently extended this banner checking to include SMB. It's a complicated protocol so requires a lot more work than just grabbing text banners like you see on FTP. Implementing this, I've found that nmap and smbclient often fail to get version information. They seem focused on getting the information from a standard location in SMBv1 packets, which gives a text string indicating version. There's another place you get get it, from the NTLMSSP pluggable authentication chunks, which gives version numbers in the form of major version, minor version. and build number. Sometimes the SMBv1 information is missing, either because newer Windows version disable SMBv1 by default (supporting only SMBv2) or because they've disabled null/anonymous sessions. They still give NTLMSSP version info, though.


For example, running masscan in my local bar, I get the following result:

Banner on port 445/tcp on 10.1.10.200: [smb] SMBv1 time=2018-06-24 22:18:13 TZ=+240 domain=SHIPBARBO version=6.1.7601 ntlm-ver=15 domain=SHIPBARBO name=SHIPBARBO domain-dns=SHIPBARBO name-dns=SHIPBARBO os=Windows Embedded Standard 7601 Service Pack 1 ver=Windows Embedded Standard 6.1

The top version string comes from NTLMSSP, with 6.1.7601, which means Windows 6.1 (Win7) build number 7601. The bottom version string comes from the SMBv1 packets, which consists of strings.

The nmap and smbclient programs will get the SMBv1 part, but not the NTLMSSP part.

This seems to be a problem with Rapid7's "National Exposure Index" which tracks SMB exposure (amongst other things). It's missing about 300,000 machines that report NT_STATUS_ACCESS_DENIED from smbclient rather than the numeric version info from NTLMSSP authentication.

The smbclient information does have the information internally. For example, you could run the following command to put the debug level at '10' to grab it:

$ smbclient -U "" -N -L 10.1.10.95 -d10

You'll get something like the following output:


It appears to get the Windows 6.1 numbers, though for some reason it's missing the build number.

To run masscan to grab this, run:

# masscan --banners -p445 10.1.10.95

In the above example, I also used the "--hello smbv1" parameter, to grab both the SMBv1 and NTLMSSP version info. Otherwise, it'll default to SMBv2 if available, and only return:

Discovered open port 445/tcp on 10.1.10.95
Banner on port 445/tcp on 10.1.10.95: [smb] SMBv2 guid=6db701a0-a419-4be9-9084-6052b19a2e56 time=2018-06-24 22:37:42 domain=SHIPSERVER version=6.1.7601 ntlm-ver=15 domain=SHIPSERVER name=SHIPSERVER domain-dns=SHIPSERVER name-dns=SHIPSERVER

Note if you do a port 445 scan of the entire Internet, you'll get about 3,000,000 responses. You probably don't want to script running 3 million instances of masscan, but instead run it once for all those addresses. Do do this, run:

# masscan --banners -p445 -iL ips.txt --rate 100000

This will load the IP addresses from the file "ips.txt". The format is one address, CIDR address, or range per line, with lines starting with # ignored as comments. It'll take about 10 seconds to read in a file containing 3 million addresses, so don't be worried if it seems to hang for a bit. It doesn't matter if you sort the file or not: masscan sorts the file itself internally, then randomizes the order when transmitting packets.

By default, masscan transmits at a rate of 100 packets per second, in order to avoid accidentally melting networks. You'll probably want a faster rate, such as 100,000 packets per second. Masscan will give a status line estimating completion time, but in this case, it'll be wildly inaccurate. The estimate is based upon getting no response from servers, which is the norm when doing massive scans. But in this case, all the servers will response which will cause masscan to send at least an ACK packet followed by at least one data packet. This will usually continue with two more data packets and some FINs and FIN-ACKs. All these extra packets fit within the "rate" specified, which means the effective rate at establishing new connections will be a lot lower than the estimate.

If you want to just scan the entire Internet for SMB on port 445, the command would be:

# masscan --banners -p445 0.0.0.0/0 --rate 100000

I love scanning the /0 subnet.

The version information can be in different locations in the output line, depending on the target. To extract it, you can use grep:

grep -Eo "version=[0-9\.]*" scan.txt

Or, to grab only the numbers portion:

grep -Eo "version=[0-9\.]*" scan.txt | cut -d= -f2

To interpret the version numbers, this seems to be a good resource. I'm repeating the details here in case the link rots:

Operating SystemVersion Details
Version Number
Windows 10Windows 10 (1803)10.0.17134
Windows 10 (1709)10.0.16299
Windows 10 (1703)10.0.15063
Windows 10 (1607)10.0.14393
Windows 10 (1511)10.0.10586
Windows 1010.0.10240
Windows 8Windows 8.1 (Update 1)6.3.9600
Windows 8.16.3.9200
Windows 86.2.9200
Windows 7Windows 7 SP16.1.7601
Windows 76.1.7600
Windows VistaWindows Vista SP26.0.6002
Windows Vista SP16.0.6001
Windows Vista6.0.6000

The various Windows server version overlap these as well.

You can get the latest version of masscan from GitHub. It doesn't have any dependencies to build it other than a compiler (gcc or clang). It does need libpcap installed to run. It also needs root privileges to run, like any other libpcap application, or you setuid it. Lastly, since masscan has it's own IP address, you need to either use --source-ip [ip] to use a different IP address on your local subnet, or use --source-port [port] to use a source port you've otherwise firewalled to prevent the local stack from using it. Otherwise, the local stack will generate RST packets, preventing a connection from being established to grab the banner.

$ sudo apt-get install build-essential git
$ git clone https://github.com/robertdavidgraham/masscan
$ cd masscan
$ make
$ sudo iptables -A INPUT -p tcp --dport 60000 -j DROP
$ sudo bin/masscan --source-port 60000 -p445 --banners ....

By default, masscan waits 10 seconds for any responses to come back after a scan is complete. I add the parameter "--wait 40" to extend that to 40 seconds. Connections longer than 30 seconds are killed anyway due to timeout, so it's not really worth it to wait much longer than 30 seconds.

There's a lot of junk out there on port 445. Among the interesting stuff is that there are a lot of honeypots out there looking for scanners and worms. When you do a scan on this port, you'll get a lot of scans coming back at you for a couple days from such honeypots. One of the healthy things about using a spoofed source IP address is that you'll avoid the noise caused by these scans. Since I always spoof the source address in my scans (--source ip [ip]) I'll also set --wait forever as a parameter, to keep masscan running even after it's transmitted all its packets. This keeps it responding to ARP requests from the local router, so that I can also run tcpdump to capture all the noise that happens after a scan, for a couple days. Otherwise, if a stack with that IP address doesn't exist, the router will drop the packets instead of forwarding them, so you can't tcpdump capture them.

So the full command line might be:

# masscan --banners -p445 0.0.0.0/0 --rate 100000 --source-port 60000 --wait 40 > scan.txt

Conclusion

I've added SMB version checking natively to masscan. While simple in theory, his actually gets a bit complex, as described above. SMB is a nasty protocol, so a custom implementation like in masscan will get different results, for various reasons, then you might get with the Samba tool smbclient or nmap.



text Notes on "The President is Missing"
Sun, 17 Jun 2018 05:45:00 +0000
Former president Bill Clinton has contributed to a cyberthriller "The President is Missing", the plot of which is that the president stops a cybervirus from destroying the country. This is scary, because people in Washington D.C. are going to read this book, believe the hacking portrayed has some basis in reality, and base policy on it. This "news analysis" piece in the New York Times is a good example, coming up with policy recommendations based on fictional cliches rather than a reality of what hackers do.


The cybervirus in the book is some all powerful thing, able to infect everything everywhere without being detected. This is fantasy no more real than magic and faeries. Sure, magical faeries is a popular basis for fiction, but in this case, it's lazy fantasy, a cliche. In fiction, viruses are rarely portrayed as anything other than all powerful.

But in the real world, viruses have important limitations. If you knew anything about computer viruses, rather than being impressed by what they can do, you'd be disappointed by what they can't.

Go look at your home router. See the blinky lights. The light flashes every time a packet of data goes across the network. Packets can't be sent without a light blinking. Likewise, viruses cannot spread themselves over a network, or communicate with each other, without somebody noticing -- especially a virus that's supposedly infected a billion devices as in the book.

The same is true of data on the disk. All the data is accounted for. It's rather easy for professionals to see when data (consisting of the virus) has been added. The difficulty of anti-virus software is not in detecting when something new has been added to a system, but automatically determining whether it's benign or malicious. When viruses are able to evade anti-virus detection, it's because they've been classified as non-hostile, not because they are invisible.

Such evasion only works when hackers have a focused target. As soon as a virus spreads too far, anti-virus companies will get a sample, classify as malicious, and spread the "signatures" out to the world. That's what happened with Stuxnet, a focused attack on Iran's nuclear enrichment program that eventually spread too far and got detected. It's implausible that anything can spread to a billion systems without anti-virus companies getting a sample and correctly classifying it.

In the book, the president creates a team of the 30 brightest cybersecurity minds the country has, from government, the private sector, and even convicted hackers on parole from jail -- each more brilliant than the last. This is yet another lazy cliche about genius hackers.

The cliche comes from the fact that it's rather easy to impress muggles with magic tricks. As soon as somebody shows an ability to do something you don't know how to do, they become a cyber genius in your mind. The reality is that cybersecurity/hacking is no different than any other profession, no more dominated by "genius" than bridge engineering or heart surgery. It's a skill that takes both years of study as well as years of experience.

So whenever the president, ignorant of computers, puts together a team of 30 cyber geniuses, they aren't going to be people of competence. They are going to be people good at promoting themselves, taking credit for other people's work, or political engineering. They won't be technical experts, they'll be people like Rudi Giuliani or Richard Clarke, who have been tapped by presidents as cyber experts despite knowing less than nothing about computers.

A funny example of this is Marcus Hutchins. He's a virus researcher of typical skill and experience, but was catapulted to fame by finding the "kill switch" in the famous Wannacry virus. In truth, he just got lucky, being just the first to find the kill switch that would've soon been found by another researcher (it was pretty obvious). But the press set him up as one of the top 5 experts in the world. That's silly, because there is no such thing, like there's no "top 5 neurosurgeons" or "top 5 bridge engineers". Hutchins is certainly skilled enough to merit a solid 6 figure salary, but such "top cyber geniuses" don't exist.

I mention Hutchins because months after the famed Wannacry incident, he was arrested in conjunction with an unrelated Russian banking virus. Assuming everything in his indictment is true, it still makes him only a minor figure with a few youthful indiscretions. It's likely this confusion between "fame" and "cyber genius" catapulted him into being a major person of interest in their investigations.

The book discusses the recent major cyberattacks in the news, like Mirai, Wannacry, and nPetya, but they are distorted misunderstandings of what happened. For example, it explains DDoS:
A DDoS attack is a distribute denial-of-service attack. A flood attack, essentially, on the network of servers that convert the addresses we type into our browsers into IP numbers that the internet routers use.
This is only partial right, but mainly wrong. DDoS is any sort of flood from multiple sources distributed around the Internet, against any target. It's only the Mirai attack, the most recent famous DDoS, that attacked the name servers that convert addresses to numbers.

The same sort of misconceptions are rife in Washington. Mirai, Wannacry, and nPetya spawned a slew of policy recommendations that get the technical details wrong. Politicians reading this Clinton thriller will just get more wrong.


In terms of fiction, the lazy cliches and superficial understand of cybersecurity will be hard for people of intelligence to stomach. However, the danger I want to point out is that people in Washington D.C., politicians who make policy, will read this book. Their understanding of how cyber works will come from such books. And it will be wrong.




text The First Lady's bad cyber advice
Thu, 31 May 2018 21:06:00 +0000
First Lady Melania Trump announced a guide to help children go online safely. It has problems.

Melania's guide is full of outdated, impractical, inappropriate, and redundant information. But that's allowed, because it relies upon moral authority: to be moral is to be secure, to be moral is to do what the government tells you. It matters less whether the advice is technically accurate, and more that you are supposed to do what authority tells you.

That's a problem, not just with her guide, but most cybersecurity advice in general. Our community gives out advice without putting much thought into it, because it doesn't need thought. You should do what we tell you, because being secure is your moral duty.

This post picks apart Melania's document. The purpose isn't to fine-tune her guide and make it better. Instead, the purpose is to demonstrate the idea of resting on moral authority instead of technical authority.


Strong Passwords



"Strong passwords" is the quintessential cybersecurity cliché that insecurity is due to some "weakness" (laziness, ignorance, greed, etc.) and the remedy is to be "strong".

The first flaw is that this advice is outdated. Ten years ago, important websites would frequently get hacked and have poor password protection (like MD5 hashing). Back then, strength mattered, to stop hackers from brute force guessing the hacked passwords. These days, important websites get hacked less often and protect the passwords better (like salted bcrypt). Moreover, the advice is now often redundant: websites, at least the important ones, enforce a certain level of password complexity, so that even without advice, you'll be forced to do the right thing most of the time.

This advice is outdated for a second reason: hackers have gotten a lot better at cracking passwords. Ten years ago, they focused on brute force, trying all possible combinations. Partly because passwords are now protected better, dramatically reducing the effectiveness of the brute force approach, hackers have had to focus on other techniques, such as the mutated dictionary and Markov chain attacks. Consequently, even though "Password123!" seems to meet the above criteria of a strong password, it'll fall quickly to a mutated dictionary attack. The simple recommendation of "strong passwords" is no longer sufficient.


The last part of the above advice is to avoid password reuse. This is good advice. However, this becomes impractical advice, especially when the user is trying to create "strong" complex passwords as described above. There's no way users/children can remember that many passwords. So they aren't going to follow that advice.

To make the advice work, you need to help users with this problem. To begin with, you need to tell them to write down all their passwords. This is something many people avoid, because they've been told to be "strong" and writing down passwords seems "weak". Indeed it is, if you write them down in an office environment and stick them on a note on the monitor or underneath the keyboard. But they are safe and strong if it's on paper stored in your home safe, or even in a home office drawer. I write my passwords on the margins in a book on my bookshelf -- even if you know that, it'll take you a long time to figure out which book when invading my home.

The other option to help avoid password reuse is to use a password manager. I don't recommend them to my own parents because that'd be just one more thing I'd have to help them with, but they are fairly easy to use. It means you need only one password for the password manager, which then manages random/complex passwords for all your web accounts.

So what we have here is outdated and redundant advice that overshadows good advice that is nonetheless incomplete and impractical. The advice is based on the moral authority of telling users to be "strong" rather than the practical advice that would help them.

No personal info unless website is secure

The guide teaches kids to recognize the difference between a secure/trustworthy and insecure website. This is laughably wrong.


HTTPS means the connection to the website is secure, not that the website is secure. These are different things. It means hackers are unlikely to be able to eavesdrop on the traffic as it's transmitted to the website. However, the website itself may be insecure (easily hacked), or worse, it may be a fraudulent website created by hackers to appear similar to a legitimate website.

What HTTPS secures is a common misconception, perpetuated by guides like this. This is the source of criticism for LetsEncrypt, an initiative to give away free website certificates so that everyone can get HTTPS. Hackers now routinely use LetsEncrypt to create their fraudulent websites to host their viruses. Since people have been taught forever that HTTPS means a website is "secure", people are trusting these hacker websites.

But LetsEncrypt is a good thing, all connections should be secure. What's bad is not LetsEncrypt itself, but guides like this from the government that have for years been teaching people the wrong thing, that HTTPS means a website is secure.

Backups

Of course, no guide would be complete without telling people to backup their stuff.


This is especially important with the growing ransomware threat. Ransomware is a type of virus/malware that encrypts your files then charges you money to get the key to decrypt the files. Half the time this just destroys the files.

But this again is moral authority, telling people what to do, instead of educating them how to do it. Most will ignore this advice because they don't know how to effectively backup their stuff.

For most users, it's easy to go to the store and buy a 256-gigabyte USB drive for $40 (as of May 2018) then use the "Timemachine" feature in macOS, or on Windows the "File History" feature or the "Backup and Restore" feature. These can be configured to automatically do the backup on a regular basis so that you don't have to worry about it.

But such "local" backups are still problematic. If the drive is left plugged into the machine, ransomeware can attack the backup. If there's a fire, any backup in your home will be destroyed along with the computer.

I recommend cloud backup instead. There are so many good providers, like DropBox, Backblaze, Microsoft, Apple's iCloud, and so on. These are especially critical for phones: if your iPhone is destroyed or stolen, you can simply walk into an Apple store and buy a new one, with everything replaced as it was from their iCloud.

But all of this is missing the key problem: your photos. You carry a camera with you all the time now and take a lot of high resolution photos. This quickly exceeds the capacity of most of the free backup solutions. You can configure these, such as you phone's iCloud backup, to exclude photos, but that means you are prone to losing your photos/memories. For example, Drop Box is great for the free 5 gigabyte service, but if I want to preserve photos on it, I have to pay for their more expensive service.

One of the key messages kids should learn about photos is that they will likely lose most all of the photos they've taken within 5 years. The exceptions will be the few photos they've posted to social media, which sorta serves as a cloud backup for them. If they want to preserve the rest of these memories, the kids need to take seriously finding backup solutions. I'm not sure of the best solution, but I buy big USB flash drives and send them to my niece asking her to copy all her photos to them, so that at least I can put that in a safe.

One surprisingly good solution is Microsoft Office 365. For $99 a year, you get a copy of their Office software (which I use) but it also comes with a large 1-terabyte of cloud storage, which is likely big enough for your photos. Apple charges around the same amount for 1-terabyte of iCloud, though it doesn't come with a free license for Microsoft Office :-).

WiFi encryption

Your home WiFi should be encrypted, of course.


I have to point out the language, though. Turning on WPA2 WiFi encryption does not "secure your network". Instead, it just secures the radio signals from being eavesdropped. Your network may have other vulnerabilities, where encryption won't help, such as when your router has remote administration turned on with a default or backdoor password enabled.

I'm being a bit pedantic here, but it's not my argument. It's the FTC's argument when they sued vendors like D-Link for making exactly the same sort of recommendation. The FTC claimed it was deceptive business practice because recommending users do things like this still didn't mean the device was "secure". Since the FTC is partly responsible for writing Melania's document, I find this a bit ironic.

In any event, WPA2 personal has problems where it can be hacked, such as if WPS is enabled, or evil twin access-points broadcasting stronger (or more directional) signals. It's thus insufficient security. To be fully secure against possible WiFi eavesdropping you need to enable enterprise WPA2, which isn't something most users can do.

Also, WPA2 is largely redundant. If you wardrive your local neighborhood you'll find that almost everyone has WPA enabled already anyway. Guides like this probably don't need to advise what everyone's already doing, especially when it's still incomplete.

Change your router password


Yes, leaving the default password on your router is a problem, as shown by recent Mirai-style attacks, such as the very recent ones where Russia has infected 500,000 in their cyberwar against Ukraine. But those were only a problem because routers also had remote administration enabled. It's remote administration you need to make sure is disabled on your router, regardless if you change the default password (as there are other vulnerabilities besides passwords). If remote administration is disabled, then it's very rare that people will attack your router with the default password.

Thus, they ignore the important thing (remote administration) and instead focus on the less important thing (change default password).

In addition, this advice again the impractical recommendation of choosing a complex (strong) password. Users who do this usually forget it by the time they next need it. Practical advice is to recommend users write down the password they choose, and put it either someplace they won't forget (like with the rest of their passwords), or on a sticky note under the router.

Update router firmware

Like any device on the network, you should keep it up-to-date with the latest patches. But you aren't going to, because it's not practical. While your laptop/desktop and phone nag you about updates, your router won't. Whereas phones/computers update once a month, your router vendor will update the firmware once a year -- and after a few years, stop releasing any more updates at all.

Routers are just one of many IoT devices we are going to have to come to terms with, keeping them patched. I don't know the right answer. I check my parents stuff every Thanksgiving, so maybe that's a good strategy: patch your stuff at the end of every year. Maybe some cultural norms will develop, but simply telling people to be strong about their IoT firmware patches isn't going to be practical in the near term.

Don't click on stuff

This probably the most common cybersecurity advice given by infosec professionals. It is wrong.


Emails/messages are designed for you to click on things. You regularly get emails/messages from legitimate sources that demand you click on things. It's so common from legitimate sources that there's no practical way for users to distinguish between them and bad sources. As that Google Docs bug showed, even experts can't always tell the difference.

I mean, it's true that phishing attacks coming through emails/messages try to trick you into clicking on things, and you should be suspicious of such things. However, it doesn't follow from this that not clicking on things is a practical strategy. It's like diet advice recommending you stop eating food altogether.

Sex predators, oh my!

Of course, its kids going online, so of course you are going to have warnings about sexual predators:


But online predators are rare. The predator threat to children is overwhelmingly from relatives and acquaintances, a much smaller threat from strangers, and a vanishingly tiny threat from online predators. Recommendations like this stem from our fears of the unknown technology rather than a rational measurement of the threat.

Sexting, oh my!

So here is one piece of advice that I can agree with: don't sext:


But the reason this is bad is not because it's immoral or wrong, but because adults have gone crazy and made it illegal for children to take nude photographs of themselves. As this article points out, your child is more likely to get in trouble and get placed on the sex offender registry (for life) than to get molested by a person on that registry.

Thus, we need to warn kids not from some immoral activity, but from adults who've gotten freaked out about it. Yes, sending pictures to your friends/love-interest will also often get you in trouble as those images will frequently get passed around school, but such temporary embarrassments will pass. Getting put on a sex offender registry harms you for life.

Texting while driving

Finally, I want to point out this error:


The evidence is to the contrary, that it's not actually dangerous -- it's just assumed to be dangerous. Texting rarely distracts drivers from what's going on the road. It instead replaces some other inattention, such as day dreaming, fiddling with the radio, or checking yourself in the mirror. Risk compensation happens, when people are texting while driving, they are also slowing down and letting more space between them and the car in front of them.

Studies have shown this. For example, one study measured accident rates at 6:59pm vs 7:01pm and found no difference. That's when "free evening texting" came into effect, so we should've seen a bump in the number of accidents. They even tried to narrow the effect down, such as people texting while changing cell towers (proving they were in motion).

Yes, texting is illegal, but that's because people are fed up with the jerk in front of them not noticing the light is green. It's not illegal because it's particularly dangerous, that it has a measurable impact on accident rates.

Conclusion

The point of this post is not to refine the advice and make it better. Instead, I attempt to demonstrate how such advice rests on moral authority, because it's the government telling you so. It's because cybersecurity and safety are higher moral duties. Much of it is outdated, impractical, inappropriate, and redundant.

We need to move away from this sort of advice. Instead of moral authority, we need technical authority. We need to focus on the threats that people actually face, and instead of commanding them what to do. We need to help them be secure, not command to command them, shaming them for their insecurity. It's like Strunk and White's "Elements of Style": they don't take the moral authority approach and tell people how to write, but instead try to help people how to write well.


text The devil wears Pravda
Wed, 23 May 2018 22:37:00 +0000
Classic Bond villain, Elon Musk, has a new plan to create a website dedicated to measuring the credibility and adherence to "core truth" of journalists. He is, without any sense of irony, going to call this "Pravda". This is not simply wrong but evil.


Musk has a point. Journalists do suck, and many suck consistently. I see this in my own industry, cybersecurity, and I frequently criticize them for their suckage.

But what he's doing here is not correcting them when they make mistakes (or what Musk sees as mistakes), but questioning their legitimacy. This legitimacy isn't measured by whether they follow established journalism ethics, but whether their "core truths" agree with Musk's "core truths".

An example of the problem is how the press fixates on Tesla car crashes due to its "autopilot" feature. Pretty much every autopilot crash makes national headlines, while the press ignores the other 40,000 car crashes that happen in the United States each year. Musk spies on Tesla drivers (hello, classic Bond villain everyone) so he can see the dip in autopilot usage every time such a news story breaks. He's got good reason to be concerned about this.

He argues that autopilot is safer than humans driving, and he's got the statistics and government studies to back this up. Therefore, the press's fixation on Tesla crashes is illegitimate "fake news", titillating the audience with distorted truth.

But here's the thing: that's still only Musk's version of the truth. Yes, on a mile-per-mile basis, autopilot is safer, but there's nuance here. Autopilot is used primarily on freeways, which already have a low mile-per-mile accident rate. People choose autopilot only when conditions are incredibly safe and drivers are unlikely to have an accident anyway. Musk is therefore being intentionally deceptive comparing apples to oranges. Autopilot may still be safer, it's just that the numbers Musk uses don't demonstrate this.

And then there is the truth calling it "autopilot" to begin with, because it isn't. The public is overrating the capabilities of the feature. It's little different than "lane keeping" and "adaptive cruise control" you can now find in other cars. In many ways, the technology is behind -- my Tesla doesn't beep at me when a pedestrian walks behind my car while backing up, but virtually every new car on the market does.

Yes, the press unduly covers Tesla autopilot crashes, but Musk has only himself to blame by unduly exaggerating his car's capabilities by calling it "autopilot".

What's "core truth" is thus rather difficult to obtain. What the press satisfies itself with instead is smaller truths, what they can document. The facts are in such cases that the accident happened, and they try to get Tesla or Musk to comment on it.

What you can criticize a journalist for is therefore not "core truth" but whether they did journalism correctly. When such stories criticize "autopilot", but don't do their diligence in getting Tesla's side of the story, then that's a violation of journalistic practice. When I criticize journalists for their poor handling of stories in my industry, I try to focus on which journalistic principles they get wrong. For example, the NYTimes reporters do a lot of stories quoting anonymous government sources in clear violation of journalistic principles.

If "credibility" is the concern, then it's the classic Bond villain here that's the problem: Musk himself. His track record on business statements is abysmal. For example, when he announced the Model 3 he claimed production targets that every Wall Street analyst claimed were absurd. He didn't make those targets, he didn't come close. Model 3 production is still lagging behind Musk's twice adjusted targets.

https://www.bloomberg.com/graphics/2018-tesla-tracker/

So who has a credibility gap here, the press, or Musk himself?

Not only is Musk's credibility problem ironic, so is the name he chose, "Pravada", the Russian word for truth that was the name of the Soviet Union Communist Party's official newspaper. This is so absurd this has to be a joke, yet Musk claims to be serious about all this.

Yes, the press has a lot of problems, and if Musk were some journalism professor concerned about journalists meeting the objective standards of their industry (e.g. abusing anonymous sources), then this would be a fine thing. But it's not. It's Musk who is upset the press's version of "core truth" does not agree with his version -- a version that he's proven time and time again differs from "real truth".

Just in case Musk is serious, I've already registered "www.antipravda.com" to start measuring the credibility of statements by billionaire playboy CEOs. Let's see who blinks first.



I stole the title, with permission, from this tweet: