Last Day to Save $400 on SANS Security East 2015, New Orleans

Intrusion Detection FAQ: What is Scanrand?

Introduction
Created by Dan "Effugas" Kaminsky of Doxpara Research, Paketto Keiretsu is a suite of five tools that each perform a specialized task for obtaining or inserting data on a network.
  • Scanrand    Fast network scanner.
  • Paratrace   'Parasitic' traceroute program.
  • Minewt   Software router.
  • Lc   Layer 2 data insertion tool.
  • Phentrophy   3-D point plotter.
This paper will mainly concentrate on scanrand since this is the particular tool in Paketto Keiretsu that has emphasis on network information gathering.

Obtaining and Installing Paketto Keiretsu
The suite of tools can be obtained from:
http://www.doxpara.com/paketto/paketto-1.10.tar.gz

Paketto Keiretsu relies on the libnet library for sending packets, and libpcap for receiving packets. These libraries deal directly with the datalink layer, bypassing the kernel network stack. In additional to these two libraries, scanrand relies on libtomcrypt for encryption algorithms. All these libraries are included and statically linked to the Paketto distribution but are referenced here for completeness. The number in parenthesis below is the version Paketto Keiretsu 1.0 - 1.10 uses.

Libtomcrypt: http://libtomcrypt.iahu.ca/(0.66)
Libnet: http://www.packetfactory.net/libnet/(1.0.2a)
Libpcap: http://www.tcpdump.org/release/(0.7.1)

Once you have obtained and extracted the paketto-1.10.tar.gz file, the typical
$ ./configure
$ make
# make install
will compile and install the suite.

Scanrand

Overview
Scanrand is a fast network scanner that can scan single hosts to very large networks efficiently. However, several network mapping utilites boast this same claim. So why is scanrand any different? Scanrand can do what is called stateless TCP scanning, which sets it apart from the other network scanners.

TCP - a stateful transport protocol
The TCP/IP stack of each operating system is responsible for keeping track of many variables for each 'connection' to or from the local machine. One of these variables is the state that a particular TCP connection is in.

A typical connection starts off with a client sending a TCP SYN packet to a server or remote machine. When the SYN is sent from the client, the operating system sets the state for this connection to SYN_SENT. When in a SYN_SENT state, the client operating system is expecting a SYN/ACK in return. Note, the client operating system and program are waiting on a reply. If the server does not respond within a particular time, the client will retransmit it's SYN since it assumes the packet did not make it to the server. If the server is listening on the port the client sent to, the server receives the SYN and replies with a SYN/ACK. The server then enters SYN_RECD state. If the client were a scanner, it would more than likely have what it wanted: A response from the server that indicated something was listening on the port it sent its SYN to.

Now that it has the information it wants, it doesn't care to talk to the server anymore. However, since TCP is a reliable protocol, if the server does not get a response from the scanner saying it received its SYN/ACK, the server will retransmit. To keep this from happening, the scanner sends a RST packet which tells the TCP connection to terminate regardless of state. You can imagine the quantity of SYN/ACK retransmits a scanner would get if it scanned thousands of machines and did not RST the connections. Another problem the scanner has to worry about is authentication. How does it know the SYN/ACK it received came from the machine it sent the SYN to? Sure the IP address is from the same machine, but what's to say it's not spoofed from a person that notices the scanning and wants to corrupt the gathered data?

TCP relies on sequence numbers to ensure the machine it is talking to is indeed the machine it originally contacted. The initial sequence numbers are 32-bit randomly generated integers that are exchanged in the initial TCP connection by both machines involved in the connection. Refer to figure 1-1.



A basic network scanner sends a SYN packet to the remote server. It then stores the connection variables1, or waits for a reply. When it gets the reply, it makes sure the ACK sequence number is 1 plus the initial sequence number it sent in the SYN. If this is the case, it is a valid response (assuming it came from the IP address it sent the SYN to). The assumption is that an attacker that wanted to insert data into your TCP stream could not synchronize sequence numbers with you because of the randomness. This is actually possible, but often very difficult depending on how random the algorithm for selecting initial sequence numbers and the type of firewall the host is behind.

The point is that a typical scanner either has to store the connection and move on to the next host, or wait for the response, then move on to the next host. Both of these can be costly, either in resources or time. Scanrand implements neither of these methods and takes a more... mathematical approach.

Scanning scanrand style
As mentioned earlier, scanrand takes a little different approach than the typical network scanner. It implements more of a, 'fire and forget' ideology for scanning mixed with a little math.

On startup, scanrand breaks itself up into two processes. One processes is responsible for doing nothing but sending out SYN packets using libnet. The other process is responsible for receiving the responses the remote computers send back using libpcap. One important thing to note here is that these processes work independently. There is no consulting with the other process on, "Did you send this packet out?" or, "Is this a valid packet?" Scanrand stores no list of IP addresses it is expecting a response from and the sending process does not wait for a response at all. It fires off a SYN, and then moves on to the next computer leaving the receiving process to sort out the flood of responses to come.

So how does the receiving process do this? It doesn't know what the initial sequence number of any of the packets sent are. It only knows it got a response packet. (A SYN/ACK, or a ICMP error message). It could be from anyone. It could be a SYN/ACK from your email client checking your email every five minutes. It could be Windows sending the names of the DVDs you just played to their gigantic database without your knowledge.

The key is what the sending process of scanrand sends out as initial sequence numbers which are not exactly randomly generated. Scanrand takes the source address and port, along with the destination address and port, concatenates them along with key, and runs them through a SHA-1 truncated to 32-bits one-way hash algorithm which becomes the initial sequence number for the outgoing packet. Scanrand calls these 'Inverse SYN cookies.' The reason we use a one-way hash function is because the likely hood of collision is low. This means that given different input data to the hash function, the probability of the hash function generating the same result is very very low.


Table 1-1: One-way hash algorithm [4]

In table 1-1, input1 is the source, source port, destination, and destination port along with a key. Therefore, if we get a response, we subtract 1 from the ACK sequence number in the reply packet (since this should be one greater than the initial sequence number we sent ) and hash it. If that hash matches the hash of the source, source port, destination, and destination port (which we also pulled from the reply packet) along with our key, then this is a valid response to our scan.

Using this method, scanrand doesn't have to wait for hosts to respond before moving on to the next host. It can fire off as many SYN packets as the machine or bandwidth can handle. It also doesn't need to keep track of any sequence numbers of any kind. It just fires and forgets.

Traditional scanning across hosts is generally serialized where as the scanrand method is scanning multiple hosts at a time. The traditional scanner will connect to host A, wait for the response, and then move to host B. The problem with this method is that the scanner has to wait for the host to respond. Depending on how long the scanner is set to wait, scanning large networks could be timely. Often there is not always a machine listening or the machine cannot be reached and the timeout for a response will be the maximum. This leads to long scanning times when scanning large networks. Scanrand's method of scanning maximizes scanning time and minimizes waiting time.

Traditional scanning with nmap
The Network Mapper (nmap) is a feature-rich scanning tool which has come to be a very valuable, and well-known in the security industry. It is available from http://www.insecure.org/nmap and can be run under Windows or Unix-based operating systems.

I have been a big fan of nmap for a while. Many a bad guy I have scanned with this feature-rich tool. I figured nmap could parallelize scanning across hosts, keep track of connections and see when a valid response was received. Not exactly the way scanrand does it, but it would give a similar effect if it can do that. Pouring over the nmap man page, I found some features, listed in table 1-2, that could possibly give scanrand a run for its money. We will see.


Table 1-2: Interesting nmap options.

The closest option I see for the effect we are looking for (parallel scanning across hosts) is the -M option. The manual page says this is only used in TCP connect() scans, which is not exactly what we want to use, but if we want to do parallel scanning, we'll have to settle.

We'll put these options to good use to see what nmap can do to represent the traditional scanners in scanning large networks when put up against the new kid on the block, scanrand.



The Showdown... nmap
We'll time and scan around a 16,000 host network with both tools to see how each performs under scanning large networks. Though I don't control a network with thousands upon thousands of hosts, I do have an internal network made up of a mere four computers. Figure 1-2 outlines my network which the scanning will occur on. I will be scanning from 192.168.64.2 as it is the machine with the most power.

The fact that thousands of hosts don't exist on my network does not keep me from scanning thousands of 'hosts'. With a little iptables magic, we can simulate thousands of websites running on the 192.168.0.0/18 network if we wish. We accomplish this by adding a rule to the firewall simliar to the one below for each website we wish to simulate where is the IP we want to act as a webserver.



This rule will NAT all traffic from the scanning machine to the webserver in the DMZ [3]. This gives us the effect that any 192.168.0.0/18 addresses could be running a webserver. I've created a little perl script to generate rules for the firewall to allow the scanner to connect to port 80 on 192.168.74.2 through this method.



The particular run of our perl script generated 72 possible webservers on the 192.168.0.0/18 network. Now that I know how many webservers are on our site, I can accurately judge the data loss in scanning. The traffic to the other IP addresses besides the one generated by our perl script will be silently dropped.

Now that the firewall and network is in place, let the games begin! The venerable nmap will be the first to bombard my poor network.



Clearly, nmap is much faster when there is a host to respond to its probe as seen in the output above to 192.168.1.138. The tcpdump dump below gives us further insight into nmap's scanning method.



As shown, nmap sends up to six SYN packets to the target until it gives up. These packets are sent in time intervals depending on what -T value you use, or by setting the option -max_rtt_timeout manually. This can be very costly(~1.7secs/machine [with -T 5]) in scanning large networks where big chunks of IP addresses don't have a machine listening.

Let's do a little tweaking and not see if we can speed this up just a little. Since nmap tries to scan before knowing if the host is up, we need to remove the -P0 as we only want to scan the hosts that appear to be up. -P0 is very helpful if you know a host is up, but blocking nmap's attempt at determining if it's up.



That's more like it. 16,384 hosts in 194 seconds is not bad at all. Of course there were repercussions for such speed, which is loss of data accuracy. On a local network nmap using -T 5 missed 2 possible servers. Backing down on -T a little should give us all servers which it does on our local network. On a fast network that gives us 84.5 hosts/second. The time would, of course, increase as we decreased the -T value, while the accuracy increased. I would not recommend trying -T 5 on anything but a local networks. It will very unreliable over internet connections as it does not way very long (0.3 seconds) for hosts to respond. This is a weakness in the traditional style scanning where hosts are scanned in sequence one by one. Next, we'll see how scanrand fairs in the scanning competitions.

The Showdown... scanrand
Before we begin scanning with scanrand, let's take a look at the output we would expect from scanrand since there is not much documentation on it. I scanned a fairly small network (with permission of course) of a friend's site to get some output with scanrand as shown below.



The first column is the status of the machine probed. The common statuses can be found in Table 1-3.


Table 1-3: Scanrand statuses

The second column is the machine we received a response from whether it be a ICMP error or a SYN/ACK.

The third column in brackets is the estimated hop count to the target machine based on the time-to-live(TTL) of the IP packet we received as a response. The estimation algorithm, which is essentially listed below[1], takes advantage of the fact that most operating systems use a multiple of 32 as their TTL values. There are some other calculations performed in specific situations. (Particularly with resets).



The fourth column is the number of seconds elapsed since the beginning of the scan until the host responded. Note, this is not the time elapsed from when the SYN packet was sent out by the scanning host.

The fifth column is only shown when we get a valid ICMP message as a response. RFC 792[2], which defines the ICMP protocol, states that ICMP unreachable messages, as well as time-exceeded messages (scanrand looks for either), must provide the first 8 bytes of the original IP datagram header. Scanrand displays this data in column 5.

Now that we can read the output of scanrand, let's put it to work.



The -c tells scanrand to verify that icmp responses are not spoofed. The -b0 option tells scanrand to send packets as fast as possible (default). We could limit the rate of packet output by specifying something like -b1M if we were on a 1 meg link. The -t 3000 option is important. This is how long scanrand will wait for a response (any response) before exiting. I set this high since I knew I was not going to get many responses and I wanted my scan to finish. Scanrand doesn't seem to take the standard CIDR network masks like nmap does which is a little annoying. Networks are specified using ranges and commas. Our scan above is going to cover the 192.168.0.0/18. These are the most useful options of scanrand though more options, as well as some experimental ones, can be found by running scanrand with no options.

Scanrand scanned blazingly fast. A few seconds after issuing the command, my mouse stopped working, and my computer became unresponsive. Another few seconds and it came back alive. A 'quick' top showed scanrand eating all available CPU time as it blasted as many packets as it could towards its several thousand destinations. Even if my 'weak' computer was not enough for the greedy scanrand, it managed to shoot off an impressive amount of packets. Too bad my network could not handle them. The first time with -b0 enabled, scanrand reported only 30 of the 72 webservers. A review of the tcpdump file on the scanning box only showed this many response packets coming back to the scanner so they were being dropped.

A little tweaking around and I found the needed -b value.



After only 32.241 seconds using -b85M, scanrand had found all 72 webservers and successfully probed all 16,384 addresses, consistently (although the time did reach all the way up to 68 seconds sometimes though this is more than likely a firewall issue). That averages to 512 hosts/sec. Pretty impressive.

And the winner is...
For scanning an internal network for rogue servers, such as an unauthorized webserver, scanrand has the upper hand for accuracy and speed. On the other hand, scanrand lacks the maturity and features that makes nmap such a great tool.

For bypassing stateless router ACLs looking for servers and other various tricks easily capable with nmap, scanrand just can't compare though it's not supposed to. I can't say for sure what Dan has in mind for scanrand as far as features, but I doubt it was ever intended to be a replacement for nmap.

Scanrand and nmap together make an excellent combination as scanrand can be used to find servers quickly and accurately that you would use nmap to enumerate further on. From my personal experience, nmap seems to do better scanning individual hosts over multiple ports (not to mention the OS fingerprinting and a plethora of other options.).

Detecting Scanrand
Scanrand makes no attempt in covering up what it is. As shown below, both version 1.0 and 1.10 use an IP id of 255 and a TTL of 255 on outgoing SYN packets.



This is simple enough to detect. If we see many probes from one host with a TTL set to 255 and id set to 255 we know they are using scanrand. Also, we see the same source port being used (across multiple hosts) which is not normal. The source port changes on each innvocation of scanrand. A simple snort signature to detect scanrand being used would be:



Of course this signature could very easily trigger on valid traffic as an IP id of 255 with a TTL of 255 in a SYN packet is valid traffic. The chances of this combination coming up in an ACK packet is more likely though the chance is there for a SYN. Below we see a SSH session when the IP id rolls over from 65535 to 0 and eventually increase to 255. The signature will never trigger between these Linux boxes as thier TTL values are initially 64. It is more likely to trigger when a Solaris 2.x machine is on the network as it is reported to use an initial TTL of 255. Other operating systems, such as some versions of Cisco IOS, are also known to use 255 as a TTL.



Another intersting sidenote about scanrand is that is does not verify ICMP error messages from scanned hosts by default. The -c option must be explicitly set to make scanrand verify ICMP error packets are indeed a valid reply. Using hping2, and a home-brewed patch (see Appendix A), we can generate a flood of bad data.

From a host that notices a scanrand scan, we issue our hping command:



This command sends a barrage of ICMP destination unreachable to the scanning host that floods it with bogus information. The only problem is that valid responses are still sent by the listening hosts as well. This technique could be useful depending on who's behind the scanning host.

On the scanrand users console, we get:



which was all generated from our hping command. It would be trivial to create a script for hping to send some ICMP unreachable error message for many of our hosts. Moral of the story: use -c.

Conclusion
While the concept of scanrand and the tool itself are very useful and innovative, I can see this technology being abused. Besides the obvious, this type of scanning could greatly increase the ability of a worm to find hosts on the Internet to infect. This could mean faster propagation of worms. Of course, there is always the good and the bad of any technology, and network administrators might use it to find exposed ports on their local networks in a quick and accurate fashion.

Scanrand is a tool that any network administrator should take a look at for auditing internal machines or other various tasks involving network service discovery.

References
[1] Kaminsky, Dan. "Paketto Keiretsu source code" www.doxpara.com. Dec 24, 2002. URL: http://www.doxpara.com/paketto/paketto-1.10.tar.gz

[2] Postel J. "RFC 793 - INTERNET CONTROL MESSAGE PROTOCOL" www.rfc-editor.org Sept, 1981. URL: ftp://ftp.rfc-editor.org/in-notes/rfc792.txt

[3] Russell, Rusty. "Linux 2.4 NAT HOWTO" www.netfilter.org. Jan 14, 2002. URL: http://www.netfilter.org/documentation/HOWTO//NAT-HOWTO.html

[4] Schneier, Bruce. Applied Cryptography John Wiley and Sons. 1996. 2nd ed.

[5] Stevens, Richard W. UNIX Network Programming. Prentice-Hall. 1998. vol 1. 2nd ed.

[6] Stevens, Richard W. TCP/IP Illustrated, Volume 1. Reading: Addison Wesley Longman, Inc, 1994.

Michael Wisener