Gathering information about a remote host is often the first step in launching an attack. In order to break into a system exploiting some kind of vulnerability it is important to find as much information as possible. Port scanning, OS fingerprinting , banner grabbing are only some of the techniques that can be used. This paper summarises briefly the most common intelligence gathering techniques in use today, describing some of the tools that employ such techniques. Finally, a tool (amap) is presented which can be used to probe remote systems in the attempt to recognise an application listening on a non standard port.
Gathering information about a remote system is often considered the first step an "intelligent hacker"1 takes in launching an attack against or gain privileged access to a target machine. Intelligence gathered in this research can provide useful information about vulnerabilities or misconfigurations that can be successfully exploited by the potentail intruder. The more a hacker knows about a particular system (e.g. the OS, the hardware architecture and services that are running), the greater are his or her chances of launching a successful attack. By knowing the operating system and system type, a hacker can do a little research and come up with a list of known vulnerabilities.
Ofir Arkin describes in  a series of steps that an "intelligent hacker" would take in this intelligence gathering attempt:
- Footprinting: this phase consists in gathering as much information as possible on the target from authorised source of information (IP address ranges, DNS servers, mail servers);
- Scanning: this phase consists in determining which hosts in the targeted network are alive and reachable (through ping sweeps), which services they offer (through port scanning) and which operating systems they run (OS fingerprinting);
- Enumeration: this phase consists in extracting valid accounts or exported resources, system banners, routing tables, SNMP information, etc.
Arkin also classifies the scan types according to the protocol used, as follows:
PING SWEEPS: consists in querying multiple hosts using ICMP packets. It is an old approach to mapping and the scan is fairly slow. Automated tools for this scan include fping and gping on Unix, Pinger on Windows
BROADCAST ICMP: consists in sending echo requests to the network and/or broadcast address. Some operating system (Unix machines in general) will send back an ECHO REPLY to the attacker source IP, others will ignore these packets.
NON-ECHO ICMP: consists in sending ICMP messages different from ECHO REQUEST. This is useful when ECHO REQUESTS (PING) are filtered. Messages used for this purpose are ICMP type 13 (Timestamp request) and type 17 (address mask request). Automated tools for this type of scan include icmpush and icmpquery 2 .
TCP SWEEPS: consists in sending a TCP ACK or SYN. Receiving a RST response is an indication that there is a host. However, information provided by this type of scan is not completely reliable if the target is behind a firewall that can reply with an RST packet on behalf of the targeted host. Tools that can be used for this type of scan include nmap and hping 3 .
UDP SWEEPS: consists in sending a UDP packet. This method relies on the ICMP Port unreachable message as a reply to a UDP packet sent to a closed UDP port. This type of scan too can be done using nmap and hping.
All the above are used to determine if a host is alive, i.e. those hosts on a targeted network that are alive.
Port scanning, on the other hand, is used to determine which services are running on a host.
Port scanning techniques include:
TCP connect() scan:
A SYN is sent to an "interesting" port;
If a SYN/ACK is received, a service is listening and the TCP handshake phase
is concluded by sending an ACK.
TCP half-opening scan:
A SYN is sent to an "interesting" port;
If a SYN/ACK is received, a service is listening, a RST packet is sent to close
This is a technique that is meant to pass through filtering rules, not to be
logged by system logging mechanisms. It consists in forging non-standard
combination of TCP flags and relies on the fact that some filtering devices do
not log a TCP connection if the three-way handshake is not completed.
Packets are sent with SYN and ACK flags set. If a port is open, TCP replies
with a RST because there is no SYN corresponding to the received
SYN/ACK, otherwise the packet is discarded silently.
The techniques that are employed for port scanning are also successfully employed for identification of the remote operating systems (OS fingerprinting). Basically, OS fingerprinting is a process for determining the operating system a remote host computer is running, based on characteristics of the data returned from the remote host. This can be as simple as connecting to the host and reading a service banner or as complex as statistical analysis of TCP initial sequence numbers and flags. OS fingerprinting is based on the fact that there are slight differences in the implementation of the TCP/IP stack from different vendors. In some cases, these differences can reveal information as detailed as the version number of the operating system and the processor architecture. Tools are available today which that can tell with a high degree of precision which operating system is on the other side, by examining subtle details in the way TCP/IP was implemented in that particular system, they can be distinguished, according to the approach they follow, in passive and active fingerprinting. The first approach consists in sending particular combinations of TCP flags or options (or ICMP messages) observing the responses obtained and comparing them to a database of known "fingerprints", while the second approach consists in monitoring (sniffing) incoming traffic and observing certain characteristics of the received packets.
Active port scanning and OS identification techniques are extensively described in , while  describes the basis of passive fingerprinting. More recently another approach has been described to remote fingerprinting based on the Round Trip Time (RTT) between a SYN and the SYN/ACK sent by the server. This approach is described in  which also presents a tool (ring) that has been implemented as a proof of concept for this approach.
An alternative method to TCP/IP stack fingerprinting is identification by using client application. These methods rely on the behaviour of certain daemons in error conditions or on the "greeting" information that some applications send as part of the application level handshaking process. Quite a number of network clients send revealing information about their host system, either directly or indirectly. Email clients, for example, often include a lot of information on their systems in the headers,  provides interesting information about the behaviour of the pine mail client in this respect. Web browsers also send this kind of information.
The different approaches to OS fingerprinting are summarised in the diagram in the following page (also described in ), some of the tools that employ the various techniques are also indicated.
Active TCP/IP Stack fingerprinting
Several publicly available tools exist that use active fingerprinting techniques. Of these tools nmap  seems to be the popular choice. Version 3.0 of nmap was released last August. Nmap uses several techniques for attempting to determine the host operating system from a network level, some of them primitive in their approach and others more complex, requiring a good understanding of the TCP/IP protocol. They include testing the response of the remote system to undefined combinations of TCP flags, TCP Initial Sequence Number (ISN) sampling, determining the default setting of the DF bit, TCP initial windows size, ToS setting, fragmentation handling, types and order of TCP options.
Nmap fingerprints a system in three steps: port scanning, which provides as a result a list of open and closed TCP and UDP ports; "ad-hoc forged" packets sending, analysis of the responses received and comparison against a database of known OS's behaviour (fingerprints).
In version 3, nmap has introduced the following additional features:
- protocol scan, which determines which protocols (TCP, IGMP, GRE, UDP, ICMP, etc.) are supported by a given host;
- "idlescan" which performs a scan via a "zombie" machine;
- ICMP timestamp and netmask requests;
- detection of host uptime;
- option to specify payload length
- IP Identification Number and TCP Initial Sequence Number predictability report;
- "random IP" scanning mode is capable of skipping unallocated netblocks;
The system is modular, new tests can be implemented and added as additional modules.
Other tools that deploy similar techniques are hping  and iQ .
Passive host fingerprinting is the practice of determining a remote operating system by measuring the peculiarities of observed traffic without actively sending probes to the host.
Five parameters are particularly useful in this technique:
- The value of the "Time to Live" field (TTL) in the IP header
- The Initial Window Size in the TCP header
- The value of the "Don't Fragment" bit (DF) in the IP header
- The value of the "Type of Service" (TOS) field in the IP header
- The types of TCP options used (if any)
Passive fingerprinting was first described in . Tools based on this technique include p0f  and siphon .
Passive fingerprinting has some limitations. If used to analyse incoming traffic, it will not help in gathering useful information about malicious users since applications that build their own packets (such as nmap, hping, xprobe, etc.) will not use the same signatures as the operating system. In addition, it is relatively simple for a remote host to modify the default values for the TTL, Window Size, DF or TOS settings and, indeed this is considered one the countermeasures system administrators could and should take against passive fingerprinting.
Using RTT for TCP/IP Stack fingerprinting
A new approach to remote OS fingerprinting at the TCP/IP stack level is described in . The technique described here relies on the fact that timeouts and regeneration cycles between a SYN sent by the client and successive SYN/ACK sent by the server to complete the TCP handshake are loosely specified in the RFC, which means that almost each OS uses its own method and set of values. Ring is a tool that has been implemented to prove how the Round Trip Time can be effectively used to recognise the remote OS.
A typical ring identification session has the following steps:
- ring sends a SYN packet to an open port of the target
- the target enters the state "SYN_RCVD" and sends back a SYN-ACK
- Ring ignores the SYN-ACK
- the target remains in the SYN_RCVD state while reinjecting SYN-ACK segments from time to time. ring measures times between these segments. Ring is extensively described in Tod Beardsley's GIAC practical 4 .
One of the oldest techniques used to identify a remote operating system is "banner grabbing", which consists in opening a connection to a remote application daemon and determining the operating system by examining the responses received from applications like telnet or ftp.
Tools that use this technique span from scanners like Hackbot  and ScanSSH to ad-hoc scripts aimed at particular application services  . Hackbot is a bannergrabber that can scan for ftp, mail, ssh banner and DNS version, can perform whois lookup and various types of web scanning including Nimda and "path revealing NT problems" . ScanSSH is a scanner that probes SSH servers and classifies them according to their advertised version number.
Fingerprinting at the application level is also extensively described in .
Various techniques have also been described to defeat fingerprinting. Among them, the simplest and most immediate is the modification of the default values of a TCP/IP stack implementation, such as the TTL, Window Size or TCP options.
Another interesting approach can be found in  which describes the design and implementation of a TCP/IP stack "fingerprint scrubber". A "fingerprint scrubber" is a tool aimed at restricting a remote user's ability to determine the operating system of another host on the network. It is a piece of software that is transparently interposed between the Internet and the network under protection (a typical position would be on the firewall) and performs a set of kernel modifications to avoid recognition of the operating system based on the characteristics of IP and TCP implementations. It works both at the network and transport layers by converting ambiguous traffic from a heterogeneous group of hosts into sanitized packets that do not reveal clues about the hosts' operating systems. For example for all the packets generated by all hosts in the protected network it normalizes the IP header flags, forces all ICMP error messages to contain data payloads of only 8 bytes, keeps track of the open TCP connections by following the three-way handshake, and blocking all TCP packets that do not belong to a valid three-way handshake sequence, reorders the TCP options within the TCP header. According to  the fingerprint scrubber was tested against nmap which was completely unable to determine the operating system with the scrubber interposed.
Probing application level services: amap0.95
In the previous sections various approaches to remote information gathering were described that allow identification of the remote Operating System or of the version of a particular application running on a remote host. A further step ahead in gathering information about a remote host is provided by amap . Amap is a scanning tool that probes services running on a remote server on a given port to identify the specific application that is listening on that specific port. Its purpose is to be used to identify services that are not running on the standard ports. This tool has been released on March 2002 under the GNU General Public License and can be downloaded from http://www.thehackerschoice.com/download.php?t=r&d=amap-0.95.tar.gz. . It is also available as a package in the Debian Linux distribution. Its authors describe it as "a next-generation scanning tool, it identifies applications and services even if they are not listening on the default port by creating a bogus-communication. amap has a growing database of know applications also including non-ASCII based applications and even enterprise services.".
The purpose of the following sections is to explain how amap works and to present the results of its use in a test environment.
Amap probes the target by sending a number of "trigger" packets at the rate of about one per millisecond. By default it sends 16 such packets, this value can be modified with the "-T" option, however I counted 11 such packets in my tests, probably because there are only 11 different triggers defined in the signature files for TCP based application protocols. These "trigger" packets are typically the initiating packet of an application protocol handshake (see SSL example in the following section). Amap has a list of "triggers" which include binary as well as text handshake messages.
Triggers are defined in the file: appdefs.trig. The triggers currently defined are shown in the following table:
The hex string in the table (indicated by a 0x before the first octet) is sent as the payload of the "trigger" packet in the first message sent after the completion of the TCP handshake or in the UDP datagram (depending on whether the service uses TCP or UDP as transport). This list can be expanded very easily, provided one knows the handshake message of the application that one wants to trigger.
Amap defines a format for describing the trigger:
Where ":" is the separator and:
PROTO_ID: is the name of the application level protocol (service) for which a handshake trigger is provided (e.g. SSL, Telnet, etc.). This value is looked up when the "p" command line option is used.
"t" or "u" indicates whether TCP or UDP must be used as transport
"0|1" is a flag to mark "dangerous" protocols. These are applications that might crash if unexpected or long data is received". When the "H" command line option is specified, triggers with a value of 1 in this field will not be sent.
<optional trigger data> can be an hex string or a ascii string depending on the application. A hex string is identified by a leading "0x". All strings are terminated with a newline character ("\n"). A trigger string is not defined for application protocols that provide a banners string upon successful completion of the TCP handshake (e.g. mail servers, ftp servers, ssh daemons, etc.). These will be simply recognised with the same mechanism used by any banner grabbing tool. After the trigger has been sent, amap then looks up the response in a list, contained in the file appdefs.resp and prints out any match it finds.
The possible responses are contained in this file with the following format:
Where ":" is the separator and:
PROTO_ID: is the name of the application level protocol (service) containing the string in its response.
<response string> can be an ASCII string or a binary string, like in the triggers and can be prepended with either a "^", meaning that the specified string must be found at the beginning of the response, or by a "/" meaning that the specified string must be found somewhere in the received string.
As for the "triggers", it is very easy to expand the list of "recognised" services by providing the appropriate description in this file. Amap supports both tcp and udp protocols, ASCII and binary protocols and provides a number of options to tune the probe being sent. It can take an nmap machine-readable output file as its input file and probe the services that are listening on ports found open by nmap.
The options currently available are described below:
- i Reads hosts and ports from the specified file. The format of this file is as obtained by nmap using the option "-m"
- sT Scan only TCP ports
- sU Scan only UDP ports
- d Print the hex dump of the received response. The default is to print only the responses that are recognised
- b Print ASCII banners if any are received from the probed service
- o Log results to
- D Reads triggers and responses definitions from , instead of the defaults appdefs.trig and appdefs.resp
-p Indicates that only the trigger associated to must be used
-T n Open "n" parallel connections. The default is indicated as 16 in the manual pages, however, I counted only 11 in all tests I made.
- t n Wait "n" seconds for a response. Default is 5.
- H Skip potentially harmful triggers. This swill skip triggers that are marked with the 1 flag in the triggers description file (appdefs.trig)
The syntax for running amap is:
amap [-sT|-sU] [options] [target port| -I ]
Either -sT or -sU must be specified. "target" is the IP address or fully qualified name of the probed host and "port" is the probed port number. Target and port must not be specified if the "-i" option is used.
Amap was downloaded from http://www.thehackerschoice.com/ and compiled on a machine running RedHat Linux 7.2.
No changes were made to the default configurations.
The test environment included the RedHat 7.2 machine running amap at the address 10.0.0.2 and the "target" host running Debian 3.0 at the address 10.0.0.1, both hosts on the same subnet. A number of services were activated on the debian host, for most of them the default port was changed to verify that amap could correctly recognise the applications listening on the ports probed.
Tcpdump was activated on the RedHat host to record the traffic exchanged between the two hosts.
Amap was used to probe services listening on TCP ports.
Services were distributed as follows:
389/tcp LDAP (not modified)
80/tcp SSL (HTTPS)
When amap was started, in each probe, 11 TCP connections were opened, SYN packets being sent at a few milliseconds one after the other. Amap forks as many child processes as the number of parallel connections specified with the -T option. Once the TCP handshake is completed, amap sends the one trigger packet per each trigger found in the appdefs.trig file for the chosen protocol (TCP in this case). In addition, it sends a trigger packet containing the string "\rnHELP\r\n".
Upon reception of the response from the server, amap checks in the appdefs.resp file for a match with the pre-defined responses. The response form the server can be either a banner or an error or a response to the handshake initiated by the amap trigger. Some application would also send error messages back to amap. As soon as a message is received from the server, the corresponding TCP connection is closed. Obviously, depending on the level of logging of the application listening on the probed port, an error will be recorded on the log file for each "wrong" trigger received. Finding eleven connections open from the same host all of which, except possibly one, generating errors on the application level protocol, could be a good indication of a probe from amap.
The next two sections describe the results of running amap against an application that responds with an ASCII banner (FTP) and an application that requires the successful completion of a binary handshake.
"Text banner" applications: ftp
The traces provided in this section show an extract of a probe on port 31 (running ftp). Amap was run on 10.0.0.2 with the following options:
For brevity, only some of the connections are shown and the payload is shown only for data transfer packets (PUSH and ACK bits set).
Amap successfully recognised ftp listening on port 31:
Recognition of the ftp service is based on the banner received from the server. In particular, the match of the response received from the server with the string:
On the server side, the following error messages are logged in the syslog file. Error messages are also sent back to the client.
"Binary handshake" application: SSL
The traces provided here show how amap can simulate an SSL connection and recognise an SSL application running on port 80.
The steps involved in the SSL handshake are as follows:
- The client sends the CLIENT_HELLO message containing:
- Client's SSL version number
- Supported ciphering schemes
- The server sends the SERVER_HELLO message containing:
- Handshake type (server hello)
- Server's SSL version
- Cipher settings
- Cipher suite
- Random number
- Compression method
- The server then sends its certificate
- Handshake type (certificate)
- Server certificate
The decoded equivalent of this string is (decoding has been obtained using ethereal :
The response received from the server that allows amap to recognize SSL is (the decoded format has been obtained using ethereal , the content of the certificate is not shown for brevity, but it can be seen in the trace in the following section)
Amap was run on 10.0.0.2 with the following options:
The following page shows the tcpdump log recorded during the probing on port 80. Hex dump is shown only for data transfers for brevity. All the connections opened by amap are shown as well as the all the triggers sent in one run of amap. Payload in red is the triggers sent by amap (Application protocol probed is indicated beside). Payload in blue is the response sent by the server. In this case, the probed service replies only to the correct trigger (i.e. the SSL CLIENT_HELLO handshake message).
Amap successfully recognized SSL listening on port 80. Here is the content of the results file:
The "banner" is, in fact, the message sent by the server in response to the handshake initiated by amap.
The apache-ssl daemon recorded the following error messages in its error log file, showing that the server received also messages that were not recognized as valid "client_hello" message. There was one such entry per each of the triggers sent by amap.
Detecting amap probes
Amap is not very stealthy is run in its default mode. 11 parallel connections each one, with the exception of one possibly, sending an unexpected message at the application protocol level are surely recorded in the application log file, provided that the application maintains a good logging level. In the test that I made, the probes on the ssh port did not leave any trace at the application level, since no logging was enabled for this application. On the other hand extensive logging was available for the ftp, http and http-ssl applications. Therefore the most effective means to detect this probe is to maintain and check logs at the application level. After all if your mail server receives a NETBIOS request, something strange must be happening.
Apart from logging at the application level, it is difficult to detect an amap probe since it uses the OS system calls to the TCP/IP stack and therefore no signature can be found at the level of the TCP, UDP or IP packet. Nevertheless, it is still possible to write a snort rule that is able to detect probes from amap when it is run in its default mode. In fact, we can observe that in all attempts, amap always sends the trigger for the mount service, specifying a machine name that is hard-wired in the binary string it sends for this type of trigger. The machine name is "kpmg-pt" and it can be found in any default probe from this tool.
It is possible to write a rule that looks for this string in the payload for each service that we possibly want to monitor against this type of probes. For instance I wrote the following rules for my test environment:
Obviously, $HOME_NET could be substituted with the IP address of the host on which the specific service is running.
The following alerts were produced by snort, running in NIDS mode with the -dv and -c options:
Obviously these rules fail, if one runs amap with the -p options specifying the triggers that should be used and not using the "mount" trigger. But then again, being a tool that targets the application level, detection is done most appropriately at the application level by careful monitoring of "strange" messages sent to the server.
Tools like amap are an additional proof that "security through obscurity" is not the right approach to secure a network: simply running a service on a different port is not sufficient to go unnoticed. However, amap can be very useful for system administrators in finding "hidden" services, in those cases where users run unauthorised services and try to disguise them using a non-standard port. In this function it can be usefully used in collaboration with tools like nmap. The list of signatures (triggers and responses) is customisable and can be easily expanded with the addition of signatures of proprietary protocols. Like its authors say: "With amap, you will be able to identify that SSL server running on port 3445 and some oracle listener on port 23!".
 Fyodor, Remote OS detection via TCP/IP stack fingerprinting
URL: http://www.insecure.org/nmap/nmap-fingerprinting-article.html (October, 4th)
 List of fingerprints for passive fingerprint monitoring
URL: http://project.honeynet.org/papers/finger/traces.txt (October, 4th)
 Hping man page, URL: http://www.hping.org/manpage.html (October, 4th)
 Arkin, Orfin, Network scanning techniques: Understanding how it is done
URL: http://www.sys-security.com/archive/papers/Network_Scanning_Techniques.pdf (October, 4th)
 Identifying ICMP Hackery Tools Used In The Wild Today
URL: http://www.sys-security.com/archive/securityfocus/icmptools.html (October, 4th)
 Preventing remote OS Detection
URL: http://www.pgci.ca/common/p_fingerprint.htm (October, 4th)
 Beck, Rob, Passive-Aggressive Resistance: OS Fingerprint Evasion, Linux Journal
URL: http://www.linuxjournal.com/article.php?sid=4750 (October, 4th)
 Malan, G. Robert, Jahanian Farnam, Smart, Matthew, Defeating TCP/IP Stack
Fingerprint, Proc. 9th Usenix Symposium, Denver, CO, August 14-17, 2000
 Arkin, Orfin, Yarochkin, Fyodor Xprobe v2.0: A fuzzy approach to remote active
operating system fingerprinting, August 2002
URL: http://www.sys-security.com/archive/papers/Xprobe2.pdf (October, 4th)
 HACKBOT 2.11 URL: http://ws.obit.nl/hackbot/documentation.txt (October, 4th)
 Provos, Niels, Honeyman, Peter, ScanSSH - Scanning the Internet for SSH servers,
URL: http://www.citi.umich.edu/techreports/reports/citi-tr-01-13.pdf (October, 4th)
 Nazario, Jose Passive Fingerprinting using Network Client Applications, Nov. 2000 URL: http://www.crimelabs.net/docs/passive.html (October, 4th)
 Bursztein, Lupin iQ Overview,
URL: http://www.bursztein.net/secu/iQ/iQ-fr.pdf (October, 4th)
 OS Identification, Unix Insider 12/8/00
URL: http://www.itworld.com/Comp/2124/swol-1208-buildingblocks/ (October, 4th)
 Glaser, Thomas TCP/IP Stack Fingerprinting Principles, , SANS Intrusion Detection FAQ, October 2000
URL: http://www.sans.org/resources/idfaq/tcp_fingerprinting.php (October, 4th)
 Veysset, Franck, Courtay, Olivier, and Heen, Olivier, New tool and technique for remote operating system fingerprinting
URL: http://www.intranode.com/pdf/techno/ring-full-paper.pdf (October, 4th)
 Vision, Max, Passive Host Fingerprinting,
URL: http://www.whitehats.com/library/passive (October, 4th)
 Advanced Remote OS Detection Methods/Concepts using Perl
URL: http://www.whitehats.com/library/passive (October, 4th)
 Examining Remote OS Detection using LPD querying, Feb 2001
URL: http://old.lwn.net/2001/0222/a/sec-lpddetect.php3 (October, 4th)
 Fernando Martins, IDS: passive mapping: an offensive use of IDS,
URL: http://www.shmoo.com/mail/ids/apr00/msg00014.shtml (October, 4th)
 Know Your Enemy:Passive Fingerprinting, Honeynet project
URL: http://project.honeynet.org/papers/finger/ (October, 4th)
 Ethereal, URL: http://www.ethereal.com (October, 4th)
 Arkin, Orfin, ICMP usage in scanning
URL: http://www.sys-security.com/html/projects/icmp.html (October, 4th)
 P0f README, URL: http://www.stearns.org/p0f/README (October, 4th)
 Amap README URL: http://www.thehackerschoice.com/download.php?t=r&d=amap-0.95.tar.gz (October, 4th)