Andrew Hay, one of the authors of the
popular OSSEC Host-Based Intrusion Detection Guide and upcoming Nagios
3 Enterprise Network Monitoring book has agreed to be interviewed for
the SANS Security Thought Leader series, and we certainly thank him for
his time.
Andrew, how did you first get into security, what was the driver?
I was one of the many people laid off from Nortel Networks in Ottawa,
Ontario, Canada. I must have applied to at least 500 different jobs in
various locations throughout Canada, the United States, Europe, and
Australia. No one wanted me. The problem with being laid off by Nortel
is that, typically, you’re not the only person. In fact, I was
one of a few thousand people laid off, all looking for the same (any)
job.
I received a call from a company who was contracted, by Nokia, to find
some people to work front line firewall and network support. I jumped
at the opportunity and within a week I was working as a contractor at
Nokia. Since I had little security experience there was a steep
learning curve but Nokia provided exceptional training for both Nokia
IPSO (the routing platform), Nokia IP Series appliances (their
hardware), and Check Point VPN-1/Firewall-1 (the bundled firewall
package).
While working at Nokia, I made a point of learning everything I could
about the products I supported. I also ensured that I obtained the
certifications for the training I received in order to make myself
stand out from the rest of my coworkers. Within 8 months, a record at
the time I might add, I was hired by Nokia to work full-time. Even
though I was hired into the job I made sure not to stop learning. I
felt my routing and switching knowledge was weak, so I paid -- out of
pocket -- for a CCNA prep-course, and subsequent exam. Customers were
calling in having problems with their Cisco to Check Point VPN’s,
so I bought books on Cisco PIX and Cisco VPN Concentrators and learned
how to troubleshoot VPN related issues.
By this time, I was hooked on security. At first, I tried to read as
much as I could on security topics to make me better at my job. The
more I read the more I realized that I was genuinely interested in all
facets of security, even those that didn’t relate directly to my
current role. I started teaching a CompTIA Security+ prep-course, based
on my own course content, through a local business to give back to the
community. The funny thing was that most of my students were in the
same boat that I was in before getting hired at Nokia.
I also started doing some consulting on the side for Cisco and Check
Point issues. This helped me learn quite a bit about working with
government organizations and subcontracting through larger, more
established consultancy firms. In 2004, after speaking with two friends
at Nokia, we decided to form a business to help add credibility to our
consulting engagements and help limit the taxes that could be taken
from us. This is how Koteas Corporation was formed. Even though we
didn’t perform a large volume of work due to our full-time jobs,
our customers consistency returned to us when they need help or advice.
Awesome, and what is your primary focus today?
In February 2006, I accepted a third-level support role at Q1 Labs Inc.
in Fredericton, New Brunswick, Canada. I came in as the "security guy"
and was again faced with another steep learning curve. This time,
however, I had to learn about intrusion detection and network attacks
as it related to QRadar, our Network Security Management (NSM)
solution. In the first year, I went home with a head full of new and
exciting security information every day. So much information, in fact,
that often I would lay awake at night trying to finish processing
everything thing I learned.
In my second year I was offered the opportunity to lead the team of
software engineers who were responsible for integrating event and
vulnerability information into QRadar. I jumped at the chance. In three
months, I learned more about how applications and platforms than I ever
knew before.
Currently, I am in my third year at Q1 Labs and am the Integration
Services Program Manager. My day-to-day duties involve researching new
vulnerability scanner, patch management, and application/platform
event-emitting technologies. I can honestly say that I love getting up
in the morning and arriving at work. I bet not everyone can say that
about his or her job.
I think this is a wonderful time to be
interested in intrusion detection; a lot of folks have turned a blind
eye, even though as Dr. Eric Cole puts it, effective configuration is
ideal, but detection is a must. I guess the biggest focus today is on
botnets, can you share a bit about the techniques one uses to detect
botnets?
When asked by Warren Lee for my take on the December 2007 "Bot Roast"
arrests by the FBI [1] I told him what I told everyone: botnets can
only be effectively detected by using advanced flow and log
correlation. A lot of people might disagree with me on that statement
but I stand behind it.
OK, what is the first reason you feel botnets can only be detected using flow and log correlation?
If you rely purely on flow technology you will have to deploy your flow
collectors at every network intersection so that every flow can be
recorded and subsequently inspected. I'm not saying that this is a bad
thing to do...far from it. In fact I think the world would be a better
place if everyone could monitor every flow from every part of the
network. The reality, however, is that deploying flow collectors to
every corner of your network might not be financially, or
architecturally, feasible. Do your switches support SPAN capabilities
so that you can passively monitor traffic on a specific port or VLAN?
Do you have room in the rack for another system? Do you have budget to
deploy flow collectors to all of your office locations or just in the
main office?
Sure thing, actually this is the
intrusion detection problem as well, the most valuable real estate in a
data center seems to be monitoring locations, so by flow, just to be
sure, you are talking about NetFlow, yes?
Well NetFlow is one well-known "flow" technology, but when I refer to a
"flow" I’m generically talking about a full communication session
between two specific IPs on your network. A full, bi-directional flow
will show you exactly what is happening between the two systems. How
cool is it that you’re able to read the email of the employee
sending your confidential information to a competitor? That’s a
rhetorical question...it’s really cool!
However, if deploying multiple flow collectors isn’t an option,
you could use a technology like NetFlow, sFlow, jFlow, cFlow to report
flows seen by your routers and switches. Unfortunately using these flow
technologies, which only gives you the source IP, destination IP,
source port, destination port, and protocol information, just won't cut
it. To really see what is going on you need to be able to inspect the
fully recombined session, including the application layer information,
to accurately see what is happening between the hosts involved.
OK, in the past few years a number of
vendors have added flow analysis to their products, or so they say. Can
you give us a war story, an example of an attack, or some botnet
activity that can be detected by flow analysis?
One of the best examples, where flow analysis shines, is in the case of
network based information theft. I recall one instance where a State
Government Agency, using network flows, was able to detect the download
of sensitive utility information to a location in Syria. Incorrect
firewall policy implementation allowed the breach to occur but, due to
their ability to analyze the network flows, they were able to see
exactly what was being accessed, and subsequently stolen.
Another example would be the detection of policy violations, in the
form of the transfer of inappropriate materials, across the company
network. In this instance, a member of the IT staff was violating
corporate policy by streaming inappropriate material from the server
room. The security staff was able to reconstruct the inappropriate
stream from the collected flow information (much to their dismay) and
use it as evidence related to the policy violation.
Thanks Andrew, I am sure that is
compelling enough for us to join the camp of believers in flow
analysis. So you say flow analysis and logs, can you bring logs into
the discussion?
Logs are great. I love logs almost as much as Dr. Anton Chuvakin (well
maybe not that much). In fact we talk about logs, logging, and log
analysis on a daily basis, blog about logging on our respective blogs,
and even share a blog on the topic with some other industry giants who
feel the same way (www.loganalysispros.com).
Oh my! Forgive an off topic moment
please. As you know Anton works for LogLogic and they had a promo piece
that was a bumper sticker that says "I love logs" where love is the
heart symbol. Somehow, I ended up with one of them. I live on Kauai,
very nice, but the ocean is very dangerous here, we had three drownings
just this week. So there is one pretty safe place to swim called
Lydgate park. The have this big ring of rocks to create a safe pool.
Tourists and locals can still swim in the ocean, but no sharks,
currents or other serious issues and there is even a lifeguard tower.
Only problem, it is near the mouth of one of the biggest river valleys,
when it rains hard, logs flow down the river out to the ocean and the
waves push them over the rock wall into the pond. Lydgate literally
fills up. Then volunteers get together and drag them out so we can swim
again. One of the primary volunteers is a guy named Robin. The last
time we did a big haul-a-thon, I gave him the bumper sticker. I can't
say he was totally pleased with the gag gift. Anyway, anytime you say I
love logs, you can now imagine what I am thinking . . . wait till you
drag a few wet sandy ones out of the pond, then we can talk about love.
But enough of that, back to detection. What do you do with the logs?
If you take in logs from your applications and hosts you will be able
to see how the worm or bot is interacting with the infected or
compromised system. Is the malware trying to enumerate remote Windows
shares? Has the malware installed or started a backdoor application to
allow remote access? Has the malware been designed to attack the core
network infrastructure (i.e. routers, switches, etc.) once it has found
a host to stage from?
The problem, however, is that relying only on logs from your
applications, hosts, and devices might produce an unmanageable amount
of logs. Not only will the number of logs be hard to manage by a human
analyst but the variations in the generated logs are almost enough to
drive you crazy!
A firewall accept log from a Cisco PIX firewall via UDP syslog protocol
won't look like a firewall accept log from a Check Point Firewall-1
firewall via the OPSEC LEA protocol. The logs generated from a
successful authentication to a Windows XP Professional workstation
won't look anything like the logs generated from a successful
authentication to a Red Hat Enterprise Linux 4 server.
By the time you figure out what is happening, based on the logs
received, the worm or bot could have spread exponentially throughout
your network.
Well now that is cheery Andrew! I am
spending a lot of time talking with security thought leaders in the
SIEM space. They are all convinced they are king of the road, the
center of the IT shop in a year or two. They tell me how their devices
will aid detection. What is your real world take on this with the
current state of SIEM?
I have this conversation with someone at least once a week and
I’ll tell you what I tell them: John Heywood said it best, "Many
hands make light work."
Think about it, the more "people" you have witnessing a crime the
better chance of catching the criminal in the act. The "people" in this
case are your intrusion analysts, your security operations staff, and
your SIEM solution. We live in a 24/7/365 world and humans are prone to
error, need sleep, need to eat, and cannot be tied to a screen for
extended periods of time and maintain a current state of alertness.
SIEM solutions are a "helping-hand" for security operations staff.
False positives can be reduced, terabytes of data can be combed through
automatically, and all security alerting can be centralized into one
location. SIEM solutions also provide a unique view of the network and
of the security devices that are in place to protect your network.
The typical follow-up question I get is "Are SIEM solutions perfect and
do they make errors if simply plugged into a network without being
tuned?" Of course they’re not perfect but then again, is your
firewall perfect? How about your NIDS? If they were perfect then
everyone would have one already and the technology would have been
perfected 10 years ago. As for the "do they make errors" part of the
question, I think you’ll agree Stephen, if you put any security
solution on your network and say "<POOF>...now protect me,"
you’re asking for all kinds of trouble.
So you are pushing for flow analysis AND log analysis right, give us some words about that?
If you can't afford, or if it is
architecturally impossible, to deploy flow collectors at every network
intersection then why not plan on deploying your collectors at the
major network connection points (external network, DMZ, internal
network, sensitive server network, etc.). If your network devices are
capable of sending flow information enable it so that some flow
tracking can be performed in remote or non-critical network devices.
Log everywhere and log often. Enable logging on all of your hosts and
devices so that you can see exactly how malware, or regular users for
that matter, are interacting with the resources.
Taking flows and logs into a central location and generating alerts
from the combined data really helps with the hair-pulling frustration
of juggling multiple information repositories (I for one cannot afford
to lose any more hair than I've already lost). The two sources give
each other context. Your logs from your firewall may indicate a certain
type of security threat against a network asset while your flow data
can validate the existence fo the asset, whether it’s vulnerable,
and how valuable that asset is to you.
OK, can you manage an example; can you demonstrate this with a flow and a log example?
The best example is the compromise of a host, or hosts, for the purpose
of creating another bot. In Figure 1.2 you see a fairly typical network
configuration. Our attacker (1), who we’ll call Number One,
attempts to compromise a specific Host (4) on the ACME network using a
zero-day web exploit. His purpose is to compromise a new system to
include in his botnet (5).
For Number One to reach his target he must cross over the Internet,
past the ACME core router (2), through the ACME Firewall (3), and into
the DMZ where the DMZ Host (4) resides.
[Figure 1.2 - The Field of Battle]
Let’s say that Number One was able to successfully compromise the
DMZ Host (4) and install his malicious payload. Immediately after
installation, the infected host communicates with its "master" system
and joins Number One’s botnet (5).
Now what would the various points seen during this transaction...
ACME Router (2) - Would have
seen the traffic from Number One (1) destined for the DMZ host (4) and
any responses from the DMZ Host (4) or ACME firewall (3). If we were
exporting flow information from this router (e.g. NetFlow) we would be
able to reconstruct the full session and compare it to the logs
received from the other points. Note - we would not see any payload in
the flow records because NetFlow doesn’t contain a full payload
as it operates on layer 3.
ACME Firewall (3) - Would have
logged the conversation between Number One (1) and the DMZ Host (4).
Any accepts, denies, blocks would have been logged.
DMZ Host (4) - Several things...
Host Firewall (if present) - would have logged the conversation
between itself and Number One (1) including any accepts, denies, or
blocks.
Host IDS (if present) - may have logged the installation of new
or replacement of existing files. May also have logged the start of a
new service or open port.
Web Application Logs - would have logged Number One’s (1)
interaction with the webserver on the DMZ Host (4). May have logged the
malicious interaction that resulted in the compromise of the DMZ Host
(4).
Host Logs - may have logged the start of a new service, the installation of a new application, etc.
Anti-Malware/Virus - may have detected/stopped the installation of the malicious application
Based on the above example, the only point that could tell us exactly
what attack was being attempted would be the end host (which obviously
isn’t ideal). Since the flow information from the ACME Router (2)
doesn’t show us the payload of the communications between Number
One (1) and the DMZ Host (4) all we can confirm is that a communication
took place. This isn’t really a bad thing because at least we can
confirm the source/destination of the attack and the source/destination
of the bot/botnet.
However, if we were able to place an additional Flow Collector (6) that
was able to look at layers 1-7, see Figure 1.3, we could actually
reconstruct the attack before it hits the DMZ Host (4) and perhaps
instruct the ACME Firewall (3) or ACME Router (2) to block the
communications from Number One (1).
[Figure 1.3 - Calling in Reinforcements]
So, assuming you are a true believer
in flow and log analysis, I know you work for a security company, what
do they do? Do they implement your passion in their product?
At Q1 Labs we have developed a network security management (NSM)
solution called QRadar that provides a fantastic set of network
security management services including: log management, threat
management, and compliance management. We follow a strategic principle
of "Complete Network and Security Knowledge Delivered Simply for Any
Customer."
"QRadar provides numerous advantages
over other security management solutions because it includes an
intuitive centralized command and control console, network, security,
application, & identity awareness, advanced threat and security
incident detection, and scalable distributed log collection and archive
capabilities."
A large part of my job is to investigate how to incorporate the
information collected by other products into QRadar for correlation
with flows and external log sources. This information, be it patch
information from a patch management solution, discovered
vulnerabilities from a vulnerability scanner, or authentication logs
from a critical database server, can be used to generate meaningful
alerts for incident handlers to investigate. The developers take this
information and build the integration points between the products so it
feels like a little part of me is in each integration mechanism we
create.ate.
Nice, tell us a bit about the OSSEC
book project, what ever convinced you to write a book that is a lot of
hard work? Oh yeah, and where is my signed copy?
It’s in the mail, I swear! Seriously though, I wanted to take a
minute to thank you for agreeing to write the forward for our little
book.
The OSSEC
book came to be due to a serious lack of documentation on how to
install, configure, and operate the OSSEC HIDS. The creator and primary
developer, Daniel Cid, also works at Q1 Labs. Like many developers,
documentation is typically an afterthought (or non-thought in some
cases). I liked the product but wanted to know more. Unfortunately most
of the advanced information was locked away in Daniel’s head. A
light went off in my head and I thought, "Hey, we should write a book."
My buddy Harlan Carvey, author of Windows Forensic Analysis,
introduced me to his publisher at Elsevier and I pitched the idea. They
loved the idea and we - me, Daniel Cid, and Rory Bray (another
coworker) - started writing the book.
I can honestly say it was an interesting experience but also a
personally rewarding one. I liked it so much that I offered to
contribute to the Nagios 3 Enterprise Network Monitoring book and am in talks for a couple of other future titles.
Where do the like-minded believers in
logs get together? Do you have a conference, a yearly meeting, a blog?
Please let me know, I would like to invite my friend Robin! *grin*
Honestly, there aren’t a lot of places. The Log Analysis mailing list
[2] has traditionally been the place to discuss log-related
information. There is also the SANS WhatWorks in Log Management Summit
that I have heard quite a few good things about. *wink, wink*
Recently, the MIS Training Institute’s Log Management Summit was
co-located with InfoSec World 2008 and, as Anton told me while
attending, everyone was there but me...I don’t know why I talk to
that guy. *smile*
So what are you doing with SANS, are you doing any teaching and writing, I do not recall seeing your name fly by?
Actually, I’ve been piloting the SANS Security Essentials Review
course (SEC401R) for those who need some additional review prior to
sitting for the SEC401 exam. I’ve run two sessions so far and they’ve been well received.
I’d like to be more involved in the courseware development at
SANS and have tossed some ideas around about creating a couple of new
SANS courses but those are still a "work in progress." I’d also
like to present at some of the bigger SANS conferences and am
constantly working to prepare myself for the day that I’m "called
up to the Majors."
Great, you clearly are a guy with a
lot to share, what outlets are there to learn more about what Andrew
Hay thinks? Do you have a blog? What is your major focus, surely not
just repeating that you love logs, Anton has that covered I think :)
Actually, I do have a blog. I tend to post my broad security-related
thoughts at my personal blog (www.andrewhay.ca) and even produce a Suggested Blog Reading
post where I mention interesting posts or news that I found from around
the blogospohere on a weekly basis. However, for more log-centric
posts, I tend to post them over at the Log Analysis Professionals blog.[3]
I’m also a regular contributor to the Security Catalyst Community[4]
where people get together to talk security and help people with
security related questions. I encourage everyone to check it out.
I recently jumped on the Twitter bandwagon as well. If you want to follow my tweets you can subscribe to @andrewsmhay with your favorite Twitter client.
Any other book projects? Upcoming movies? Broadway appearances?
Well, I don’t think I’ve got the moves or voice for
Broadway but I do have a few projects that I’m working on.
As Peter Giannoulis mentioned in his recent interview, I’m
involved in The Academy with Peter, Adam Winnington, and Jason Ingram.
Our goal is to create a new way for everyone to learn about security
products and tools because, lets face it, some people learn better
visually.
I’ve signed on to write the Nokia Firewall, VPN, and IPSO Configuration Guide
(ISBN 9781597492867) with Peter Giannoulis and my lovely wife Keli Hay.
I’ve also been tossing around the idea of writing a fictional
novel about an elite group of hackers who work for the government to
electronically soften their targets prior to further efforts. Think Tom
Clancy’s "Net Force" meets James Bond after being trained by the
guys on the TV show "The Unit".
I’m also working on some more log-oriented training materials
that focus on log analysis and log management. Anton and I have also
come up with a plan to present some fun/crazy/interesting log-related
presentations for conferences. I can’t let the cat out of the bag
just yet, but the first one is about making logs and logging "sexy"
again.
There are always more books to be written on various subjects so
I’ll probably sit down to brainstorm some ideas for books that
just aren’t out there yet. I’m open to ideas by the way.