Please look at the killing with keyboards file then answer the following questions in the context of the best practice concepts covered in chapter 11 and the security professional proficiencies covered in chapter 13.
Identify what is at risk here,
5 possible threats and
5 vulnerabilities in this scenario.
Analyze measures that could be taken to reduce the risks.
CHAPTER 11 BEST PRACTICES
DEFINED SECURITY POLICY
One of the best presents a manager could give an analyst, besides a workstation with dual
21-inch LCD monitors, is a well-defined security policy for the sites being monitored.1
“Well-defined” means the policy describes the sorts of traffic allowed and/or disallowed
across the organizational boundary. For example, a fairly draconian security policy may
authorize these outbound protocols and destinations:
• Web surfing using HTTP and HTTPS to arbitrary Web servers
• File transfer using FTP to arbitrary FTP servers
• Name resolution using DNS to the site’s DNS servers
• Mail transfer using SMTP and POP3 to the site’s mail servers
• VPN traffic (perhaps using IPSec or SSL) to the site’s VPN concentrators
To meet the organization’s business goals, the security policy would allow these
inbound protocols to these destinations:
• Web surfing using HTTP and HTTPS to the site’s Web servers
• Name resolution to the site’s DNS servers
• Mail transfer using SMTP to the site’s mail servers
Notice that for each item, both the protocol and the system(s) authorized to use that
protocol are specified. These communications should be handled in a stateful manner,
meaning the response to an inbound VPN connection is allowed.
In the context of this security policy, anything other than the specified protocols is
immediately suspect. In fact, if the policy has been rigorously enforced, the appearance
of any other protocol constitutes an incident. In Chapter 1, I quoted Kevin Mandia and
Chris Prosise to define an incident as any “unlawful, unauthorized, or unacceptable
action that involves a computer system or a computer network.”2 At the very least, the
appearance of a peer-to-peer protocol like Gnutella would be an “unauthorized” event.
Without a defined security policy, analysts must constantly wonder whether observed
protocols are authorized. Analysts have to resolve questions by contacting site administrators.
Once a responsible party validates the use of the protocol, analysts can move on
to the next event. Analysts working without well-defined security policies often define
their own “site profiles” by listing the protocols noted as being acceptable in the past.
Creating and maintaining these lists wastes time better spent detecting intrusions.
NSM does not include protection as a traditional aspect. NSM is not an active component
of an access control strategy, and the theory does not encompass intrusion prevention
or intrusion protection systems (IPSs). An IPS is an access control device, like a
firewall. An IDS or NSM sensor is an audit or traffic inspection system. The fact that an
access control device makes decisions at OSI model layer 7 (application content) rather
than layer 3 (IP address) or 4 (port) does not justify changing its name from “firewall” to
“IPS.” Any device that impedes or otherwise blocks traffic is an access control device,
regardless of how it makes its decision. The term “IPS” was invented by marketing staff
tired of hearing customers ask, “If you can detect it, why can’t you stop it?” The marketers
replaced the detection “D” in IDS with the more proactive protection “P” and gave birth
to the IPS market.
There’s nothing wrong with devices making access control decisions using layer 7 data.
It’s a natural and necessary evolution as more protocols are tunneled within existing protocols.
Simple Object Access Protocol (SOAP) over HTTP using port 80 TCP is one
example. If application designers restricted themselves to running separate protocols on
separate ports, network-based access control decisions could largely be made using information
from layers 3 and 4. Unfortunately, no amount of engineering is going to put the
multiprotocol genie back into its bottle.
While NSM is not itself a prevention strategy, prevention does help NSM be more
effective. Three protective steps are especially useful: access control (which implements
policy), traffic scrubbing, and proxies.
When access control enforces a well-defined security policy, heaven shines on the NSM
analyst. Earlier we looked at the benefits of a security policy that says what should and
should not be seen on an organization’s network. When access control devices enforce
that policy, unauthorized protocols are prevented from entering or leaving an organization’s
network. This strategy allows analysts to focus on the allowed protocols. Instead of
having to watch and interpret hundreds of protocols, analysts can carefully examine a
If analysts identify a protocol not authorized by the security policy, they know the
access control device has failed. This may be the result of malicious action, but it is more
often caused by misconfigurations. I am personally familiar with several intrusions specifically
caused by accidental removal of access control rules. During the period when
“shields were dropped,” intruders compromised exposed victims.
When NSM works in conjunction with well-defined security policies and appropriately
enforced access control, it offers the purest form of network auditing. Deviations
from policy are easier to identify and resolve. The traffic load on the sensor is decreased if
its field of view is restricted by access control devices. An organization’s bandwidth is
devoted to the protocols that contribute to productivity, not to sharing the latest pirated
movie over a peer-to-peer connection. Intruders have many fewer attack vectors, and
NSM analysts are intently watching those limited channels.
I mentioned packet or traffic scrubbing in Chapter 1 as a form of normalization, or the
process of removing ambiguities in a traffic stream. Chapter 3 briefly expanded on this
idea by mentioning dropping packets with invalid TCP flag combinations. Traffic scrubbing
is related to access control, in that scrubbing can sometimes deny traffic that doesn’t
meet accepted norms. Where scrubbing is implemented, traffic will be somewhat easier
Certain “schools” of intrusion detection spend most of their time analyzing odd
packet traces because they don’t collect much beyond packet headers.3 If unusual packets,
such as IP fragments, are not allowed to traverse the organization’s Internet gateway, they
cannot harm the site. The only justification for analyzing odd traffic is pure research. In
budget-challenged organizations, time is better spent dealing with application content as
shown in transcripts of full content data collected by using NSM techniques.
Traffic scrubbing is another way to make network traffic more deterministic. On some
networks, arbitrary protocols from arbitrary IP addresses are allowed to pass in and out
of the site’s Internet gateway. This sort of freedom helps the intruder and frustrates the
analyst. It is much more difficult to identify malicious traffic when analysts have no idea
what “normal” traffic looks like. Any steps that reduce the traffic variety will improve
NSM detection rates.
Proxies are applications that insert themselves between clients and servers for reasons of
security, monitoring, or performance. A client that wishes to speak to a server first connects
to the proxy. If the client’s protocol meets the proxy’s expectations, the proxy connects
on behalf of the client to the server. Figure 11.2 depicts this exchange.
For the case of HTTP traffic, a proxy like Nylon or Squid that implements the SOCKS
protocol can be used.4 From the prevention point of view, the key element of a proxy is its
protocol awareness. The proxy should be able to differentiate between legitimate and illegitimate
use of the port associated with a protocol. For example, an HTTP proxy should
be able to recognize and pass legitimate HTTP over port 80 TCP but block and log unauthorized
protocols running over port 80 TCP. This scenario appears in Figure 11.3.
Some applications tunnel their protocols within other protocols. For example, tools
like HTTPTunnel can encapsulate arbitrary protocols within well-formatted HTTP
requests.5 If the proxy is not smart enough to recognize that the supposed HTTP traffic
doesn’t behave like legitimate HTTP traffic, the proxy will pass it (see Figure 11.4).
A proxy can be used as an application-based form of access control. If the application
doesn’t speak the protocols expected by the proxy, the proxy won’t forward the traffic.
Many organizations proxy outbound HTTP traffic for purposes of monitoring unauthorized
Web surfing. NSM is more concerned with limiting an intruder’s opportunities for
communicating with the outside world. Projects like DCPhoneHome and Gray-World
are dedicated to finding ways to circumvent outbound access control methods like proxies
and firewall egress control rules.6
Beyond proxies lie application-layer firewalls. These products make decisions based on
the packet or stream application content. Firewall vendors are busy adding these features
to their products. Even Cisco routers, using their Network-Based Application Recognition
ARE ALL OF THESE “MIDDLEBOXES” A GOOD IDEA?
So many systems have been placed between clients and servers that they have their
own name—middleboxes. A middlebox is any device other than an access switch
or router between a client and a server. Because the Internet was designed with an
end-to-end infrastructure in mind, these intervening devices often impair the
functionality of protocols. A few examples of middleboxes include the following:
• Network and port address translation devices
• Load balancing appliances
So many middlebox devices exist that an informational RFC was written to
describe them (see http://www.faqs.org/rfcs/rfc3234.html). Security architects
must balance the need to protect systems against the possibility their interventions
will break desired features.
A locked-down network is a boring network. Organizations with
well-developed policies, access control, traffic scrubbing, and proxies don’t announce
discoveries of the latest back door on hundreds of their servers. They tend not to get
infected by the latest Trojans or contribute thousands of participants to the bigger bot
nets. They may also suffer the perverse effect of lower budgets because their security
strategies work too effectively, blinding management to the many disasters they avoided.
Keep this in mind if your analysts complain that their work is not challenging.
Detection is the process of collecting, identifying, validating, and escalating suspicious
events. It has traditionally been the heart of the reasoning behind deploying IDSs. Too
many resources have been devoted to the identification problem and fewer to issues of
validation and escalation. This section is a vendor-neutral examination of detecting
intrusions using NSM principles.
As mentioned, detection requires four phases.
1. Collection: The process begins with all traffic. Once the sensor performs collection, it
outputs observed traffic to the analyst. With respect to full content collection, the data
is a subset of all the traffic the sensor sees. Regarding other sorts of NSM data (session,
statistical, alert), the data represents certain aspects of the traffic seen by the sensor.
2. Identification: The analyst performs identification on the observed traffic, judging it
to be normal, suspicious, or malicious. This process sends events to the next stage.
3. Validation: The analyst categorizes the events into one of several incident categories.
Validation produces indications and warnings.
4. Escalation: The analyst forwards incidents to decision makers. Incidents contain
actionable intelligence that something malicious has been detected.
Collection involves accessing traffic for purposes of inspection and storage. Chapter 2
discussed these issues extensively. Managers are reminded to procure the most capable
hardware their budgets allow. Thankfully the preferred operating systems for NSM operations,
such as the BSDs and Linux, run on a variety of older equipment. In this respect
they outperform Windows-based alternatives, although it’s worth remembering that
Windows NT 4 can run on a system with 32MB of RAM.9 Nevertheless, few sensors collect
everything that passes by, nor should they. Because few sensors see and record all
traffic, the subset they do inspect is called observed traffic.
Not discussed in Chapter 2 was the issue of testing an organization’s collection strategy.
It’s extremely important to ensure that your collection device sees the traffic it
should. IDS community stars like Ron Gula and Marcus Ranum have stressed this reality
for the past decade. Common collection problems include the following:
• Misconfiguration or misapplication of filters or rules to eliminate undesirable events
• Deployment on links exceeding the sensor’s capacity
• Combining equipment without understanding the underlying technology
Any one of these problems results in missed events. For example, an engineer could
write a filter that ignores potentially damaging traffic in the hopes of reducing the
amount of undesirable traffic processed by the sensor. Consider the following scenario.
Cable modem users see lots of ARP traffic, as shown here.
Deployment of underpowered hardware on high-bandwidth links is a common problem.
Several organizations test IDSs under various network load and attack scenario conditions.
• Neohapsis provides the Open Security Evaluation Criteria (OSEC) at http://
• ICSA Labs, a division of TruSecure, offers criteria for testing IDSs at http://
• The NSS Group provides free and paid-only reviews at http://www.nss.co.uk/.
• Talisker’s site, while not reviewing products per se, categorizes them at http://
The IATF is organized by the National Security Agency (NSA) to foster discussion among developers and users of digital security products. The federal government is heavily represented. I
attended in a role as a security vendor with Foundstone. The October meeting
focused on Protection Profiles (PPs) for IDSs.12 According to the Common Criteria,
a PP is “an implementation-independent statement of security requirements
that is shown to address threats that exist in a specified environment.”13 According
to the National Institute of Standards and Technology (NIST) Computer Security
Resource Center (http://csrc.nist.gov/) Web site, the Common Criteria for IT
Security Evaluation is “a Common Language to Express Common Needs.”14
Unfortunately, many people at the IATF noted that the IDS PP doesn’t require a
product to be able to detect intrusions. Products evaluated against the PPs are
listed at http://niap.nist.gov/cc-scheme/ValidatedProducts.html.
This process seems driven by the National Information Assurance Partnership
(NIAP, at http://niap.nist.gov/), a joint NIST-NSA group “designed to meet the
security testing, evaluation, and assessment needs of both information technology
(IT) producers and consumers.”15 The people who validate products appear to be
part of the NIAP Common Criteria Evaluation and Validation Scheme (CCEVS)
Validation Body, a group jointly managed by NIST and NSA.16
I haven’t figured out how all of this works. For example, I don’t know how the
Evaluation Assurance Levels like “EAL4” fit in.17 I do know that companies trying to
get a product through this process can spend “half a million dollars” and 15+ months,
according to speakers at the IATF Forum. Is this better security? I don’t know yet.
Beyond issues with filters and high traffic loads, it’s important to deploy equipment
properly. I see too many posts to mailing lists describing tap outputs connected to hubs.
With a sensor connected to the hub, analysts think they’re collecting traffic. Unfortunately,
all they are collecting is proof that collisions in hubs attached to taps do not result
in retransmission of traffic. (We discussed this in Chapter 3.)
I highly recommend integrating NSM collection testing with independent audits, vulnerability
scanning, and penetration testing. If your NSM operation doesn’t light up like
a Christmas tree when an auditor or assessor is working, something’s not working properly.
Using the NSM data to validate an assessment is also a way to ensure that the assessors
are doing worthwhile work.
Once while doing commercial monitoring I watched an “auditor” assess our client. He
charged them thousands of dollars for a “penetration test.” Our client complained that we
didn’t report on the auditor’s activities. Because we collected every single packet entering
and leaving the small bank’s network, we reviewed our data for signs of penetration testing.
All we found was a single Nmap scan from the auditor’s home IP address. Based on
our findings, our client agreed not to hire that consultant for additional work.
Once all traffic is distilled into observed traffic, it’s time to make sense of it. Identification
is the process of recognizing packets as being unusual. Observed traffic is transformed into
events. Events and the traffic they represent can be categorized into three categories:
Normal traffic is anything that is expected to belong on an organization’s network.
HTTP, FTP, SMTP, POP3, DNS, and IPsec or SSL would be normal traffic for many
enterprises. Suspicious traffic appears odd at first glance but causes no damage to corporate
assets. While a new peer-to-peer protocol may be unwelcome, its presence does not
directly threaten to compromise the local Web or DNS server. An example of this sort of
traffic appears below and in a case study in Chapter 14. Malicious traffic is anything that
could negatively impact an organization’s security posture. Attacks of all sorts fit into the
malicious category and are considered incidents.
To fully appreciate the three classes of traffic, let’s take a look at a simple mini case
study. While writing this chapter I received the following alert in my Sguil console. (Sguil
is an open source interface to NSM data described in Chapter 10.)
The two elements of the signature that do the real work are shown in bold. The M
means Snort watches to see if the More fragments bit is set in the IP header of the packet.
The 25 means Snort checks to see if the “Data” or packet payload is fewer than 25 bytes.18
Fragments are an issue for IDSs because some products do not properly reassemble them.
There’s nothing inherently evil about fragmentation; it is IP’s way of accommodating
protocols that send large packets over links with smaller MTUs.
Let’s use ICMP as an example of a protocol than can send normal or fragmented traffic.
First take a look at normal ICMP traffic, such as might be issued with the ping command.
The –c switch says send a single ping.19
Analysts using NSM tools and tactics have the data they need to validate
events. Validation in NSM terms means assigning an event into one of several categories.
NSM practitioners generally recognize seven incident categories developed by the Air
Force in the mid-1990s. The Sguil project adopted these categories and defines them as
• Category I: Unauthorized Root/Admin Access
A Category I event occurs when an unauthorized party gains root or administrator control
of a target. Unauthorized parties are human adversaries, both unstructured and
structured threats. On UNIX-like systems, the root account is the “super-user,” generally
capable of taking any action desired by the unauthorized party. (Note that so-called
Trusted operating systems, like Sun Microsystem’s Trusted Solaris, divide the powers of
the root account among various operators. Compromise of any one of these accounts
on a Trusted operating system constitutes a Category I incident.) On Windows systems,
the administrator has nearly complete control of the computer, although some powers
remain with the SYSTEM account used internally by the operating system itself. (Compromise
of the SYSTEM account is considered a Category I event as well.) Category I incidents
are potentially the most damaging type of event.
• Category II: Unauthorized User Access
A Category II event occurs when an unauthorized party gains control of any nonroot
or nonadministrator account on a client computer. User accounts include those held by
people as well as applications. For example, services may be configured to run or interact
with various nonroot or nonadministrator accounts, such as apache for the Apache
Web server or IUSR_machinename for Microsoft’s IIS Web server. Category II incidents
are treated as though they will quickly escalate to Category I events. Skilled attackers
will elevate their privileges once they acquire user status on the victim machine.
• Category III: Attempted Unauthorized Access
A Category III event occurs when an unauthorized party attempts to gain root/administrator
or user-level access on a client computer. The exploitation attempt fails for one
of several reasons. First, the target may be properly patched to reject the attack. Second,
the attacker may find a vulnerable machine but may not be sufficiently skilled to execute
the attack. Third, the target may be vulnerable to the attack, but its configuration prevents
compromise. (For example, an IIS Web server may be vulnerable to an exploit
employed by a worm, but the default locations of critical files have been altered.)
• Category IV: Successful Denial-of-Service Attack
A Category IV event occurs when an adversary takes damaging action against the
resources or processes of a target machine or network. Denial-of-service attacks may
consume CPU cycles, bandwidth, hard drive space, user’s time, and many other
• Category V: Poor Security Practice or Policy Violation
A Category V event occurs when the NSM operation detects a condition that exposes
the client to unnecessary risk of exploitation. For example, should an analyst discover
that a client domain name system server allows zone transfers to all Internet users, he
or she will report the incident as a Category V event. (Zone transfers provide complete
information on the host names and IP addresses of client machines.) Violation of a client’s
security policy also constitutes a Category V incident. Should a client forbid the
use of peer-to-peer file-sharing applications, detections of Napster or Gnutella traffic
will be reported as Category V events.
• Category VI: Reconnaissance/Probes/Scans
A Category VI event occurs when an adversary attempts to learn about a target system
or network, with the presumed intent to later compromise that system or network.
Reconnaissance events include port scans, enumeration of NetBIOS shares on Windows
systems, inquiries concerning the version of applications on servers, unauthorized
zone transfers, and similar activity. Category VI activity also includes limited
attempts to guess user names and passwords. Sustained, intense guessing of user names
and passwords would be considered Category III events if unsuccessful.
• Category VII: Virus Infection
A Category VII event occurs when a client system becomes infected by a virus or worm.
Be aware of the difference between a virus and a worm. Viruses depend on one or both
of the following conditions: (1) human interaction is required to propagate the virus,
and (2) the virus must attach itself to a host file, such as an e-mail message, Word document,
or Web page. Worms, on the other hand, are capable of propagating themselves
without human interaction or host files. The discriminator for classifying a Category VII
event is the lack of human interaction with the target. Compromise via automated code
is a Category VII event, while compromise by a human threat is a Category I or II event.
If the nature of the compromise cannot be identified, use a Category I or II designation.
These categories are indicators of malicious activity, although classifying an event as a
Category I or II incident generally requires a high degree of confidence in the event data.
Typically the process of identification, validation, and escalation of high-impact events is
done in an integrated fashion. Analysts watching well-protected sites encounter few Category
I or II events, so these events often stand out like a sore thumb against the sea of
everyday Category III and VI events.
Formal definitions of indications and warnings tend to break down when the model
involves recognition of actual compromise. The definitions here are based on military
indications and warning (I&W) concepts. The military’s I&W model is based on identifying
activity and deploying countermeasures prior to the enemy’s launch of a physical,
violent attack. If this physical attack, involving aircraft firing missiles or terrorists exploding
bombs, is compared to an intrusion, there’s no need to talk in terms of indications or
warnings. Once shells start flying, there’s no doubt as to the enemy’s intentions.
For NSM, it’s a fuzzier concept. If an analyst discovers an intrusion, one stage of the
game is over. Talk of indications and warnings seems “overcome by events.” The victim is
compromised; what more is there to do or say? However, it’s crucial to recognize there’s
no “blinking red light” in NSM. Even when analysts possess concrete evidence of compromise,
it may not be what they think.
Thus far each step has been a thought exercise for the analyst. The sensor transforms
all traffic into a subset of observed traffic. Analysts access that traffic or are provided
alerts based on it. They perform identification by judging traffic as normal, suspicious, or
malicious. At the point where they are ready to physically classify an event, they must
have a mechanism for validating the information presented by their NSM console.
Sguil (see Chapter 10) provides the following open source example of validating an
event. Look at the process of validating an event in Sguil. First, the analyst reviews alerts
and observed traffic information on her console (see Figure 11.13).
All of the alerts in this Sguil console are unvalidated. The “ST” column at the far left of
each of the top three panes reads “RT,” which means “real time.” The highlighted alert
shows an “MS-SQL Worm propagation attempt.” This is the result of the SQL Slammer
SHORT-TERM INCIDENT CONTAINMENT
Short-term incident containment (STIC) is the step taken immediately upon confirmation
that an intrusion has occurred. When a system is compromised, incident response
teams react in one or more of the following ways.
1. Shut down the switch port to which the target attaches to the network.
2. Remove the physical cable connecting the target to the network.
3. Install a new access control rule in a filtering router or firewall to deny traffic to and
from the target.
Any one of these steps is an appropriate short-term response to discovery of an intrusion.
I have dealt with only a handful of cases where an intruder was allowed completely uninterrupted
access to a victim as soon as its owner recognized it was compromised. Most
sites want to interrupt the intruder’s access to the victim. Note that I do not list “shut
down the server” as an acceptable STIC action. Yanking the power cable or shutting down
the system destroys valuable volatile forensic evidence.
Initiating STIC gives the incident response team time and breathing room to formulate
a medium-term response. This may involve “fish-bowling” the system to watch for
additional intruder activity or patching/rebuilding the victim and returning it to duty. In
both cases, emergency NSM plays a role.
EMERGENCY NETWORK SECURITY MONITORING
While STIC is in force and once it has been lifted, the NSM operation should watch for
additional signs of the intruder and implement enhanced monitoring. In cases where
round-the-clock, wide-open full content data collection is not deployed, some sort of
limited full content data collection against the victim and/or the source of the intrusion
should be started. As we saw in earlier chapters, the only common denominator in an
intrusion is the victim IP. Attackers can perform any phase of the compromise from a
variety of source IPs. Once a victim is recognized as being compromised, it’s incredibly
helpful to begin full content data collection on the victim IP address. Having the proper
equipment in place prior to a compromise, even if it’s only ready to start collecting when
instructed, assists the incident response process enormously.
Emergency NSM is not necessary if a site already relies on a robust NSM operation. If
the organization collects all of the full content, session, alert, and statistical data it needs,
collection of emergency data is irrelevant. In many cases, especially those involving highbandwidth
sites, ad hoc monitoring is the only option. Once a victim is identified, ad hoc
sensors should be deployed to capture whatever they can.
It’s amazing how many organizations muddle through incident response scenarios
without understanding an intrusion. It’s like a general directing forces in battle without
knowing if they are taking the next hill, being captured by the enemy, or deserting for
Canada. Emergency NSM is one of the best ways to scope the extent of the incident, identify
countermeasures, and validate the effectiveness of remediation. How does a site really
know if it has successfully shut out an intruder? With NSM, the answer is simple: no evidence
of suspicious activity appears after implementation of countermeasures. Without
this validation mechanism, the effectiveness of remediation is often indeterminate.
I volunteered to start emergency NSM. The client provided six Proliant servers,
on which I installed FreeBSD 4.5 RELEASE on each system. I placed each of the
new sensors in critical choke points on the client network where I suspected the
intruder might have access. I started collecting full content data with Tcpdump
and statistical data with Trafd.27 (Back then I was not yet aware of Argus as a session
data collection tool.)
Shortly after I started monitoring, I captured numerous outbound X protocol
sessions to hosts around the globe. The intruder had compromised numerous
UNIX systems and installed entries in their crontab files. These entries instructed
the victims to “phone home” at regular intervals, during which the intruder would
issue commands. In one of the X sessions, I watched the intruder for 53 minutes.
He moved from system to system using valid credentials and built-in remote
access services like Telnet and rlogin. He unknowingly led me to many of the systems
he had compromised.
EMERGENCY NSM IN ACTION
I have had the good fortune to perform several incident response activities at several
huge corporations. One of the sites suffered systematic, long-term compromise
during a three-year period. Several colleagues and I were asked to figure out what
was happening and to try to cut off the intruder’s access to the victim company.
We performed host-based live response on systems the corporation suspected
of being compromised. The results weren’t as helpful as we had hoped, as live
response techniques largely rely on the integrity of the host’s kernel. If the victim’s
kernel were modified by a loadable kernel module root kit, we wouldn’t be able to
trust the output of commands run to gather host-based evidence.
Using this information, we began an “intruder-led” incident response. All of the
systems the intruder contacted were rebuilt and patched, and a site-wide password
change was performed. When the intruder returned, he couldn’t access those systems,
but he found a few others he hadn’t touched in round one. Following the
end of his second observed X session, we remediated the new list of compromised
systems. Once the intruder had no luck reaching any system on the client network,
we considered it more or less “secure.” I continued performing emergency NSM
for several months to validate the success of the incident response plan, eventually
replacing full content data collection with Argus.
The most useful emergency NSM data is session-based. Argus can be quickly deployed
on a FreeBSD-based system and placed on a live network without concern for signatures,
manning, or other operational NSM issues. Argus data is very compact, and its contentneutral
approach can be used to validate an intruder’s presence if his or her IP address or
back door TCP or UDP port is known. Beyond this point lies full-blown incident
response, which I leave for other books beyond the scope of this one.
BACK TO ASSESSMENT
We end our journey through the security process by returning to assessment. We’re back
at this stage to discuss a final NSM best practice that is frequently overlooked: analyst
feedback. Front-line analysts have the best seat in the house when it comes to understanding
the effectiveness of an NSM operation. Their opinions matter!
Too often analyst opinions take a back seat to developer requirements. I’ve seen many
NSM operations struggle to overcome developer-led initiatives. While developers are frequently
the most technically savvy members of any NSM operation, they are not in the
best position to judge the needs of the analysts they support. Analysts should have a way
to communicate their opinions on the effectiveness of their tool sets to developers.
The most important channel for communication involves IDS signature refinement.
Many shops task engineers with developing and deploying signatures. Analysts are left to
deal with the consequences by validating events. The signature might be terrible, alerting
on a wide variety of benign traffic. Managers should ensure that analysts have an easy way
to let engineers know if their signatures operate properly. A simple way to accomplish this
goal is to offer a special “incident” category for signature feedback. By validating events
with this unique value, engineers can quickly determine analysts’ satisfaction with rules.
Engineers should remember that rules that cause too many useless alerts actually harm
detection efforts. Analysts would be better served by more accurate alerts that represent
truly significant events.
Bejtlich, R. (2004). The Tao of Network Security Monitoring: Beyond Intrusion Detection. Addison-Wesley Professional; 1 edition.