homepage
Open menu Go one level top
  • Train and Certify
    • Get Started in Cyber
    • Courses & Certifications
    • Training Roadmap
    • Search For Training
    • Online Training
    • OnDemand
    • Live Training
    • Summits
    • Cyber Ranges
    • College Degrees & Certificates
    • NICE Framework
    • DoDD 8140
    • Specials
  • Manage Your Team
    • Overview
    • Security Awareness Training
    • Voucher Program
    • Private Training
    • Workforce Development
    • Skill Assessments
    • Hiring Opportunities
  • Resources
    • Overview
    • Reading Room
    • Webcasts
    • Newsletters
    • Blog
    • Tip of The Day
    • Posters
    • Top 25 Programming Errors
    • The Critical Security Controls
    • Security Policy Project
    • Critical Vulnerability Recaps
    • Affiliate Directory
  • Focus Areas
    • Blue Team Operations
    • Cloud Security
    • Digital Forensics & Incident Response
    • Industrial Control Systems
    • Leadership
    • Offensive Operations
  • Get Involved
    • Overview
    • SANS Community
    • CyberTalent
    • Work Study
    • Instructor Development
    • Sponsorship Opportunities
    • COINS
  • About
    • About SANS
    • Why SANS?
    • Instructors
    • Cybersecurity Innovation Awards
    • Contact
    • Frequently Asked Questions
    • Customer Reviews
    • Press Room
  • Log In
  • Join
  • Contact Us
  • SANS Sites
    • GIAC Security Certifications
    • Internet Storm Center
    • SANS Technology Institute
    • Security Awareness Training
  • Search
  1. Home >
  2. Blog >
  3. Extracting Known Bad Hash Set From NSRL
Tony Rodrigues

Extracting Known Bad Hash Set From NSRL

February 22, 2010

Hash filtering is a time-saving technique for a computer forensics examiner when working on a huge disk image. In a nutshell, this technique can filter out all those files in your image that belong to the operating system or well-known software packages. This will let the examiner focus on unknown files, reducing the scope of the investigation. After all, there's no point in spending time checking files we already know.

This filtering operation is based on hashes. Usually, we calculate the hash for every file in the image and check it against a list of hashes previously calculated over known good files. We call this list the known good hash set. All files with hashes matching the list are filtered out.

On the other hand, we would like to know if there are malicious files in our computer forensics case image. Again, the technique works by calculating the hash for every file in the image, looking for matches in a list containing pre-calculated hashes for known malicious files, viruses, cracker's tools, or anything you judge to be a malicious file. We call this list the known bad hash set and we want to be alerted when matches occur.

It's not an easy task to keep such hashsets, and they need to be huge in order to be effective. Thankfully, others are collecting files and calculating hashes for us. The National Institute for Standards and Technology maintains the National Software Reference Library or NSRL, which is one of the best hashset libraries available, it's public and free!

Unfortunately, life is not a bed of roses.

Practically all tools that use hash sets for filtering have a way to say "this is my known good hash set, ignore everything found here" and "this is my known bad hash set, ring all bells when something matches here". The SleuthKit tool SORTER does that using -x (for known good) and -a (for known bad). However, the NSRL hash set contains both good and bad files. If we use it as known good, there's a risk of ignoring malicious files in the image. If we use it as known bad, we will have thousands of false positives. What to do?

Doug White, from NIST, has given me good advice.

The NSRL file that correlates hashes and file names is NSRLFile.txt while NSRLProd.txt softs the files by classification. The known bad files belong to products classified as "Hacker Tool". So, we can separate them. You can use MS LogParse, AWK or any programming language. I prefer Perl and here is the code:

#!/usr/bin/perl -w
# Extracts known good and known bad hashsets from NSRL
# uso: nsrlext.pl -n <nsrl files comma separated> -p <nsrl prod files comma separated> -g <known good txt> -b <known bad txt> [-h]
#
# -n :nsrl files comma separated. Ex: -n c:\nsrl\RDA_225_A\NSRLFile.txt,c:\nsrl\RDA_225_B\NSRLFile.txt
# -p :nsrl prod files comma separated. Ex: -p c:\nsrl\RDA_225_A\NSRLProd.txt,c:\nsrl\RDA_225_B\NSRLProd.txt
# -g :known good txt filename. Ex: -g good.txt
# -b :known bad txt filename. Ex: -b bad.txt
# -h :help
#
#
use Getopt::Std;

my $ver="0.1";

#opcoes
%args = ( );
getopts("hn:p:g:b:", \%args);

#help
if ($args{h}) {
    &cabecalho;
    print <<DETALHE ;
    uso: nsrlext.pl -n nsrl_files_comma_separated -p nsrl_prod_files_comma_separated [-g known_good_txt] [-b known_bad_txt] [-h]

    -n :nsrl files comma separated. Ex: -n c:\nsrl\RDA_225_A\NSRLFile.txt,c:\nsrl\RDA_225_B\NSRLFile.txt
    -p :nsrl prod files comma separated. Ex: -p c:\nsrl\RDA_225_A\NSRLProd.txt,c:\nsrl\RDA_225_B\NSRLProd.txt
    -g :known good txt filename. Ex: -g good.txt
    -b :known bad txt filename. Ex: -b bad.txt
    -h :help

DETALHE
    exit;
}

die "Enter the NSRL hashset file list (comma delimited)\n" unless ($args{n});
die "Enter the NSRL product file list (comma delimited)\n" unless ($args{p});

die "Enter known good and/or known bad output filenames\n" unless (($args{g}) || ($args{b}));

my %hack;

&cabecalho;

#Prod files
my @prod = split(/,/, $args{p});

foreach $item (@prod) {
    open(PRODUCT, "< $item");

    while (<PRODUCT>) {
        chomp;
        my @line = split(/,/, $_);

        #create a hash of hacker tool codes
        $hack{$line[0]} = $item if ($line[6] =~ /Hacker Tool/);
    }

    close(PRODUCT);
}

#hashset files
my @hset = split(/,/, $args{n});

open(BAD, "> $args{b}") if ($args{b});

open(GOOD, "> $args{g}") if ($args{g});

my $i=0;

foreach $item (@hset) {
    open(NSRL, "< $item");

    while (<NSRL>) {

        #stdout feedback
        print ">" if (($i % 10000) == 0);

        my @line = split(/,/, $_);

        if ($hack{$line[5]}) {
            #is a hacker tool
            print BAD $_ if ($args{b});
        }
        else {
            print GOOD $_ if ($args{g});
        }

        $i++;
        }

        close(NSRL);
    }

    print "\nDone !\n";

    close(BAD)  if ($args{b});
    close(GOOD) if ($args{g});

### Sub rotinas  ####

sub cabecalho {
    print <<CABEC;

    nsrlext.pl v$ver
    Extracts known good and known bad hashsets from NSRL
    Tony Rodrigues
    dartagnham at gmail dot com
--------------------------------------------------------------------------

CABEC

}

#-----EOF-------

Usage: 

nsrlext.pl -n c:\nsrl\RDA_225_A\NSRLFile.txt,c:\nsrl\RDA_225_B\NSRLFile.txt -p c:\nsrl\RDA_225_A\NSRLProd.txt,c:\nsrl\RDA_225_B\NSRLProd.txt -b NSRLBad.txt -g NSRLGood.txt

This script runs in both Windows and Linux, it just requires Perl.

After this, we can use both hash sets in Autopsy, TSK Sorter or even with "md5deep/sha1deep".

Tony Rodrigues has over 20 years of IT experience and 7 years in Information Security management. He currently holds CISSP, CFCP and Security+ certifications and has been in charge of several corporate digital investigations in Brazil. He loves CAINE Live CD and writes about Computer Forensics/Incident Response for forcomp.blogspot.com.

Share:
TwitterLinkedInFacebook
Copy url Url was copied to clipboard
Subscribe to SANS Newsletters
Join the SANS Community to receive the latest curated cybersecurity news, vulnerabilities, and mitigations, training opportunities, plus our webcast schedule.
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kingdom of Saudi Arabia
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia (Slovak Republic)
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

Tags:
  • Digital Forensics and Incident Response

Related Content

Blog
SUMMIT_Free_SANS_2021_Summits_Teaser.jpg
Digital Forensics and Incident Response, Cyber Defense Essentials, Industrial Control Systems Security, Purple Team, Blue Team Operations, Penetration Testing and Ethical Hacking, Cloud Security, Security Management, Legal, and Audit
November 30, 2020
Good News: SANS Virtual Summits Will Be FREE for the Community in 2021
They’re virtual. They’re global. They’re free.
Emily Blades
read more
Blog
En.png
Digital Forensics and Incident Response
November 24, 2020
SANS DFIR Presenta Nuevos Webcasts en Español
SANS DFIR presenta sus nuevos episodios en Español! En este blog podrás ver todos los episodios con concluciones y con recursos para aprender DFIR
SANS DFIR
read more
Blog
shutterstock_1473864617.jpg
Digital Forensics and Incident Response
October 14, 2020
Defense Spotlight: Finding Hidden Windows Services
Attackers can make a Window services disappear from view. Fortunately these services can still be found, through unconventional discovery techniques.
370x370_Joshua-Wright.jpg
Joshua Wright
read more
  • Register to Learn
  • Courses
  • Certifications
  • Degree Programs
  • Cyber Ranges
  • Job Tools
  • Security Policy Project
  • Posters
  • The Critical Security Controls
  • Focus Areas
  • Blue Team Operations
  • Cloud Security
  • Cybersecurity Leadership
  • Digital Forensics
  • Industrial Control Systems
  • Offensive Operations
Subscribe to SANS Newsletters
Join the SANS Community to receive the latest curated cybersecurity news, vulnerabilities, and mitigations, training opportunities, plus our webcast schedule.
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kingdom of Saudi Arabia
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia (Slovak Republic)
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe
  • © 2021 SANS™ Institute
  • Privacy Policy
  • Contact
  • Twitter
  • Facebook
  • Youtube
  • LinkedIn