homepage
Open menu
Go one level top
  • Train and Certify
    Train and Certify

    Immediately apply the skills and techniques learned in SANS courses, ranges, and summits

    • Overview
    • Courses
      • Overview
      • Full Course List
      • By Focus Areas
        • Cloud Security
        • Cyber Defense
        • Cybersecurity and IT Essentials
        • DFIR
        • Industrial Control Systems
        • Offensive Operations
        • Management, Legal, and Audit
      • By Skill Levels
        • New to Cyber
        • Essentials
        • Advanced
        • Expert
      • Training Formats
        • OnDemand
        • In-Person
        • Live Online
      • Course Demos
    • Training Roadmaps
      • Skills Roadmap
      • Focus Area Job Roles
        • Cyber Defence Job Roles
        • Offensive Operations Job Roles
        • DFIR Job Roles
        • Cloud Job Roles
        • ICS Job Roles
        • Leadership Job Roles
      • NICE Framework
        • Security Provisionals
        • Operate and Maintain
        • Oversee and Govern
        • Protect and Defend
        • Analyze
        • Collect and Operate
        • Investigate
        • Industrial Control Systems
    • GIAC Certifications
    • Training Events & Summits
      • Events Overview
      • Event Locations
        • Asia
        • Australia & New Zealand
        • Latin America
        • Mainland Europe
        • Middle East & Africa
        • Scandinavia
        • United Kingdom & Ireland
        • United States & Canada
      • Summits
    • OnDemand
    • Get Started in Cyber
      • Overview
      • Degree and Certificate Programs
      • Scholarships
    • Cyber Ranges
  • Manage Your Team
    Manage Your Team

    Build a world-class cyber team with our workforce development programs

    • Overview
    • Why Work with SANS
    • Group Purchasing
    • Build Your Team
      • Team Development
      • Assessments
      • Private Training
      • Hire Cyber Professionals
      • By Industry
        • Health Care
        • Industrial Control Systems Security
        • Military
    • Leadership Training
  • Security Awareness
    Security Awareness

    Increase your staff’s cyber awareness, help them change their behaviors, and reduce your organizational risk

    • Overview
    • Products & Services
      • Security Awareness Training
        • EndUser Training
        • Phishing Platform
      • Specialized
        • Developer Training
        • ICS Engineer Training
        • NERC CIP Training
        • IT Administrator
      • Risk Assessments
        • Knowledge Assessment
        • Culture Assessment
        • Behavioral Risk Assessment
    • OUCH! Newsletter
    • Career Development
      • Overview
      • Training & Courses
      • Professional Credential
    • Blog
    • Partners
    • Reports & Case Studies
  • Resources
    Resources

    Enhance your skills with access to thousands of free resources, 150+ instructor-developed tools, and the latest cybersecurity news and analysis

    • Overview
    • Webcasts
    • Free Cybersecurity Events
      • Free Events Overview
      • Summits
      • Solutions Forums
      • Community Nights
    • Content
      • Newsletters
        • NewsBites
        • @RISK
        • OUCH! Newsletter
      • Blog
      • Podcasts
      • Summit Presentations
      • Posters & Cheat Sheets
    • Research
      • White Papers
      • Security Policies
    • Tools
    • Focus Areas
      • Cyber Defense
      • Cloud Security
      • Digital Forensics & Incident Response
      • Industrial Control Systems
      • Cyber Security Leadership
      • Offensive Operations
  • Get Involved
    Get Involved

    Help keep the cyber community one step ahead of threats. Join the SANS community or begin your journey of becoming a SANS Certified Instructor today.

    • Overview
    • Join the Community
    • Work Study
    • Teach for SANS
    • CISO Network
    • Partnerships
    • Sponsorship Opportunities
  • About
    About

    Learn more about how SANS empowers and educates current and future cybersecurity practitioners with knowledge and skills

    • SANS
      • Overview
      • Our Founder
      • Awards
    • Instructors
      • Our Instructors
      • Full Instructor List
    • Mission
      • Our Mission
      • Diversity
      • Scholarships
    • Contact
      • Contact Customer Service
      • Contact Sales
      • Press & Media Enquiries
    • Frequent Asked Questions
    • Customer Reviews
    • Press
    • Careers
  • Contact Sales
  • SANS Sites
    • GIAC Security Certifications
    • Internet Storm Center
    • SANS Technology Institute
    • Security Awareness Training
  • Search
  • Log In
  • Join
    • Account Dashboard
    • Log Out
  1. Home >
  2. Blog >
  3. Timeline analysis with Apache Spark and Python
Juan Leaniz

Timeline analysis with Apache Spark and Python

September 16, 2015

This blog post introduces a technique for timeline analysis that mixes a bit of data science and domain-specific knowledge (file-systems, DFIR).

Analyzing CSV formatted timelines by loading them with Excel or any other spreadsheet application can be inefficient, even impossible at times. It all depends on the size of the timelines and how many different timelines or systems we are analyzing.

Looking at timelines that are gigabytes in size or trying to correlate data between 10 different system's timelines does not scale well with traditional tools.

One way to approach this problem is to leverage some of the open source data analysis tools that are available today. Apache Spark is a fast and general engine for big data processing. PySpark is its Python API, which in combination with Matplotlib, Pandas and NumPY, will allow you to drill down and analyze large amounts of data using SQL-syntax statements. This can come in handy for things like filtering, combining timelines and visualizing some useful statistics.

These tools can be easily installed on your SANS DFIR Workstation, although if you plan on analyzing a few TBs of data i would recommend setting up a Spark cluster separately.

The reader is assumed to have a basic understanding of Python programming.

Quick intro to Spark

This section is a quick introduction to PySpark and basic Spark concepts. For further details please check the Spark documentation, it is well written and fairly up to date. This section of the blog post does not intend to be a Spark tutorial, I'll barely scratch the surface of what's possible with Spark.

From apache.spark.org: Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.

Spark's documentation site covers the basics and can get you up and running with minimal effort. Best way to get started is to spin up an Amazon EMR cluster, although that's not free. You can always spin up a few VMs in your basement lab :)

In the context of this DFIR exercise, we'll leverage Spark SQL and Spark's DataFrame API. The DataFrame API is similar Python's Pandas. Essentially, it allows you to access any kind of data in a structured, columnar format. This is easy to do when you are handling CSV files.

Folks over at Databricks have been kind enough to publish a Spark package that can convert CSV files (with headers) into Spark DataFrames with one simple Python line of code. Isn't that nice? Although, if you are brave enough, you can do this yourself with Spark and some RDD transformations.

To summarize, you'll need the following apps/libraries on your DFIR workstation:

  • Access to a Spark (1.3+) cluster
  • Spark-CSV package from Databricks
  • Python 2.6+
  • Pandas
  • NumPy

It should be noted that the Spark-CSV package needs to be built from source (Scala) and for that there are a few other requirements that are out of the scope of this blog post, as is setting up a Spark cluster.

Analysis (IPython notebook)

Now, let's get on with the fun stuff!

The following is an example of how these tools can be used to analyze a Windows 7 system's timeline generated with log2timeline.

You can find the IPython Notebook over on one of my github repos

Share:
TwitterLinkedInFacebook
Copy url Url was copied to clipboard
Subscribe to SANS Newsletters
Receive curated news, vulnerabilities, & security awareness tips
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

By providing this information, you agree to the processing of your personal data by SANS as described in our Privacy Policy.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Tags:
  • Digital Forensics and Incident Response

Related Content

Blog
Top_10_Summit_Talks_2022.png
Cybersecurity Insights, Digital Forensics and Incident Response, Cyber Defense, Cloud Security, Open-Source Intelligence (OSINT), Security Management, Legal, and Audit, Security Awareness
December 5, 2022
Top 10 SANS Summits Talks of 2022
This year, SANS hosted 13 Summits with 246 talks. Here were the top-rated talks of the year.
370x370-person-placeholder.png
Alison Kim
read more
Blog
FOR577.png
Digital Forensics and Incident Response
September 22, 2022
NEW SANS DFIR COURSE IN DEVELOPMENT | FOR577: LINUX Incident Response & Analysis
FOR577: Linux Incident Response & Analysis course teaches how Linux systems work and how to respond and investigate attacks effectively.
Viv_Ross_370x370.png
Viviana Ross
read more
Blog
Untitled_design-43.png
Digital Forensics and Incident Response, Cybersecurity and IT Essentials, Industrial Control Systems Security, Purple Team, Open-Source Intelligence (OSINT), Penetration Testing and Red Teaming, Cyber Defense, Cloud Security, Security Management, Legal, and Audit
December 8, 2021
Good News: SANS Virtual Summits Will Remain FREE for the Community in 2022
They’re virtual. They’re global. They’re free.
370x370-person-placeholder.png
Emily Blades
read more
  • Register to Learn
  • Courses
  • Certifications
  • Degree Programs
  • Cyber Ranges
  • Job Tools
  • Security Policy Project
  • Posters & Cheat Sheets
  • White Papers
  • Focus Areas
  • Cyber Defense
  • Cloud Security
  • Cybersecurity Leadership
  • Digital Forensics
  • Industrial Control Systems
  • Offensive Operations
Subscribe to SANS Newsletters
Receive curated news, vulnerabilities, & security awareness tips
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

By providing this information, you agree to the processing of your personal data by SANS as described in our Privacy Policy.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
  • © 2023 SANS™ Institute
  • Privacy Policy
  • Contact
  • Careers
  • Twitter
  • Facebook
  • Youtube
  • LinkedIn