homepage
Open menu
Go one level top
  • Train and Certify
    Train and Certify

    Immediately apply the skills and techniques learned in SANS courses, ranges, and summits

    • Overview
    • Courses
      • Overview
      • Full Course List
      • By Focus Areas
        • Cloud Security
        • Cyber Defense
        • Cybersecurity and IT Essentials
        • DFIR
        • Industrial Control Systems
        • Offensive Operations
        • Management, Legal, and Audit
      • By Skill Levels
        • New to Cyber
        • Essentials
        • Advanced
        • Expert
      • Training Formats
        • OnDemand
        • In-Person
        • Live Online
      • Course Demos
    • Training Roadmaps
      • Skills Roadmap
      • Focus Area Job Roles
        • Cyber Defence Job Roles
        • Offensive Operations Job Roles
        • DFIR Job Roles
        • Cloud Job Roles
        • ICS Job Roles
        • Leadership Job Roles
      • NICE Framework
        • Security Provisionals
        • Operate and Maintain
        • Oversee and Govern
        • Protect and Defend
        • Analyze
        • Collect and Operate
        • Investigate
        • Industrial Control Systems
      • European Skills Framework
    • GIAC Certifications
    • Training Events & Summits
      • Events Overview
      • Event Locations
        • Asia
        • Australia & New Zealand
        • Latin America
        • Mainland Europe
        • Middle East & Africa
        • Scandinavia
        • United Kingdom & Ireland
        • United States & Canada
      • Summits
    • OnDemand
    • Get Started in Cyber
      • Overview
      • Degree and Certificate Programs
      • Scholarships
    • Cyber Ranges
  • Manage Your Team
    Manage Your Team

    Build a world-class cyber team with our workforce development programs

    • Overview
    • Why Work with SANS
    • Group Purchasing
    • Build Your Team
      • Team Development
      • Assessments
      • Private Training
      • Hire Cyber Professionals
      • By Industry
        • Health Care
        • Industrial Control Systems Security
        • Military
    • Leadership Training
  • Security Awareness
    Security Awareness

    Increase your staff’s cyber awareness, help them change their behaviors, and reduce your organizational risk

    • Overview
    • Products & Services
      • Security Awareness Training
        • EndUser Training
        • Phishing Platform
      • Specialized
        • Developer Training
        • ICS Engineer Training
        • NERC CIP Training
        • IT Administrator
      • Risk Assessments
        • Knowledge Assessment
        • Culture Assessment
        • Behavioral Risk Assessment
    • OUCH! Newsletter
    • Career Development
      • Overview
      • Training & Courses
      • Professional Credential
    • Blog
    • Partners
    • Reports & Case Studies
  • Resources
    Resources

    Enhance your skills with access to thousands of free resources, 150+ instructor-developed tools, and the latest cybersecurity news and analysis

    • Overview
    • Webcasts
    • Free Cybersecurity Events
      • Free Events Overview
      • Summits
      • Solutions Forums
      • Community Nights
    • Content
      • Newsletters
        • NewsBites
        • @RISK
        • OUCH! Newsletter
      • Blog
      • Podcasts
      • Summit Presentations
      • Posters & Cheat Sheets
    • Research
      • White Papers
      • Security Policies
    • Tools
    • Focus Areas
      • Cyber Defense
      • Cloud Security
      • Digital Forensics & Incident Response
      • Industrial Control Systems
      • Cyber Security Leadership
      • Offensive Operations
  • Get Involved
    Get Involved

    Help keep the cyber community one step ahead of threats. Join the SANS community or begin your journey of becoming a SANS Certified Instructor today.

    • Overview
    • Join the Community
    • Work Study
    • Teach for SANS
    • CISO Network
    • Partnerships
    • Sponsorship Opportunities
  • About
    About

    Learn more about how SANS empowers and educates current and future cybersecurity practitioners with knowledge and skills

    • SANS
      • Overview
      • Our Founder
      • Awards
    • Instructors
      • Our Instructors
      • Full Instructor List
    • Mission
      • Our Mission
      • Diversity
      • Scholarships
    • Contact
      • Contact Customer Service
      • Contact Sales
      • Press & Media Enquiries
    • Frequent Asked Questions
    • Customer Reviews
    • Press
    • Careers
  • Contact Sales
  • SANS Sites
    • GIAC Security Certifications
    • Internet Storm Center
    • SANS Technology Institute
    • Security Awareness Training
  • Search
  • Log In
  • Join
    • Account Dashboard
    • Log Out
  1. Home >
  2. Blog >
  3. Understanding EXT4 (Part 6): Directories
370x370_Hal-Pomeranz.jpg
Hal Pomeranz

Understanding EXT4 (Part 6): Directories

June 7, 2017

Hal Pomeranz, Deer Run Associates

Many years ago, I started this series of blog posts documenting the internals of the EXT4 file system. One item I never got around to was documenting how directories were structured in EXT. Some recent research has caused me to dive back into this topic, and given me an excuse to add additional detail to this EXT4 series.

If you go back and read earlier posts in this series, you will note that the EXT inode does not store file names. Directories are the only place in traditional Unix file systems where file name information is kept. In EXT, and the classic Unix file systems it is evolved from, directories are simply special files that associate file names with inode numbers.

Furthermore, in the simplest case, EXT directories are just sequential lists of file entries. The entries aren't even sorted. For the most part, directory entries in EXT are simply added to the directory file in the order files are created in the directory.

Let's create a small directory as an example:

$ <strong>cd /tmp</strong>
$ <strong>mkdir testing</strong>
$ <strong>cd testing</strong>
/tmp/testing
$ <strong>touch this is a simple directory</strong>
$ <strong>ls</strong>
a  directory  is  simple  this

When I list the directory with "ls", the file names are displayed in alphabetical order. But when I look at things in my trusty hex editor, you can see the directory entries in their actual order:

image.png

I have also used highlighting to show the fields in several directory entries. Every directory must start with the "." and ".." entries. These links point to the directory itself (".") and the parent directory ("..").

Each directory entry contains five fields:

Byte 0-3: Inode number
     4-5: Total entry length
     6  : File name length
     7  : File type
     8- : File name

Directory entries are variable length, and the second field tracks the total length of the entry. Entries must be aligned on four byte boundary. So in the case of the "." entry, the total entry length is 12 bytes, even though the entry could fit into 9 bytes. Field three is the length of the file name- one byte in this case. The extra three bytes in the directory entry just contain nulls.

In classic Unix file systems, the file name length field was two bytes. But since Linux doesn't allow file names longer than 255 characters, the EXT developers decided to use only one byte for the file name length and to use the second byte to hold the file type. While the file type is also stored in the inode, having this information in the directory entry is more efficient for operations like "ls -F" where the file type is displayed along with the file name.

File type is a numeric field defined as follows:

o: Unknown
1: Regular file
2: Directory
3: Character special device
4: Block special device
5: FIFO (named pipe)
6: Socket
7: Symlink

In the case of the "." and ".." links, the file type is "2", which is "directory". This is the only case where Unix file systems allow hard links to directories.

Finally, note the entry length field of the final file entry in the directory. The entry size is 0x0FB4, or 4020 bytes. This consumes all remaining bytes to the end of the block. The last directory entry is always aligned to the end of the block. Directory entries may not cross block boundaries.

Deleting Files

Now let's observe what happens when I delete the file "simple" from our example directory:

image.png

This is the standard behavior when files are deleted in classic Unix file systems. The entry before the deleted file simply grows to consume the "deleted" entry. But the entry for the deleted file is otherwise unchanged and can be carved for.

However, this unused "slack" space from deleted directory entries can be reused when new files are added to the directory. For example, here's what happens when I add a file named "new" to our example directory:

image.png

Large Directories

One of the consistent criticisms of classic Unix file systems has been that as directories get large, performance suffers. If directories are nothing more than unsorted lists of files, searching for the entry you want means sequentially parsing a large number of directory entries. As the directory grows, the average search time increases linearly.

More modern file systems solve this issue by organizing directory entries in some sort of searchable data structure. For example, NTFS uses B-trees. Starting in EXT3, developers created a hashed tree system, dubbed "htree", for organizing directory entries. This is now standard in EXT4 and can be seen when the directory grows larger than a single block.

Here's the first block of my /usr/share/doc directory:

image.png

Things start getting really interesting after the first 24 bytes used by the "." and ".." entries. The rest of the block is used by the dx_root structure that defines the root of the htree. Technically, the "." and ".." entries are part of dx_root as well and the data structure consumes the entire first block, but the interesting fields start 24 bytes into the block:

Byte 0-23 : "." and ".." entries
     24-27: Reserved (zero)
     28   : Hashing algorithm used
     29   : Size of dx_entry records (normally 8)
     30   : Depth of tree
     31   : Flags (unused, normally 0)
     32-33: Max dx_entry records possible
     34-35: Actual number of dx_entry records to follow
     36-39: Relative block number for "zero hash"
     rest : dx_entry records

After four null bytes at offset 24-27, a single byte specifies the hash algorithm used by the htree. The byte codes are documented here, but 0x01 seems to be the standard, which is a hashing algorithm based on MD4.

Next comes the size of the dx_entry records, which are used to index the various blocks in the htree. These records are always 8 bytes long and will be described further below.

Bytes 32-33 document the maximum number of dx_entry records that can be stuffed into this block after the initial fields of the dx_root structure. The dx_entry records start 40 bytes into the block, so you might assume that this max value is the block size of 4096 bytes, minus the 40 bytes of dx_root fields, divided by the 8 byte size of dx_entry records. That would get you a max value of 507. The actual value is 0x01FC or 508 because the "zero hash" entry in bytes 36-39 counts as an extra dx_entry record.

Given that a single dx_root block can index over 500 htree blocks, and that those blocks can contain hundreds of file name entries, it is rare for an htree to ever need more than a single level. So in practice, the "depth of tree" byte at offset 30 is always 0x00, indicating a flat tree. The specification does allow for a nested tree, but I've never seen one in a real world application.

Bytes 34-35 are the actual number of dx_entry records to follow, again counting the "zero hash" record in bytes 36-39 as one of the dx_entry records. Each dx_entry record is a four byte hash value followed by a four byte relative block offset from the beginning of the directory file. For clarity, let's take a look at the first several dx_entry records from this example in tabular form:

Hash value          Block offset
"zero hash"             1
0x0F3FFEA2             16
0X1D8171F2              8
0X2C3E5760             15
0X39989908              4
...

The dx_entry records are a lookup table sorted by hash value. The initial "zero hash" entry means that all files whose names hash to values less than 0x0F3FFEA2 can be found in block number 1 of the directory file. File names with a hash value greater than or equal to 0x0F3FFEA2 but less than 0x1D8171F2 can be found in block 16 of the file, and so on. The dx_entry records are sorted by hash value so that the EXT file system code can do binary search to find the appropriate block offset more quickly.

After the last dx_entry record, the rest of the block is slack space. Typically, this slack space contains directory entries from when the directory was small enough to fit into a single block. For example, right after the end of the last dx_entry record, you can see most of an original entry for a subdirectory (file type 0x02) named "alsa-utils". Then there's an entry for another subdirectory named "anacron" at inode 0x000C14D7 (791767). Typically, you will also find live entries for these directories in whatever htree block they got hashed into. But it's possible that these directories were later deleted, and that these entries in the directory slack may be the only record of their existence.

The rest of the directory file is "leaf blocks" of the htree. These blocks are full of normal directory entries and are simply read sequentially. The htree format was specifically designed for backwards compatibility, so that older code could still do a normal sequential search through the directory entries.

If you're thinking about carving for deleted directory files, be aware that directory files larger than one block are usually fragmented. The EXT block allocation algorithm prefers to put files into the same block group with their parent directory. By the time a directory grows larger than a single block, the nearby blocks have all been consumed.

Hal Pomeranz is an independent Digital Forensic Analyst and Expert Witness. He thinks that any day spent looking at a hex editor is a good day.

Share:
TwitterLinkedInFacebook
Copy url Url was copied to clipboard
Subscribe to SANS Newsletters
Receive curated news, vulnerabilities, & security awareness tips
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

By providing this information, you agree to the processing of your personal data by SANS as described in our Privacy Policy.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Tags:
  • Digital Forensics and Incident Response

Related Content

Blog
DFIR_-_DFIR_Origin_Stories_-_340x340_Thumb.jpg
Digital Forensics and Incident Response
March 20, 2023
DFIR Origin Stories - Kat Hedley
Digital Forensics and Incident Response (DFIR) called to Kat Hedley as soon as she first entered the workforce.
DFIR_ICON_(1).PNG
SANS DFIR
read more
Blog
Google.png
Digital Forensics and Incident Response, Cloud Security
March 13, 2023
Google Cloud Log Extraction
In this blog post, we review the methods through which we can extract logs from Google Cloud.
Megan_Roddie_370x370.png
Megan Roddie
read more
Blog
Untitled_design-43.png
Digital Forensics and Incident Response, Cybersecurity and IT Essentials, Industrial Control Systems Security, Purple Team, Open-Source Intelligence (OSINT), Penetration Testing and Red Teaming, Cyber Defense, Cloud Security, Security Management, Legal, and Audit
December 8, 2021
Good News: SANS Virtual Summits Will Remain FREE for the Community in 2022
They’re virtual. They’re global. They’re free.
370x370-person-placeholder.png
Emily Blades
read more
  • Register to Learn
  • Courses
  • Certifications
  • Degree Programs
  • Cyber Ranges
  • Job Tools
  • Security Policy Project
  • Posters & Cheat Sheets
  • White Papers
  • Focus Areas
  • Cyber Defense
  • Cloud Security
  • Cybersecurity Leadership
  • Digital Forensics
  • Industrial Control Systems
  • Offensive Operations
Subscribe to SANS Newsletters
Receive curated news, vulnerabilities, & security awareness tips
United States
Canada
United Kingdom
Spain
Belgium
Denmark
Norway
Netherlands
Australia
India
Japan
Singapore
Afghanistan
Aland Islands
Albania
Algeria
American Samoa
Andorra
Angola
Anguilla
Antarctica
Antigua and Barbuda
Argentina
Armenia
Aruba
Austria
Azerbaijan
Bahamas
Bahrain
Bangladesh
Barbados
Belarus
Belize
Benin
Bermuda
Bhutan
Bolivia
Bonaire, Sint Eustatius, and Saba
Bosnia And Herzegovina
Botswana
Bouvet Island
Brazil
British Indian Ocean Territory
Brunei Darussalam
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Cayman Islands
Central African Republic
Chad
Chile
China
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Cook Islands
Costa Rica
Croatia (Local Name: Hrvatska)
Curacao
Cyprus
Czech Republic
Democratic Republic of the Congo
Djibouti
Dominica
Dominican Republic
East Timor
East Timor
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Estonia
Ethiopia
Falkland Islands (Malvinas)
Faroe Islands
Fiji
Finland
France
French Guiana
French Polynesia
French Southern Territories
Gabon
Gambia
Georgia
Germany
Ghana
Gibraltar
Greece
Greenland
Grenada
Guadeloupe
Guam
Guatemala
Guernsey
Guinea
Guinea-Bissau
Guyana
Haiti
Heard And McDonald Islands
Honduras
Hong Kong
Hungary
Iceland
Indonesia
Iraq
Ireland
Isle of Man
Israel
Italy
Jamaica
Jersey
Jordan
Kazakhstan
Kenya
Kiribati
Korea, Republic Of
Kosovo
Kuwait
Kyrgyzstan
Lao People's Democratic Republic
Latvia
Lebanon
Lesotho
Liberia
Liechtenstein
Lithuania
Luxembourg
Macau
Macedonia
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Marshall Islands
Martinique
Mauritania
Mauritius
Mayotte
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Monaco
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nauru
Nepal
Netherlands Antilles
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Niue
Norfolk Island
Northern Mariana Islands
Oman
Pakistan
Palau
Palestine
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Poland
Portugal
Puerto Rico
Qatar
Reunion
Romania
Russian Federation
Rwanda
Saint Bartholemy
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Vincent And The Grenadines
Samoa
San Marino
Sao Tome And Principe
Saudi Arabia
Senegal
Serbia
Seychelles
Sierra Leone
Sint Maarten
Slovakia
Slovenia
Solomon Islands
South Africa
South Georgia and the South Sandwich Islands
South Sudan
Sri Lanka
St. Helena
St. Pierre And Miquelon
Suriname
Svalbard And Jan Mayen Islands
Swaziland
Sweden
Switzerland
Taiwan
Tajikistan
Tanzania
Thailand
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkey
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United Arab Emirates
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Vatican City
Venezuela
Vietnam
Virgin Islands (British)
Virgin Islands (U.S.)
Wallis And Futuna Islands
Western Sahara
Yemen
Yugoslavia
Zambia
Zimbabwe

By providing this information, you agree to the processing of your personal data by SANS as described in our Privacy Policy.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
  • © 2023 SANS™ Institute
  • Privacy Policy
  • Contact
  • Careers
  • Twitter
  • Facebook
  • Youtube
  • LinkedIn