As a pen tester, it's not enough to exploit targets and get shells. That's great (and it's a big part of what we do), but the real value to the customer is to demonstrate what the effective risk is from the successful exploitation of a vulnerability. In order to answer this question, I find myself interrogating data on compromised systems, trying to make sense of what's available and what the disclosure of the data will mean to my customer.
Often, I find myself looking at a bunch of data in a SQL database. This might be a native SQL database (usually MSSQL, SQLite3, MySQL, Oracle, etc.), but sometimes it's a database I've created by importing CSV files, JSON data, or other data formats. As a former DBA (a long, long time ago) I feel really comfortable at a SQL> prompt. In this article I'll offer some quick tips on getting useful data out of a database.
I'm going to use a Pokémon Pokedex SQLite3 database as my data source for examples. This database is the labor-of-love project from Eevee. Special thanks to Eevee for making this complex database available.
Data Structures
Identify the structure of tables, indexes, and other objects using the .schema command:
sqlite> .schema CREATE TABLE item_pockets ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, PRIMARY KEY (id) ); CREATE TABLE pokeathlon_stats ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, PRIMARY KEY (id) ); ...
This database has several tables. If you want to look at the structure of a single table, specify a table:
sqlite> .schema CREATE TABLE item_pockets ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, PRIMARY KEY (id) ); CREATE TABLE pokeathlon_stats ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, PRIMARY KEY (id) ); ...
Note that the .schema command is very SQLite-specific. Other databses use SHOW TABLES and DESC tablename instead. Most of the other commands shown in this article are generic enough to work across any database.
Retrieving Some or All
We retrieve data from the database using the SQL SELECT statement (I've also issued the SQLite3 .headers command to turn on column labels:
sqlite> select * from pokemon; id|identifier|species_id|height|weight|base_experience|order|is_default 1|bulbasaur|1|7|69|64|1|1 2|ivysaur|2|10|130|142|2|1 3|venusaur|3|20|1000|236|3|1 4|charmander|4|6|85|62|5|1 5|charmeleon|5|11|190|142|6|1 6|charizard|6|17|905|240|7|1 7|squirtle|7|5|90|63|10|1 8|wartortle|8|10|225|142|11|1 9|blastoise|9|16|855|239|12|1 10|caterpie|10|3|29|39|14|1 ...
By specifying the *
, we note that we want to retrieve all of the columns. If you only want a few specific columns, specify the columns you want by name in the order you want them to appear:
sqlite> select id, identifier, weight, height from pokemon; id|identifier|weight|height 1|bulbasaur|69|7 2|ivysaur|130|10 3|venusaur|1000|20 4|charmander|85|6 5|charmeleon|190|11 6|charizard|905|17 7|squirtle|90|5 8|wartortle|225|10 9|blastoise|855|16 10|caterpie|29|3 ...
The pokemon table has a lot of records in it. If you only want the first 5 records, add LIMIT 5 to the end of your query:
sqlite> select id, identifier, weight, height, "order" from pokemon limit 5; id|identifier|weight|height|order 1|bulbasaur|69|7|1 2|ivysaur|130|10|2 3|venusaur|1000|20|3 4|charmander|85|6|5 5|charmeleon|190|11|6
Unique Data
Individual rows of data will often have duplicate values present. We can get a unique list of the values present by using the DISTINCT modifier in a query. For example, consider the structure of the Pokedex contest_type_names table:
sqlite> .schema contest_type_names CREATE TABLE contest_type_names ( contest_type_id INTEGER NOT NULL, local_language_id INTEGER NOT NULL, name VARCHAR(79), flavor TEXT, color TEXT, PRIMARY KEY (contest_type_id, local_language_id), FOREIGN KEY(contest_type_id) REFERENCES contest_types (id), FOREIGN KEY(local_language_id) REFERENCES languages (id) ); CREATE INDEX ix_contest_type_names_name ON contest_type_names (name); sqlite> select * from contest_type_names; contest_type_id|local_language_id|name|flavor|color 1|5|Sang-froid|Épicé|Rouge 1|9|Cool|Spicy|Red 2|5|Beauté|Sec|Bleu 2|9|Beauty|Dry|Blue 3|5|Gr'ce|Sucré|Rose 3|9|Cute|Sweet|Pink 4|5|Intelligence|Amère|Vert 4|9|Smart|Bitter|Green 5|5|Robustesse|Acide|Jaune 5|9|Tough|Sour|Yellow 1|10||Ostrá| 5|10|Síla|| 2|10|Krása|Suchá|
The return data set is fairly small, but we can focus it further. What if we are only interested in the unique color assignments given to Pokémon contest types?
sqlite> select distinct(color) from contest_type_names; Rouge Red Bleu Blue Rose Pink Vert Green Jaune Yellow
Conditional Expressions
Often you'll want to filter the returned data with a conditional expression, denoted with a WHERE clause. Using a WHERE clause allows you to specify the nature of the data you want returned, matching one or more columns to values you specify. For example, what if we only want to see information about Pikachu in the pokemon table?
sqlite> .schema pokemon CREATE TABLE pokemon ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, species_id INTEGER, height INTEGER NOT NULL, weight INTEGER NOT NULL, base_experience INTEGER NOT NULL, "order" INTEGER NOT NULL, is_default BOOLEAN NOT NULL, PRIMARY KEY (id), FOREIGN KEY(species_id) REFERENCES pokemon_species (id), CHECK (is_default IN (0, 1)) ); CREATE INDEX ix_pokemon_order ON pokemon ("order"); CREATE INDEX ix_pokemon_is_default ON pokemon (is_default); sqlite> select * from pokemon where identifier = "pikachu"; id|identifier|species_id|height|weight|base_experience|order|is_default 25|pikachu|25|4|60|112|32|1
In addition to using matching expressions, SQLite3 also supports common comparison operators, shown here:
Operator | Meaning |
---|---|
= | Equal to |
!= | Not equal to |
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
Let's use this to find out if there are any Pokémon smaller than Pikachu (who appears to be 4 decimeters from the previous query):
sqlite> select identifier from pokemon where height < 4; caterpie weedle pidgey rattata spearow paras ...
We can combine multiple WHERE expressions together too. For example, what Pokémon are taller than Pikachu but weigh less? (Note that here I've pressed Enter to start a new line, prompting SQLite3 to produce a continuation prompt of ...>
.)
sqlite> select identifier, height, weight from pokemon ...> where height < 4 and weight > 190; klink|3|210 durant|3|330
Wildcards
SQL allows you to specify wildcards in your WHERE clauses, using the keyword LIKE and _ to match any single character, or LIKE and %
to match a group of characters. Using the pokemon_species_name table, let's identify all the genus values that match Dr at the beginning of the line:
sqlite> .schema pokemon_species_names CREATE TABLE pokemon_species_names ( pokemon_species_id INTEGER NOT NULL, local_language_id INTEGER NOT NULL, name VARCHAR(79), genus TEXT, PRIMARY KEY (pokemon_species_id, local_language_id), FOREIGN KEY(pokemon_species_id) REFERENCES pokemon_species (id), FOREIGN KEY(local_language_id) REFERENCES languages (id) ); CREATE INDEX ix_pokemon_species_names_name ON pokemon_species_names (name); sqlite> select name, genus from pokemon_species_names where genus like 'Dr%'; name|genus Nidoqueen|Drill Nidoking|Drill Rhydon|Drill Hypotrempe|Dragon Seeper|Drache Horsea|Dragón Horsea|Drago Horsea|Dragon ... Simipour|Drenaje Munna|Dream Eater Musharna|Drowsing Muplodocus|Dragon Viscogon|Drache Goodra|Dragón ...
Notice that some of the genus types are Dragon and Dragón. If we want to match either of those two, we can use the _
modifier to match the o and ó characters:
sqlite> select name, genus from pokemon_species_names where genus like 'Drag_n'; name|genus Hypotrempe|Dragon Horsea|Dragón Horsea|Dragon Hypocéan|Dragon Seadra|Dragón Seadra|Dragon Minidraco|Dragon Dratini|Dragón Dratini|Dragon Draco|Dragon ...
You can also search for the middle portion of a matching string by adding %
to the beginning and the end of the expression. For example, we can search for any genus with the word eat:
sqlite> select name, genus from pokemon_species_names ...> where genus like '%eat%'; name|genus Castform|Weather Munchlax|Big Eater Munna|Dream Eater Heatmor|Anteater
Although it doesn't show in this output, %eat% will match values starting or ending with eat as well (e.g. there doesn't have to be preceding or following characters to be a match).
Ordering Data
Sometimes you want to change the order of the data being returned. No problem, simply enter ORDER BY followed by the column you want to use. You can add multiple comma-separated columns in the ORDER BY clause as well. By default, values are listed in ascending order, but this can be modified by adding the keyword DESCENDING to the end of the ORDER BY expression.
For example, the abilities table in the Pokémon database discloses the identifier information, sorted by default using the id field. If you want to change the sort order to the generation_id column, add an ORDER BY clause:
sqlite> .schema abilities CREATE TABLE abilities ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, generation_id INTEGER NOT NULL, is_main_series BOOLEAN NOT NULL, PRIMARY KEY (id), FOREIGN KEY(generation_id) REFERENCES generations (id), CHECK (is_main_series IN (0, 1)) ); CREATE INDEX ix_abilities_is_main_series ON abilities (is_main_series); sqlite> select * from abilities order by generation_id; id|identifier|generation_id|is_main_series 1|stench|3|1 2|drizzle|3|1 3|speed-boost|3|1 4|battle-armor|3|1 5|sturdy|3|1 6|damp|3|1 7|limber|3|1 ... 162|victory-star|5|1 163|turboblaze|5|1 164|teravolt|5|1 10001|mountaineer|5|0 10002|wave-rider|5|0 10003|skater|5|0 10004|thrust|5|0 10005|perception|5|0 ... 189|primordial-sea|6|1 190|desolate-land|6|1 191|delta-stream|6|1
Cross-table Queries
When people start out designing databases, people try and stuff everything into a single table, even if it's not efficient or performance-wise to do so. Later, data that changes less frequently, or should be isolated from related records gets placed into a different table, sharing an identifier that allows us to query both tables to produce one set of results.
For example, consider the pokemon_species and pokemon_species_names tables shown below in SQL and visualized forms (courtesy of WWW SQL Designer.
sqlite> .schema pokemon_species CREATE TABLE pokemon_species ( id INTEGER NOT NULL, identifier VARCHAR(79) NOT NULL, generation_id INTEGER, evolves_from_species_id INTEGER, evolution_chain_id INTEGER, color_id INTEGER NOT NULL, shape_id INTEGER NOT NULL, habitat_id INTEGER, gender_rate INTEGER NOT NULL, capture_rate INTEGER NOT NULL, base_happiness INTEGER NOT NULL, is_baby BOOLEAN NOT NULL, hatch_counter INTEGER NOT NULL, has_gender_differences BOOLEAN NOT NULL, growth_rate_id INTEGER NOT NULL, forms_switchable BOOLEAN NOT NULL, "order" INTEGER NOT NULL, conquest_order INTEGER, PRIMARY KEY (id), FOREIGN KEY(generation_id) REFERENCES generations (id), FOREIGN KEY(evolves_from_species_id) REFERENCES pokemon_species (id), FOREIGN KEY(evolution_chain_id) REFERENCES evolution_chains (id), FOREIGN KEY(color_id) REFERENCES pokemon_colors (id), FOREIGN KEY(shape_id) REFERENCES pokemon_shapes (id), FOREIGN KEY(habitat_id) REFERENCES pokemon_habitats (id), CHECK (is_baby IN (0, 1)), CHECK (has_gender_differences IN (0, 1)), FOREIGN KEY(growth_rate_id) REFERENCES growth_rates (id), CHECK (forms_switchable IN (0, 1)) ); CREATE INDEX ix_pokemon_species_order ON pokemon_species ("order"); CREATE INDEX ix_pokemon_species_conquest_order ON pokemon_species (conquest_order); sqlite> .schema pokemon_species_names CREATE TABLE pokemon_species_names ( pokemon_species_id INTEGER NOT NULL, local_language_id INTEGER NOT NULL, name VARCHAR(79), genus TEXT, PRIMARY KEY (pokemon_species_id, local_language_id), FOREIGN KEY(pokemon_species_id) REFERENCES pokemon_species (id), FOREIGN KEY(local_language_id) REFERENCES languages (id) );

In these two tables, the data for name and genus is in a different table than the primary Pokémon species information. The pokemon_species_names data probably changes less often then the data in pokemon_species, and is a pretty reasonable design. However, how do we formulate a query across the two tables that returns the Pokémon identifier and genus in the same query?
The answer is in the SQL join capability. Take a look:
sqlite> select identifier, genus from pokemon_species, pokemon_species_names where ...> pokemon_species.id = pokemon_species_names.pokemon_species_id and ...> local_language_id = 9; bulbasaur|Seed ivysaur|Seed venusaur|Seed charmander|Lizard charmeleon|Flame charizard|Flame squirtle|Tiny Turtle wartortle|Turtle blastoise|Shellfish caterpie|Worm ...
Here I've selected columns from two tables. The WHERE clause tells SQL to associate records from pokemon_species with records from pokemon_species_id when the id and species_id columns match. I've also limited the output where local_language_id is 9, which is English.
In this example, 'identifier' and 'genus' are both unique names in the two different tables. If the column names match across two tables, you can use dot-notation to specify the full table name followed by the column name (e.g. pokeon_species.identifier, pokemon_species_id.genus)
Grouping Data
One powerful SQL statement we haven't yet covered is the GROUP BY operator. Using GROUP BY, we can order the return results into groups. For example, adding GROUP BY genus to the previous query will order the results by the genus column:
sqlite> select identifier, genus from pokemon_species, pokemon_species_names ...> where pokemon_species.id = pokemon_species_names.pokemon_species_id and ...> local_language_id = 9 GROUP BY genus; landorus|Abundance seedot|Acorn arceus|Alpha chinchou|Angler trapinch|Ant Pit heatmor|Anteater dedenne|Antenna marill|Aqua Mouse azumarill|Aqua Rabbit hariyama|Arm Thrust
You may be thinking "this doesn't really seem that useful? couldn't we get the same result with ORDER BY instead?" You'd be correct, except that GROUP BY is often used with aggregate functions, one of the most powerful tools in your SQL arsenal.
Aggregate Functions
Aggregate functions are a kind of virtual column, allowing you to calculate simple operations on data in tables. Use an aggregate function to calculate a value using one of the following functions:
Function | Meaning |
---|---|
COUNT | Count the total records |
MAX | Identify the largest value |
MIN | Identify the smallest value |
SUM | Add the values together |
AVG | Calculate an average value |
An aggregate function that I use all the time is COUNT. Want to know how many Pokémon fall into the genus of mouse?
sqlite> select count(genus) from pokemon_species_names ...> where genus = "Mouse" and local_language_id = 9; 6
OK, it was 6. That seems a little anti-climactic.
In this example, 'count(genus)' is used to count the result set that matches the SELECT statement with the return clause. This is often used to simply count the total number of records in a table:
sqlite> .headers on sqlite> select count(*) as "Total" from pokemon_species_names; Total 6094
Here I also took the opportunity to demonstrate how we name an aggregate function column. Using the syntax AS Total after the aggregate function allows me to reference the aggregate column by name.
Aggregate functions become very useful when you combine them with the GROUP BY operator. Previously we looked for the number of Pokémon in the mouse genus, but what if we want to get a count of how many Pokémon belong to each of the genera? (NB: I had to Google "plural of genus" for that.)
sqlite> select count(name) count, genus from pokemon_species_names ...> where local_language_id = 9 group by genus order by count desc; count|genus 8|Dragon 6|Mouse 6|Mushroom 5|Balloon 5|Flame 5|Fox 4|Bagworm 4|Bat 4|Cocoon 4|Drill 4|Fairy 4|Poison Pin 4|Seed 4|Tadpole 3|Big Jaw 3|Bivalve 3|Cottonweed 3|EleFish 3|Electric 3|Flower
The GROUP BY operator here groups each of the return results into a set by the specified GENUS column. By naming the aggregate function column count, I was able to reference it later in the ORDER BY clause as well.
Understanding GROUP BY, and aggregate functions is what gives you powerful tools for analyzing SQL data. Use it to identify the number of credit cards disclosed in a compromise, or credit card breach counts grouped by state that have mandatory breach reporting requirements, for example.
Whew! That was a lot of SQL for one article. Master these techniques though, and you'll be well on your way to getting useful and meaningful data out of SQL databases in your pen-test post-compromise phase. For more interesting SQL lessons specifically using the Pokémon Pokedex database, check out Dandelion Mané's PokemonSQLTutorial project too!
-Josh Wright
@joswr1ght