Recovering Data
This post is a quick run down of a lightning talk that I gave at my local information security meetup.
Introduction
Tonight we are going recovering deleted files from a USB thumb drive. Since I’m a fan of Kali linux, all the tools demonstrated come pre-installed and can be used from a booting off a live CD or USB Thumb drive.
Image the drive
First of all I like to image the drive so that I’m not working on the original source.
This can be helpful if you can’t keep the source device.
When imaging the drive we can use GNU dd
, however there are a couple of forks of dd
that
have added features for computer forensics. These are dc3dd
and dcflddd
, tonight I will use the later.
$ dcfldd --help
Usage: dcfldd [OPTION]...
Copy a file, converting and formatting according to the options.
bs=BYTES force ibs=BYTES and obs=BYTES
cbs=BYTES convert BYTES bytes at a time
conv=KEYWORDS convert the file as per the comma separated keyword list
count=BLOCKS copy only BLOCKS input blocks
ibs=BYTES read BYTES bytes at a time
if=FILE read from FILE instead of stdin
obs=BYTES write BYTES bytes at a time
of=FILE write to FILE instead of stdout
NOTE: of=FILE may be used several times to write
output to multiple files simultaneously
of:=COMMAND exec and write output to process COMMAND
seek=BLOCKS skip BLOCKS obs-sized blocks at start of output
skip=BLOCKS skip BLOCKS ibs-sized blocks at start of input
pattern=HEX use the specified binary pattern as input
textpattern=TEXT use repeating TEXT as input
errlog=FILE send error messages to FILE as well as stderr
hashwindow=BYTES perform a hash on every BYTES amount of data
hash=NAME either md5, sha1, sha256, sha384 or sha512
default algorithm is md5. To select multiple
algorithms to run simultaneously enter the names
in a comma separated list
hashlog=FILE send MD5 hash output to FILE instead of stderr
if you are using multiple hash algorithms you
can send each to a separate file using the
convention ALGORITHMlog=FILE, for example
md5log=FILE1, sha1log=FILE2, etc.
hashlog:=COMMAND exec and write hashlog to process COMMAND
ALGORITHMlog:=COMMAND also works in the same fashion
hashconv=[before|after] perform the hashing before or after the conversions
hashformat=FORMAT display each hashwindow according to FORMAT
the hash format mini-language is described below
totalhashformat=FORMAT display the total hash value according to FORMAT
status=[on|off] display a continual status message on stderr
default state is "on"
statusinterval=N update the status message every N blocks
default value is 256
sizeprobe=[if|of] determine the size of the input or output file
for use with status messages. (this option
gives you a percentage indicator)
WARNING: do not use this option against a
tape device.
split=BYTES write every BYTES amount of data to a new file
This operation applies to any of=FILE that follows
splitformat=TEXT the file extension format for split operation.
you may use any number of 'a' or 'n' in any combo
the default format is "nnn"
NOTE: The split and splitformat options take effect
only for output files specified AFTER these
options appear in the command line. Likewise,
you may specify these several times for
for different output files within the same
command line. you may use as many digits in
any combination you would like.
(e.g. "anaannnaana" would be valid, but
quite insane)
vf=FILE verify that FILE matches the specified input
verifylog=FILE send verify results to FILE instead of stderr
verifylog:=COMMAND exec and write verify results to process COMMAND
--help display this help and exit
--version output version information and exit
The structure of of FORMAT may contain any valid text and special variables.
The built-in variables are used the following format: #variable_name#
To pass FORMAT strings to the program from a command line, it may be
necessary to surround your FORMAT strings with "quotes."
The built-in variables are listed below:
window_start The beginning byte offset of the hashwindow
window_end The ending byte offset of the hashwindow
block_start The beginning block (by input blocksize) of the window
block_end The ending block (by input blocksize) of the hash window
hash The hash value
algorithm The name of the hash algorithm
For example, the default FORMAT for hashformat and totalhashformat are:
hashformat="#window_start# - #window_end#: #hash#"
totalhashformat="Total (#algorithm#): #hash#"
The FORMAT structure accepts the following escape codes:
\n Newline
\t Tab
\r Carriage return
\\ Insert the '\' character
## Insert the '#' character as text, not a variable
BLOCKS and BYTES may be followed by the following multiplicative suffixes:
xM M, c 1, w 2, b 512, kD 1000, k 1024, MD 1,000,000, M 1,048,576,
GD 1,000,000,000, G 1,073,741,824, and so on for T, P, E, Z, Y.
Each KEYWORD may be:
ascii from EBCDIC to ASCII
ebcdic from ASCII to EBCDIC
ibm from ASCII to alternated EBCDIC
block pad newline-terminated records with spaces to cbs-size
unblock replace trailing spaces in cbs-size records with newline
lcase change upper case to lower case
notrunc do not truncate the output file
ucase change lower case to upper case
swab swap every pair of input bytes
noerror continue after read errors
sync pad every input block with NULs to ibs-size; when used
with block or unblock, pad with spaces rather than NULs
Report bugs to <nicholasharbour@yahoo.com>.
As I’m running low on disk space I am going to save the drive image to an external drive. This may also come in handy if you are booting from removeable media.
For now I’m only going to use the if
and of
arguments.
dcfldd if=<source device> of=<target image>
NOTE: it is important to get the source and target around the right way as you have the potential to distory data!
A handy way to see what drives you connected is to use the lsblk
command.
lsblk -io KNAME,TYPE,SIZE,MODEL
now lets image the drive
$ dcfldd if=/dev/sdb of=/media/root/ELEMENTS/isig.img
Recovering the files
Now we have an image file it is time to carve out the files from it. Kali has both foremost
and scalpel
installed. While the authors of foremost
recommend using scalpel
,
I have had the most sucess with foremost
out of the box.
$ foremost -h
foremost version 1.5.7 by Jesse Kornblum, Kris Kendall, and Nick Mikus.
$ foremost [-v|-V|-h|-T|-Q|-q|-a|-w-d] [-t <type>] [-s <blocks>] [-k <size>]
[-b <size>] [-c <file>] [-o <dir>] [-i <file]
-V - display copyright information and exit
-t - specify file type. (-t jpeg,pdf ...)
-d - turn on indirect block detection (for UNIX file-systems)
-i - specify input file (default is stdin)
-a - Write all headers, perform no error detection (corrupted files)
-w - Only write the audit file, do not write any detected files to the disk
-o - set output directory (defaults to output)
-c - set configuration file to use (defaults to foremost.conf)
-q - enables quick mode. Search are performed on 512 byte boundaries.
-Q - enables quiet mode. Suppress output messages.
-v - verbose mode. Logs all messages to screen
The options that we are going to use:
-t all
to carve out as many files as possible-v
verbose output-in isig.img
the drive image to process-o foremost
the directory to place the found files
$ foremost -t all -v -in isig.img -o foremost
After some time the files will be carved out into the foremost
directory. Lets have a look what is in there.
$ tree foremost/
foremost/
├── audit.txt
├── docx
│ └── 00106328.docx
├── exe
│ ├── 00107944.exe
│ └── 00129456.exe
├── gif
│ ├── 00111088.gif
...
└── png
├── 00118384.png
├── 00127448.png
├── 00128384.png
├── 00129920.png
├── 00131136.png
└── 00143742.png
Now we can compare that to the original USB thumb drive.
$ tree /media/root/HOT
HOT
├── 00018664.exe
├── 2011 AHOT TK_v3.pdf
├── Chapter Handbook
│ ├── A.Preface.pdf
│ ├── B. Charter.pdf
│ ├── C.Officer Positions.pdf
│ ├── D.Benefits.pdf
│ ├── E.Activities.pdf
│ ├── F.Chapter Business.pdf
│ ├── G.Annual Meeting.pdf
│ ├── H.Marketing Media.pdf
│ ├── I.Safe Riding Tips.pdf
│ ├── J. State Rallies.pdf
│ ├── K.Reference Docs.pdf
│ ├── L.Index.pdf
│ ├── Opening Pages.pdf
│ └── Table of contents.pdf
├── forensicsT1C1.jpg
├── forensicsT1C2.img
├── key.txt
└── shadow
We see that 00106328.docx
has been recovered. The recovered files both present and deleted are carved out and given an unique number. Sometimes you get double ups and fragments,
it depends on how the files are written to disk and calculated without the aid of the File Allocation Table.