Left4Code

dd

Table of Contents


1. Background Information

dd is part of the gnu core utilities. It can be used to create or copy block data from different devices or files.

For the purposes of this course, I will show how to copy data from a partition or entire drive to an image file, and how to write random data to an image file.

1.1 Warnings

Most programs you use on a daily basis can be compared to a car. It has seatbelts, a (hopefully) strong exterior, and if your car is a bit newer, sensors for road obstructions. This could be programs like Firefox, your GUI calculator software, and that fun video game.

dd is not like this, dd can be compared more to a motorcycle than a car. dd is a small program prioritizing efficiency and has little to no safety measures. Much like a motorcycle, you should learn to use dd in a safe environment where potential damage is minimized. You should also bring a helmet too. (understanding of commands, virtual environment, patience.)

1.2 Links Used on This Page
▶[https://en.wikipedia.org/wiki/Hard_disk_drive?lang=en] 

◉───╡ Wikipedia Link for How a Hard Drive Works.

▶[https://www.gnu.org/software/coreutils/manual/html_node/dd-invocation.html]

◉───╡ dd GNU coreutils manual.

▶[https://maizure.org/projects/decoded-gnu-coreutils/index.html]

◉───╡ A Project which explains the GNU coreutil functionality from a programmers perspective (written by MaiZure, my hero.)

▶[https://youtube.com/channel/UCnjRWRyHTLZo5l7cfq04Uwg]

◉───╡ MaiZure's Youtube Channel.

https://dfir.ru/2018/07/21/a-live-forensic-distribution-executing-malicious-code-from-a-suspect-drive/ 

◉───╡ Maxim Suhanov's PoC for malicious code from a drive being executed in a live forensic distribution.

https://dfir.ru/2018/07/25/a-live-forensic-distribution-writing-to-a-suspect-drive/ 

◉───╡ Maxim Suhanov's Additional implications for the PoC.

1.3 How A Hard Disk Drive (HDD) Works (Click to Expand)

A HDD consist of the following major components:

1. Platter: Magnetic glass disk responsible for storing binary data in a non-volatile way.

2. Spindle: The thing that holds a single platter or multiple platters together.

3. Actuator: Magnetic Motor responsible for controlling the actuator arm and read/write head for precise data operation.

4. Actuator Arm: holds the read/write head

5. read/write head: an Electromagnet responsible for writing ones and zeros to the platter.

The hard disk will begin by spinning up to around 5,400 or 7,200 RPM.

Physically on the disk, the platter is grouped up into sectors which are readable by the disk firmware.

When an operating system is installed into the disk, the entire boot process is run and the disk IO process is mainly managed by the kernel and operating system file structure schemes, which are for example: NTFS, ext4, btrfs.

When a program on the operating system requires the use of the disk, this is managed by the kernel through system calls, which are the layer of abstraction between the user and the kernel.

normally, a program will call a wrapper function in a standard library (glibc, musl) and this function will then invoke the system call to the kernel.

In the Case of dd, (I'm not 100% on this, I'm really bad at reading C code I haven't written) the program uses the internal functions iread and iread_fullblock to read data from a file and put it into a buffer, then writing it to the output parameter. the iread function seems to call read(). By default, dd will write in 512-byte blocks.

After doing some searching, I found MaiZure's graphical guide for the GNU coreutils This project has to be one of the most underrated things ever. I can not thank this person enough for the work they have done here. Just amazing. Seriously, Amazing.

It covers all relevant GNU coreutils in a way that is super easy to understand, providing graphical representations of the control flow of the program. MaiZure also outlined all of the functions and literally broke everything down.

MaiZure also has a Youtube channel. Support and learn!

I guess that serves as my personal thank you to MaiZure since I'm silly and can't figure out how to contact you directly.


2. DD Usage

The standard dd usage for most forensic purposes is the following:

dd if=<input> of=<output> 	    

This command will function to copy all bytes from the specified input file to the specified output file. By default, dd will copy 512 bytes of information from the input file to the output file at a time.

The general format for dd follows the old IBM Job Control Language parameters, which is why this may look different from the standard flag-style "-" or "--" system.

2.1 Additional Parameters

dd parameters are able to manipulate the placement, input, and output of data.

the parameters you would probably only be using for a forensic investigation would be:

'if' (input file) - the file that dd will copy data from. Default is stdin.

'of' (output file) - the file that dd will copy data to.

'bs' (byte size) - the rate of bytes transferred from 'if' to 'of'. Default is 512 Bytes.

'count' - the amount of times a transfer of the value assigned to 'bs' will happen.

the 'noerror' convention - dd will continue to operate despite a read error.

the 'excl' convention - if the file specified in 'of' already exists, dd will not overwrite it.

the 'progress' status convention - will show the progress of longer file copying procedures.

3. Read-only Mounting of a Drive

In a forensic situation, you would want to make sure you are mounting the drive in read only mode, you would do this by first running:

blockdev --setro	    

After, you would then mount the device like this:

sudo mount -o ro,noreload /dev/<partition>	    

the device should then be able to be only read from and should not be modified for the most part.

3.1 Warnings for Mounting a Drive as Read-only

NOTE: It is quite hard to 100% mark a device as read-only when doing a forensic investigation. As detailed in the readme for a Linux kernel patch for software write blocking which modifies the block device driver to check for read only conditions more, the issue is quite complex and there are many potential problems that could arise from both userland processes and the actual block device driver or operating system fighting you and modifying or clearing data.

Using a forensically-focused Linux distribution which supports software write-blocking is a wise decision instead of trying to manually patch your kernel if you're using a different distribution. Tsurugi Linux supports such a feature natively and will block write access and open drives in read-only mode by default.

Out of curiosity, I was wondering what the difference is between how Tsurugi Linux handles this issue and how the kernel patch handles this issue. So I contacted the Tsurugi Linux developers for some insight. The Project founder of Tsurugi Linux, 'sug4r' informed me of the following:

1: Tsurugi Linux uses their own write-blocker system, and originally used the previously linked github patch

2: Tsurugi Linux will build a dirty filesystem for ext4 and check the hash before and after mounting the image.

This ensures that the chance that writing to the drive is heavily minimized and there should be no problems.

Huge thanks to sug4r for replying to my message, I probably wouldn't have found the answer to my question without it.

If you would like to read the initial articles by Maxim Suhanov which describes executing malicious code on a target hard drive from a forensic live Linux distribution, they are here:

The initial article 

Additional article explaining further implications

Maxim has a lot of very interesting blog posts at dfir.ru. If anything from there interests you, I highly recommend you read it.

4. Example of dd Usage with Parameters

Let's start by making a test input file. As the reader, you can make this however you want. Personally, typing some funny stuff into a text file seems like the best way. But for the purposes of using dd, you can generate a file of any chosen size using /dev/random.

the command below will generate a file 4Kb in size using data from /dev/random as the input and will place the output file in the current directory you are operating in. /dev/random is a special virtual device on Linux that will produce random data:

Creates a 4Kb file named "testfile.txt".

dd if=/dev/random of=testfile.txt bs=1024 count=4   

Upon examining the file, you will notice that the data in the file is not made up entirely of human-readable ASCII, if you would like to make a file that is made up of random ASCII data, you can use the following command:

Creates a 4Kb File Named "testfile.txt" Filled only with ASCII Readable Text.

tr -dc [:alnum:] </dev/random | head -c 4096 > testfile.txt	    

This command sequence will take input from /dev/random and use tr to only take alphanumeric characters as defined by "[:alnum:]", this output will then be sent to head, where it will load 4096 bytes of the translated output into testfile.txt.

The 'ucase' and 'lcase' conventions can be used to modify data at the block level from lower-case to upper-case and vice-versa.

Converts the Content from <testfile_created> to Uppercase.

dd if=<testfile_created> of=<testfile_created>_new.txt conv=ucase	    

For forensics, dd can be used to create image files from entire disks or disk partitions on the system. These image files can then be processed by other forensic tools like Autopsy or Foremost.

Puts Contents from /dev/sda Into an Image File Using noerror Convention.

sudo dd if=/dev/sda of=sda_image.dd bs=4096 conv=noerror status=progress   

This example will make a complete copy of the entire /dev/sda drive and save the output to the image file "sda_image.dd"

additionally, individual partitions can be copied like so:

Copies Content from /dev/sda1 Partition to Image File.

sudo dd if=/dev/sda1 of=sda1_image.dd bs=4096 status=progress	    

5. Using Hashing with dd

When you only have the choice to use dd for data acquisition, the method used for hashing is generally as follows:

1: Take a Hash of the Drive you Intend to Make a Copy of

If you do not understand what I mean when I use "sha*sum" in the command, please refer to the beginning of the 'sha*sum' page for context.

sha*sum /dev/sdX	    
2: Use dd to Copy the Contents of the Drive to an Image File
dd if=/dev/sdX of=<outputfile.dd> 	    
3: Calculate the Hash of the Resulting Output File
sha*sum <outputfile.dd> 	    
4: Compare the Hash of the /dev/sdX Drive to the <outputfile.dd> file

You should make sure the hash of the drive and the image file match

6. Conclusion

This concludes the general usage of dd for a forensic application. Additional tooling using the original dd framework with extended features such as ddrescue, dcfldd, and dc3dd will be subsequently covered.