Year of Python (YOP) – Week Nineteen


Hello Reader!

This weeks piece of code are updates to some previous YOP scripts I’ve written…

First, there was my YOP – Week Seventeen script, which was an index.dat HASH table parser.  I wanted to start tackling the Record Hash part of the Hash Table entries.  This part of the entry is four bytes in size, and is parsed out at the bit level, where we are basically looking at bits being turned “on” or “off.”  I started by looking through the specs that Joachim Metz has documented in his libmsiecf project.

Now according to Joachim’s research, the record hash in the hash table is broken down as follows:

  • Record Hash Flags – 5 bits
  • One bit that is unused
  • Record Hash Value – 26 bits

How this data is translated out still isn’t clear (at least to me at this point), so all I was trying to accomplish with this snippet of code was to get the data down to the binary level.  And here’s what we have:

from bitstring import BitArray, Bits

def hash_data_parse(parse_data):
    bit_values = BitArray(parse_data)
    binary_values = bit_values.bin
    return binary_values

def hash_table_records(parse_records):
    ie_hash_data = struct.unpack("<I", parse_records[0:4])
    ie_hash_record_pointer = struct.unpack("<I", parse_records[4:8])
    ie_hash_data_parse = hash_data_parse(hex(ie_hash_data[0]))
    print "Hash Data: {}\t\tHash Record Pointer: {}".format(hex(ie_hash_data[0]), ie_hash_record_pointer[0])
    print "Record Hash Flags: {}\tRecord Hash Value: {}".format(ie_hash_data_parse[:5], ie_hash_data_parse[6:])

The first part we need to do is import the bitstring module so we can parse the hexidecimal data.  Next I created a new function that takes the hexadecimal data as input, and returns the binary value.  Finally to the hash_table_records function, I added the NEW function I created, and then printed out the data that it returned.  You’ll note I’m not printing out the unused bit.

The second script I updated was from YOP – Week 13.  This was the script I designed to hash files in two directories with the end goal determining that the files all match.  I’m looking to use this code after copying a large number of files to another location, so I can make sure everything moved over correctly.  However the one piece of code I had not written at the time, was the code to actually compare the two md5 values to make sure they matched.

audit_log.write("There are {} items in the source directory.\n".format(len(source_file_list)))
audit_log.write("Source files....\n")
for key, value in source_file_list.items():
    audit_log.write("{}\t\t\t\t\tMD5: {}\n".format(key, value))

audit_log.write("There are {} items in the destination directory.\n".format(len(dest_file_list)))
audit_log.write("\nDestination files...\n")
for key, value in dest_file_list.items():
    audit_log.write("{}\t\t\t\t\tMD5: {}\n".format(key, value))

for (key, value) in set(source_file_list.items()) & set(dest_file_list.items()):
    audit_log.write("{}: {} is present in both source and destination.\n".format(key, value))

The first update I made was to the source file data I’m writing to the “audit file” you specify with the script.  Utilizing the dictionary I created to track the source files, I get the length of the dictionary, which gives me the number of items that we’re working with in the source location.  Here I’m looking at what my “baseline” is, how many items I should end up with in the destination.

The second update is repeating that same process in the destination directory.  Now I’m checking to see how many items I have where the files were copied to.

Finally, I’m comparing both dictionaries, and writing the files that match to my output audit file.  This is just another audit point to make sure I end up with the same number of files that I started with.

Until next week!


No Responses Yet to “Year of Python (YOP) – Week Nineteen”

  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: