Hello Reader!

This week I was working on updates to the Prefetch File script I began last week.  I’m starting to work on parsing out the file information data for the prefetch file.  It’s not complete, but there’s enough for now to post for this week.

First lets talk about the changes.  The prefetch_format function is first:

def prefetch_format(format_type):
    if format_type == "0x11":
        return "Windows XP"
    elif format_type == "0x17":
        return "Windows 7"
    elif format_type == "0x1a":
        return "Windows 8"
    return

Last week I was just printing out what OS the format was in.  This time around we’re going to use the data from the function to figure out what type of file we’re dealing with.  The parse it out accordingly.

The second change was when we read in the prefetch file to process it:

with open(args.prefetch_file, 'rb') as prefetch:
    prefetch_file = prefetch.read()
    prefetch_header = prefetch_file[:84]
    windows_os = prefetch_header_parse(prefetch_header)
    if windows_os == "Windows XP":
        file_info = prefetch_file[84:152]
        winxp_file_info(file_info)
    elif windows_os == "Windows 7":
        file_info = prefetch_file[84:240]
        win7_file_info(file_info)
    elif windows_os == "Windows 8":
        file_info = prefetch_file[84:308]
        win8_file_info(file_info)

So here’s where we are taking the return value from the prefetch_format function, and then figuring out which os function to pass it to.  This leads us to two new functions of code:

def winxp_file_info(file_info):
    metrics_offset = struct.unpack("<L", file_info[0:4])
    no_metrics = struct.unpack("<L", file_info[4:8])
    trace_chains = struct.unpack("<L", file_info[8:12])
    no_trace_chains = struct.unpack("<L", file_info[12:16])
    filename_str_offset = struct.unpack("<L", file_info[16:20])
    filename_str_size = struct.unpack("<L", file_info[20:24])
    volume_info = struct.unpack("<L", file_info[24:28])
    no_volumes = struct.unpack("<L", file_info[28:32])
    volume_info_size = struct.unpack("<L", file_info[32:36])
    last_run_time = struct.unpack("<Q", file_info[36:44])
    run_count = struct.unpack("<L", file_info[60:64])
    print "Metrics Array Offset: {}".format(hex(metrics_offset[0]))
    print "No. of Metrics: {}".format(no_metrics[0])
    print "Trace Chains Offset: {}".format(trace_chains[0])
    print "No. of Trace Chains: {}".format(no_trace_chains[0])
    print "Filename String Offset: {}".format(filename_str_offset[0])
    print "Filename String Size: {}".format(filename_str_size[0])
    print "Volume Info: {}".format(volume_info[0])
    print "No. of Volumes: {}".format(no_volumes[0])
    print "Volume Info Size: {}".format(volume_info_size[0])
    print "Last Run Time: {}".format(last_run_time[0])
    print "Run Count: {}".format(run_count[0])
    return

def win7_file_info(file_info):
    metrics_offset = struct.unpack("<L", file_info[0:4])
    no_metrics = struct.unpack("<L", file_info[4:8])
    trace_chains = struct.unpack("<L", file_info[8:12])
    no_trace_chains = struct.unpack("<L", file_info[12:16])
    filename_str_offset = struct.unpack("<L", file_info[16:20])
    filename_str_size = struct.unpack("<L", file_info[20:24])
    volume_info = struct.unpack("<L", file_info[24:28])
    no_volumes = struct.unpack("<L", file_info[28:32])
    volume_info_size = struct.unpack("<L", file_info[32:36])
    last_run_time = struct.unpack("<Q", file_info[44:52])
    run_count = struct.unpack("<L", file_info[68:72])
    print "Metrics Array Offset: {}".format(hex(metrics_offset[0]))
    print "No. of Metrics: {}".format(no_metrics[0])
    print "Trace Chains Offset: {}".format(trace_chains[0])
    print "No. of Trace Chains: {}".format(no_trace_chains[0])
    print "Filename String Offset: {}".format(filename_str_offset[0])
    print "Filename String Size: {}".format(filename_str_size[0])
    print "Volume Info: {}".format(volume_info[0])
    print "No. of Volumes: {}".format(no_volumes[0])
    print "Volume Info Size: {}".format(volume_info_size[0])
    print "Last Run Time: {}".format(last_run_time[0])
    print "Run Count: {}".format(run_count[0])
    return

So basically what I’m doing in these two functions is parsing out the file info data from the prefetch file of either Windows XP or Windows 7.  I’m still working on the Windows 8 functions, since that has a bit more data.

I also stumbled across the python Construct module which is what plaso uses to parse out this same information with log2timeline.  I’ve been playing around with rewriting some of this data with that module as well.  I may have that show up in future YOP entries.

Until next week!

https://github.com/CdtDelta/YOP


Hello Reader!

This week I’m going back to parsing Windows artifacts.  The first one I’ve decided to tackle are Prefetch files.  For those of you who are not familiar with Prefetch files, you can check out this link for more information.

So far all this script does is parse thee header information of the Prefetch files, which is only 84 bytes long.  I’ve started working on parsing the rest of the data, but it’s not ready yet.  The format of the script is simple:

yop-week20.py -f <prefetch file>

And that’s it.  The script reads in the first 84 bytes of the file, and then passes it to a function to parse out the individual information.

Now there are some parts of the code I still need to tweak.  First is the ability to output to a CSV file.  I’ve been playing around with the Python CSV module, so look for the ability to output to that format in later scripts.  The second part I need to fix is prefetch_header_file_name variable.  Since the contents of this can vary in size, I need to figure out how to determine the end of the file name (I’m sure someone will contact me online within 5 minutes of posting this :)  ).

The Prefetch file has one part that’s unique, in that depending on the version of Windows, the data after the header portion can vary.  I’m choosing to handle it by using a different function depending on the version of Windows I need to work with.  This function will take care of that:

def prefetch_format(format_type):
    if format_type == "0x11":
        print "Windows XP/2003"
    elif format_type == "0x17":
        print "Windows Vista/7"
    elif format_type == "0x1a":
        print "Windows 8.1"
    return

Right now this is just printing out the version of Windows.  However the current version I’m working on now will parse out the file version based on which version of Windows we’re dealing with.  That should show up in a later YOP.

Until next week!

https://github.com/CdtDelta/YOP


Hello Reader!

This weeks piece of code are updates to some previous YOP scripts I’ve written…

First, there was my YOP – Week Seventeen script, which was an index.dat HASH table parser.  I wanted to start tackling the Record Hash part of the Hash Table entries.  This part of the entry is four bytes in size, and is parsed out at the bit level, where we are basically looking at bits being turned “on” or “off.”  I started by looking through the specs that Joachim Metz has documented in his libmsiecf project.

Now according to Joachim’s research, the record hash in the hash table is broken down as follows:

  • Record Hash Flags – 5 bits
  • One bit that is unused
  • Record Hash Value – 26 bits

How this data is translated out still isn’t clear (at least to me at this point), so all I was trying to accomplish with this snippet of code was to get the data down to the binary level.  And here’s what we have:

from bitstring import BitArray, Bits

def hash_data_parse(parse_data):
    bit_values = BitArray(parse_data)
    binary_values = bit_values.bin
    return binary_values

def hash_table_records(parse_records):
    ie_hash_data = struct.unpack("&lt;I", parse_records[0:4])
    ie_hash_record_pointer = struct.unpack("&lt;I", parse_records[4:8])
    ie_hash_data_parse = hash_data_parse(hex(ie_hash_data[0]))
    print "Hash Data: {}\t\tHash Record Pointer: {}".format(hex(ie_hash_data[0]), ie_hash_record_pointer[0])
    print "Record Hash Flags: {}\tRecord Hash Value: {}".format(ie_hash_data_parse[:5], ie_hash_data_parse[6:])
    return

The first part we need to do is import the bitstring module so we can parse the hexidecimal data.  Next I created a new function that takes the hexadecimal data as input, and returns the binary value.  Finally to the hash_table_records function, I added the NEW function I created, and then printed out the data that it returned.  You’ll note I’m not printing out the unused bit.

The second script I updated was from YOP – Week 13.  This was the script I designed to hash files in two directories with the end goal determining that the files all match.  I’m looking to use this code after copying a large number of files to another location, so I can make sure everything moved over correctly.  However the one piece of code I had not written at the time, was the code to actually compare the two md5 values to make sure they matched.

audit_log.write("There are {} items in the source directory.\n".format(len(source_file_list)))
audit_log.write("Source files....\n")
for key, value in source_file_list.items():
    audit_log.write("{}\t\t\t\t\tMD5: {}\n".format(key, value))


audit_log.write("There are {} items in the destination directory.\n".format(len(dest_file_list)))
audit_log.write("\nDestination files...\n")
for key, value in dest_file_list.items():
    audit_log.write("{}\t\t\t\t\tMD5: {}\n".format(key, value))

for (key, value) in set(source_file_list.items()) &amp; set(dest_file_list.items()):
    audit_log.write("{}: {} is present in both source and destination.\n".format(key, value))

The first update I made was to the source file data I’m writing to the “audit file” you specify with the script.  Utilizing the dictionary I created to track the source files, I get the length of the dictionary, which gives me the number of items that we’re working with in the source location.  Here I’m looking at what my “baseline” is, how many items I should end up with in the destination.

The second update is repeating that same process in the destination directory.  Now I’m checking to see how many items I have where the files were copied to.

Finally, I’m comparing both dictionaries, and writing the files that match to my output audit file.  This is just another audit point to make sure I end up with the same number of files that I started with.

Until next week!

https://github.com/CdtDelta/YOP


Hello Reader!

Today’s post will be short.  The college semester finished this week, so most of my time has been spent grading final exams as well as finishing grading some lab assignments.  However, it wasn’t enough to stop the YOP!

My script this week was one I created to help with determining the final grades for my students.  Like most courses there are different weights to different parts of the course (exams, homework assignments, etc).  In prior years I found a website online that was designed to help teachers/students determine their grades so I just used that.  To be honest, I still used the site this year, mainly because it was getting late into the evening and I wanted to get the grades posted.

But once I had the grades tallied (on a spreadsheet I use to track grades) I decided to write some python code and use it to verify that my calculations were correct.  It also gave me something to post for this week.

The first part of this script is simple, ask for the grades:

midterm_total = raw_input("Enter the Midterm grade: ")
final_total = raw_input("Enter the Final Grade: ")
lab_total = raw_input("Enter the Lab Total Grade: ")
homework_total = raw_input("Enter the Homework Total Grade: ")
participation_total = raw_input("Enter the Participation Grade: ")

(I’m replacing the Participation Grade with something else next year)

Simple right?  Next we just calculate the different weights of the grades we’ve just entered.  For reference, the weighting is as follows:

Mid Term – 20%

Final – 20%

Labs – 30%

Homework – 20%

Participation – 10%

And here’s the code that figures the weight part out:

midterm_calc = float(midterm_total) * 0.2
final_calc = float(final_total) * 0.2
lab_calc = float(lab_total) * 0.3
homework_calc = float(homework_total) * 0.2
participation_calc = float(participation_total) * 0.1

Now two out of the last three years I’ve offered up an extra credit assignment for students to work on.  It’s optional, only work a total of 10 points, so I need to factor that in as well:

extra_credit = raw_input("Enter extra credit score (Enter if none): ")

But I include the option that if there isn’t an extra credit score, we can move on and calculate the final grade either way:

if extra_credit:
    grade = (midterm_calc + final_calc + lab_calc + homework_calc + participation_calc) + int(extra_credit)
else:
    grade = midterm_calc + final_calc + lab_calc + homework_calc + participation_total

And then we wrap up by printing what the grade is…

Until next week!

https://github.com/CdtDelta/YOP


Hello Reader!

So this weeks piece of code is a continuation of the work I started on the index.dat file in YOP Weeks 4 and 5.  My hope is to eventually have one script that will parse out the entire file, but for now I’m doing it piecemeal.  In the previous posts we tackled the header portion of the index.dat file, this time around we’re going to look at the HASH table.

The HASH Tables job in the index.dat file is similar to allocation files you’ll see on Filesystems ($Bitmap for NTFS, FAT in a FAT File system).  It’s job is to record where the valid record entries are within the index.dat file.  There can be more than one HASH Table entry within an index.dat file, and they are normally 4096 bytes in size.  As always Joachim Metz has a detailed writeup on the index.dat file specification here.  Note if that link doesn’t work, try this one and click on the Documentation link.

Now so far all this script will do is parse out the HASH table header, and then parse out all the records in the HASH table itself.  The records themselves are eight bytes long in total, but broken up into two sections of four bytes each.  The first four bytes are the Data piece, the second four bytes are a record pointer.

Now the Data portion can vary depending on what type of record it’s pointing to.  The simple version is it will tell you if the record is not being used, it’s marked for deletion, or it’s pointing to a valid/active record.  The Record Pointer portion holds the offset within the file for where the data portion is located.

So let’s take a look at the code…the first thing I’m doing is defining two functions.  The first function is for the HASH Table “Header”:

def hash_header(parse_header):
    ie_hash_header = parse_header[0:4]
    ie_hash_length = struct.unpack("<I", parse_header[4:8])
    ie_hash_next_table = struct.unpack("<I", parse_header[8:12])
    ie_hash_table_no = struct.unpack("<I", parse_header[12:16])
    print "{}\nHash Table Length: {}\nNext Hash Table Offset: {}\nHash Table No: {}\n".format(ie_hash_header, (ie_hash_length[0] * 128), ie_hash_next_table[0], ie_hash_table_no[0])
    return ie_hash_header, (ie_hash_length[0] * 128), ie_hash_next_table[0], ie_hash_table_no[0]

This part parses out the signature “HASH”, the length of the hash table (which is the value of this offset times 128), the file offset to the NEXT HASH table entry, and finally the HASH Table number, which starts at zero.

The second function will parse out the eight bytes of each hash table record:

def hash_table_records(parse_records):
    ie_hash_data = struct.unpack("<I", parse_records[0:4])
    ie_hash_record_pointer = struct.unpack("<I", parse_records[4:8])
    print "Hash Data: {}\t\tHash Record Pointer: {}".format(hex(ie_hash_data[0]), ie_hash_record_pointer[0])
    return

This function will be the main workhorse portion of the script.

The last main part of the code we will talk about is the section that opens the index.dat file and then starts to parse the HASH Table portion:

with open(index_dat, "rb") as ie_file:
    ie_hash_parser = ie_file.read()
    ie_hash_head = ie_hash_parser[20480:20496]
    ie_hash_header = hash_header(ie_hash_head)
    ie_hash_record_start = 20496
    ie_hash_record_end = 20504
    ie_hash_record = ie_hash_parser[ie_hash_record_start:ie_hash_record_end]
    while ie_hash_record_start < (ie_hash_record_start + (int(ie_hash_header[1]) - 12)):
        ie_hash_record_table = hash_table_records(ie_hash_record)
        ie_hash_record_start = ie_hash_record_end
        ie_hash_record_end += 8
        ie_hash_record = ie_hash_parser[ie_hash_record_start:ie_hash_record_end]

Again at this point the code is only decoding the first HASH Table.  Future versions will parse through all the HASH tables within the index.dat file, as well as further decode the Data portion of the HASH table record.  Right now we’re just displaying the “raw” output.

Now the key here is ie_hash_record_start and ie_hash_record_end values.  Since they are eight bytes long each, we have to cycle through each set of eight bytes within the table.  So once we have parsed out eight bytes, the ie_hash_record_end value becomes the ie_hash_record_start value, and a new ie_hash_record_end value is “created” by adding 8 to the old value.  We keep this loop going until the ie_hash_record_start value is greater than the size of the HASH table itself.  Which would be the following formula:

Hash Table Start Byte + (Length of Hash Table – 12)

Why do we subtract 12?  Because we have to account for the HASH Table Header portion which is 12 bytes in length.

Until next time!

https://github.com/CdtDelta/YOP


Hello Reader!

This weeks script is brought to you by MIcrosoft’s Patch Tuesday, and the MS15-034 fix that came out this month…

So we started looking at what systems we needed to patch for this vulnerability, and I started to think about some of the programs that are on the servers I use that don’t use a standard port 80 web server connection.  Some of the software I use on my systems (both forensic and from the server manufacturer) have some type of web front end, but they use odd numbered ports.  So I was thinking of a quick way that I could figure out which systems might be vulnerable, including ones that I didn’t know there was any type of web service on.

I was reading this SANS Diary post about MS15-034, which included a method of just checking for the vulnerability, and thought it would make a great YOP post for this week.

Now there is a pre-requisite for this script in order for it to run, you’ll need to install the Requests module.  I may re-write this script at a later point to use urllib, but I prefer using Requests for any web based stuff.

Basically you can feed this script a list of servers/IP addresses, and then specify an output file to write your results too.

The first thing I did in this script is define the request I wanted to send, along with a range of ports to check (in this case all of them):

headers = {'host':'ms15034', 'range': 'bytes=0-18446744073709551615'}
tcp_ports = range(1,65536)

But the following code:

with open(args.server_list, &amp;quot;r&amp;quot;) as servers_to_check:
    for ms15_034_server in servers_to_check.readlines():
        for ports in tcp_ports:
            try:
                url = 'http://' + ms15_034_server.rstrip() + ':' + str(ports)
                print &amp;quot;Checking: {}...&amp;quot;.format(url)
                ms15_034_check = http_check.get(url, headers = headers)
                ms15_output[str(ports)] = ms15_034_check.status_code
                http_check.close()
            except:
                continue
        for key, value in sorted(ms15_output.iteritems()):
            ms15_output_file.writelines(&amp;quot;Server: {}\tPort: {}\tResult: {}&amp;quot;.format(ms15_034_server, key, value))

Does most of the heavy lifting.  Here I start by cycling through the ports, informing the user which URL and port I’m checking (I’ll explain why in a bit).  When it finds an open port, it sends the headers I defined to see if it gets a response.  Then, it takes that response and puts it into a dictionary by port and then the HTTP response (if there is one).  When it’s finished cycling through all the ports, it writes the dictionary data to a file, and then moves on to the next server in the list.

Now there is a small bug in the code that I’m still trying to nail down, that’s inconsistent.  In some cases the script will hang on a port, and then only way to get it to continue is to do a Ctrl-C (which makes it move on to the next port).  I’m not quite sure why this is happening yet, because sometimes it will just hang on an open port longer than others.  So sometimes I’m just having to wait for it to move on, otherwise I need to help it along.

Until next week!

https://github.com/CdtDelta/YOP

(EDIT: Thanks to Willi Ballenthin for suggesting the timeout parameter in the ms15_034_check line.  That appears to have fixed the hanging issue and I’ve updated the code on the site.)


Hello Readers!

This week I’m taking a trip back to the past a bit (for me anyway).  Originally I was planning on publishing some code that worked with IMAP and webmail accounts, however I ran out of time in getting the code in a working state.  My plan is still to publish it during this Year of Python.  I’m looking for it to do a specific task with the Spam folder, but I don’t want to say much more than that now.

However, if you want to see an excellent write up on using Python and IMAP/email, check out this post from the System Forensics Blog.

Now this particular piece of code this week I used a long time ago for a case I was working on.  At the time I was going through some Apache log files, and I needed to verify two types of log entries.  One was if a hashed password was being included in the URL string, and then the second was to decode some %xx characters in other URL entries (ex %2f).  At the time I was hoping to automate the process a bit better for my workflow, but the reality of time stepped in, so I turned it into a menu that would just prompt me for what I wanted to do.

Which is all in this piece of code:

while quit_prog:
    choice = raw_input("Choose md5 or url (type quit to exit): ")
    if choice.lower() == "md5":
        enter_md5 = raw_input("Enter the text to convert to md5: ")
        print create_md5(enter_md5)
    elif choice.lower() == "url":
        enter_url = raw_input("Enter the url text to convert: ")
        print decode_url(enter_url)
    elif choice.lower() == "quit":
        quit_prog = False

The code is rather simple.  I just set it up to prompt me if I was decoding and MD5, or a URL.  After that I would just copy/paste what I wanted to decode.  At that point the script would just run the corresponding function, and either generate the MD5 hash from the text I put in, or decode the special characters in the URL.

And that’s it!  Again this was one of these programs I designed to satisfy a need at the time, but it worked.  Moving forward I would probably look at just feeding in the Apache logs and then outputting the decoded data.  That will be version 2.0!

Until next time!

https://github.com/CdtDelta/YOP




Follow

Get every new post delivered to your Inbox.