Year of Python (YOP) – Week Forty Eight


Hello Reader!

Well, we’re coming into the home stretch.  It’s week Forty Eight, and only four weeks left on the calendar.  To be honest I’m shocked I made it this far (and there were some close calls).  But the year isn’t over, so on to this weeks bit of code!

I’m continuing to look at parsing out Norton NPE Log files.  As I stated last week, the logs are in XML format, but not in a “normal” structure that makes it easy to parse with the XML modules.  So I’ve been using BeautifulSoup to tackle this particular issue.

This week I focused on the Infections Detected and Suspicious Items sections of the log file.  But right now I’m only looking at the summaries for these sections.  That way (at least right now) I can go into the log file and target anything that doesn’t have a zero value.

The good news is both sections are parsed out the same way, so once I got one section working, it was just a matter of setting up the second section.

Now what I do in each section is start by looking for the initial section name in the document.  Once I have that, I walk through each of the “children” in this section, and create a dictionary with the values.  The trick with this section is that each child has a line that looks like this:

<DRIVERS Count=”0/>

And if I tell the script to just pull the drivers tag, it pulls the entire line.  But I want to separate out “drivers”, “count”, and “0” when I display the output.  In the dictionary I’m creating, I’ll get “drivers” as the key (using this example), and “count : 0” as the value.  So the value of the dictionary ends up being a dictionary itself, with “count” as the key, and “0” as the value.  This means when I display the output, as I’m walking through the dictionary, my value from the first dictionary is actually value[key], in order to produce the value from the “inner” dictionary.  This was the one part of the script that I had the biggest challenge figuring out.

Until next week!


