Off-Topic: Analyzing Outlook Mailbox Size

Where I work, we use Exchange as our email server and Outlook as the primary client (at least I do). We also have an email quota that I keep bumping into, since I have a tendency to attract many emails with large attachments like image-happy PowerPoint files or binary code modules to patch things. I am also an extreme user of email folders. My main Outlook account contains some 650 folders, and my offline archive of all my old emails reaches towards 1300, with many 100s of thousands of emails for a total of almost 20 GB. So, pretty extreme.

My problem is: what do I do when the email system tells me (and it is serious, I can attest) that I am close to hitting my quota and that soon email will neither be received nor sent? I want to find the folders that are very large and candidates for some archiving. The answer has eluded me for a long time, until I stumbled upon a 2010 Youtube video: http://www.youtube.com/watch?v=3skJOd4GIak, from “tech-informer.com” (which now looks pretty dead). With some modifications, this solved my problem.

To do this, you need to have SnagIt installed. I do, since SnagIt is an essential tool for my work. Since things have a changed slightly since the video was posted, I will go through it quickly here with some updated screenshots and notes. The versions of software used:

  • SnagIt version 11
  • Outlook 2010
  • Excel 2010

We start in the mostly useless folder size dialog box in Outlook:

As you can see, the name column is too narrow to see the full name of each folder. The box itself cannot be resized to see more data, only the columns can be resized. You cannot sort by either size or total size. A pretty bad example of UI design.

To get the data out of this box to manipulate it, we use the SnagIt feature of text capture along with scrolling capture. In SnagtIt 11, you need to select Text capture along with “all in one”. When you start the capture, click the scroll arrow to capture the entire window contents as text (!).

The resulting capture is not the same as in the instruction video, as it is now tab-delimited rather than space-delimited. The capture looks like this:

I saved this as a .txt file (the default is .rtf), and opened it in Excel. In the Excel import dialog, I selected that the file was delimited by tab characters, and I got a neat table. With text containing “KB” in the size columns which does not sort neatly or allow analysis. To fix this, you have to do a search and replace from “empty space” “K” “B” to nothing. The problem is that the empty space is not a normal space but something else. To solve this, I simply edited a cell and copied the piece of the text that I wanted to replace (note that I select the little space before KB):

Next, do a global replace all:

After this, Excel magically realizes all the number columns are indeed numbers, and you can start manipulating and sort in order of size. The only remaining problem is that all names of folders are unreadable, at least when using deeply nested folder with long names like I do. To solve this, I took another text capture of the mailbox size dialog, with the names column expanded.

Since the resulting capture is just a single long column of names, and it has the same order as the previous capture, I can just copy the text from within the SnagIt editor and paste it into Excel. Thus, finally, I have an analyzable data set.

However, acting on it was more difficult than I thought. There were some pretty big folders, but the majority of the space is spent in some 100 folders each of some 2 to 3 MB each. I also noted that of my 650 folders, some 400 contained no data. Would be nice to be able to hide them in Outlook view. But moving them out of the way to the archive file is a ton of manual work, so I will let them be.

 

 

 

2 thoughts on “Off-Topic: Analyzing Outlook Mailbox Size”

  1. I used to work diligently to file my email and such like this but with the latest versions of Windows and Outlook, search works so well that I just leave everything in my inbox and create an archive of that every 2-3 months. I tag stuff for follow-up instead of leaving it unfiled if I need to look at it again. Takes a bit to get used to but it saved me a lot of time and hassle especially since I always had questions on where certain emails should be filed anyway. Just my $.02

  2. That is certainly an approach to avoid the need for deep data analysis. It does not suit my way of working – indeed, a large portion of my use folders is not as a place to manually sort email into, but rather to separate out various incoming flows automatically. Support queues, internal and external forums, various internal email lists, …, all are sorted on the way in. In that way, I avoid drowning my inbox in things that are not necessarily directed at me. The problem with this is that if some email list suddenly lights up with 10s of MB of big emails, I have to go look at just where it happened.

Leave a Reply

Your email address will not be published. Required fields are marked *