3.6 KiB
Some time ago I gave an introduction to converting Microsoft MSG files [1] to a readable RFC 2822 [2] format on Linux. In fact you will sometimes get an even kinkier format to work with: The Outlook Data File (PST) [3]. PST files is a proprietary format used by Microsoft Outlook, and is the equivalent of the mbox on Linux.
Edit August 29th: Also have a look at the more up-to-date [4].
Even though PST files are a bit harder to read than single EML files, there is hope if you only have a Linux client: libpst, and more specifically readpst. For libpst you need three libraries:
libgsf
(i/o library that can read and write common file types and handle structured formats that provide file-system-in-a-file semantics)- boost (portable C++ source libraries)
- libpst
On OS X you can install it by:
brew install libgsf
brew install boost
brew install libpst
Now if you have a pst archive, like [5] for instance, you can convert it by:
mkdir export
readpst -M -b -e -o export "Personal Folders.pst"
This should give an output like this:
Opening PST file and indexes...
Processing Folder "Deleted Items"
Processing Folder "Inbox"
Processing Folder "latest"
[...]
Processing Folder "Reports"
"Reports" - 11 items done, 1 items skipped.
Processing Folder "Quotes"
"Quotes" - 1 items done, 1 items skipped.
Processing Folder "Printer"
"Printer" - 1 items done, 1 items skipped.
Processing Folder "Passwords"
"Passwords" - 6 items done, 1 items skipped.
[...]
Processing Folder "Kum Team"
"Kum Team" - 37 items done, 0 items skipped.
"9NT1425(India 11.0)" - 228 items done, 1 items skipped.
Processing Folder "Jimmi"
"Jimmi" - 31 items done, 0 items skipped.
"Inbox" - 27 items done, 11 items skipped.
Processing Folder "Outbox"
Processing Folder "Sent Items"
"Sent Items" - 0 items done, 1 items skipped.
Processing Folder "Calendar"
"Calendar" - 0 items done, 6 items skipped.
Processing Folder "Contacts"
"Contacts" - 0 items done, 1 items skipped.
[...]
Processing Folder "Drafts"
Processing Folder "RSS Feeds"
Processing Folder "Junk E-mail"
Processing Folder "quarantine"
"My Personal Folder" - 13 items done, 0 items skipped.
Which creates a directory structure like ls -l 'export/My Personal Folder'
:
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Calendar
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Contacts
drwxr-xr-x 29 - staff 986 Aug 28 21:34 Inbox
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Journal
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Sent Items
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Tasks
If you sample Inbox/Mails/
, you will find:
1.eml 10.eml 11.eml 12.eml 13.eml 14.eml 15.eml 16.eml 17.eml 2.eml 3.eml 4.eml 5.eml 6.eml 7.eml 8.eml 9.eml
You can now continue with our previous post [6]. I'll also encourage you to have a look at the documentation of the Outlook PST format [7].
[1] Converting Microsoft MSG files: /2013-10-08-msg-eml.html
[2] RFC 2822: http://tools.ietf.org/html/rfc2822
[3] The Outlook Data File (PST): http://office.microsoft.com/en-001/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx
[4] libpff: /converting-pst-archives-in-os-xlinux-with-libpff
[5] Example PST file: http://sourceforge.net/projects/pstfileup/files/Personal%20Folders.pst/download
[6] Reading MSG and EML Files on OSX/Linux Command Line: :4443/forensics/reading-msg-files-in-linux-command-line/
[7] The outlook.pst format: http://www.five-ten-sg.com/libpst/rn01re05.html