thoughts/data/converting-pst.md
Tommy Skaug 805a34f937
All checks were successful
Export / Explore-GitHub-Actions (push) Successful in 2m19s
initial migration
2024-08-05 20:24:56 +02:00

100 lines
3.6 KiB
Markdown

Some time ago I gave an introduction to converting Microsoft
MSG files [1] to a readable RFC 2822 [2] format on Linux. In
fact you will sometimes get an even kinkier format to work
with: The Outlook Data File (PST) [3]. PST files is a
proprietary format used by Microsoft Outlook, and is the
equivalent of the mbox on Linux.
**Edit August 29th**: Also have a look at the more
up-to-date [4].
Even though PST files are a bit harder to read than single
EML files, there is hope if you only have a Linux client:
libpst, and more specifically readpst. For libpst you need
three libraries:
* ``libgsf`` (i/o library that can read and write common file
types and handle structured formats that provide
file-system-in-a-file semantics)
* boost (portable C++ source libraries)
* libpst
On OS X you can install it by:
```
brew install libgsf
brew install boost
brew install libpst
```
Now if you have a pst archive, like [5] for instance, you can
convert it by:
mkdir export
readpst -M -b -e -o export "Personal Folders.pst"
This should give an output like this:
Opening PST file and indexes...
Processing Folder "Deleted Items"
Processing Folder "Inbox"
Processing Folder "latest"
[...]
Processing Folder "Reports"
"Reports" - 11 items done, 1 items skipped.
Processing Folder "Quotes"
"Quotes" - 1 items done, 1 items skipped.
Processing Folder "Printer"
"Printer" - 1 items done, 1 items skipped.
Processing Folder "Passwords"
"Passwords" - 6 items done, 1 items skipped.
[...]
Processing Folder "Kum Team"
"Kum Team" - 37 items done, 0 items skipped.
"9NT1425(India 11.0)" - 228 items done, 1 items skipped.
Processing Folder "Jimmi"
"Jimmi" - 31 items done, 0 items skipped.
"Inbox" - 27 items done, 11 items skipped.
Processing Folder "Outbox"
Processing Folder "Sent Items"
"Sent Items" - 0 items done, 1 items skipped.
Processing Folder "Calendar"
"Calendar" - 0 items done, 6 items skipped.
Processing Folder "Contacts"
"Contacts" - 0 items done, 1 items skipped.
[...]
Processing Folder "Drafts"
Processing Folder "RSS Feeds"
Processing Folder "Junk E-mail"
Processing Folder "quarantine"
"My Personal Folder" - 13 items done, 0 items skipped.
Which creates a directory structure like ``ls -l 'export/My
Personal Folder'``:
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Calendar
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Contacts
drwxr-xr-x 29 - staff 986 Aug 28 21:34 Inbox
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Journal
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Sent Items
drwxr-xr-x 2 - staff 68 Aug 28 21:34 Tasks
If you sample ``Inbox/Mails/``, you will find:
1.eml 10.eml 11.eml 12.eml 13.eml 14.eml 15.eml 16.eml 17.eml 2.eml 3.eml 4.eml 5.eml 6.eml 7.eml 8.eml 9.eml
You can now continue with our previous post [6]. I'll also
encourage you to have a look at the documentation of the
Outlook PST format [7].
[1] Converting Microsoft MSG files: /2013-10-08-msg-eml.html
[2] RFC 2822: http://tools.ietf.org/html/rfc2822
[3] The Outlook Data File (PST): http://office.microsoft.com/en-001/outlook-help/introduction-to-outlook-data-files-pst-and-ost-HA010354876.aspx
[4] libpff: /converting-pst-archives-in-os-xlinux-with-libpff
[5] Example PST file: http://sourceforge.net/projects/pstfileup/files/Personal%20Folders.pst/download
[6] Reading MSG and EML Files on OSX/Linux Command Line: :4443/forensics/reading-msg-files-in-linux-command-line/
[7] The outlook.pst format: http://www.five-ten-sg.com/libpst/rn01re05.html