DEV Community

Pavel Polívka
Pavel Polívka

Posted on

7 1

Parsing Outlook emails in Java

Recently I was in need to parse Outlook emails to extract some values so that automated tests can pass multifactor authentication. I was hoping for some naïve implementation in JavaScript but could not found reliable solution there so that I search for good library in Java. I was not even surprised that there were several solutions for parsing Outlook msg files. Java truly has library for everything.

I chose the the Auxilii msgparser library. As it seemed like the easiest to use solution.

Added via Maven.

<dependency>
    <groupId>com.auxilii.msgparser</groupId>
    <artifactId>msgparser</artifactId>
    <version>1.1.15</version>
</dependency>

Usage is then straight forward

Message parsedMessage = new MsgParser().parseMsg(msgFile.getInputStream());
String body = parsedMessage.getBodyText();
List<Attachment> attachments = parsedMessage.getAttachments();

Please be aware that Outlook on MacOS does not use msg format for it’s emails. Exported emails on mac are eml. Those are exported in plain text so they could be parsed via regex just be reading the file.

The whole code supporting all would look like this.

String body = "";
if(file.getName().endsWith("msg")) {
    Message parsedMessage = new MsgParser().parseMsg(file);
    body = parsedMessage.getBodyText();
} else if (file.getName().endsWith("eml")) {
    body = new String(Files.readAllBytes(file.toPath()), StandardCharsets.UTF_8);
}
// here parse your body

If this is interesting to you, you can follow me on Twitter.

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (1)

Collapse
 
vinod98 profile image
Vinod98

I am getting the error as below and I am using the same code snippet you mentioned above. Any idea why?

org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature; read 0x615F3430305F2D2D, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document
Invalid header signature; read 0x615F3430305F2D2D, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document
inside load messages
at org.apache.poi.poifs.storage.HeaderBlock.(HeaderBlock.java:151)
at org.apache.poi.poifs.storage.HeaderBlock.(HeaderBlock.java:117)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.(POIFSFileSystem.java:285)
at com.auxilii.msgparser.MsgParser.parseMsg(MsgParser.java:159)
at com.auxilii.msgparser.MsgParser.parseMsg(MsgParser.java:138)
at com.aspose.email.examples.email.Email_Parse.loadMessages(Email_Parse.java:36)
at com.aspose.email.examples.email.Email_Parse.getMessages(Email_Parse.java:114)
at com.aspose.email.examples.email.Email_Parse.main(Email_Parse.java:25)

AWS Q Developer image

Your AI Code Assistant

Generate and update README files, create data-flow diagrams, and keep your project fully documented. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE