Parse an email header
Level: Advanced (score: 4)
Write a regular expression to extract 4 pieces of information from an email header:
- From email
- To email
- Subject
- Date sent (without timezone info)
Use re.match
or re.search
and capturing parenthesis. Return the captured groupdict
of the match object.
Here is an example how it would work (email header found here - we use another made up one in the tests):
>>> header = """Return-Path: <bounces+5555-7602-redacted-info> ... ... ... Received: by 10.8.49.86 with SMTP id mf9.22328.51C1E5CDF ... Wed, 19 Jun 2013 17:09:33 +0000 (UTC) ... Received: from NzI3MDQ (174.37.77.208-static.reverse.softlayer.com [174.37.77.208]) ... by mi22.sendgrid.net (SG) with HTTP id 13f5d69ac61.41fe.2cc1d0b ... for ; Wed, 19 Jun 2013 12:09:33 -0500 (CST) ... Content-Type: multipart/alternative; ... boundary="===============8730907547464832727==" ... MIME-Version: 1.0 ... From: redacted-address ... To: redacted-address ... Subject: A Test From SendGrid ... Message-ID: <1371661773.974270694268263@mf9.sendgrid.net> ... Date: Wed, 19 Jun 2013 17:09:33 +0000 (UTC) ... X-SG-EID: P3IPuU2e1Ijn5xEegYUQ... ... X-SendGrid-Contentd-ID: {"test_id":"1371661776"}""" >>> >>> from email_header import get_email_details >>> get_email_details(header) {'from': 'redacted-address', 'to': 'redacted-address', 'subject': 'A Test From SendGrid', 'date': 'Wed, 19 Jun 2013 17:09:33'}
Enjoy and keep calm and code in Python!