For many years I have been involved with the Azure Sydney User Group, and even though I’m no longer the organiser I still gather updates for Azure in the prior month and prepare a PowerPoint presentation that contains them. My go-to place for the information is the Azure Updates website and I’ve typically just been browsing the site and manually updating an existing PowerPoint template.
Recently I reflected on the amount of time it takes for me to click through the pages and pull out the headlines, and decided this probably wasn’t a great use of my time. Even though it’s typically under 10 minutes to do it’s still repetitive which clearly means it’s the perfect candidate to automate!
I’ve used it for other integration in the past, so I know the updates website has an RSS feed which is perfect for automation.
Choosing an implementation approach
There are a few ways I could have built an automation, and my original intention was to use either Azure Logic Apps or Power Automate Flow and integrate into PowerPoint online in Microsoft 365. Unfortunately, it turns out there is no native PowerPoint online connector which means this approach became a no-go!
In the absence of this integration capability, I decided to turn to a code-based solution because I know there are many ways to generate Office documents through SDKs that implement the Office Open XML File Formats standard (ECMA-376).
One of the reasons I also looked at Logic Apps or Power Automate was their serverless pay-on-execution model. Keeping this in mind I turned to my trusty friend, Azure Functions. At this stage the no-brainer for me as a long-term C# developer would have been to implement a .NET-based Function, but as I’ve said a few times before, I’m wanting to push my skills to cover other languages, so I thought I’d have a go with Python.
Azure Functions + Python = ❤
It turns out there is an excellent PowerPoint library called python-pptx that has everything I needed, and I found an great blog and sample from Matthew Wimberly that had what I needed to read and parse an RSS feed. Now I had these two elements I needed a little bit of Functions magic to tie it together and provide a simple HTTP API I could use to generate my presentation.
The resulting Azure Function (shown below) does all I need in less than 200 lines of code.
import logging | |
import azure.functions as func | |
import os | |
from azure.storage.blob import BlobClient, BlobSasPermissions, generate_blob_sas | |
from datetime import datetime, timedelta, timezone | |
from pptx import Presentation | |
from pptx.util import Pt | |
import requests # pulling data | |
from bs4 import BeautifulSoup # xml parsing | |
# RSS scraping function | |
# Based mostly on: https://github.com/mattdood/web_scraping_example/blob/master/scraping.py | |
def get_updates_rss(startDate, endDate): | |
article_list = [] | |
try: | |
# execute my request, parse the data using XML | |
# parse using BS4 | |
r = requests.get(os.environ["UpdatesURL"]) | |
soup = BeautifulSoup(r.content, features='xml') | |
# select only the "items" I want from the data | |
updates = soup.findAll('item') | |
# for each "item" I want, parse it into a list | |
for a in updates: | |
# Get publication date | |
published = a.find('pubDate').text | |
pubDate = datetime.strptime(a.find('pubDate').text, "%a, %d %b %Y %H:%M:%S Z") | |
# only include items falling within our requested date range | |
if (pubDate >= startDate and pubDate <= endDate): | |
title = a.find('title').text | |
link = a.find('link').text | |
# basic parse to flag announcement types | |
if "preview" in title.lower(): | |
announcement_type = "preview" | |
else: | |
announcement_type = "GA" | |
# create an "article" object with the data | |
# from each "item" | |
article = { | |
'title': title, | |
'link': link, | |
'published': published, | |
'antype': announcement_type | |
} | |
# append my "article_list" with each "article" object | |
article_list.append(article) | |
# after the loop, dump my saved objects into a .txt file | |
return article_list | |
except Exception as e: | |
logging.exception("Couldn't scrape the Azure Updates RSS feed") | |
### | |
# Generate a section of the final PowerPoint | |
### | |
def generate_presentation_section(presentation, layout, articles, item_type): | |
# Add first slide and slide notes | |
slide = presentation.slides.add_slide(layout) | |
slide_notes = slide.notes_slide | |
shapes = slide.shapes | |
slide_item_count = 0 | |
total_item_count = 0 | |
article_count = len(articles) | |
slide_count = 1 | |
for article in articles: | |
# Each new slide requires first elements be added differently to the rest. | |
if slide_item_count == 0: | |
# Insert title for slide | |
title_shape = shapes.title | |
body_shape = shapes.placeholders[1] | |
title_shape.text = item_type + " (" + str(slide_count) + ")" | |
# Insert first bullet item | |
tf = body_shape.text_frame | |
tf.text = article["title"] | |
tf.paragraphs[0].font.size = Pt(24) | |
# Insert first slide note | |
sltf = slide_notes.notes_text_frame | |
sltf.text = "- " + article["link"] + " (" + article["published"] + ")" | |
else: | |
# Insert bullet point | |
p = tf.add_paragraph() | |
p.font.size = Pt(24) | |
p.text = article["title"] | |
# Insert slide note | |
dotpoint = sltf.add_paragraph() | |
dotpoint.text = "- " + article["link"] + " (" + article["published"] + ")" | |
slide_item_count += 1 | |
total_item_count += 1 | |
# If we hit 5 items on a slide, create a new slide and reset item count | |
# If there aren't any items left, don't create a new empty slide | |
if slide_item_count == 5 and total_item_count < article_count: | |
slide = presentation.slides.add_slide(layout) | |
slide_notes = slide.notes_slide | |
shapes = slide.shapes | |
slide_item_count = 0 | |
slide_count += 1 | |
### | |
# Upload generated file to Azure Storage and generate a SAS URL for it | |
### | |
def upload_file_to_storage(presenation_file): | |
blob_client = BlobClient.from_connection_string(conn_str=os.environ["PowerPointAccountConnection"], container_name=os.environ["PowerPointContainer"], blob_name=presenation_file) | |
with open(presenation_file, "rb") as data: | |
blob_client.upload_blob(data) | |
# Generate a SAS-protected URL for the item which will allow the caller to download the file for 1 hour. | |
startTime = datetime.now(tz=timezone.utc) | |
endTime = startTime + timedelta(hours=1) | |
return "https://" + os.environ["PowerPointStorageAccount"] + ".blob.core.windows.net/" + os.environ["PowerPointContainer"] + "/" + presenation_file + "?" + generate_blob_sas(os.environ["PowerPointStorageAccount"],os.environ["PowerPointContainer"],blob_name=presenation_file,account_key=os.environ["PowerPointStorageKey"],permission=BlobSasPermissions(read=True),start=startTime,expiry=endTime) | |
##### | |
# Azure Function main entry point | |
##### | |
def main(req: func.HttpRequest) -> func.HttpResponse: | |
blob_sas_url = "" | |
message = "" | |
http_status = 200 | |
try: | |
# start date is required | |
startParam = req.params.get('start') | |
if not startParam: | |
message = "Bad request: 'start' query parameter is required in format YYYY-MM-DD." | |
http_status=400 | |
else: | |
# end date is optional, so if not provided use today | |
endParam = req.params.get('end') | |
if not endParam: | |
endParam = datetime.now("%Y-%m-%d") | |
# add 1 day to end date so we include all of the day | |
ending = datetime.strptime(endParam, "%Y-%m-%d") | |
ending = ending + timedelta(days=1) | |
starting = datetime.strptime(startParam, "%Y-%m-%d") | |
updatelist = get_updates_rss(startDate=starting,endDate=ending) | |
if len(updatelist) > 0: | |
prs = Presentation() | |
# Initialise default slide layout (bullets) | |
bullet_slide_layout = prs.slide_layouts[1] | |
preview_items = [item for item in updatelist if item["antype"] == "preview"] | |
ga_items = [item for item in updatelist if item["antype"] == "GA"] | |
generate_presentation_section(prs, bullet_slide_layout, preview_items, "Preview") | |
generate_presentation_section(prs, bullet_slide_layout, ga_items, "GA") | |
filename = os.environ["LocalTempFilePath"] + "AzureUpdate-" + datetime.strftime(datetime.now(),"%Y-%m-%d-%H-%M-%S") + ".pptx" | |
prs.save(filename) | |
blob_sas_url = upload_file_to_storage(filename) | |
message = "File created and uploaded to storage. You can <a href='" + blob_sas_url + "'>download it</a> for the next 1 hour." | |
else: | |
message = "There are no updates for the specified period, so no PowerPoint has been generated.", | |
except TypeError as te: | |
logging.exception("Type error") | |
message = "Check the format of your request and ensure you provide the 'start' query parameter in the format YYYY-MM-DD", | |
http_status=400 | |
except ValueError: | |
pass | |
return func.HttpResponse( | |
mimetype="text/html", | |
body=message, | |
status_code=http_status | |
) |
To run it, you invoke the Function via a web browser with a URL similar to:
https://your-func-app.azurewebsites.net/api/GeneratePresentation?code=YOUR-FUNC-KEY&start=2021-06-20&end=2021-06-30
If the supplied date range is valid and there are updates that fall within it, you receive a simple web page with a link to a downloadable PowerPoint file held in a private Azure Storage account which is served for a limited period using a SAS-protected URL. The full documentation around how to debug, deploy and execute the Azure Function can be be found on the GitHub repository for the solution. Also, here’s a sample of what you can generate.
I have deployed the solution onto a Consumption plan in Azure which means I’m not paying for idle compute, and the PowerPoint takes up so little space that my Storage Account costs will be tiny, especially given this API endpoint can’t be invoked by just anyone. Finally, to save myself even more money, I have a Timer Function that once a week deletes any PowerPoint files sitting in the Storage Account, which won’t be many (if any) for most of the time.
I’m pretty happy with the solution as it stands, but in future I might look to use my existing PowerPoint template as the base for the resulting presentation which means there would be even less manual work for me to do. Right now I still need to copy / paste from one PowerPoint to the other, but this is so trivial that I’m not bothered about automating it away … just yet 😉.
Hopefully you find some inspiration in the solution here!
Happy Days! 😎
P.S. The GitHub repository with the solution is here: https://github.com/sjwaight/AzureUpdatesPresentationGen
Top comments (0)