Darian Vance

Posted on Jan 23 • Originally published at wp.me

Solved: Automate Weekly Report Generation from Jira to PDF

#devops #programming #tutorial #cloud

🚀 Executive Summary

TL;DR: This guide provides a step-by-step tutorial for automating weekly Jira report generation into PDF format using Python, eliminating manual, error-prone processes. It leverages the Jira API for data retrieval and ReportLab for PDF creation, scheduled via cron jobs for an efficient, open-source solution.

🎯 Key Takeaways

Utilize Python libraries such as jira for API interaction, reportlab for PDF generation, and python-decouple for secure credential management via a config.env file.
Craft precise Jira Query Language (JQL) to fetch specific issue data (e.g., updated or created in the last week) from a designated project, ensuring relevant information for the report.
Schedule the Python script using a cron job on Linux/macOS, often with a wrapper shell script, to ensure the correct virtual environment and Python interpreter are used for reliable weekly execution.

Automate Weekly Report Generation from Jira to PDF

In the fast-paced world of software development and IT operations, timely and accurate reporting is paramount. Yet, many organizations still grapple with the tedious, manual process of compiling weekly progress reports from Jira. This often involves exporting data to spreadsheets, copy-pasting into documents, and formatting everything into a presentable PDF – a task that is not only time-consuming and error-prone but also a drain on valuable engineering resources. The alternative, leveraging expensive SaaS solutions, might offer automation but often comes with a hefty price tag and limited customization options.

At TechResolve, we believe in empowering our SysAdmins, Developers, and DevOps Engineers with efficient, scalable solutions. This tutorial will guide you through building your own automated system to fetch critical data from Jira and generate professional PDF reports on a weekly schedule. By the end of this guide, you will have a robust, customizable, and open-source solution that saves countless hours, ensures data consistency, and allows your team to focus on innovation rather than repetitive administrative tasks.

Prerequisites

Before we dive into the automation, ensure you have the following:

Python 3.8+: The core language for our script.
Jira Cloud or Server Instance: With API access enabled. You’ll need sufficient permissions to read project and issue data.
Jira API Token (for Cloud) or Username/Password (for Server): For authentication. For Jira Cloud, an API token is highly recommended over a password.
pip Package Manager: For installing Python libraries.
Basic Understanding of Python: Familiarity with scripting and data structures.
Basic Understanding of Bash/Shell Scripting: For setting up cron jobs.
Operating System: Linux/macOS for cron scheduling, or Windows for Task Scheduler (this guide will focus on cron).

Step-by-Step Guide: Automate Weekly Jira Reports

1. Set Up Your Environment and Obtain Jira API Credentials

First, let’s prepare our development environment and secure the necessary credentials. It’s good practice to use a virtual environment to manage project dependencies.

Create a Virtual Environment:

python3 -m venv jira_report_env
source jira_report_env/bin/activate

Install Required Python Libraries: We’ll use jira for API interaction and reportlab for PDF generation.

pip install jira reportlab python-decouple

Obtain Jira API Token: For Jira Cloud, navigate to https://id.atlassian.com/manage-profile/security/api-tokens and create a new API token. Copy this token immediately as it will not be shown again. For Jira Server, you might use your username and password or an OAuth token if configured.
Secure Your Credentials: Never hardcode credentials in your script. We’ll use a config.env file and python-decouple to load them securely. Create a file named config.env in your project directory:

JIRA_SERVER_URL="https://your-company.atlassian.net"
JIRA_USERNAME="your-email@example.com"
JIRA_API_TOKEN="YOUR_JIRA_API_TOKEN"
JIRA_PROJECT_KEY="YOUR_PROJECT_KEY" <!-- Example: "TRP" for TechResolve Project -->

Remember to replace placeholder values with your actual Jira details. Also, make sure to add config.env to your .gitignore if you’re using version control.

2. Fetch Data from Jira Using its API

Now, let’s write a Python script to connect to Jira and retrieve the data we need for our report. We’ll use Jira Query Language (JQL) to filter issues relevant to the past week for a specific project.

Create a file named generate_report.py:

import datetime
from decouple import config
from jira import JIRA

def get_jira_issues():
    jira_server_url = config('JIRA_SERVER_URL')
    jira_username = config('JIRA_USERNAME')
    jira_api_token = config('JIRA_API_TOKEN')
    project_key = config('JIRA_PROJECT_KEY')

    options = {
        'server': jira_server_url
    }
    jira = JIRA(options, basic_auth=(jira_username, jira_api_token))

    today = datetime.date.today()
    # Calculate the start of last week (Monday) and end of last week (Sunday)
    # Assuming today is Tuesday, last_week_start would be Monday of the previous week
    # This logic needs to be carefully adjusted based on your definition of "weekly"
    # For a simple "last 7 days" report:
    one_week_ago = today - datetime.timedelta(days=7)

    # JQL to fetch issues updated or created in the last week for a specific project
    # You might adjust this JQL based on what "weekly report" means for you
    # e.g., 'status changed during ("-1w", "0w")' for status changes
    jql_query = (
        f'project = "{project_key}" AND '
        f'(updated >= "{one_week_ago.strftime("%Y-%m-%d")}" OR '
        f'created >= "{one_week_ago.strftime("%Y-%m-%d")}") '
        f'ORDER BY updated DESC'
    )

    print(f"Executing JQL: {jql_query}")

    issues = jira.search_issues(jql_query, maxResults=100) # Adjust maxResults as needed

    # Extract relevant data
    report_data = []
    for issue in issues:
        report_data.append({
            'key': issue.key,
            'summary': issue.fields.summary,
            'status': issue.fields.status.name,
            'assignee': issue.fields.assignee.displayName if issue.fields.assignee else 'Unassigned',
            'reporter': issue.fields.reporter.displayName if issue.fields.reporter else 'N/A',
            'issue_type': issue.fields.issuetype.name,
            'url': f"{jira_server_url}/browse/{issue.key}"
        })
    return report_data

if __name__ == '__main__':
    data = get_jira_issues()
    if data:
        print(f"Fetched {len(data)} issues for the report.")
    else:
        print("No issues found for the specified criteria.")

Logic Explanation: The script reads Jira credentials from config.env. It then calculates a date range for the “last week” (configurable based on your reporting period definition). A JQL query is constructed to fetch issues from your specified project that were created or updated within this period. Finally, it iterates through the fetched issues, extracting key fields like summary, status, and assignee, and stores them in a list of dictionaries for further processing.

3. Generate the PDF Report

Now, let’s extend our script to take the fetched Jira data and format it into a professional PDF document using the ReportLab library. Append the following functions to your generate_report.py file (or create a new file and import get_jira_issues).

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.units import inch
from reportlab.lib import colors

# ... (Previous code for get_jira_issues and imports) ...

def create_pdf_report(data, filename="weekly_jira_report.pdf"):
    doc = SimpleDocTemplate(filename, pagesize=letter)
    styles = getSampleStyleSheet()
    story = []

    # Title
    report_title = f"Weekly Jira Report - {datetime.date.today().strftime('%Y-%m-%d')}"
    story.append(Paragraph(report_title, styles['h1']))
    story.append(Spacer(1, 0.2 * inch))

    # Summary
    story.append(Paragraph(f"Total issues updated/created in the last week: {len(data)}", styles['Normal']))
    story.append(Spacer(1, 0.2 * inch))

    if not data:
        story.append(Paragraph("No relevant issues found for this reporting period.", styles['Normal']))
    else:
        # Table Header
        table_data = [['Key', 'Type', 'Summary', 'Status', 'Assignee']]

        # Populate table data
        for item in data:
            table_data.append([
                Paragraph(item['key'], styles['Normal']),
                Paragraph(item['issue_type'], styles['Normal']),
                Paragraph(item['summary'], styles['Normal']),
                Paragraph(item['status'], styles['Normal']),
                Paragraph(item['assignee'], styles['Normal'])
            ])

        # Create Table
        col_widths = [0.8 * inch, 0.8 * inch, 3.0 * inch, 0.8 * inch, 1.2 * inch] # Adjust widths as needed
        table = Table(table_data, colWidths=col_widths)

        # Apply Table Style
        table.setStyle(TableStyle([
            ('BACKGROUND', (0, 0), (-1, 0), colors.grey),
            ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
            ('ALIGN', (0, 0), (-1, -1), 'LEFT'),
            ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
            ('BOTTOMPADDING', (0, 0), (-1, 0), 12),
            ('BACKGROUND', (0, 0), (-1, -1), colors.beige),
            ('GRID', (0, 0), (-1, -1), 1, colors.black),
            ('LEFTPADDING', (0, 0), (-1, -1), 6),
            ('RIGHTPADDING', (0, 0), (-1, -1), 6),
            ('TOPPADDING', (0, 0), (-1, -1), 6),
            ('BOTTOMPADDING', (0, 0), (-1, -1), 6),
        ]))

        story.append(table)

    doc.build(story)
    print(f"Report generated successfully: {filename}")

if __name__ == '__main__':
    report_data = get_jira_issues()
    report_filename = f"weekly_jira_report_{datetime.date.today().strftime('%Y%m%d')}.pdf"
    create_pdf_report(report_data, report_filename)

Logic Explanation: The create_pdf_report function takes the list of issue dictionaries. It initializes a SimpleDocTemplate for the PDF. It adds a title and a summary paragraph. For the issues, it constructs a Table with headers and populates it with the extracted issue data. TableStyle is applied to make the report visually appealing. Finally, doc.build(story) compiles all the elements into the PDF file.

4. Schedule the Automation with Cron

To make this report generation truly automated, we’ll schedule it to run weekly using a cron job. First, ensure your script is executable.

Make the Script Executable:

chmod +x generate_report.py

Create a Wrapper Script (Optional but Recommended): Sometimes cron environments differ from your interactive shell. A small wrapper script ensures the correct Python interpreter and virtual environment are used. Create run_report.sh:

#!/bin/bash

# Navigate to the script's directory
cd /path/to/your/jira_report_project <!-- IMPORTANT: Change this to your project path -->

# Activate the virtual environment
source jira_report_env/bin/activate

# Execute the Python script
python generate_report.py

# Deactivate the virtual environment
deactivate

Make this wrapper script executable as well:

chmod +x run_report.sh

Schedule with Cron: Open your user’s crontab for editing. Replace su -c "crontab -e" your_user with your actual user or use crontab -e directly if logged in as the target user.

crontab -e

Add the following line to run the script every Monday at 9:00 AM. Replace /path/to/your/jira_report_project/run_report.sh with the actual path to your script.

0 9 * * 1 /path/to/your/jira_report_project/run_report.sh >> /tmp/logs/jira_report_cron.log 2>&1

Cron Syntax Explanation:

0 9 * * 1 means:

0: Minute 0
9: Hour 9 (9 AM)
*: Every day of the month
*: Every month
1: Monday (0 or 7 is Sunday, 1 is Monday)

The >> /tmp/logs/jira_report_cron.log 2>&1 part redirects both standard output and standard error to a log file, which is crucial for debugging cron jobs. Ensure the /tmp/logs/ directory exists or change it to an appropriate log path.

Common Pitfalls

Jira API Rate Limits: Jira Cloud APIs have rate limits. If you’re querying a very large number of issues or running the script too frequently, you might hit these limits, resulting in HTTP 429 errors. To mitigate this, optimize your JQL queries to fetch only necessary data, use pagination if fetching more than 1000 issues, and implement exponential backoff in your script for retries.
Authentication and Permissions: Ensure your Jira API token (or username/password) is correct and has the necessary permissions to read data from the specified project. Common errors include 401 Unauthorized or 403 Forbidden. Double-check your config.env values and the permissions of the Jira user associated with the API token.
Cron Environment Differences: Cron jobs run in a minimal environment, which might not have the same PATH or environment variables as your interactive shell. Using a wrapper script (as shown in Step 4) that explicitly activates the virtual environment is a robust way to handle this. Always redirect cron output to a log file to capture any errors.

Conclusion

You’ve now built a powerful automation that transforms a manual, weekly chore into an efficient, hands-off process. By leveraging Python, Jira’s robust API, and the ReportLab library, you’ve gained control over your reporting workflow, ensuring consistency and saving valuable engineering time. This solution is not just about generating PDFs; it’s about empowering your team to focus on more impactful work.

As next steps, consider enhancing this solution:

Email Integration: Automatically email the generated PDF report to stakeholders using Python’s smtplib.
Advanced Report Layouts: Explore more advanced features of ReportLab or other PDF libraries (e.g., fpdf2, WeasyPrint) to create more complex and visually rich reports, including charts and graphs.
Error Handling and Monitoring: Implement more comprehensive error logging and integrate with monitoring tools to be alerted if the report generation fails.
Cloud Integration: Deploy your script as a serverless function (e.g., AWS Lambda, Google Cloud Functions) triggered by a scheduled event for a fully managed, scalable solution.
More Granular JQL: Refine your JQL queries to generate specific reports for different teams, sprints, or issue types.