Seth Keddy

Posted on May 15

Step-by-Step Guide to Set Up a Cron Job to Run a Report

#sql #python #analytics #automation

HouseKeeping

Setting Up a Free AWS EC2 Instance as a Baseline Development Environment

Before automating report generation with cron jobs, it's essential to have a reliable development environment where you can build, test, and deploy your scripts. Amazon Web Services (AWS) offers a free-tier EC2 instance, which is an excellent baseline platform for development work without upfront costs.

Steps to Set Up a Free Tier EC2 Instance:

Create an AWS Account:

If you don’t already have one, sign up at aws.amazon.com. The free tier provides 750 hours per month of a t2.micro or t3.micro instance for 12 months.
Launch an EC2 Instance:

-   Navigate to the EC2 dashboard in the AWS Management Console.

-   Choose “Launch Instance” and select an Amazon Machine Image (AMI), such as the latest Amazon Linux 2 or Ubuntu Server.

-   Select the **t2.micro** or **t3.micro** instance type to remain within the free tier.

-   Configure storage and network settings as needed.

-   Set up a security group allowing SSH (port 22) access from your IP address.

Access the Instance:

Use an SSH client from your local machine to connect to the instance using the provided key pair.
Set Up Your Development Environment:

Install necessary packages, programming languages (Python, Node.js, etc.), and tools to support your report generation scripts.

This EC2 instance will serve as your dedicated sandbox for testing and running automated tasks like scheduled reports, ensuring that your production systems remain stable.

Proceeding to Automate Report Execution with Cron Jobs

With the baseline EC2 instance configured and ready, you can now automate your reporting tasks by scheduling scripts with cron jobs. Here’s how to set that up...

Ready Set GO!

1. Prepare Your Report Script

Make sure your report script runs correctly when executed manually.
For example, a script called generate_report.sh located in /home/user/scripts/.

Here's a generate_report.sh that:

Connects to a SQL database,
Runs a query to find missing data,
Formats the results as HTML,
And sets up a simple Flask server to serve the report page.

This will give you a solid base for your report runner automation later.

from flask import Flask, render_template_string
import pyodbc  # or use your preferred DB driver

app = Flask(__name__)

# Replace these with your actual connection info
CONNECTION_STRING = 'DRIVER={SQL Server};SERVER=your_server;DATABASE=your_db;UID=user;PWD=password'

# Sample query to find missing data (adjust as needed)
QUERY = '''
SELECT o.OrderID, o.CustomerID
FROM Orders o
WHERE NOT EXISTS (
    SELECT 1 FROM Customers c WHERE c.CustomerID = o.CustomerID
)
'''

def run_query():
    conn = pyodbc.connect(CONNECTION_STRING)
    cursor = conn.cursor()
    cursor.execute(QUERY)
    columns = [column[0] for column in cursor.description]
    rows = cursor.fetchall()
    cursor.close()
    conn.close()
    return columns, rows

def generate_html_report(columns, rows):
    # Simple HTML table format
    html = '<h1>Missing Data Report</h1>'
    html += '<table border="1" cellpadding="5" cellspacing="0">'
    html += '<thead><tr>'
    for col in columns:
        html += f'<th>{col}</th>'
    html += '</tr></thead><tbody>'
    for row in rows:
        html += '<tr>'
        for cell in row:
            html += f'<td>{cell}</td>'
        html += '</tr>'
    html += '</tbody></table>'
    return html

@app.route('/')
def report():
    columns, rows = run_query()
    html_report = generate_html_report(columns, rows)
    return render_template_string(html_report)

if __name__ == '__main__':
    app.run(debug=True)

How to use:

Install dependencies:

pip install flask pyodbc
Update CONNECTION_STRING with your DB info.
Modify the QUERY to fit your missing data check.
Run the script:
python your_script_name.py
Open http://127.0.0.1:5000 in your browser to view the report.

    #!/bin/bash  # generate_report.sh  # Command to generate report python3 /home/user/scripts/report_generator.py 
# Optional: send report via email or move to a shared location

Ensure the script has execution permissions:

chmod +x /home/user/scripts/generate_report.sh

2. Open the Crontab Editor

Open the terminal and run:

crontab -e

This opens the crontab file for the current user.

3. Define the Cron Schedule and Command

Cron schedule syntax (five fields):

* * * * * command-to-run

Fields:
- Minute (0-59)
- Hour (0-23)
- Day of month (1-31)
- Month (1-12)
- Day of week (0-7, Sunday=0 or 7)
Example: Run report every day at 2:30 AM:

30 2 * * * /home/user/scripts/generate_report.sh >> /home/user/logs/report.log 2>&1

Explanation:
- 30 2 * * * — at 02:30 every day
- >> /home/user/logs/report.log 2>&1 — append standard output and errors to a log file for troubleshooting

4. Save and Exit the Crontab Editor

In vi editor: press ESC, then type :wq and press Enter.
Cron will automatically install the new job.

5. Verify Cron Job

List cron jobs with:

crontab -l

Check if your job is listed correctly.

6. Monitor and Troubleshoot

Check the log file /home/user/logs/report.log after the scheduled time.
If the script fails, make sure:
- Paths are absolute (no relative paths)
- Environment variables are set (cron runs with limited environment)
- Permissions are correct
- Any dependencies are accessible

Summary

Write your report script to run standalone.
Set execution permission on the script.
Schedule with crontab -e.
Redirect output to logs for monitoring.
Verify your cron job with crontab -l.
Monitor logs and fix issues as needed.

DEV Community