DEV Community

Cover image for The Lazy Engineer’s Guide to Automating Timesheets: Part 1
Ernesto Herrera Salinas
Ernesto Herrera Salinas

Posted on • Edited on

The Lazy Engineer’s Guide to Automating Timesheets: Part 1

Ah, timesheets. The Lex Luthor of every software engineer’s existence. If you’re like me, you’d rather debug a race condition at 3 a.m. than spend even five minutes documenting what you did all day. But alas, as freelancers or full-time employees, we’re often stuck with this tedious chore.

This year, I hit my breaking point. After a whirlwind of projects—some canceled, some pivoted, and others shelved indefinitely—I found myself staring down the barrel of a year-end timesheet deadline. The thought of manually reconstructing my entire year’s work made me want to scream. So, I did what any self-respecting engineer would do: I decided to automate the problem away.

This is the story of how I turned a day of dread into a coding adventure. Buckle up—it’s going to be a lazy, efficient ride.


The Problem: Timesheets Are the Worst

Let’s set the scene:

  • The Task: Document every hour spent on every task for the entire year.

  • The Obstacle: My memory is about as reliable as a flaky integration test.

  • The Deadline: One day. Yes, one day.

I knew I couldn’t do this manually. So, I hatched a plan to pull data from all the tools I use daily—JIRA, Git, Slack, and Outlook—and stitch it together into a coherent timesheet.


The Tools of the Trade

Here’s what I used:

  1. JIRA: For tracking tasks and tickets.

  2. Git: For commit history (because every good engineer ties commits to tickets, right?).

  3. Slack: For team communication (because meetings and messages count too).

  4. Outlook: For calendar events (because apparently, meetings are work too).


Step 1: Pulling JIRA Tickets

My first stop was JIRA. I needed to pull all the tickets assigned to me within a specific date range. Thankfully, JIRA has a robust API, and with a little Python magic, I was able to automate this process.

The Script

Here’s the Python script I wrote to fetch JIRA tickets:

import os
from jira import JIRA
import pandas as pd
from datetime import datetime
import logging
import sys
from typing import List, Dict, Any
import argparse

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class JiraTicketPuller:
    def __init__(self, server: str, email: str, api_token: str):
        """
        Initialize JIRA client with authentication credentials.

        Args:
            server: JIRA server URL
            email: User's email for authentication
            api_token: JIRA API token
        """
        try:
            self.jira = JIRA(server=server, basic_auth=(email, api_token))
            logger.info("Successfully connected to JIRA")
        except Exception as e:
            logger.error(f"Failed to connect to JIRA: {str(e)}")
            sys.exit(1)

    def get_user_tickets(
        self, 
        username: str, 
        start_date: str, 
        end_date: str
    ) -> List[Dict[str, Any]]:
        """
        Retrieve all tickets assigned to a user within a date range.

        Args:
            username: JIRA username to search for
            start_date: Start date in YYYY-MM-DD format
            end_date: End date in YYYY-MM-DD format

        Returns:
            List of dictionaries containing ticket information
        """
        try:
            # Construct JQL query
            jql_query = (
                f'assignee = {username} AND '
                f'created >= {start_date} AND '
                f'created <= {end_date} '
                'ORDER BY created DESC'
            )

            # Fields to retrieve
            fields = [
                'summary',
                'status',
                'priority',
                'issuetype',
                'created',
                'updated',
                'project',
                'components',
                'description'
            ]

            # Get issues using JQL
            issues = self.jira.search_issues(
                jql_query,
                maxResults=1000,  # Adjust based on your needs
                fields=fields
            )

            tickets_data = []
            for issue in issues:
                ticket = {
                    'Key': issue.key,
                    'Summary': issue.fields.summary,
                    'Status': str(issue.fields.status),
                    'Priority': str(issue.fields.priority),
                    'Issue Type': str(issue.fields.issuetype),
                    'Created': issue.fields.created[:10],  # Get date only
                    'Updated': issue.fields.updated[:10],  # Get date only
                    'Project': str(issue.fields.project),
                    'Components': ', '.join([str(c) for c in issue.fields.components]),
                    'Description': issue.fields.description or ''
                }
                tickets_data.append(ticket)

            logger.info(f"Retrieved {len(tickets_data)} tickets for user {username}")
            return tickets_data

        except Exception as e:
            logger.error(f"Error retrieving tickets: {str(e)}")
            return []

    def export_to_csv(self, tickets: List[Dict[str, Any]], output_file: str):
        """
        Export tickets data to CSV file.

        Args:
            tickets: List of ticket dictionaries
            output_file: Path to output CSV file
        """
        try:
            if not tickets:
                logger.warning("No tickets to export")
                return

            df = pd.DataFrame(tickets)
            df.to_csv(output_file, index=False, encoding='utf-8')
            logger.info(f"Successfully exported tickets to {output_file}")

        except Exception as e:
            logger.error(f"Error exporting to CSV: {str(e)}")

def validate_date(date_str: str) -> bool:
    """Validate date string format (YYYY-MM-DD)."""
    try:
        datetime.strptime(date_str, '%Y-%m-%d')
        return True
    except ValueError:
        return False

def main():
    # Set up argument parser
    parser = argparse.ArgumentParser(
        description='Pull JIRA tickets for a user within a date range'
    )
    parser.add_argument('--username', required=True, help='JIRA username')
    parser.add_argument('--start-date', required=True, help='Start date (YYYY-MM-DD)')
    parser.add_argument('--end-date', required=True, help='End date (YYYY-MM-DD)')
    parser.add_argument(
        '--output', 
        default='jira_tickets.csv', 
        help='Output CSV file path'
    )

    args = parser.parse_args()

    # Validate dates
    if not all(validate_date(date) for date in [args.start_date, args.end_date]):
        logger.error("Invalid date format. Please use YYYY-MM-DD")
        sys.exit(1)

    # Configuration
    JIRA_SERVER = os.getenv("JIRA_SERVER")
    JIRA_EMAIL = os.getenv("JIRA_EMAIL")
    JIRA_API_TOKEN = os.getenv("JIRA_API_TOKEN")

    # Initialize ticket puller
    puller = JiraTicketPuller(JIRA_SERVER, JIRA_EMAIL, JIRA_API_TOKEN)

    # Get and export tickets
    tickets = puller.get_user_tickets(
        args.username,
        args.start_date,
        args.end_date
    )
    puller.export_to_csv(tickets, args.output)

if __name__ == "__main__":
    # Ensure JIRA API token is set
    if not os.getenv("JIRA_API_TOKEN"):
        logging.error("JIRA_API_TOKEN environment variable not set")
        return
    main()
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Authentication: The script uses your JIRA email and API token to authenticate.

  2. JQL Query: It constructs a JQL query to fetch tickets assigned to you within a date range.

  3. Data Export: The results are exported to a CSV file for easy analysis.


Step 2: Fetching Git Commits

Next, I turned to Git. Since our team follows a practice of including JIRA ticket IDs in commit messages, I wrote a script to extract commit data and map it to tickets.

The Script

import subprocess
import csv
import re

def get_git_commits(since_date=None, author=None):
    # Get commit data with full commit message
    cmd = ['git', 'log', '--pretty=format:%h|%ad|%s', '--date=iso']

    if since_date:
        cmd.extend(['--since', since_date])
    if author:
        cmd.extend(['--author', author])

    result = subprocess.run(cmd, capture_output=True, text=True)
    commits = result.stdout.strip().split('\n')

    # Process commits to extract JIRA IDs
    processed_commits = []
    jira_pattern = re.compile(r'JIRA:\s*([A-Z]+-\d+)')

    for commit in commits:
        if commit:  # Skip empty lines
            hash_id, date, message = commit.split('|', 2)
            jira_match = jira_pattern.search(message)
            jira_id = jira_match.group(1) if jira_match else ''

            processed_commits.append({
                'date': date,
                'commit': hash_id,
                'jira_id': jira_id
            })

    # Write to CSV
    csv_filename = 'git_commits_jira.csv'
    fieldnames = ['date', 'commit', 'jira_id']

    with open(csv_filename, 'w', newline='') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(processed_commits)

    return processed_commits
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Git Log: The script uses git log to fetch commit history.

  2. JIRA ID Extraction: It uses regex to extract JIRA ticket IDs from commit messages.

  3. CSV Export: The results are saved to a CSV file for later use.


Step 3: Tackling Slack Messages

Slack was trickier. Messages are context-heavy, and mapping them to specific tasks isn’t straightforward. I briefly considered using AI to parse the data but decided against it due to cost and complexity. Instead, I created a generic ticket to capture communication time and wrote a script to fetch Slack messages.

The Script

import os
from datetime import datetime
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError
import pandas as pd

def get_all_conversations(client):
    """
    Get all conversations (channels, private channels, and DMs) the bot has access to.

    Args:
        client: Slack WebClient instance

    Returns:
        list: List of conversation objects
    """
    conversations = []

    try:
        # Get all conversations (channels, private channels, and DMs)
        result = client.conversations_list(
            types="public_channel,private_channel,im,mpim",
            limit=1000
        )
        conversations.extend(result["channels"])

        # Handle pagination
        while result.get("response_metadata", {}).get("next_cursor"):
            cursor = result["response_metadata"]["next_cursor"]
            result = client.conversations_list(
                types="public_channel,private_channel,im,mpim",
                cursor=cursor,
                limit=1000
            )
            conversations.extend(result["channels"])

    except SlackApiError as e:
        print(f"Error fetching conversations: {e.response['error']}")

    return conversations

def get_user_info(client, user_id):
    """
    Get user information from their ID.

    Args:
        client: Slack WebClient instance
        user_id: User's Slack ID

    Returns:
        dict: User information including real name and email
    """
    try:
        result = client.users_info(user=user_id)
        user = result["user"]
        return {
            "real_name": user.get("real_name", "Unknown"),
            "email": user.get("profile", {}).get("email", "Unknown"),
            "display_name": user.get("profile", {}).get("display_name", "Unknown")
        }
    except SlackApiError:
        return {"real_name": "Unknown", "email": "Unknown", "display_name": "Unknown"}

def get_messages_from_conversation(client, channel_id, channel_name, start_date, end_date):
    """
    Retrieve messages from a specific conversation within a date range.

    Args:
        client: Slack WebClient instance
        channel_id (str): The ID of the Slack channel
        channel_name (str): The name of the Slack channel
        start_date (str): Start date in format 'YYYY-MM-DD'
        end_date (str): End date in format 'YYYY-MM-DD'

    Returns:
        list: List of messages within the date range
    """
    messages = []
    start_timestamp = int(datetime.strptime(start_date, '%Y-%m-%d').timestamp())
    end_timestamp = int(datetime.strptime(end_date, '%Y-%m-%d').timestamp())

    try:
        # Initialize the cursor for pagination
        cursor = None
        while True:
            # Get messages using conversations.history
            result = client.conversations_history(
                channel=channel_id,
                oldest=start_timestamp,
                latest=end_timestamp,
                limit=100,
                cursor=cursor
            )

            # Process messages
            for msg in result['messages']:
                user_id = msg.get('user')
                user_info = get_user_info(client, user_id) if user_id else {
                    "real_name": "Unknown",
                    "email": "Unknown",
                    "display_name": "Unknown"
                }

                message_data = {
                    'channel_name': channel_name,
                    'channel_id': channel_id,
                    'text': msg.get('text', ''),
                    'user_id': user_id,
                    'user_name': user_info['real_name'],
                    'user_email': user_info['email'],
                    'timestamp': datetime.fromtimestamp(float(msg['ts'])).strftime('%Y-%m-%d %H:%M:%S'),
                    'thread_ts': msg.get('thread_ts', None)
                }
                messages.append(message_data)

            # Check if there are more messages to fetch
            cursor = result.get('response_metadata', {}).get('next_cursor')
            if not cursor:
                break

    except SlackApiError as e:
        print(f"Error fetching messages from {channel_name}: {e.response['error']}")

    return messages

def main():
    # Set your date range
    start_date = "2024-01-01"
    end_date = "2024-01-31"

    # Initialize the Slack client
    slack_token = os.environ.get('SLACK_BOT_TOKEN')
    if not slack_token:
        raise ValueError("Please set the SLACK_BOT_TOKEN environment variable")

    client = WebClient(token=slack_token)

    # Get all conversations
    print("Fetching all conversations...")
    conversations = get_all_conversations(client)

    # Fetch messages from all conversations
    all_messages = []
    total_conversations = len(conversations)

    print(f"Found {total_conversations} conversations. Fetching messages...")

    for idx, conv in enumerate(conversations, 1):
        channel_id = conv["id"]
        channel_name = conv.get("name", "DM")  # DMs don't have names

        print(f"Processing {idx}/{total_conversations}: {channel_name}")

        messages = get_messages_from_conversation(
            client,
            channel_id,
            channel_name,
            start_date,
            end_date
        )
        all_messages.extend(messages)

    # Convert to DataFrame for easier handling
    df = pd.DataFrame(all_messages)

    # Save to CSV
    output_file = f"slack_messages_{start_date}_to_{end_date}.csv"
    df.to_csv(output_file, index=False, encoding='utf-8')

    print(f"\nTotal messages retrieved: {len(all_messages)}")
    print(f"Messages saved to: {output_file}")

    # Print summary
    print("\nMessages per channel:")
    channel_summary = df.groupby('channel_name').size().sort_values(ascending=False)
    print(channel_summary)

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Conversation List: The script fetches all channels and DMs accessible to the bot.

  2. Message Retrieval: It retrieves messages within a specified date range.

  3. CSV Export: The messages are saved to a CSV file for further analysis.


Step 4: Capturing Outlook Meetings

Finally, I needed to account for meetings. Using the exchangelib Python library, I wrote a script to pull calendar events and export them to a CSV.

The Script

from exchangelib import Credentials, Account
import pandas as pd
from datetime import datetime, timedelta

def get_outlook_meetings():
    # Replace with your email and password
    email = os.environ.get("JIRA_EMAIL")
    password = os.environ.get('OUTLOOK_PASSWORD')

    # Connect to Exchange
    credentials = Credentials(email, password)
    account = Account(email, credentials=credentials, autodiscover=True)

    # Get calendar items for the last 30 days (adjust as needed)
    start_date = datetime.now() - timedelta(days=30)
    end_date = datetime.now() + timedelta(days=30)

    # Get calendar items
    calendar_items = account.calendar.view(
        start=start_date,
        end=end_date
    )

    # Prepare data structure
    meetings = []

    # Process each appointment
    for item in calendar_items:
        # Calculate duration in minutes
        duration = int((item.end - item.start).total_seconds() / 60)

        meetings.append({
            'date': item.start.strftime('%Y-%m-%d'),
            'time': item.start.strftime('%H:%M'),
            'duration': duration,
            'title': item.subject
        })

    # Create DataFrame and save to CSV
    df = pd.DataFrame(meetings)
    df.to_csv('outlook_meetings.csv', index=False)

    return df

# Run the function
if __name__ == "__main__":
    meetings_df = get_outlook_meetings()
    print(f"Exported {len(meetings_df)} meetings to outlook_meetings.csv")
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Authentication: The script uses your Outlook email and password to authenticate.

  2. Calendar Query: It fetches calendar events within a specified date range.

  3. CSV Export: The events are saved to a CSV file.


What’s Next?

At this point, I had four CSV files:

  1. JIRA Tickets: All the tasks I worked on.

  2. Git Commits: All the code I wrote.

  3. Slack Messages: All the communication I participated in.

  4. Outlook Meetings: All the meetings I attended.

In Part 2, I’ll show you how I stitched these datasets together to create a comprehensive timesheet. Spoiler alert: It involves more Python, some data wrangling, and a little bit of magic.

Stay tuned, and remember: Laziness is the mother of invention.


What’s your least favorite chore as a software engineer? Have you automated it yet? Share your stories in the comments!

Image of AssemblyAI tool

Challenge Submission: SpeechCraft - AI-Powered Speech Analysis for Better Communication

SpeechCraft is an advanced real-time speech analytics platform that transforms spoken words into actionable insights. Using cutting-edge AI technology from AssemblyAI, it provides instant transcription while analyzing multiple dimensions of speech performance.

Read full post

Top comments (0)

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay