DEV Community

Cover image for From Infrastructure to Product: Pipelines into a Profitable SaaS
Cláudio Filipe Lima Rapôso
Cláudio Filipe Lima Rapôso

Posted on

From Infrastructure to Product: Pipelines into a Profitable SaaS

Open Journal Systems (OJS) is the engine behind thousands of scientific publications worldwide. However, educational institutions, from high schools running literary magazines to universities managing peer-reviewed journals, share a massive pain point: they want to publish research, but they lack the internal IT resources to manage complex PHP monoliths, backups, and server security.

Blending my experience in the classroom with software architecture, I realized this is a massive opportunity. We can move beyond just "hosting" OJS and transform it into a fully automated, Zero-Touch B2B SaaS platform targeted globally at the education sector.

This article details the design, infrastructure, and product strategy to build a highly available, multi-tenant OJS SaaS on Amazon Web Services (AWS) using Terraform, Go, and GitHub Actions.


🏗️ 1. The Reference Architecture on AWS

To sell this globally, we cannot afford manual interventions or local server disks. We must adopt a distributed, stateless, and 100% managed topology.

Core Components:

  • Compute: Amazon ECS with AWS Fargate. We run the OJS Docker image in a serverless manner. We pay only for the CPU/RAM consumed, which is crucial for maximizing profit margins.
  • Database: Amazon RDS for MySQL. Fully managed, automated backups, and isolated data tiers.
  • File Persistence: OJS requires a persistent directory (files_dir) for articles and PDFs. We mount Amazon EFS (Elastic File System) directly to the ECS Task via the NFS protocol.

⚙️ 2. The "Zero-Touch" Provisioning Engine

To turn this infrastructure into a product, the entire customer lifecycle must be automated. Here is how the provisioning engine works:

Engine

  1. The Storefront: A university administrator selects a plan on your landing page and checks out.
  2. The Orchestrator (Golang API): For high concurrency and low footprint, a lightweight backend API built in Go listens for the payment gateway webhook (e.g., Stripe).
  3. The Trigger: The Go API automatically calls the GitHub Actions API, passing the tenant_name, region, and tier as JSON payloads.
  4. The Magic: GitHub Actions runs the Terraform scripts. Within minutes, the isolated AWS infrastructure is provisioned, Route53 updates the DNS (journal.university.edu), and the client receives an automated email with their admin credentials.

🌍 Level 1: The System Context Diagram (Business View)

Who it's for: Executives, Investors, University Directors, and non-technical Stakeholders.
What it shows: The "Big Picture." It places our OJS SaaS Platform at the center of the universe and focuses exclusively on answering two questions: Who uses the system? and What other systems do we talk to?

  • The Actors: It makes it clear that we have different user profiles (the Admin who pays, the Author who publishes, the Reader who consumes).
  • The External Boundaries: It shows that we didn't build a payment processor from scratch (we use Stripe), we didn't reinvent infrastructure pipelines (we use GitHub Actions), and we integrate with academic industry standards (Crossref).
  • The Value: Here, technology doesn't matter. There is no AWS, Go, or PHP in this diagram. The focus is purely on the business value flow.

Systems Context

Level 2: The Container Diagram (Solution/Cloud View)

Who it's for: Cloud Architects, DevOps Engineers, Tech Leads, and Security Teams.
What it shows: How the central system (the blue box from Level 1) is divided internally.
(Architectural Note: In the C4 Model, a "Container" means an independently deployable/runnable unit, not necessarily a Docker container, although in our case, they coincide).

  • Separation of Concerns: The division between the Storefront Web (React Front-end), the Provisioning API (our Go engine), and the Tenant Infrastructure (the client's OJS) is crystal clear.
  • Technology Choices: This is where the cloud "magic" appears. We show that OJS runs on ECS Fargate (Compute), uses Amazon RDS (Relational Database), and Amazon EFS (Network persistence for PDFs).
  • Communication Flow: It shows that the Admin doesn't access the database directly; the communication flows from the Browser to the Load Balancer, to the container, and only then to the database via port 3306.

Container Context

🧩 Level 3: The Component Diagram (Engineering/Code View)

Who it's for: Software Developers (Backend).
What it shows: We take just one of the containers from Level 2 (in this case, our Go Provisioning API) and open the hood to see how the code is structured internally.

  • Domain Isolation: This is the perfect level to demonstrate Domain-Driven Design (DDD) principles. The Tenant Manager component represents the heart of our domain (where the business rules live).
  • Dependency Inversion: The diagram shows components like the GitHub API Client and the Metadata Repository. As architects, we want our domain (Tenant Manager) to not depend on specific infrastructure implementations, but rather on interfaces (ports and adapters). The visual flow shows how the business rule delegates the heavy I/O lifting to external clients.
  • The Webhook Lifecycle: It explains the exact path of the data: The request hits the Router, passes through validation in the Payment Handler, updates the database via the Metadata Repo, and finally triggers the infrastructure via the GitHub Client.

Component Diagram

Here is the GitHub Actions workflow that your Go API triggers:

name: 'Zero-Touch SaaS Provisioning'
on:
  workflow_dispatch:
    inputs:
      tenant_name: { description: 'Tenant Name', required: true }
      environment: { description: 'Tier (basic/premium)', required: true }
      aws_region: { description: 'Target Region', required: true }

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ github.event.inputs.aws_region }}

      - name: Terraform Init & Apply
        run: |
          terraform init -backend-config="key=${{ github.event.inputs.tenant_name }}.tfstate"
          terraform apply -var-file="envs/${{ github.event.inputs.environment }}.tfvars.json" -auto-approve

Enter fullscreen mode Exit fullscreen mode

🌍 3. Global Scaling and Data Compliance

Selling to the global education market means navigating strict data privacy laws. Because our infrastructure is codified with Terraform, deploying a compliant environment is just a matter of dynamically injecting the aws_region variable:

  • GDPR (Europe): If a German university signs up, the automation deploys to eu-central-1 (Frankfurt). The data never leaves Europe.
  • LGPD (Brazil): For Brazilian institutions, the automation targets sa-east-1 (São Paulo).
  • FERPA/HIPAA (USA): Medical universities publishing clinical trials can be hosted in isolated, dedicated environments in the US.

💎 4. Tiered Product Strategy (FinOps)

We can structure the SaaS offerings to capture both ends of the academic spectrum, leveraging AWS FinOps strategies:

Tier 1: High School & Small College Plan (High Volume)

  • Use Case: High school science fairs or small departmental newsletters.
  • Architecture: Shared ECS resources and a single shared RDS instance with logical separation (different schemas per school).
  • FinOps: We use Fargate Spot to utilize spare AWS compute capacity for up to a 70% discount. We also use EventBridge to scale the environments down to zero during weekends and school holidays.

Tier 2: University & Research Institute Plan (High Margin)

  • Use Case: Official university peer-reviewed journals indexing globally with Crossref and DOIs.
  • Architecture: Dedicated ECS Tasks, isolated RDS instances powered by Graviton (ARM) processors for better price/performance, and custom domain mapping.
  • Price Point: Premium recurring revenue with strict SLA guarantees and daily automated snapshots.

🐳 5. Dockerizing OJS

The Docker image remains very similar to any standard containerized environment, focusing heavily on the required PHP extensions:

FROM php:8.2-apache

# System dependencies and PHP extensions required for OJS on AWS
RUN apt-get update && apt-get install -y \
    libpng-dev libjpeg-dev libxml2-dev libzip-dev zlib1g-dev libicu-dev g++ \
    && docker-php-ext-configure gd --with-jpeg \
    && docker-php-ext-install gd mysqli opcache zip intl gettext

# Upload Optimizations
RUN { \
    echo 'opcache.memory_consumption=128'; \
    echo 'upload_max_filesize=64M'; \
    echo 'post_max_size=64M'; \
    } > /usr/local/etc/php/conf.d/ojs-optimization.ini

COPY . /var/www/html/
RUN chown -R www-data:www-data /var/www/html/

Enter fullscreen mode Exit fullscreen mode

🛠️ 6. Infrastructure as Code (Terraform)

Governance across the AWS environment is managed via Terraform. The state (.tfstate) is securely stored in Amazon S3 with State Locking handled by DynamoDB.

Below is the critical snippet that connects ECS (Fargate) to EFS to persist the PDFs:

# 1. Amazon RDS MySQL
resource "aws_db_instance" "ojs_db" {
  identifier           = "ojs-db-${var.environment}"
  allocated_storage    = 20
  engine               = "mysql"
  engine_version       = "8.0"
  instance_class       = var.db_instance_class
  username             = var.db_user
  password             = var.db_password
  vpc_security_group_ids = [aws_security_group.rds_sg.id]
  db_subnet_group_name = aws_db_subnet_group.main.name
  skip_final_snapshot  = true
}

# 2. Amazon EFS for the files directory
resource "aws_efs_file_system" "ojs_files" {
  creation_token = "ojs-efs-${var.environment}"
  encrypted      = true
}

# 3. ECS Task Definition (Fargate) with EFS Volume
resource "aws_ecs_task_definition" "ojs_task" {
  family                   = "ojs-app-${var.environment}"
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = var.fargate_cpu
  memory                   = var.fargate_memory

  container_definitions = jsonencode([{
    name  = "ojs-container"
    image = var.ecr_image_url
    mountPoints = [{
      sourceVolume  = "ojs-efs-volume"
      containerPath = "/var/www/ojsdata"
    }]
    environment = [
      { name = "MYSQL_HOST", value = aws_db_instance.ojs_db.address }
    ]
  }])

  volume {
    name = "ojs-efs-volume"
    efs_volume_configuration {
      file_system_id = aws_efs_file_system.ojs_files.id
      transit_encryption = "ENABLED"
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Variable Injection (FinOps)

We define instance sizes via JSON (prod.tfvars.json) to control costs per environment:

{
  "environment": "prod",
  "region": "us-east-1",
  "fargate_cpu": "1024",
  "fargate_memory": "2048",
  "db_instance_class": "db.t4g.medium",
  "db_user": "ojsadmin"
}

Enter fullscreen mode Exit fullscreen mode

🚀 7. CI/CD with GitHub Actions

The automated deployment pipeline checks out the code, configures AWS credentials, and updates the infrastructure via Terraform.

name: 'Deploy OJS AWS'
on:
  workflow_dispatch:
    inputs:
      tenant_name: { description: 'Tenant Name', required: true }
      environment: { description: 'Environment (dev/hom/prod)', required: true }

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1

      - name: Terraform Init (S3 Backend)
        run: |
          terraform init \
            -backend-config="bucket=ojs-terraform-state-central" \
            -backend-config="key=${{ github.event.inputs.tenant_name }}.${{ github.event.inputs.environment }}.tfstate" \
            -backend-config="dynamodb_table=terraform-lock"

      - name: Terraform Apply
        env:
          TF_VAR_db_password: ${{ secrets.DB_PASSWORD }}
        run: |
          terraform apply \
            -var-file="envs/${{ github.event.inputs.environment }}.tfvars.json" \
            -auto-approve

Enter fullscreen mode Exit fullscreen mode

💰 8. FinOps: Cost Optimization on AWS

AWS provides fantastic mechanisms to slash costs in non-production environments:

  1. Fargate Spot: In Dev/Hom environments, we configure the ECS Capacity Provider to use Fargate Spot, utilizing spare AWS compute capacity for up to a 70% discount.
  2. Graviton (ARM): We use RDS instances powered by Graviton processors (db.t4g), which deliver better performance at a lower price point than the x86 architecture.
  3. Scheduling Automation: We leverage Amazon EventBridge to trigger an AWS Lambda function that stops the RDS instance and scales the ECS Desired Count to 0 outside of regular business hours.

🎯 Conclusion

By bridging the gap between software architecture and the needs of the classroom, we can transform Open Journal Systems from a complex IT burden into a seamless, globally scalable SaaS product. The combination of AWS ECS Fargate for horizontal scalability, Go for rapid API orchestration, and Terraform for compliant IaC delivers a true, zero-touch EdTech platform.

Have you ever thought about productizing open-source software into a niche B2B SaaS? Share your thoughts and architectural approaches in the comments! 👇

Top comments (0)