In mid-level and big scale organizations, the mission focus is on having continuous innovation backed by stability. The former is required to stay ahead of the curve (being better than one's competitors) while the latter is the springboard to delivery. Thus, a strong platform is an apparent need for every technological company.
Thus, we land upon a concept called Platform Engineering.
So what's it actually?
As per this definition
Platform engineering is the process through which enterprises adopt (new technology and platforms), leverage (existing technologies and platforms), and transform (shift the dial on delivering value by transforming the way things are done) cloud platforms. It is at the core of designing, building, and operating your cloud infrastructure to deliver the next generation IT ecosystem.
While the above definition is accurate, it does not really help understand on the why, what & how to build reliable platforms. Especially for teams who are looking to either augment or revamp their infrastructure from old ways to new.
Therefore, I felt I could contribute by creating one from my know-how so far. For easier understanding, the categories herewith will be structured into the following parameters: why, some points to consider, alternatives.
To build a smooth platform, most e-commerce companies need tooling/tech/processes/workflows in the following categories.
Domains
Why?: Branding of one's product, everything on the internet begins with a domain. E.g: google.com, facebook.com, dev.to
Some points to consider?:
- Domain pricing / renewals (with ICANN fees)
- Whois Privacy
- The right name and Top Level Domain or TLD or country-level TLD if it's region specific. (Impacts SEO)
Known tools: Bigrock.in
Alternatives: Namecheap, Route53, Cloudflare, Google Registry
Cloud Platform
Why?: Gateway to the system, having the right infrastructure will be like a strong backbone.
Some points to consider?:
- Basic features such as instances/VMs, isolated networks (not everyone will need Kubernetes or Mesos)
- Sufficient Capacity for one's enterprise at scale (The last thing one wants in the middle of a frantic user rush is no hardware available)
- Multiple accounts with RBAC to distinguish between teams/environments.
- Compatibility with Infrastructure-as-Code tools
- Effective Pricing.
Known tools: Amazon Web Services (AWS)
Alternatives: Google Cloud Platform, Microsoft Azure Cloud, DigitalOcean, Linode
Infrastructure as Code
Why?: Codifying the infrastructure provisioning and configuration to keep it exact across environments.
Some points to consider?:
- Idempotency by changing/deleting infrastructure only once.
- Ability to store state locally & remotely.
- Should be vender neutral & serve as a crisp log for audit
Known tools: Terraform
Alternatives: Pulumi, AWS Cloudformation, Google Deployment Manager, Azure Resource Manager
Version Control
Why?: Managing codebases with multiple teams across varied projects in different business verticals.
Some points to consider?:
- De-centralized (avoids a single point of failure)
- Clear history log (for updates/rollback)
- Branching/separating strategies.
- Ease of setup
Known tools: Git
Alternatives: Mercurial, TFS, SVN
Packaging
Why?: Software should be same across environments. Helps in the CI/CD step and reproduce issues to be fixed quicker
Some points to consider?:
- Application storage size (No point in having GB sized containers)
- Ability to generate artifacts (if any)
Known tools: Docker
Alternatives: Containerd, LXC, APT, Yum
CI/CD
Why?: To compile/build and deploy software easily. Manual steps can be time-consuming, error-prone and be a bottleneck.
Some points to consider?:
- Ability to build/deploy multiple programming languages, frameworks
- Support for unit/smoke tests, canary deployments.
- Preferably independent than infrastructure (helps avoid vendor lock-in)
- Self-hosted vs SaaS (factors here are cost, upgrades and maintenance)
Known tools: Jenkins
Alternatives: Github Actions, Gitlab , CircleCI, TravisCI, ArgoCD
Web-Services
Why?: Helps your userbase access all the consumer endpoints of the system. For e.g: discover, checkout, payments
Some points to consider?:
- Scale management (An incorrectly setup web-service can quickly have a cascading effect in case of heavy traffic)
- Reverse proxying to one/multiple back-ends and fail-overs (for High Availability)
- Support for HTTP/GRPC protocols.
- Advanced features such as rate-limits, throttling, IP bans.
Known tools: HAProxy
Alternatives: NGinx, Apache, Ambassador, Envoy, Istio
Databases
Why?: To maintain state of applications and store all data such as inventory, payments and other types of relational information. Helpful for data analytics as well.
Some points to consider?:
- ACID Compliance
- Libraries/connectors for multiple programming languages.
- Storage parameters such as in-memory or persistent to disk.
- Ease of backup, recovery in case of DR drills.
- Key-value stores for relatively smaller datasets.
Known tools: MySQL or MariaDB
Alternatives: PostgreSQL, MSSQL, MongoDB, Kafka, Aerospike, Redis, Memcached
Monitoring
Why?: A small monitored system will be more reliable than a big one without.
Some points to consider?:
- Agents/libraries across multiple programming languages.
- Ability to view application/infrastructure metrics.
- Retention as per compliance.
Known tools: NewRelic
Alternatives: Datadog, AppSignal, Instana, ElasticAPM
TL;DR: While this is not a definitive list of things that's required to build a system but there's a good chance this blog post provides an entrypoint for you to begin with
What are some pointers you would consider to build a reliable, consistent and easy-to-use platform?
Top comments (0)