How do you handle traffic scaling for monolith application? Just copy the application in different machine and allow load balancer to distribute traffic?
A couple of ways.
Vertically scale (add more computing power to a server) and then tinker with your app server's scaling properties. For example with Puma (a popular Rails app server) you can increase its workers and threads which in turn use more system resources.
Horizontally scale (add more servers) and then load balance them like you mentioned.
You could also do a combination of both.
For horizontal, there may be serveral technical issue I think.
For most application, the bottleneck would be in database. I mean the performance.
For the transactional issue, it could be done through pessimistic lock and optimistic lock I think.
For microservice, I often have a question for session. How can I get session info across different microservice component.
You can use a load balancer that supports sticky sessions, this way users are always routed to the same server.
Your cache would typically live outside of your application instance. For example with Rails you could use Redis as your cache backend and Redis would be running on its own server that's unrelated to your load balancer.
That's typically what people mean when they say they run "stateless" servers. For horizontally scaling most web applications, you want to keep them as dumb as possible. They should be disposable. I would strive for that even if you plan to do a small single server deployment.
I've dealt with some pretty big web apps before. A single SQL database can go a really really really really long ways as long as you avoid silly mistakes like N+1 queries and understand how to profile slow queries on demand. You can also cache expensive queries to avoid hitting your DB entirely.
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.