DEV Community: Ahmet Öz

Importance of limiting the concurrency.

Ahmet Öz — Sun, 28 Feb 2021 15:24:43 +0000

Concurrent programming techniques are the most used approach to structure the applications since the multicore processors are invented, and it's the most effective way to speed your applications when the number of tasks is much greater than the number of CPUs especially when the coordination of these smaller tasks requires network communication. For instance, using concurrent programming a CPU will do other things after triggering an HTTP request instead of blocking the CPU until the response is fulfilled. In this article, I want to explain the importance of putting limits on concurrent requests.

The behavior of concurrent requests in a server could be analyzed with a well-known queueing theory called Little’s Law L = ℷW. When we adopt the idea in this theory to a typical client-server architecture based on which requests arrive at a server from the client, processed some time in the server, and then responded to the client. The specifying the formula:

L - the level of concurrency; the number of requests that are currently our server is processing.
ℷ - throughput; the numbers of completed requests (amount) per second in a server, completed because otherwise the request will be piled up in the server and not counted in throughput.
W - average latency; the latency (delay) of handling a specific request.

As you could imagine a change in one characteristic might influence two others. Thus the entire system. So let's jump to playing around with the formula to see the effects, in terms of performance, we care about increasing the throughput which means increase the number of things, a server can do per second. If we want to increase our lambda ℷ - throughput, then we might try to decrease our latency with the same concurrency limit, most of the time that is beyond our control other than typical algorithm optimization because when we process that request, we might want to access some database or some other service on the network, so basically we might not have control for how long it takes to the response.

Another thing to do to increase our throughput would be to increase the level of concurrency L - the level of concurrency. How high our throughput can go giving some average latency which is usually more or less constant per system. It's constant because when the concurrency of the system is increased it means an overload of the actual server, and if the server could not handle that amount of requests at the same time, if there is no specific implementation added (by default) to handle the load like rejecting or queuing the request, the connections might drop because the request will time out. Or worse the machine running the server might crash or throttled because of running out of memory or CPU limits.

In conclusion, it’s important to note that concurrency limits need to be enforced at the server level, and if you don't have control over the client and the usage of the server/API should be documented considering the best performant concurrency limit. If you have control over your client, the client needs to send requests with considering a good concurrency limit to not overload the servers.

My code review checklist.

Ahmet Öz — Sat, 26 Dec 2020 13:34:30 +0000

Code reviews are a critical and legitimate activity that needs time and focus.

I was thinking recently about the reviews that I was doing and created a checklist for me to adjust before submitting the comments to the pull requests.

Review:

Check out the code and run the new tests manually, try to check if the code is handling edge cases.
Ensure the code is covering all requirements.
Checkout code coverage results and ensure the important parts of the code are tested.

Comments:

When demanding changes, explain them clearly. Offer suggestions or ask questions, instead of giving a direct todo or even worse just sharing a sample code.
Don't focus too much on the coding styles or something like making the code one-liner.
Avoid LGTM comments if possible, if everything is right, write thanks for the work.

Reply to comments:

Always be kind to the developer but serious with the code.
Avoid confusing statements.
Do not criticize or explain the architectural decision changes as a long comment, instead request a pair review and do the discussion in a verbal way, after the discussion request or try to write the decisions as a comment to make them visible to the team.

k8s makes deployment easy

Ahmet Öz — Fri, 01 May 2020 17:05:32 +0000

A couple of months ago I was trying to understand and describe the value of Kubernetes, starting to read about it and come off some notes as below, maybe it could be helpful for other developers to understand the value of it. For me, as described in the title I described it to my self as k8s makes the deployment easy.

"Kubernetes does the things that the very best system administrator would do: automation, failover, centralized logging, monitoring. It takes what we've learned in the DevOps community and makes it the default, out of the box."
@kelseyhightower

Docker makes the development easy, lets us build containers locally, push them to or pull them from a container registry, and run container images locally on our machine. Kubernetes makes deployment easy, lets us orchestrate clusters in a pragmatic way. Actually, after the adaption of tools like docker and k8s, there are no big distinctions as before between software engineers and operations engineers. It’s all just software now, and we’re all engineers. So those tools improve the quality of the processes, as said by Amazon CTO Werner Vogels, "You build it, you run it".

K8s cluster has a brain called control plane, it maintains all the tasks, scheduling containers, managing services, serving API requests, and so on for us, that part is running on master nodes, and other workloads of the cluster are running on worker nodes. For a typical production use case, k8s would need min 3 master nodes (virtual machines) to avoid the single point of failures, also most of the professionals suggest to not manage clusters manually (I guess here the reason is simple, cluster management is not an easy task which requires a significant investment of time, effort, and expertise and also it could not be cost-effective) instead suggest the usage of a managed services. Run less software philosophy could be a piece of good advice to consider before you decide to run Kubernetes on-premise. For learning/testing purposes we could use Docker Desktop which lets us run a small (single-node) Kubernetes cluster on our laptops.

Kubernetes control plane components working together as a state machine. K8s has a process called reconciliation loop trying to reconcile the actual state with the desired state. The desired state is stored in an internal database (etcd) as a specification/manifest file in k8s like deployment, and controller components of the application continually check the desired state for each K8s resource and make necessary adjustments to keep the state machine in sync with manifests. K8s scheduler on master node watches the unscheduled deployments (desired state) and schedules the node for the pod, after scheduling the kubelet process running on that node picks it up and takes care of actually starting its containers.

Thread safety issues in Java

Ahmet Öz — Fri, 01 May 2020 16:42:43 +0000

Note: The post content and ideas aggregated from various resources and mainly Brian Goetz's excellent book Java Concurrency in Practice. If you enjoy these notes below, I strongly suggest purchasing the book!

Writing concurrent programs most of the time is primarily related to managing shared mutable data/state between threads.
As threads in the same process sharing the same memory address space (heap) with threads, managing the shared data/state is important to get predictable results from the running application. Basically this makes it hard to create a multithreaded application than a single-threaded application. The penalties for failing to synchronize shared mutable data are liveness and safety failures, so as a developer we need to write a code that synchronizes access of the shared data and which explains the synchronization policy for both clients and maintainers of the code/library.

Atomicity

Operation of a shared state in a concurrent application should be atomic, and if we don't use synchronization mechanisms, concurrent access of operation might cause race conditions.

Race Conditions: A semantic error condition, mostly related to the timing of instruction because thread scheduling algorithms can swap between threads (like context switching) in runtime or other kernel related problems might occur like waiting for some threads on kernel level, so it could cause that instructions on a multithreaded application will not be executed in a sequential manner means that multiple threads can race to access and change the data in the same time, so correctness is based on a timing luck.

Most of the time atomicity problems not easy to predict, and happens mostly on compound actions:

read-modify-write
- for example incrementing a counter, actions: read the value, increment it and write back to memory.
- to avoid such cases we could use AtomicInteger class in java.util.concurrent.atomic package.
check-then-act
- for example lazy initialization of an expensive object like creating a database connection object.
- actions: first do a null check if the object is assigned, then create the object.
- to avoid such cases we could use
  - double-check lock mechanism
  - an enum as enums using lazy initialization by default and all synchronization is handled by JVM.

Visibility

Including the atomicity, we need to make sure that a change in the shared state needs to be visible by other threads because threads using the same memory. In order to ensure the visibility of memory writes across threads, we must use synchronization.

Solutions to adjust visibility between threads:

A weaker form of memory visibility using the java volatile keyword. Basically, when an update occurs to a volatile variable the JVM ensures that the change is propagated to memory so threads could see the value change without needing an explicit synchronization. It's weak because does not guarantee atomicity issues. Most of the use cases using volatile is not practical, it's only useful for storing state value of an object in a boolean variable like initialized, stopped ext.
A better solution is using the same common locking mechanism for both reading and writing threads because synchronization has no effect unless both read and write operations are synchronized.

Thread safety issues could be solved

By using immutable data structures.
By not sharing any states between threads. Stateless objects or using pure functions without side effects.
By using the java features like implicit locks, explicit locks synchronized keyword, and volatile keyword.
By delegating thread safety to well-implemented higher level concurrency utilities in java.util.concurrent package.
By not sharing the object reference.
By returning defensive copies of a collection instead of returning the mutable collection itself.
By using a thread-confinement object that can only be accessed by a current thread-like ThreadLocal.
By using encapsulation and locks for securing not thread-safe objects.