5 Performance metrics every system architect should know

The goal of system architects is to design and oversee the development of IT infrastructure that supports business goals

Firstly, we need to understand what a system architect does:

There are 5 important metrics useful for measuring service performance

Figure 1. A simple request, process, response sequence between two services.

1. Latency (first-byte)

The first important metric is the first-byte latency. This is the time it takes for the smallest input (typically a single byte) to be processed from the start of the request to the end of the response. These are often what are written in hardware or system specifications (e.g. disk latency, memory access latency, etc.).

2. Latency (end-to-end)

The end-to-end latency, similar to first-byte latency, is the time it takes from the start to finish of the transaction. The difference between the two arises due to processing times that depend on the size of the input. For example, matrix multiplication between two 1000x1000 matrices takes longer than two 2x2 matrices.

3. Throughput

Throughput is the number of tasks that a service completes within a given time range. For example, services are often measured in terms of requests per second (rps).

4. Bandwidth

Bandwidth is the maximum rated capacity of a service. A typical example of this is network bandwidth, which is what is advertised by the Internet Service Provider (ISP) based on the specifications of hardware used.

5. Concurrency

Finally, concurrency is the number of requests that a service can process at the same time. Note that this is measured with a time duration as the denominator (e.g. 100 concurrent requests v.s. 1000 requests per second).

100 concurrent requests * 500ms requests means = bandwidth of 200 requests per second

System architects need to identify the bottleneck in order to find opportunities for improvement

To increase the performance of a system, different strategies could be used. For example, decreasing the end-to-end latency of requests by minimizing the processing time can increase the bandwidth of the service. Similarly, horizontally scaling the service by adding more worker nodes can increase the concurrency and thus can also increase the bandwidth of the service.

To identify such opportunities, system architects can think like chemists

An analogy to Chemistry is that in a chemical reaction, with material inputs converted to outputs, the expected amount of output can be determined by finding the “limiting reagent”.


In this article, I have described 5 important performance metrics that system architects should be aware of. These metrics are first-byte latency, end-to-end latency, throughput, bandwidth, and concurrency. To identify ways to improve a service and meet desired performance levels, the system architect must identify the bottleneck in the system. Knowing the bottleneck will point to solutions that will help, and rule out solutions that will not.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Justin San Juan

Justin San Juan


Award-Winning Software Engineer | Business and Web Consultant