AWS Latency
when designing a vpc u need to aware of latency u get when u place components in differnet region or az as well as data transfer cost
apps respond quickly but struggle with lot of users
latency is about how fast your system responds to a single requesnts
optimizing lower latency can hurt thoughput and vice versa
eg optimize latency by caching frequnlt access data reduces latency but limits available memory for added tasks decreasing throhjput
throughput measures how many requests an app can handle simultaneously higher throughput allow more users a once without issue
optimize latency by caching frequnlt access data reduces latency but
Latency increases as distance increases (PG < AZ < Cross Region).
Throughput remains stable across all scenarios (~1.04K msg/s).
Network cost applies only to cross-AZ and cross-region traffic, not for same-AZ or same-PG communication.
ut if we add the VM in the same placement group, if the rack is down, that hits our availability
and its a trade off
come up to an interesting battle, load testing tool gatling vs jmeter vs k6 vs Locust for time sake, may skip jmeter as in my experience jmeter is not fast enough to compete against the else

Metric
Scope
Latency (p90) microsecond
Throughput (MPS) messages per sec
Network Usage (Receive)
Network Cost / Month
Same Placement Group (Same PG)
Intra-Node (same PG)
~53.5 µs
1.04K msg/s
~901 kb
$0
Same Availability Zone (Same AZ)
Within same AZ
~131 µs
1.04K msg/s
~901 kb
$0
Cross Availability Zone (Cross AZ)
Between AZs in same region
~289 µs
1.04K msg/s
~902 kb
$2.72
Cross Region
Between AWS regions
~6.45 ms
1.04K msg/s
~902 kb
$2.72
Reduce latency
optimize network layer
Database Indexing
Caching
Load Balancing
Content Delivery Network
Async Processing
Data Compression
Place cloud resources near by location


Common loophole that cause latency
There are many factors that can contribute to API latency,
Network latency: Network connectivity issues or high network traffic can significantly increase API latency. If data must travel a long distance between the client and server or if there are issues with the network infrastructure, it can result in delays.
Server overload: When the API server becomes suddenly inundated with requests, it may struggle to process them all in a timely manner, leading to increased latency. This problem can be compounded by inadequate server resources, such as CPU, memory, or disk I/O.
Inefficient code: Certain coding patterns—such as algorithms with high time complexity, unoptimized SQL queries, and synchronous operations—can significantly slow down response times.
Third-party dependencies: API integration enables developers to incorporate third-party functionality—such as payment processing, geolocation services, and messaging—into their applications. However, if these external dependencies experience downtime, it can lead to increased latency in the APIs and applications that rely on them.
Geographic location: The physical location of the API client and server can contribute to an API’s latency because data transfer is not instantaneous. It is instead limited by the speed of light and the characteristics of the network infrastructure.
Throttling or rate limiting: APIs often implement rate limiting or throttling to control the number of requests that a single client can make. When clients exceed these limits, they may experience delays or receive errors.
Last updated
