What is Latency?
Latency is the time that elapses between the initiation of a request for data and the start of the actual data transfer. This delay may be in nanoseconds but it is still used to judge the efficiency of networks.
In a network, latency is an expression of how much time it takes for a packet of data to get from one designated point to another. In some usages, latency is measured by sending a packet that is returned to the sender and the round-trip time is considered the latency.
The latency assumption seems to be that data should be transmitted instantly between one point and another (that is, with no delay at all).
Latency is often used to mean any delay or waiting that increases real or perceived response time beyond the response time desired. Specific contributors to computer latency include mismatches in data speed between the microprocessor and input/output devices and inadequate data buffers. Within a computer, latency can be removed or "hidden" by such techniques as prefetching (anticipating the need for data input requests) and multithreading, or using parallelism across multiple execution threads.