Queues
Queues
There are many situations in industry and everyday life in which customers require service from limited service facilities. Much work has been done on techniques for predicting the amount of congestion in such systems. Applications include the following: telephone calls requiring service at an exchange; aircraft requiring an opportunity to take off or land; cars requiring a changed traffic light, or an opportunity to turn from a minor road into a major road; products at one stage of an industrial process requiring entrance to the next stage of processing; machines requiring attention from an operator; people requiring transport by bus, train, or airplane; and patients requiring attention by a doctor or admission to a hospital.
The objective of most investigations of queueing is to modify the system to make it more efficient. For example, the object of a study of queueing at traffic lights might be to formulate rules for adjusting the timing of the lights to the rates of flow of traffic. Such matters fall within the general subject of operations research. There seems to be as yet little work on queueing of a fundamental sociological nature.
Methods of investigation. There are, essentially, four methods of investigating queueing problems:
(1) direct observation of practical situations,
(2) planned experiments under artificial conditions, (3) simulation, and (4) mathematical analysis. Published work is largely on mathematical analysis, but this does not properly reflect the relative importance of the methods. The last sections of this article deal with simulation and mathematical analysis.
Some elementary results. Some important theoretical ideas and results can be illustrated with a simple example. Consider customers arriving at a shop having one server who serves only one customer at a time. Unserved customers queue up waiting their turn for service. Suppose that the instants at which customers arrive are distributed in a stable statistical pattern in which the mean interval between the arrival of successive customers is a minutes. Let the time taken to serve a customer have a stable frequency distribution with mean s min.
In a very long time, T min., about T/a customers will arrive, whereas, even if the server works continuously, the number of customers served in that time is about T/s. Hence, if T/a > T/s, that is, if s/a > 1, customers arrive more rapidly than they can be served and the queue will grow until the system changes; for example, a second server may be obtained or customers may be deterred from joining the queue.
The first step in many queueing investigations is to see whether the average service capability is enough to meet average demands. Suppose that this is so, that is, s/a 1. It is reasonable to expect a state of statistical equilibrium in which the number of customers queueing fluctuates with a stable frequency distribution. Also, in a long time T, the server will need to work for a total time of about sT/a, in order to serve T/a customers. Hence, there will be no customers in the system for a proportion 1 — (s/a) of the time. The dimensionless quantity s/a, called the traffic intensity, measures the average demand on the system relative to the average service capability. The qualitative behavior of the system is determined in the first place by whether the traffic intensity exceeds, or is less than, one. (Nearly always, the case of exact balance is unstable, leading eventually to very long queues.)
The next general point is that when s/a 1 the amount of congestion depends on the random fluctuations present. Thus, suppose that a = l,s = 0.9. If arrivals are perfectly regular every minute and if service always takes 0.9 min., an exact cycle of working is set up and no congestion arises. But if there is appreciable random variation in the arrivals or in the service times, there will, from time to time, be several customers at the service point simultaneously, and congestion will result. It can be shown that if arrivals are completely random (defined later) and the service time is constant, then the mean time spent queueing before the start of service is 4.05 min. and the server is idle 10 per cent of the time. Further, if both arrivals and service times are random and if the standard deviation of service time is equal to the mean, the mean time spent queueing rises to 8.1 min. With regular arrivals and a particular form of distribution of service time with standard deviation equal to the mean, the mean queueing time is 3.76 min. These represent substantial congestion.
The important general points are (1) if there is appreciable random variation, substantial congestion may result if the system is loaded near to its limiting average capacity, and (2) the amount of congestion depends in an essential way on the statistical variability in the system.
The amount of congestion can be reduced in various ways, for example, by having more servers, by reducing the variability in the system, or by reducing the traffic intensity. Suppose, however, that the traffic intensity is reduced to 0.8, by reduction of the service time to a constant, s = 0.8 min. Then the mean queueing time drops from 4.05 min. to 1.6 min. The proportion of time that the server is idle rises from 0.1 to 0.2. An analysis of the costs associated with congestion and with the service mechanism is usually required to find the best way to run the system.
General description of queueing systems. The main features that determine a queueing system are the input or arrival system, the service mechanism, and the queue discipline. Many possibilities arise, only a few being described here.
The main features of the input are the average rate of arrival (including its dependence, if any, on time) and the local statistical character of the arrival pattern. The two simplest types of arrival used in mathematical work are regular arrivals and random arrivals. The latter, often called a Poisson process, is a very special form defined as follows. In any small time interval (t,t + Δt) there is a chance Δt/a of a customer arriving, quite independently of arrivals in other time periods. The average interval between arrivals is a; the distribution of x, the interval between successive arrivals, is negative exponential, having density function α1 exp(— x/a),x> 0 [SeeDistributions, Statistical],
The service mechanism is described by the service capacity, the service availability, and the properties of service time. The service capacity is the maximum number of customers that can be served at a time, one in the single-server queue’discussed above. In some applications—for example, many transport problems—service is available only at certain times, and this must be specified; or service may not be available until a group of customers of specified size has collected (batch service). Finally, one must specify the statistical properties of the service time, usually by giving a probability distribution. Perhaps the most common type has a unimodal positively skewed density function with a standard deviation of 20 to 50 per cent of the mean. In mathematical work, a particularly important density is the negative exponential, with equation s-1e-x/s where s is the mean service time.
The queue discipline specifies what happens to customers who cannot be served immediately. Sometimes there may be losses, for example, customers may have to be rejected because of lack of waiting room. Then the probability of loss should be calculated. If there are no losses, customers queue awaiting their turn for service. One then needs to specify how customers are selected for service from the queue. Even in a single-server system there are many possibilities: for example, first-come, first-served, that is, customers queue in order of arrival; random selection from a queue of customers; non-pre-emptive priority, in which the customers are divided into two or more priority classes so that when a customer is to be selected for service the one with the highest priority is chosen; and pre-emptive priority, in which a high-priority customer is served immediately upon arrival.
This classification applies to a single point of congestion. Often there will be several such interlinked points.
Arrival pattern, service mechanism, and queue discipline determine the system. The resulting amount of congestion can be described by the distributions of three quantities: the number of customers in the system at an arbitrary time, the time a customer spends queueing before being served, and the server’s busy and idle periods. Which quantity or combination of quantities should be considered depends on the costs associated with congestion.
Investigating the behavior of queues. Several of the usual methods used in investigating queues are described below.
Simulation. The structure of the system is specified statistically, for example, by giving numerically the frequency distribution of service time. The behavior of the system is then simulated, that is, reconstructed empirically using pencil and paper (in simple cases), an electrical or hydraulic analogue machine, or, most commonly, digital computation on an electronic computer. The behavior over a long period is found and relevant properties, such as queueing times, measured. The procedure is normally repeated for a range of values of the important parameters. This approach is empirical observation of an idealized statistical model. It is sometimes called the Monte Carlo method (Florida …1956). [SeeComputation].
Simulation is not necessary for the one-server queueing model for which numerical results were given above, because simple mathematical results can be found. If used, however, simulation would proceed roughly as follows. From tables of random numbers, or from a computer program for generating pseudo-random numbers, values are formed representing the intervals between the arrivals of customers and the service times of customers. The arrival instants and the instants at which service is started and completed can be built up and anything of interest, such as the frequency distribution of queueing time, measured [SeeSimulation].
An important advantage of this method is its extreme flexibility. Systems far too complex for mathematical analysis can be investigated. A disadvantage is that many simulations under a wide range of conditions may be necessary to understand fully the system’s behavior. Simulation is, however, the most generally applicable technique for investigating specific practical problems.
Mathematical analysis. The theory of stochastic processes deals with systems that change in time probabilistically. To get working results about queues from the theory, it is usually necessary to consider very simplified models of the real problem. The usefulness of the theory is partly that skillful simplification may leave the essential features preserved and, perhaps more importantly, that intensive study of simplified models can lead to valuable qualitative understanding. A good example of qualitative conclusions arising from a simplified mathematical model comes from an investigation by Tanner (1961) of a model of overtaking on a two-lane road. Tanner’s calculations show that under certain circumstances there is a density of slow traffic, well below the capacity of the road, above which overtaking is effectively impossible. In many circumstances, high acceleration is more important than high speed, and a cutting of safety margins makes little difference to the average speed of a fast car over its whole journey. Mathematical investigation of simplified models combined with simulation of more complex models is a powerful method of investigation.
Essentially three types of mathematical investigation can be made: (1) to find rigorously the conditions for the existence of a unique equilibrium statistical distribution, (2) to find the equilibrium distribution assuming that it exists, and (3) to examine the transient behavior when, for example, the parameters change in time. The answer to (1) is usually qualitatively obvious, although the rigorous proofs often call for delicate arguments. Answers to (2) form the bulk of the literature and are relevant when long-run behavior is investigated. When behavior over a short period or under changing conditions is required, the transient behavior needs consideration, but not many usable results are yet available.
Application of Markov processes. The general idea behind the mathematical theory is to relate the probability distributions of the states of the system at different times. Consider the transition probabilities determining the probability distribution of the system at time t + Δt, given the state at time t. If these transition probabilities depend only on the state at t and not also on what happens before t, the process is a Markov process [SeeMarkov Chains]. It is in principle then easy to find simultaneous differential equations describing the transient behavior of the system and ordinary linear equations for the equilibrium behavior.
The Markov property holds in simple form when arrivals are random and the distribution of service time is negative exponential, with density s-1e-x/s Simplicity is gained by having random arrivals because the probability of an arrival in the small time interval (t, t + Δt) is Δt/a, independently of all other occurrences. The analogous property of the exponential distribution of service time is that if service of a customer is in progress at time t, the probability that service is completed in (t,t + Δt) is Δt/s, where s is the mean service time, independently of the length of time for which the particular service operation has been going on.
For example, consider the single-server queue with random arrivals and exponentially distributed service time. Let pn(t) be the probability at time t that there are n customers in the system, including the one, if any, being served. Consider p”(t + Δt), where Δt > 0 is very small. The ways in which the queue may be empty at t + Δt are that (1) the queue is empty at t and no customer arrives in (t, t + Δt), (2) there is one customer present at t whose service is completed in (t, t + Δt), and (3) two or more events occur in (t, t + Δt). The probability of (3) is negligible. Thus p0(t, t + At) is the sum of the probabilities of (1) and (2). That is,
Similarly, if n ≥ 1, p”(t + Δt) is the sum of the probabilities that (a) there are n customers at t and there is no arrival or completion of service in (t, t + Δt), (fc) there are n — 1 customers at t and one customer arrives in (t, t + Δt), and (c) there are n + 1 customers at t and service is completed in (t,t +At). Therefore
This set of simultaneous linear differential equations can be solved to give the transient behavior of the system. The answer is rather complicated. If it is assumed that an equilibrium distribution exists, the equations must have a solution, {p”}, that is independent of time. Thus, directly from (l)and(2),
Also, since the {p“} form a probability distribution, l=p0+p1+p2+...; the solution is
where r = s/a is the traffic intensity. The mean of the distribution is r/(l - r). When r = 0.9, the case examined previously, the mean number in the system is 9; this is the average over a long time of the number of customers present. Because arrivals are random, an arriving customer also finds on the average 9 customers ahead, each taking on the average 0.9 min. to be served. Thus the mean time spent queueing, for the queue discipline first-come, first-served, is 9 x 0.9 min. = 8.1 min. The general formula is that in a single-server queue, with random arrivals, exponential service time, and the queue discipline first-come, first-served,
where r is the traffic intensity.
The method used above to get linear equations can be used for more complex systems, provided that arrivals are independent and random and that service time distributions are exponential. However, consider even the simple single-server queueing system with random arrivals but with a non-exponential distribution of service time; here the argument fails, because the probability that service of a customer is completed in (t, t + At) is not constant but depends on how long service has been in progress. There are various ways round this difficulty, of which, one—the method of the imbedded Markov process—depends on considering the system only at instants at which service of a customer is completed. The generalization of (3) is the Pollaczek-Khinchin formula, , where c is the ratio of the standard deviation of service time to the mean.
Future developments. The number of new queueing situations is unlimited; almost each new application shows a fresh combination of input, service mechanism, and queue discipline. Simulation is usually the best method for tackling the more complex situations, and there is scope for further work on increasing the efficiency of simulation techniques. Two fields where further mathematical work is likely are those of networks of queues and of the automatic control of queueing systems.
History. The first major theoretical investigation was made in 1909 by A. K. Erlang, Copenhagen Telephone Co. Between the two world wars, important mathematical work on queueing was done by T. C. Fry (United States), A. la. Khinchin (Soviet Union), and Felix Pollaczek (France). After 1945, interest in the subject grew rapidly, both because of the much increased attention paid to operations research, and because the appropriate mathematical tool, the theory of stochastic processes, was being widely studied in other contexts. There is now an extensive literature on the mathematical problems connected with queueing. Except for the special fields of telephone engineering and traffic studies, relatively little has been written about applications.
D. R. Cox
[See alsoModels, mathematical].
BIBLIOGRAPHY
Introductory theoretical books in English are Khinchin 1955, Morse 1958, Cox & Smith 1961, and Riordan 1962. Saaty 1961 gives a more extensive review of the subject and has a long bibliography. Takacs 1962 and Benes 1963 are more specialized and mathematical. Syski 1960 deals particularly with applications to telephone engineering. Empirical and theoretical work on road traffic problems is best approached through the publications of the Road Research Laboratory (Great Britain) and the Highway Research Board (United States). White 1962 contains a critical discussion of work on the somewhat related problem of the sizes of casual groups of people.
Benes;, VÁclav E. 1963 General Stochastic Processes in the Theory of Queues. Reading, Mass.: Addison-Wes-ley.
Cox, David R.; and Smith, W. L. 1961 Queues. London: Methuen; New York: Wiley.
Florida, University of, Gainesville, Statistical Laboratory 1956 Symposium on Monte Carlo Methods, Held at the University of Florida …March 16 and 17, 1954. Edited by Herbert A. Meyer. New York: Wiley.
Khinchin, Alexandr Ia. (1955) 1960 Mathematical Methods in the Theory of Queueing. London: Griffin; New York: Hafner. → First published in Russian.
Morse, Philip M. 1958 Queues, Inventories and Maintenance: The Analysis of Operational Systems With Variable Demand and Supply. New York: Wiley.
Riordan, John 1962 Stochastic Service Systems. New York: Wiley.
Saaty, Thomas L. 1961 Elements of Queueing Theory, With Applications. New York: McGraw-Hill.
Syski, Ryszard 1960 Introduction to Congestion Theory in Telephone Systems. Edinburgh: Oliver & Boyd.
TakÁcs, Lajos 1962 Introduction to the Theory of Queues. New York: Oxford Univ. Press.
Tanner, J. C. 1961 Delays on a Two-lane Road. Journal of the Royal Statistical Society Series B 23:38-63.
White, H. 1962 Chance Models of Systems of Casual Groups. Sociometry 25:153-172.