HCI Design: Princeton Real-Time Lecture Notes

Real-Time Computing
For Human Computer Interfacing

November 4, 2002

Copyright 1997-2002, Perry R. Cook, Princeton University

* Permission to make digital or hard copies of part or all
of this work for personal or classroom use is granted with
or without fee provided that copies are not made or dis-
tributed for profit or commercial advantage and that copies
bear this notice and full citation on the first page. To copy
otherwise,to republish, to post on services, or to redistribute
to lists, requires specific permission and/or a fee.

I. What's Real-Time?

There is considerable disparity on the definition of a "Real-Time" computer system, depending on the particular industry and field of use, the history of computer useage in that industry, and many other factors.

The general agreement is that there is some notion of time-critical processing. That is, some tasks that the computer must perform must be done a timely fashion. "Timely" can be defined somewhat loosely, as in the case of a request for an account balance in a bank automatic teller machine, or more precisely as in the case of the control systems in a modern jet airplane. For simplicity, we'll separate real-time systems into three groups:

1) Business Data Processing

This first group is in many ways simply a speed improvement on classical information batch-processing used by banks, insurance companies, etc, where queries are posted to a central computer or network, and responses come back containing the desired information. The real-time aspect is that a human can post the query using a terminal, the waiting time is short enough that the human can wait at the terminal until the response comes back, and the response is displayed on a terminal or other device near the human operator.

2) Communications Switching, and Process Control

This second group contains two types of systems, but the similarity is that some process is taking place which usually involves flow (data, water in a pipe, parts on a conveyor belt, etc), and a computer is used to control parameters of that flow. The parameters could be connections or addresses as in the case of telephone communications, rates of flow as in the case of a commercial gas pump, or both as in the case of the complex routing and flow-control systems of a modern oil refinery or beer brewery.

3) Closed Loop Control Systems

This third system type involves online monitoring of processes (usually physical) and external inputs, and closing a feedback path with a computer which controls the processes to minimize some error criterion. This can be closely related to the Process Control system of the second type listed above.

Of course, real world systems often are combinations of these basic types.

A telephone service provider would use systems of both the first and second types, for example. Long distance phone calls would first be connected to a local switch, then information would be posted as queries to the Business Data Processing system (to check billing information), then further switching systems would take over to connect the call to the final destination. Every month the business data processing system would generate bills for each customer, but probably not in real time. Any time, however, some customers might be able to call in and see how many remaining hours they have on a prepaid long distance calling plan.

A jet airplane would likely include systems of both the second and third types, with closed loop control being used to ensure accuracy and stability in steering, etc., and process control being used to ensure steady and economical fuel flow to the engines.

The Dictionary of Computer Science (Van Nostrand Reinhold, 1993) defines a simple table of average response times, and processing methods and models:

Computer Processing Modes and Times:

Card Oriented Batch	100-10000s
Keyboard Oriented Batch	1-100s
Interactive Computing	1-10s
Online Inquiry and Transactions	1-10s
Message Switching	0.1-10s
Data Acquisition and Control	0.01-10s

Which of these are real-time? Clearly by modern standards, response times of 5 seconds or less would be required of any business data processing system carrying the label of Real-Time. But from the Nyquist Sampling Criterion a control system with 5 seconds response time could at best control continuous processes which change no faster than every 10 seconds. To ensure stability and control of rapidly varying processes, many modern feedback control systems operate at thousands of samples per second. So the lesson we can take from this disparity in "real times" is that different applications may have very different definitions.

Desktop computer companies, recently becoming more interested in the delivery of graphics and sound, even have very different definitions of real-time. Windows95 and NT basically describes the successful delivery of media in real time by saying "graphics and audio appear smooth, no objectionable jerkiness in visual appearance, not too many clicks or interruptions in the audio presentation, pre-recorded game sound effects should play with no noticeable delay, etc." (Here I'm paraphrasing many different sources and references for programmers).

Silicon Graphics specifies real-time in that multiple types of media, specifically graphics animation, digital sound, and MIDI commands, can be synchronized in output accurate to a single audio sample (1/48000 seconds). At first this might seem like an extremely stringent criterion, but there is at least 1 missing ingredient to the description, and that is "latency". Again there are disagreements on this definition as well, but I'll use latency to mean the delay between the time of an external control input and the measurable response at the display device(s). Display is used here to mean audio, video, or any other real-time output device. If media on a desktop computer is being controlled by inputs from a human or other computer, it is necessary to also think about latency as well as the ability to synchronize.

Other computer companies who have paid significant attention to media display, such as Apple and NeXT, have slightly different definitions and specifications for real-time, and quite different means for addressing the problems of smoothness (not losing data), synchronization, and latency. The undeniable fact, however, is that to ensure smoothness and synchronization, especially in a multi-tasking environment, requires a tradeoff of longer latency times. We will delve a little more deeply into this later in these notes.

II. Control Systems vs. Media Delivery

Controls systems and some process control systems have the most stringent time requirements. In order to ensure stability and robust behavior, these systems must maintain an accurate periodic sampling of the system variables. The media delivery task described above can be viewed as a process control system, with the objective being to provide samples or frames of audio, video, MIDI control data, etc. to the output of the system in a predictable way. The basic media delivery task differs from most process control systems, however, in that once the parameters are selected (audio sampling rate, video frame rate, video resolution, etc.) the system usually doesn't control the rates of flow. The system does, however, need to potentially arbitrate resources between the media delivery tasks and other tasks which may need to be accomplished.

Some basic working requirements of Media Delivery could be summarized: Make it look/sound smooth. No clicks or pops in audio. No jerks in video. If things do get behind, degrade gracefully.

A basic working requirement for a Control System might be: Close a feedback loop with sensors and motors. Guarantee Stability.

Modern HCI is really both of these: We're interested in combining these two areas. We want to collect user input, process it in a timely fashion, and display something smoothly and convincingly (with small or controllable delay) in response. The human and computer make up a complete closed-loop feedback system. Many modern HCI systems are powered feedback control systems, and the human is capable of adding energy/gain to the system. Thus stability can be a consideration.

III. Control Systems:

Generally, a control system is one that acts to bring the state of some system to a desired state. Probably the most common closed loop control system is the heating/temperature control system in buildings. In this type of system, a human sets a desired temperature on a thermostat control. The system determines if the temperature in the room is close enough to that desired temperature. If not, the system acts to change the temperature, by turning on the heater and fan. When the actual temperature is close enough to the desired, the heater and fan shut off. Note that in this system the human sets the desired temperature, but is not "in the loop" all the time.

There are open loop control systems, in which the state of the system is affected by the input of a desired quantity, but the system does not adjust itself dynamically to keep the state close to the desired state. An example of this might be a blender, where the speed control is set and the blades run at a given rate, but if the material being blended is thicker a given input setting results in a slower blade speed. The blender does nothing to measure blade speed and keep it constant.

Closed loop control systems which are of particular interest to us in HCI are those where the human is inside the feedback loop. A common example of such a system is driving a car. In this system, the human uses the steering wheel to express a desired state to the system. The system responds, and the visual system of the human measures the system state and takes corrective actions until the desired direction is achieved.

The modern world abounds with examples of closed loop control systems where humans are in the control loop. But there are also ancient examples of this as well. Broom balancing, swimming, walking, and playing musical instruments can all be viewed as closed loop control systems. Most, but not all, musical instruments use closed loop control. Brass instruments, bowed string instruments, and the Theremin are all instruments which require constant readjustment on the part of the human player to keep the instrument sound within desired limits of pitch, amplitude, and sound quality.

A more general and formalized view of a closed loop control system usually contains four parts: The Actuator, the Plant, the Sensor, and the Controller or Compensator. The actuator is what the controller uses to change the state of the plant. The plant is the actual physical system being controlled ("plant" comes from steel mills, power plants, etc. I like "system" better, but the whole feedback control is a system too.). The sensor determines the actual state of the plant. The controller determines the actions required to get the plant to the desired state.

Looking at a few feedback control systems with humans in the loop, let's identify the four basic parts:

Actuator: Hand
Plant (System): Physics of the broomstick
Sensor: Visual/Tactile/Kinesthetic
Controller: Human

Actuator: Hand Mouse/Joystick
Plant (System): Computer running game software
Sensor: Visual (auditory) System(s)
Controller: Human

Actuator: Hands
Plant (System): Electronics of the Theremin
Sensor: Auditory
Controller: Human

There are a number of things to be concerned about with feedback control systems. The main one is stability. As a mental exercise, travel around the loop of a feedback control system, and measure the "gain" at each stage. If this net gain is greater than 1.0, then there is a possibility that the compensator will overdo the job of adjustment. This can result in the error getting larger rather than smaller, and the whole system can blow up rather quickly. Putting the human in the loop complicates this further, because the human compensator is always learning about the system, and adapting its gain.

There is always some delay in measuring, and in making changes in a system input, and in the system itself (mass, etc.), and in computing the necessary changes. The net delay around a loop in a feedback control system can also affect stability. Picture trying to balance a broom, when the only feedback you get is delayed by 1 second, or 10 seconds. Even for the expert broom balancer, there exists a delay at which the broom will fall.

Resolution, accuracy, noise, and sampling also affect the stability and performance of feedback control systems. Picture a 1-bit control system, in which the state is either less or more than the desired, and the actuator can only be positively on or negatively on (or either on or off, as in the case of the heating control system). Such systems actually exist, but are totally inadequate for many tasks (a 1-bit steering mechanism would be hard to use in an automobile).

IV. Synchronous vs. Asynchronous Models
Polling vs. Interrupts

Synchronous real time systems are based on time-ordered external events. The processor is assumed to respond instantaneously to these external events. In practice, this condition is said to be met if the response time is much less than the time between external events.

Asynchronous real-time systems assume that external events occur at times which are elements of the real numbers (with possibility for extremely dense event clusters), and the system is responsible for responding within some specified time bound.

To accomplish either of these systems, we can poll, use interrupts, or a combination. These two forms of event inputs, however, lend themselves more naturally to one or another type of system. In a polled system, a computer loop looks periodically for external events, and if anything has changed since the last time, actions are taken. This lends quite naturally to synchronois real-time, because we are forcing the inputs to occur on specific time boundaries. Unless the loop is carefully written, however, we're not sure that the periodic checking is truly regular in time. Also, events which are shorter than the time between polls can be missed entirely without some type of hardware buffering. In interrupt-level processing any external event causes the processor to yield to an Interrupt Service Routine (ISR), which is usually a short program that is ensured to either finish quickly before any other tasks must be performed, or an even shorter program that simply might place the event into a queue and return execution to a higher level program.

V. Data and Processing Modules

The basic model of events that must be serviced leads to the notion of a Queue. A Queue can be as simple as a FIFO (first-in first-out) buffer, where input requests are placed in the buffer and serviced in order of their occurrance. This only makes sense if all inputs are of the same type or priority, and all tasks that are to be executed are also of the same type or priority.

A more common type of event queue also includes a notion of priority. In this type of queue the events would be placed in along with an explicit device number or address, a priority number, and an execution deadline (a time beyond which if the event hasn't been serviced, the system is considered to have failed), There can be many processes, all of which can place events into the queue, service them, and take them out. Or there can be many many processes capable of "posting" requests to the queue, and only one master process which looks at the queue, determines which task should be performed next, allocates resources, and removes events once they have been serviced. Schemes abound for handling queues, but it is imperative that whatever architecture is used, debugging and verification should be part of the design. The more elaborate and complex a system for handling events in real time, perhaps using multiple processes which can write and read from a common event queue and data pool, the more likely the system is to encounter fatal and hard-to-find errors like deadlocks, lost messages, recursive conditions that never halt, etc.

If we are using interrupts to handle critical inputs, there are many options for interrupt architecture and handling. Interrupts can be single priority or multi-level, in hardware or software. Interrupts can be nestable, where during one ISR another interrupt can cause a Branch to another ISR. Upon returning from the latter ISR the first ISR is picked up where it left off. These options and more can be implemented in hardware, software, or a mixture. Some dedicated processors designed specifically for real-time computing are differentiated from other processors specifically by the hardware they include to deal with interrupts.

If we are to use multiple processes in a real-time system, they must communicate at one or more levels. The queue is one mechanism of communication between logical processes, but often processes need to relay information regarding state and data. Shared memory is one mechanism, especially in a single processor system, but can make a system hard to debug because it may be difficult to determine which process changed memory in an undesirable way. Message passing is a more modern object-oriented way of passing information between processes. For multi-processor systems, shared memory rapidly becomes expensive, and a hardware data streaming method called Direct Memory Access is often used.

Also for multi-processor systems, we must determine what topology we will use to connect the different system components. This brings us to the question of networks, where the most common connection schemes are the Star (one master and many slave processors, the master has a separate port for each slave), the Ring (each processor gets data from the neighbor on the left, and gives data to the neighbor on the right, and the ends are wrapped to connect in a ring), and Busses including the Common-Bus (all processors connect to a common bus, only one can write to it at any given time, but all can read) and the Multi-Bus (all processors hook to all other processors via dedicated busses, not very practical for large numbers of processors, because there are N*(N-1) pathways required for N processors).

VI. Operating Systems for Real-time

The past few sections have just scratched the surface of the level of complexity that can be encountered when designing and programming real-time systems. Online, while the system is running, attention must be paid to scheduling and queueing, interrupt handling, load monitoring, etc., plus the computation required to accomplish the required tasks. During development, a system must aid in debugging, optimization, etc. All of this points to the need for an operating system. There are, of course, a large number of Real-time operating systems that have been created over the years. These are as varied as the applications they were designed to serve, or perhaps as varied as the theorems they were designed to test and verify. A short list of host-based real-time operating systems include VxWorks, OS-9, VRTX, LynxOS, Chimera, and RT Mach. A short list of DSP (see later section in these notes) includes MWave, VCOS, and SPOX.

VII. More on Mutiple Processes

v A common software model for handling multiple processes involves the use of a main loop and low-level interrupts. The main loop typically polls the less-critical inputs, services the queue by looking for tasks that need to be accomplished and passing control to processes which do the required work, takes events off the queue once they are completed, and otherwise waits around a lot. One or more Interrupt Service Routines respond to critical inputs and output requests. Variations on this model abound, with one very common system using one clocked ISR as the master queue servicing routine. It is possible to construct a system which uses only interrupts (once the system is configured by a startup routine), or only operates under control of a main loop.

VIII. More on Multiple Processors

The use of multiple processors in real-time systems is motivated by many factors. The main ones include

1) Segments the overall task, in good engineering practice, into smaller sub-problems which can more easily dealt with conceptually, and the processor types can be matched more appropriately with the functions being executed by them.

2) Response times can be improved, because a local processor can collect and interpret data before determining whether to disturb any other processors in the system.

3) Cost can be saved, because by matching the processors to the tasks at hand, a minimum cost-per-function can be achieved.

The use of multiple processors brings potential difficulties as well, however, including:

1) Multiple processors can increase complexity and indeterminacy. Debugging systems with multiple asynchronous processors, running different algorithms, possibly sharing memory or at least communicating information to each other, is difficult. If the multiple processors are not all of the same type, multiple sets of development tools which are not integrated with each other may need to be used simultaneously.

2) Depending on the selection of connection topology and hardware capabilities of the individual processors, response times can be degraded rather than improved. Passing data, synchronization, arbitration for busses and memory, etc. can all degrade the performance of a multi-processor system, compared to a single processor system.

IX. DSP and DSPs

A Digital Signal Processing chip (DSP) is a microprocessor designed specifically to implement Digital Signal Processing (DSP) algorithms. This circular definition makes more sense when we inspect the types of algorithms and tasks which come from the branch of applied engineering mathematics called Digital Signal Processing. Algorithms such as digital filtering, the Fast Fourier Transform and other frequency transforms, and matrix and vector mathematic operations, all fall into the realm of DSP algorithms. A processor specifically designed to perform these types of operations typically has a single cycle multiply- accumulator (without a very deep pipeline, unlike many modern CPUs which "average" one instruction per cycle). DSP chips also have optimized parallel data paths to deliver data to the multiplier, since many algorithms require long running sums of the products of pairs of numbers (vector inner product, or an FIR digital filter operation). To deliver data and store results efficiently, DSP chips typically have dedicated address generation hardware. These address registers essentially perform "pointer arithmetic," but also have hardware to do calculations on a modulo basis ( automatically wrapping from N-1 to 0 on an increment, or 0 to N-1 in a decrement), automatic bit-reversing required for FFT calculation, and incrementing and decrementing by arbitrary amounts. Finally, a DSP chip typically has dedicated support for I/O, specifically for A/D and D/A chips, and some type of interface to a host processor controller.

Applications of DSP algorithms include Data Compression/Decompression, Telephony (MODEMS, FAX), Audio Processing, Audio Synthesis, Speech Synthesis and Recognition, 3D Audio, and 2&3D graphics. The prime reason for using a DSP chip is that it is the most suitable piece of hardware for DSP algorithms. This can free up a less suitable host processor to do tasks for which it is more efficient. This, of course, translates to better system performance and decreased system cost. A cheap DSP and a cheap micro-controller can often perform better at a given cost than a high-powered host processor trying to do both tasks, neither of which it is best suited. As discussed below, in a typical system a microcontroller might read some switches and do some small amount of display, while the DSP performs audio synthesis or signal processing.

One reason for not using a DSP chip is that such chips are notoriously hard to develop code for. The optimized parallel architecture makes it difficult for a compiler to automatically generate optimal code for such chips. Further, a multi-processor system involving different types of processors (microcontroller and DSP, for example) requires two development systems, and programmers with expertise in either or both systems. Another vote against DSP chips is that system complexity goes up in multi-processor systems, and the tasks of synchronizing, control, and data movement often become more taxing on both processors than if the whole system were just running on a less optimal processor.

In recent times, the benefits of DSP architecture have found their way into new classes of processors, notably the "Media Processor," and the "Media Extended" host processors. Media Processors combine video, audio, and other media processing capability into Very Large Instruction Word (VLIW) architecture. In these processors, large amounts of memory is shared between all data types, but the ALU and data paths are splittable into smaller sub-words. So a stream of 8-bit video data, and two channels of 16-bit audio data, might all flow independently into sub- sections of the ALU for operations. The same ALU can be reconfigured on the next block of processing to handle two 24-bit streams of numbers.

Media extensions, such as MMX of Intel, and VIS of Sun, do somewhat the same thing to a modified host CPU. The existing floating point registers and floating point arithmetic unit can be split into smaller integer registers, and parallel operations on different integer data can be performed in one cycle. Both the Media Processors and the Media Extended host processors attempt to use reconfigurable hardware to achieve better than single-cycle operation times.

Trends for the future of DSP depend on the application. DSP chips will continue to improve somewhat in speed, but will grow cheaper and cheaper. These chips will find themselves in more and more dedicated imbedded applications in devices and systems. The modern automobile now contains a number of DSP chips, some doing steering and suspension control, some doing audio in the car radio. DSP chips have also begun to provide support for high-speed multiprocessor applications, so this allows for scalable designs. Media Processor future trends are still to be determined, because some say that media processors are themselves the future trend for desktop computing. Others say that Media Processors will only have use in dedicated devices such as set-top cable/audio boxes, and the Multi-Media Extended host processors will be the home of DSP in the desktop computer.

X. Microcontrollers

As with all definitions related to hardware and performance throughout the history of computing, the definition of a micro- controller has also changed somewhat. Common features are relatively constant however. In the past, microcontrollers typically exhibited

1) Small Word Size and Integer Math: This keeps the size, cost and power consumption down.

2) Low Level Language Interface: Typically Assembler, maybe C

3) Hardware Interrupt Support

4) Peripheral Devices Required for Inputs and Outputs

5) Low Cost

Examples of microcontrollers from the 1970's - 1980's include:

H8, 68xx, 65xx, Zx
8 bit data, 8 bit instruction, 2+K address
Clock Rate: 1-2 MHz
Language and Interface: Assembler via TTY or Hex Keypad
Typical System cost: $100.00

Those readers with some consumer microcomputer system experience might recognize that the 6800, 6502, Z80, and others from these families were actually the host processors resident in desktop computers like the early Apple I and II, the Commodore Vic and 64, the Atari 400 and 800, various Tandy and Radio Shack computers, etc. One relatively common thread in microcontroller history is that as a processor ends its life cycle as a main processor, it may just be beginning its life as a microcontroller. A mature processor that can be manufactured cheaply, and which has a long history of reliable software and tools often makes an excellent choice for a microcontroller.

Modern microcontrollers include many updated versions of historically popular microprocessors, and also some new processors designed specifically for use as microcontrollers. many basic goals and features still persist, with some new additions:

1) Small Words (relatively) Integer Math (or not)
2) Higher Level Language Support: Assembler, C, Forth, BASIC, with tools
3) Peripheral Inputs Integrated
4) Low Cost

Examples of currently available microcontrollers:

Updated versions of 68xx, 65xx, Zx, + 680xx family
PIC Chip ($4-10.00)
The BASIC Stamp = PIC + More
16 bit data, high level instructions, 2K memory
Clock Rate: 4-20 MHz
Language and Interface: BASIC via PC Serial Port
System cost: $10-35.00

References:

Some Web References:

Educational Research Groups and Projects:

http://www.eecs.umich.edu/RTCL/ U. Michigan Real-Time Computing Lab
http://www.cs.cmu.edu/Groups/real-time/cmu.html Carnegie Melon Real-Time Groups

Commercial:
From Artesyn: http://www.artesyn.com/cp/html/body_choosingos.html Choosing an OS
http://www.artesyn.com/cp/html/body_choosingproc.html Choosing a Processor
http://www.realtime-os.com/rtresour.html> Resource List Compiled by E. Douglas Jensen

Some Non-Web References Made from Dead Trees:

"Encyclopedia of Computer Science"
A. Ralston and E. Reilly, eds.
New York: Van Nostrand Reinhold, 1993

"Real-Time Programming : Neglected Topics"
Caxton C. Foster.
Reading, Mass. : Addison-Wesley Pub. Co., c1981.

"Real-Time Software for Control : Program Examples in C"
David M. Auslander, Cheng H. Tham.
Englewood Cliffs, N.J. : Prentice Hall, c1990.

"Real-time systems, Specification, Verification, and Analysis,"
Mathai Joseph, ed.
Englewood Cliffs, N.J. : Prentice Hall, 1996

"Introduction to Real-time Software Design,"
S.T. Allworth and R.N. Zobel
New York, Springer Verlag, 1989

Real-Time Computing For Human Computer Interfacing November 4, 2002

I. What's Real-Time?

Card Oriented Batch

100-10000s

Keyboard Oriented Batch

1-100s

Interactive Computing

1-10s

Online Inquiry and Transactions