New reader tasks will be held back (in the Readers Task Queue on the left side of Figures 2 and 4), as soon as a
writer task has asked for permission to write. Thus, new reader tasks might be delayed for a long time. In an extreme
case, reader tasks might suffer starvation. This extreme case will be exceedingly rare if, as is typical when using
Multiple Reader – Writer Locks, reading is quite frequent but writing is quite infrequent.
3. “Fair” Lock.
This is a Multiple Reader – Writer Lock that will not allow either a reader task or a writer task to starve. It strives to be
‘fair’ to both readers and writers, but in doing so it might force some reader tasks or writer tasks to wait for longer
times than the absolute minimum possible waiting time.
One way to achieve ‘fairness’ is for the Lock to service tasks on a “First Come, First Served” basis within the
constraints of a Multiple Reader – Writer Lock. In other words, new reader tasks will be held back if there’s a writer
task already waiting. And new writer tasks will be held back if there’s a reader task already waiting.
Of these 3 variants of Multiple Reader – Writer locks, the “Writers Preference” variant is probably the most common
implementation in operating systems. It works well when writer tasks lock a Lock infrequently, and only for short
periods of time.
Developers of multi-tasking software with an RTOS are familiar with 3 kinds of traditional semaphores: counting
semaphores, binary semaphores and mutex semaphores. As embedded software enters the era of multi-core
processing, these 3 are being joined in a number of multi-core oriented operating systems by a fourth kind of
semaphore: the multiple reader – writer lock. We have seen how these new semaphore locks ensure mutual
exclusion when data writer tasks are involved, while allowing multiple reader tasks to access data at the same time.
These new semaphores often offer better performance than traditional kinds of semaphores when multiple
tasks desire truly parallel read-access to shareable data. Such performance acceleration is of benefit in many
producer-consumer applications, data base interactions, and multi-core software designs.
1. COUNTING SEMAPHORES:
“Don't believe everything you hear about RTOSs” by Michael Barr.
URL: http://www.eetimes.com/design/automotive-design/4007693/Don-t-believe-everything-you-hear-about-RTOSes .
2. UNBOUNDED PRIORITY INVERSIONS:
“Mutexes Battle Priority Inversions” by David Kalinsky.
URL: http://kalinskyassociates.com/Wpaper2.html .
Advanced Technical Paper:
"The Fourth Semaphore:
Multiple Reader - Writer Locks"
Users of real-time operating systems (RTOS’s) are familiar with the various kinds of semaphores they offer.
Depending on which RTOS you choose, these may include counting semaphores, binary semaphores and mutex
semaphores. As the embedded software world transitions into the era of multi-core processing, these 3 “traditional”
kinds of semaphores are being joined in a number of multi-core oriented operating systems by a fourth kind of
semaphore: the multiple reader – writer lock. In this article, we will see how these new semaphores work, and which
multi-core application problems are best solved using this special kind of semaphore.
A software design problem that is encountered more and more often when using multi-core processors, is the so-
called “Multiple Readers – Writers” problem. This is a situation in which a large data area must be shared among
multiple tasks. Clearly, when one task is writing into the data area, no other task should be writing into it at the same
time. And no other task should be reading from it at that time. This discipline could also be enforced by a “traditional”
semaphore. But in the “Multiple Readers – Writers” problem, we would like to allow many tasks to read from the data
area at the same time – at those times when no task is writing into it. Traditional semaphores do not offer this
flexibility, as they instead limit access to the data area to only one single reader task when there is no task writing.
Thus, traditional semaphores are a less-than-optimal solution for the multiple readers – writer problem.
An application example is the management of an airline reservation seat map data base. Many different air travel
customers may be interested in certain seats on a certain flight, and may ask to see the seat map of nearby seats at
the same time. For instance, a customer in San Francisco, and another in Berlin, and another in Tel-Aviv may all at
once inquire about the seats near seat 18D in Figure 1 below.
|This material first appeared on the internet, in edited form, in EE Times - Embedded Design, October 26, 2010.
This material has also been published in German, in Elektronik Praxis magazine (Germany), June 8, 2010.
© Copyright 2016, D. Kalinsky Associates, All Rights Reserved.
This page Updated March 25, 2016
on this Flight
When all of those customers ask about the seat map, we would like to provide them with this information as quickly
as possible – all at the same time, rather than one-after-another. But if one of those customers actually decides to
reserve a seat or two, we want to allow only that one customer to select that seat or two. We want to prevent different
customers from attempting to reserve the same seat(s) at the same time. And we want to prevent reading of the seat
map while the seat reservations are in the midst of being updated. A “Multiple Reader – Writer” lock can do this with
optimal efficiency, by allowing multiple simultaneous readers when there is no active writer, and no simultaneous
access at all during active writing.
This “Multiple Reader – Writer” lock is our fourth kind of semaphore. It is becoming available in a number of multi-
core oriented embedded operating systems. In particular, it can be found in Symmetric Multi-Processing (“SMP”)
operating systems with origins either in the world of traditional single-CPU real-time operating systems (RTOS) or
origins in the world of SMP Linux. Using these operating systems, tasks (sometimes called ‘threads’ or ‘processes’)
may dynamically move from core to core. These tasks may at the same time be interested in reading from shared
data tables or data bases. Or a number of tasks may have a Producer-Consumer relationship for some shared
information, with multiple consumers often interested in receiving information at the same time.
On a multi-core chip, different tasks may be running on different cores in true parallelism (beyond the pseudo-
parallelism of multitasking within a single CPU). So the ability to provide them with the data they need in true parallel
fashion can speed up performance -- when compared with the sequentialization of data reading that would be
provided by a “traditional” semaphore.
REVIEW OF 3 KINDS OF “TRADITIONAL” SEMAPHORES
Single-CPU oriented RTOSs typically offer the following 3 kinds of “traditional” semaphores: counting semaphores,
binary semaphores and mutex semaphores. Each of them is targeted for a different purpose in our multitasking
The most sophisticated of the three is the counting semaphore. It can be thought of like a plastic cup containing a
number of identical ‘tokens’. Tasks can put individual tokens into the cup, or take individual tokens out. Counting
semaphores have often been used for regulating the access of tasks to multiple equivalent serially-shareable
resources. For instance, 17 tasks may wish to share 4 identical printers. In this case, a counting semaphore can be
created and initialized with 4 tokens, as illustrated below. Tasks are then programmed to take a token before printing,
and return the token after printing is done.
But such tasks run into a problem: After one or more tokens has been taken, the task will not be able to determine
which printer(s) are still available. Thus, counting semaphores are only a very partial solution to the sharing of
multiple equivalent resources. [See reference 1.]
A better way to use counting semaphores is for signaling from one task to another. For example, if 2 tasks of
different priorities need to execute the same total number of times over the long run: A counting semaphore can be
created with an initial count of zero (no ‘tokens’ in it). Every time the higher priority task runs, it creates a token and
puts it into the semaphore, thus incrementing the semaphore’s count. The lower priority task of the pair waits at the
semaphore for tokens to appear, and runs once for each new token, thus consuming the token and decrementing the
semaphore’s count. If the high priority task runs with moderate bursts, the lower priority task will eventually ‘catch up’
to the same total number of executions.
Classic binary semaphores can be thought of simply as counting semaphores that are limited to a maximum count
of one (1 single token). They can also be used for signaling from task to task, in situations where signals (counts,
tokens) will not accumulate or need not be counted. Classic binary semaphores should be avoided for purposes of
mutual exclusion. For example, they should not be used to regulate sharing of a single printer, or a shared data
structure among multiple tasks. The reason is a famous (or should I say ‘infamous’) bug called an “unbounded
priority inversion” that is intrinsic in classic binary semaphores. [See reference 2.]
Mutex semaphores can be thought of as binary semaphores that have had the “unbounded priority inversion” bug
repaired. Hence, a mutex is the mechanism of choice for purposes of mutual exclusion. It’s the way to regulate
sharing of a printer, or a shared data structure among multiple tasks. Most mutexes solve the “unbounded priority
inversion” bug by manipulating the priority of a task as it holds the mutex, using techniques such as ‘priority
inheritance’ protocol or ‘priority ceiling’ protocol.
THE FOURTH SEMAPHORE
All three kinds of “traditional” semaphores treat all tasks the same way, when they are used to regulate access to a
single shared resource. They do not distinguish between tasks that wish to read from a shared resource, as against
tasks that wish to write to the shared resource. Only the fourth kind of semaphore, the “Multiple Reader – Writer”
Lock, differentiates between reader tasks and writer tasks. And by doing so, it allows simultaneous parallel access
to the shared resource by multiple reader tasks only.
Multiple Reader – Writer Locks are inherently complex. Inside of the operating systems that support them, each lock
has associated with it a pair of task-waiting queues, as shown in Figure 2 below.
|FIGURE 1: Airline reservation seat map example
of “Multiple Readers – Writers” problem.
|FIGURE 2: “Multiple Reader – Writer” Lock and associated Task Queues.
Readers Task Queue
Writers Task Queue
One task queue, shown to the right in Figure 2, is a queue in which writer tasks can be held by the operating system,
to wait until reader tasks have finished reading. A second task queue, shown to the left in Figure 2, is a queue in
which reader tasks can be held by the operating system, to wait until a writer task (or perhaps several) have finished
writing. The management and control of these tasks queues is complex, often involving additional internal
semaphores within the operating system logic itself. This complexity can sometimes impact performance, so these
Locks should be used with care.
Multiple Reader – Writer Locks typically provide higher performance than other semaphores or locks when there are
lots of concurrent reader tasks, but writing occurs infrequently. This is quite common in multi-core software designs.
Multiple Reader – Writer Locks typically do not provide a significant performance benefit when there is a high rate of
writing by writer tasks. This is because writer tasks must run one-at-a-time, and only when no reader tasks are
actively reading. In this situation, “traditional” semaphores may well provide higher performance. In a single-CPU
environment, an alternative to consider may be a mutex. In a multi-core environment, am alternative to consider might
be a “spinlock”.
APPLICATION PROGRAMMER’S INTERFACE
There is no standardization of the application programmer’s interface (API) to Multiple Reader – Writer Locks as
implemented in the various operating systems that support them. So I’d like to show a “generic” API that illustrates
the typical services offered, but is not identical to the API of any actual operating system:
• RWLockCreate() Create a New Multiple Reader – Writer Lock.
• RWLockDelete() Destroy an existing Multiple Reader – Writer Lock.
• RWLockRead() Lock the Lock for Reading. [Shareable.]
• RWLockEndRead() Un-lock, after the end of Reading.
• RWLockWrite() Lock the Lock for Writing. [Non-Shareable.]
• RWLockEndWrite() Un-lock, after the end of Writing.
Once such a lock has been created to protect a shareable data resource, tasks that wish to read from it must first call
RWLockRead() . Later, those tasks must call RWLockEndRead() when they are finished reading. More than 1 task
will be permitted to read at the same time. Tasks that wish to write must first call RWLockWrite() . Later, those tasks
must call RWLockEndWrite() when they are finished writing. Only 1 task at a time will be permitted to write, and this
will only be when no task has permission to read.
VARIATIONS OF MULTIPLE READER – WRITER LOCKS
Not all Multiple Reader – Writer Locks behave in precisely the same way. Different operating systems that offer
them, implement them with differing internal logic that can express different application software design preferences.
We will take a quick look at 3 possible differing approaches:
1. “Readers Preference” Lock.
This is a Multiple Reader – Writer Lock that tries to maximize potential concurrency by giving clear preference to reader
tasks. Whenever some tasks are already reading, any new readers will be allowed to join them. A new reader task
will only be held back (in the Readers Task Queue on the left side of Figures 2 and 3), if a writer task is already
holding permission to write.
|FIGURE 3: “Multiple Reader – Writer” Lock with Readers Preference.
Writer tasks will be held back (in the Writers Task Queue on the right side of Figures 2 and 3), so long as any reader
task is still reading. Thus, writer tasks might be delayed for a long time. And in an extreme case, writer tasks might
suffer starvation in the Writers Task Queue.
2. “Writers Preference” Lock.
This is a Multiple Reader – Writer Lock that strives to allow writer tasks to update the protected data as soon as
possible. But it still allows reader tasks, including possibly multiple parallel readers, to finish the reading they’ve
already started, before a writer task is given access to the data.
|FIGURE 4: “Multiple Reader – Writer” Lock with Writers Preference.