What you get when you use locks to protect data and you have a cycle in your dependencies.
Prevent locks from leading to deadlock:
or eliminate locks entirely:
There are four necessary and sufficient conditions for DeadLock
. All of these conditions need to hold within a ConcurrentSystem
to occur. These conditions are:
- Blocking shared resources. A shared resource which can be accessed concurrently is protected by some mechanism (typically a lock in practice) that can block a process or thread waiting to acquire the resource while another process or thread is using the resource. (This can be avoided by using LockFreeSynchronization)
- No pre-emption. A process or thread cannot force another process or thread to drop a shared resource it is holding.
- Acquiring while holding. A process or thread may attempt to acquire additional shared resources while it is holding others. (at least two resources)
- Circular wait. A cycle (schedule) exists in the ConcurrentSystem such that the above three conditions all hold for a set of processes or threads accessing a shared resource. (at least two processes or threads)
- While it's generally considered a dumb thing to do...it is certainly possible to block on yourself, especially if non-recursive mutexes are used.
As these conditions are necessary and sufficient, it is possible to break any DeadLock
by breaking any one of the above four conditions.
Typically the first condition is a non-starter because in order to break it, a serializable schedule needs to be enforced over a shared resource. Practically this usually isn't tractable due to the inherent non-determinism within most schedulers.
Pre-emption leads to issues of LiveLock
(i.e. can cause the system to violate liveness properties and starve) if not used prudently (see DiningPhilosophers
). A type of pre-emption commonly used is rollback in transaction based systems. E.g. some RDBMSes will often attempt to break a DeadLock
by rolling back the offending transactions and re-attempting them (aside: liveness is then 'tackled' by typically aborting the processes with error after a certain number of repeated rollbacks i.e. the RDBMS spots the LiveLock
but cannot prevent it, at least ensuring the system does not InfiniteLoop
The third condition ensures that a ConcurrentSystem
with only one shared resource (with only one atomic access mechanism) will never DeadLock
. This is akin to the GiantLock?
approach that is often applied to make an API ThreadSafe
. When more than one shared resource exists within the system, it usually isn't practical to attempt to break this condition if the processes or threads necessarily require access to more than one of those shared resources to perform their work (e.g. the "chopsticks" in the DiningPhilosophers
The final condition emphasises the behavioural aspect of the ConcurrentSystem
and is the condition most commonly tackled. Ensuring that the acquisition of locks always occurs in the same order by all processes or threads sharing a resource is one commonly used way of breaking this potential for a circular wait in the system and so ensures DeadLock
will not occur.
It should be noted that the first condition is a broad criterion that encompasses most approaches to concurrency (i.e. interference avoidance). The second and third conditions above are more policy statements on the application of ConcurrencyControl?
within the system. The fourth condition is a behavioral statement concerning system dynamics.
It is for these reasons that generally the fourth condition is most easily broken (especially if you're fixing a DeadLock
bug during maintenance), the second and third conditions are harder to break, particular if there is a reliance on third-party software within the system which enforces its own concurrency policy. The first condition is arguable the hardest to break as it effectively states that you need to write your own scheduler to avoid destructive interference within the system. This directly runs against the principle of OnceAndOnlyOnce
and generality within SoftwareEngineering
as a whole, but can have significant performance gains (ThreadsWithoutLocks?
). The complexity and rigour required is generally prohibitive in practice (i.e. there are easier ways of breaking DeadLock
Not to be confused with StaleLock?
Discussion about deadlock in biological systems moved to BiologicalDeadlock
See also: DatabaseDeadlockAvoidance