Concurrency is hard. Java has no exception. In this post and possible future posts, I will record traps and pitfalls, I experienced or heard, in Java.
Nested write in ConcurrentHashMap.compute
could deadlock
ConcurrentHashMap
uses bucket level lock in write operations (e.g. put
, compute
) to protect bucket nodes. If nested writing key falls to the same bucket ConcurrentHashMap.compute
is serving, then it deadlocks. The javadoc of ConcurrentHashMap.compute
and its siblings warn this.
Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.
I encountered this once in production code and the “update” is shadowed by ServiceLoader
.
There are others encountering this.
- JDK-8062841: ConcurrentHashMap.computeIfAbsent stuck in an endless loop
- Deadlock due to ConcurrentHashMap.compute in PrometheusMeterRegistry
- Avoid Recursion in ConcurrentHashMap.computeIfAbsent()
CompletableFuture.complete
will run non-async computations if it completes the future
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
I think it is not a good design, it makes CompletableFuture.complete
vulnerable to CompletableFuture.then
, CompletableFuture.when
and CompletableFuture.handle
. I did see code in production utilize this subtlety to build strong happen-before relation between when
and code after complete
.
There are others encountering this.
There is CompletableFuture.completeAsync
, but not CompletableFuture.completeExceptionallyAsync
.
CompletableFuture.get
may swallow InterruptedException
if waiting future completes immediately after Thread.interrupt
This is what I found in investigating FLINK-19489 and reported in JDK-8254350. It is only fixed in Java 16 and later.
ConcurrentHashMap.size
or ConcurrentHashMap.isEmpty
does not sync with concurrent ConcurrentHashMap.remove
Bear in mind that the results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads. Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control.
Normally, we don’t rely on size
or isEmpty
to detect concurrent removing. The subtlety is that size
or isEmpty
after remove
could detect state before remove
if concurrent remove
succeed. Let’s image following sequences.
- A
ConcurrentHashMap
with keya
. thread-1
andthread-2
remove
keya
concurrently.size
orisEmpty
afterremove
inthread-1
may not observeremove
.
I am aware of this in investigating FLINK-19448 where I linked repl for evaluation.
Thread.getState
could run into Thread.State.BLOCKED
due to class loading
Returns the state of this thread. This method is designed for use in monitoring of the system state, not for synchronization control.
I found this in investigating FLINK-19864. Here is the repl.
Conclusion
Shit happens. Murphy wins. There is no silver bullet, we need caution and enough eyeballs.