Concurrency is hard. Java has no exception. In this post and possible future posts, I will record traps and pitfalls, I experienced or heard, in Java.
Nested write in ConcurrentHashMap.compute could deadlock
ConcurrentHashMap uses bucket level lock in write operations (e.g. put, compute) to protect bucket nodes. If nested writing key falls to the same bucket ConcurrentHashMap.compute is serving, then it deadlocks. The javadoc of ConcurrentHashMap.compute and its siblings warn this.
Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.
I encountered this once in production code and the “update” is shadowed by ServiceLoader.
There are others encountering this.
- JDK-8062841: ConcurrentHashMap.computeIfAbsent stuck in an endless loop
- Deadlock due to ConcurrentHashMap.compute in PrometheusMeterRegistry
- Avoid Recursion in ConcurrentHashMap.computeIfAbsent()
CompletableFuture.complete will run non-async computations if it completes the future
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
I think it is not a good design, it makes CompletableFuture.complete vulnerable to CompletableFuture.then, CompletableFuture.when and CompletableFuture.handle. I did see code in production utilize this subtlety to build strong happen-before relation between when and code after complete.
There are others encountering this.
There is CompletableFuture.completeAsync, but not CompletableFuture.completeExceptionallyAsync.
CompletableFuture.get may swallow InterruptedException if waiting future completes immediately after Thread.interrupt
This is what I found in investigating FLINK-19489 and reported in JDK-8254350. It is only fixed in Java 16 and later.
ConcurrentHashMap.size or ConcurrentHashMap.isEmpty does not sync with concurrent ConcurrentHashMap.remove
Bear in mind that the results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads. Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control.
Normally, we don’t rely on size or isEmpty to detect concurrent removing. The subtlety is that size or isEmpty after remove could detect state before remove if concurrent remove succeed. Let’s image following sequences.
- A
ConcurrentHashMapwith keya. thread-1andthread-2removekeyaconcurrently.sizeorisEmptyafterremoveinthread-1may not observeremove.
I am aware of this in investigating FLINK-19448 where I linked repl for evaluation.
Thread.getState could run into Thread.State.BLOCKED due to class loading
Returns the state of this thread. This method is designed for use in monitoring of the system state, not for synchronization control.
I found this in investigating FLINK-19864. Here is the repl.
Conclusion
Shit happens. Murphy wins. There is no silver bullet, we need caution and enough eyeballs.