In our new Java Monthly edition, we’d like to introduce you to Jean-Philippe Bempel. He was kind enough to share his experience on 12 more Java-related questions.
Developer passionate by performance, runtimes (JVM, CLR) and Mechanical Sympathy supporter, Jean-Philippe has more than 8 years experience in low latency trading systems. He is now working at Datadog on a production debugger and its JVM implementation. He is also a committer on the OpenJDK’s project JDK Mission Control, which, with various and deep technical JVM talks, led him to become a Java Champion.
Dreamix: To start up, can you briefly describe what safepoint in JVM means?
Jean-Philippe Bempel: A safepoint is a point in the execution where we can safely stop threads to inspect their stacks and be able to reach all objects because JVM knows precisely where they are.
Dreamix: How does the VM know if a current point in the execution is a safepoint so that the application threads can be suspended?
Jean-Philippe Bempel: JVM knows because either we are interpreted bytecode and between every bytecode instruction we are at a safepoint, either we are executing JITed code and the JIT emitted information (OopsMaps) and poll instruction to check if we need to stop the thread at the current safepoint.
Dreamix: Sometimes bringing the JVM to a safepoint state can be a high cost operation. For that reason, can a tradeoff or optimization be achieved?
Jean-Philippe Bempel: The TTSP (Time-To-SafePoint) is usually short, because polls to safepoint are emitted at the end of a method or inside a long loop. However, there are some cases where we may wait for a long time before a thread reaches the next safepoint: performing a System.arraycopy on a large buffer, a large for loop on integer (counted loop) with no method call but with expensive operations, or external factors like oversubscription of executing threads compared to number of cores available or some kernel operations like swapping or THP (Transparent HugePages) compaction could lead to stall threads. There is no silver bullet for this, and each case needs to be carefully analyzed to understand the factors causing a long TTSP.
Dreamix: How would you explain safepoint bias to a dummy?
Jean-Philippe Bempel: Profilers collect stacktrace samples at regular intervals. The collection of those samples requires stopping the threads to walk safely through their stack. The easiest way to stop the thread is to use a VM Operation at safepoint. Those safepoints are either in a long loop, end of method or native code transition. So samples will always give us stack at those safepoints statistically increasing the importance in a profile of that point in the execution, while the expensive code could be between those safepoints.
Dreamix: Is it going to be better if the future JVM profilers are non-safepoint biased? Why?
Jean-Philippe Bempel: Modern JVM profilers try to collect stacktraces without using VM operation thus avoiding to stop all the threads at once. But debug information is still emitted by default at safepoint which does not much improve the profile bias as I explain in detail in my last blog entry.
Dreamix: What is the benefit of Java Mission Control over other profiling tools (VisualVM, JProfiler, Netbeans profiler)?
Jean-Philippe Bempel: JDK Mission Control allows to open and process JFR files that are created by the JVM. The content of JFR files are much more that just a collection of stacktrace samples; it also contains events emitted by JVM, code into the JDK or your own custom events. So you could get information about locks, threads, allocations, GC, Socket & File IOs, JIT, etc. JMC also includes a MBean console. Regarding execution profiling, JFR is collecting stacktraces without using VM operation.
Dreamix: Can you give a brief overview about what kind of issues JVM applications might have when running on Mesos containers?
Jean-Philippe Bempel: In linux container, the JVM can be running in a much more constrained environment that usual and if not correctly configured we can run into issues: OOMkill of the JVM process if the container memory is too short compared to what the JVM requires, excessive CPU throttling or thread oversubscription if CPU quotas are too tight.
Dreamix: Given the case when not used objects but still being referenced cannot be cleaned by the GC (memory leak). Is there a way that they can be somehow found and automatically cleaned?
Jean-Philippe Bempel: Finding the root cause of memory leak is not an easy task. The old fashion way is to take Heap Dumps at different points in time and compare them to find the increasing instances of class. The other way is to use JFR and the OldObjectSample event which Marcus Hirt describes in his blog. It will give you object candidates to examine with reference chains to GC roots and help you understand why those instances are retained.
Dreamix: At the moment Datadog is a leading monitoring platform on the market. Can you shed some light on the challenges there and the benefits that the platform brings?
Jean-Philippe Bempel: Beside the usual Metrics, Logs & Distributed Tracing, Datadog is also providing Continuous Profiling. For the JVM we are using JFR, and JMC for backend processing. With applications deployed in the cloud it’s more complex to deploy also profiler tools. Having JFR baked into the JVM is very helpful and requires less operation to be able to gather all the information needed. JFR is designed to be always on and with a controlled overhead.
Dreamix: Processing a big amount of data with a minimum delay and high speed is important for every system. You have experience in low latency trading systems. Can you give examples of what kind of optimizations have been made there in order to respond quickly to the market changes?
Jean-Philippe Bempel: In trading systems I dealt with one of the first issue is GC. We were using ParallelGC for those systems. I know this is counter-intuitive because this is a throughput oriented collector, but the thing is ParallelGC is very predictable and easy to reason about. And while we need low latency, predictability is also key in trading. Heap was sized large enough to avoid Full GC during trading hours, and then we made careful allocations inside the critical path to have a minor GC of 20/30ms every 5 minutes or so. We have also used very early (2012) the Azul’s C4 collector with Prime JVM (formerly Zing JVM) with very good results!
Then, to reach 100us to route orders on the market, one of the most effective optimizations was to tune the machine (BIOS/OS) to avoid latency for example on Power Management (200us to wake a core that was on sleep, while you want to route order in 100us…), thread affinity to avoid L3 cache thrashing by noisy neighbors, etc.
Dreamix: How do you update yourself about the latest trends in Java?
Jean-Philippe Bempel: My main source of information is twitter! With the right (and carefully crafted) set of followers, you can cover a lot of grounds about what’s the latest news in the JVM world. The other source is also participating in conferences and unconferences (JCrete, JAlba, JChateau, …) where you can meet people and have long conversations about many subjects!
Dreamix: Can you recommend a favorite book about programming? What about a favorite book in general?
Jean-Philippe Bempel: I am no longer reading books these days compared to my early days as a programmer back to the 90s , but the last book which I found really good is Java Concurrency In Practice. It’s an old one now, but the fundamentals are still solid today. As for in general, big fan of Isaac Asimov’s books.
Is there anything else you would like to ask Jean-Philippe Bempel? What is your opinion on the questions asked? Who would you like to see featured next? Let’s give back to the Java community together!