Using exceptions for flow control is slow, even for DateTimeFormatter

Last week, I reviewed a seemingly simple patch to an Apache Crunch pipeline. Functional details don’t matter, but the pipeline processes loads of hits from HTTP proxies. Hits have a timestamp and the patch aims to enrich every hit with some metadata indexed by user and business date (ie. we query a repository with a composite key, part of it is the date formatted as a yyyyMMdd string). Functional requirements are clear, the code is simple and the repository is more than fast enough to not affect the performance profile of the pipeline...

October 4, 2016 · 8 min · 1555 words · Clément MATHIEU

Chasing down Guava cache slowness

This post describes how I re-discovered a three year old performance issue in Guava caches.

August 29, 2015 · 5 min · 1053 words · Clément MATHIEU

Presentation, Sample 'em all

This post share the slide deck I used to introduce flamegraphs and Java profilers to other teams when I joined Mediamétrie and reduced the execution time of several data pipelines by 100x or more.

October 26, 2014 · 2 min · 233 words · Clément MATHIEU

OpenJDK JEP 180: HashMap vs collisions

This post explains how OpenJDK JEP 180 reworks HashMap implementation to mitigate complexity attacks.

May 1, 2014 · 16 min · 3300 words · Clément MATHIEU