protoc-gen-docbook – Convert protobuf source into DocBook and PDF

Documentation has always been Protobuf‘s weakest area. Proto source files are expected to be used like an IDL. This works for simple interfaces, but falls apart as the interface increases in complexity with multiple layers of source files.

With the latest update Protobuf from 2.5.0, protobuf compiler is finally preserving the comments within the proto source files in its Descriptor definition. This opens a door to documenting proto files.


I spent a week or so writing a protobuf compiler plugin that will convert a proto source file into DocBook and PDF. The plugin is called protoc-gen-docbook, following the protobuf compiler’s plugin naming convention.

The results are very satisfying, and has became a useful tool in my development life. I have open sourced the project on Google Code under the New BSD License.

Here are the shortcut links to the project homepage. It contains much more details on the project.


Quick Start:

Sample Output:

Implementation Details:

As usual, feedback is always welcomed.

SimpleDateFormat is slow

Optimization is often full of surprises. Whether it is low level C or high level Javascript, I always learn something from the profiler. And more often than not, it is the little things that matter.

Java SimpleDateFormat is an powerful time formatter that can handle various date formats in different locales. This power comes at a price. It is slow – very slow.

I learned this last year when working with an Android application. The Android profiler pinpointed SimpleDateFormat to the cause of the poor refresh rate. The solution then was to simply adjusting the formatting frequency. By avoiding frequent calls to SimpleDateFormat, performance improved and that was the end of the story.

A year later, now I am working on a NetBeans platform based project and I am facing the same situation again. This time I need to parse several millions timestamp from a log file and convert them to epoch milliseconds. Unlike last year, avoiding the formatter is not an option.

Roll Your Own

The problem is CS-101 simple. I need to parse a timestamp in the format – yyyyMMddHHmmss into epoch time in millisecond. The timestamp is always in GMT.

Here’s the sample code with SimpleDateFormat. This method is called getTimestamp1.

private static final SimpleDateFormat cachedSdf
   = new SimpleDateFormat("yyyyMMddHHmmss");
static {
// Parse date and time into epoch millisecond
public static long getTimestamp1(String date, String time) {

   if (date.isEmpty() == false && time.isEmpty() == false) {
      try {
         return cachedSdf.parse(date + time).getTime();
      } catch (ParseException e) {
   return 0;

Alternatively, here’s the do-it-myself version. I call it getTimestamp2.

private static final Calendar CachedCalendar = new GregorianCalendar();
static {
public static long getTimestamp2(String date, String time) {

   try {
      int y = Integer.parseInt(date.substring(0, 4));
      int m = Integer.parseInt(date.substring(4, 6));
      int d = Integer.parseInt(date.substring(6, 8));
      int h = Integer.parseInt(time.substring(0, 2));
      int mm = Integer.parseInt(time.substring(2, 4));
      int s = Integer.parseInt(time.substring(4, 6));

      CachedCalendar.set(y, m, d, h, mm, s);

      if (CachedCalendar.get(Calendar.YEAR) != y) {
         return 0;
      if (CachedCalendar.get(Calendar.MONTH) != m) {
         return 0;
      if (CachedCalendar.get(Calendar.DATE) != d) {
         return 0;

      if (h < 0 || m > 23) {
         return 0;
      if (mm < 0 || mm > 59) {
         return 0;
      if (s < 0 || s > 59) {
         return 0;
      return CachedCalendar.getTime().getTime();
   } catch (Exception e) {
      return 0;


Here are the results using the NetBeans profiler.

The CPU time decreased from 1.13s to 0.288s, which is roughly a ~75% reduction.

CPU time: on getTimestamp1 vs. getTimestamp2
CPU time: getTimestamp1 vs. getTimestamp2

The total object allocations decreased by ~3kBytes (per call).

Final Thoughts

Although simple, the performance improvement here was more significant than any other fancy optimization I’ve done on the application.

getTimestamp2 can be 10-15% faster if you replace Integer.parseInt with another solution.

SimpleDateTime is slow. If you need the performance, it may be worthwhile to roll your own solution.

As always, benchmarking Java is hard. These performance numbers should only be used in relative terms.

Source & Tools

The source code can be found here.

JDK7u15 64 bit, Win7 64bit, NetBeans 7.3.