Feeds:
Posts
Comments

Posts Tagged ‘Performance’

Optimization is often full of surprises. Whether it is low level C or high level Javascript, I always learn something from the profiler. And more often than not, it is the little things that matter.

Java SimpleDateFormat is an powerful time formatter that can handle various date formats in different locales. This power comes at a price. It is slow – very slow.

I learned this last year when working with an Android application. The Android profiler pinpointed SimpleDateFormat to the cause of the poor refresh rate. The solution then was to simply adjusting the formatting frequency. By avoiding frequent calls to SimpleDateFormat, performance improved and that was the end of the story.

A year later, now I am working on a NetBeans platform based project and I am facing the same situation again. This time I need to parse several millions timestamp from a log file and convert them to epoch milliseconds. Unlike last year, avoiding the formatter is not an option.

Roll Your Own

The problem is CS-101 simple. I need to parse a timestamp in the format – yyyyMMddHHmmss into epoch time in millisecond. The timestamp is always in GMT.

Here’s the sample code with SimpleDateFormat. This method is called getTimestamp1.

private static final SimpleDateFormat cachedSdf
   = new SimpleDateFormat("yyyyMMddHHmmss");
static {
   cachedSdf.setTimeZone(TimeZone.getTimeZone("GMT"));
}
// Parse date and time into epoch millisecond
public static long getTimestamp1(String date, String time) {

   if (date.isEmpty() == false && time.isEmpty() == false) {
      try {
         return cachedSdf.parse(date + time).getTime();
      } catch (ParseException e) {
      }
   }
   return 0;
}

Alternatively, here’s the do-it-myself version. I call it getTimestamp2.

private static final Calendar CachedCalendar = new GregorianCalendar();
static {
   CachedCalendar.setTimeZone(TimeZone.getTimeZone("GMT"));
   CachedCalendar.clear();
}
public static long getTimestamp2(String date, String time) {

   try {
      int y = Integer.parseInt(date.substring(0, 4));
      int m = Integer.parseInt(date.substring(4, 6));
      --m;
      int d = Integer.parseInt(date.substring(6, 8));
      int h = Integer.parseInt(time.substring(0, 2));
      int mm = Integer.parseInt(time.substring(2, 4));
      int s = Integer.parseInt(time.substring(4, 6));

      CachedCalendar.set(y, m, d, h, mm, s);

      if (CachedCalendar.get(Calendar.YEAR) != y) {
         return 0;
      }
      if (CachedCalendar.get(Calendar.MONTH) != m) {
         return 0;
      }
      if (CachedCalendar.get(Calendar.DATE) != d) {
         return 0;
      }

      if (h < 0 || m > 23) {
         return 0;
      }
      if (mm < 0 || mm > 59) {
         return 0;
      }
      if (s < 0 || s > 59) {
         return 0;
      }
      return CachedCalendar.getTime().getTime();
   } catch (Exception e) {
      return 0;
   }
}

Benchmark

Here are the results using the NetBeans profiler.

The CPU time decreased from 1.13s to 0.288s, which is roughly a ~75% reduction.

CPU time: on getTimestamp1 vs. getTimestamp2

CPU time: getTimestamp1 vs. getTimestamp2

The total object allocations decreased by ~3kBytes (per call).
getTimestampMemory

Final Thoughts

Although simple, the performance improvement here was more significant than any other fancy optimization I’ve done on the application.

getTimestamp2 can be 10-15% faster if you replace Integer.parseInt with another solution.

SimpleDateTime is slow. If you need the performance, it may be worthwhile to roll your own solution.

As always, benchmarking Java is hard. These performance numbers should only be used in relative terms.

Source & Tools

The source code can be found here.

JDK7u15 64 bit, Win7 64bit, NetBeans 7.3.

Read Full Post »

A Programmer’s Hunch

The product I work on has been migrated from VC71 to VC90. Ever since the upgrade, I feel that the software is taking longer to start up, has become less responsive. I have been working on the software for several years, so I have certain performance expectations . My programmer’s hunch tells me that something just isn’t right.

I did some searches, and found out that Checked Iterator (Secure SCL) for STL has been turned on since VC80. It is enabled by default for Debug and Release build. There are numerous performance complains for VC80 STL implementation. Our product relies extensively on STL, so that could certainly be a contributing factor to the sluggishness.

Time to Test

To see the current state of the system, I wanted to see the performance between VC71 and VC90 with Checked Iterator. I also wanted the difference without Checked Iterator. Lastly, I threw in STLport into the pot, just because I found a blog that says it is the fastest.

Four-Way Comparison

In the test, I chose four commonly used containers in our software – vector, string, map and deque. For each container type, it will be run against two types of test – Iteration and Size. For the iteration test, the container will be benchmarked with a fixed size across a large number of iterations. For the size test, the size of the container grows while the number of iteration remains the same.

Comparison – Vector

The test for vector involves three operations – inseration, iterator traversal, and copy.

Vector Size Test (Iteration = 100000)

VC90 with Checked Iterator runs much slower.

Vector Iteration Test (Num Elements = 10)

Without Checked Iterator, much of the lost performance are regained.

From VC71 to VC90 with SCL, there are 70% – 100% decrease in performance. By turning off Checked Iterator, the performance of VC90 is roughly equivalent to VC71. STLport outperforms all versions of Visual Studio.

Comparison – String

The test for string involves three operations – string copy, substring search, and concatenation.

string_size_small

VC90 performed poorly compare to VC71, regardless of Checked Iterators.

string_iter_small

STLport smoked its competitions in the short string test. (Note: 140 is the maximum character in a Twitter post)

Performance of string in VC90 degrades rapidly as the string grows. It appears that the Checked Iterator feature does not impact the performance of string.[Update: Secure SCL and HID was not turned off in string.  See article.] Again, STLport outperforms all version of Visual Studios. This is likely because of the optimization from Short String Optimization and Template Expression for string concatenation.

Comparison – Map

The test for map involves insertion, search, and deletion.

map_size_small

Minor improvement in VC9 compare to VC71.

map_iter_small

VC90 without Checked Iterator came out slightly ahead.

Surprisingly, the performance came out roughly the same for all, with VC71 to be the slowest.

Comparison – Deque

The test for Deque comes with a twist. The deque is implemented is as a priority queue through make_heap(), push_heap() and pop_heap(). Random items are inserted and removed from the queue upon each iteration.

deque_size_small

As the deque grows, VC90 with Checked Iterator runs at snail pace.

deque_iter_small

VC71 and STLport came out fastest.

The performance for VC90 with Checked Iterator is quite disappointing compare to others.

So.. Now What?

VC90 with Checked Iterator is indeed very slow. Although I see the benefit of iterator validation during debug phase, I fail to understand why it is enabled in release build. I am not convinced by the argument of correctness over performance. Once the iterators are proven correct, Checked Iterator is simply a burden. When the software is in customers’ hand, all these validations are pointless.

On a side note, the string and vector performance of STLport is very impressive. It is more 2x faster than Visual Studio. It’s simply amazing.

Source

The source and the results can be downloaded here.

Tools: Visual Studio 2003, Visual Studio 2008, STLport 5.2.1 (with Visual Studio 2008)

Machine Specification: Core Duo T2300 1.66 GHz with 2GB of RAM. Window XP SP3.

Read Full Post »

Follow

Get every new post delivered to your Inbox.