STL String Crashes When HID = 0

Awhile ago, we upgraded our compiler to VC90 (Visual Studio 2008), we found out that Has Iterator Debugging (HID) and Secure SCL in VC9.0 were severely degrading our product’s performance. So we took the time to rebuild everything by disabling those features.

This is done by defining preprocessor macros _SECURE_SCL = 0 and _HAS_ITERATOR_DEBUGGING = 0.

Soon after, we experienced some strange crashes in Debug build that makes no sense to us.

Crashes at std::_Container_base_secure::_Orphan_all()

Here’s one found by my co-worker. [Note: The code was simplified for demonstration purposes]

#define _HAS_ITERATOR_DEBUGGING 0 //turn off Has Iterator Debugging
#include <string>
#include <algorithm>
using namespace std;
int main()
{
	string abc = "abc";

	// Method 1: Crashes upon exit with an access violation
	string dot_abc = "." + abc;

	// Method 2: Works
	//string dot_abc = string(".") + abc;

	string buffer = ".abc";

	// Works without the search call here
	search(buffer.begin(), buffer.end(), dot_abc.begin(), dot_abc.end());

	return 0;
}

If you choose Method 1, you will get an access violation upon the destruction of the string class.

msvcp90d.dll!std::_Container_base_secure::_Orphan_all()  Line 223 + 0x5 bytes    C++
msvcp90d.dll!std::_Container_base_secure::~_Container_base_secure()  Line 115    C++
msvcp90d.dll!std::_String_base::~_String_base()  + 0x11 bytes    C++
msvcp90d.dll!std::_String_val<unsigned short, std::allocator<unsigned short> >::~_String_val<unsigned short,std::allocator<unsigned short> >()  + 0x11 bytes    C++
msvcp90d.dll!std::basic_string<char, std::char_traits<char>,std::allocator<char> >::~basic_string<char,std::char_traits<char>,std::allocator<char> >()  Line 917 + 0xf bytes    C++

However, if you choose Method 2, it will exit gracefully. And both method works under Release build.

The first alarming thing from the call stack is the fact that we are calling into msvcp90d.dll. Strings, unlike other STL containers, is separately compiled into another DLL since VC80.

Remember, to turn off HID and Secure SCL, it is required that all binaries linked by a translation unit to have the same HID and Secure SCL settings. After some online search, it is clear that msvcp90d.dll is built with HID = 1.

Yikes! Since we can’t build msvcp90d.dll, there isn’t much we can do. But Microsoft isn’t stupid, they clearly have worked around some of the problems because Method 2 does work.

Stepping In std::string

In C++, the devil is in the details. Method 1 and Method 2 appears to be functionally equvialent, they are vastly different underneath.

// Method 1
string dot_abc = "." + abc;

At a glance, Method 1 should invoke the operator+ with const char * as the left argument, and std::string as the right argument. After stepping into the function call, it calls into an operator+ in that constructs a basic_string object.

//string L27 operator +(const char *, std::string)
template<class _Elem,
	class _Traits,
	class _Alloc> inline
	basic_string<_Elem, _Traits, _Alloc> __CLRCALL_OR_CDECL operator+(
		const _Elem *_Left,
		const basic_string<_Elem, _Traits, _Alloc>& _Right)
	{	// return NTCS + string
	return (basic_string<_Elem, _Traits, _Alloc>(_Left) += _Right);
	}

It calls a copy constructor that takes in _Left (which is “.”) in this case, and performs operator+= with _Right (which is std::string abc).

// xstring L661 cctor(const char*)
__CLR_OR_THIS_CALL basic_string(const _Elem *_Ptr)
	: _Mybase()
	{	// construct from [_Ptr, <null>)
	_Tidy();
	assign(_Ptr);
	}

In method 2, the operation is different. First, a copy constructor is invoked to create a temp string.

// xstring L798 cctor(const char *)
__CLR_OR_THIS_CALL basic_string(const _Elem *_Ptr, _NO_DEBUG_PLACEHOLDER)
	: _Mybase()
	{	// construct from [_Ptr, <null>)
	this->_Myfirstiter = _IGNORE_MYITERLIST;
	_Tidy();
	assign(_Ptr);
	}

Then it will invoke operator+ with std::string as the left and right argument.

// string L17 operator +(std::string const &, std::string const &)
template<class _Elem,
	class _Traits,
	class _Alloc> inline
	basic_string<_Elem, _Traits, _Alloc> __CLRCALL_OR_CDECL operator+(
		const basic_string<_Elem, _Traits, _Alloc>& _Left,
		const basic_string<_Elem, _Traits, _Alloc>& _Right)
	{	// return string + string
	return (basic_string<_Elem, _Traits, _Alloc>(_Left) += _Right);
	}

Notice anything strange?

For the operation where “.” is copied into a std::string, the copy constructor invoked by Method 1 and Method 2 are different! In method 2, it has a different signature, and there is an extra line in the copy constructor – this->_Myfirstiter = _IGNORE_MYITERLIST.

This is probably one of Visual Studio’s work around to allow programs compiled with HID=0 to safely invoke the std::string library in msvcp90d.dll. Unfortunately, there are loop holes in their patch that fails in Method 1.

Conclusion

If you want to turn off HID and Secure SCL for performance reason, be careful with the string library. There are quite a few bugs in VC9.0 that crashes on perfectly legal C++ code. The example above is one of several scenarios that we’ve found. We have also seen similar crashes on certain usage of stringstream.

On a side note, a co-worker of mine filed this bug in Microsoft Connect. They closed the bug shortly, and told him that it has been fixed in VC10 Beta 2. Basically, they are suggesting that we should upgrade our compiler to a beta version, and pay them more money when it officially comes out. Great customer support, M$.

STL Performance Comparison: VC71, VC90, and STLport

A Programmer’s Hunch

The product I work on has been migrated from VC71 to VC90. Ever since the upgrade, I feel that the software is taking longer to start up, has become less responsive. I have been working on the software for several years, so I have certain performance expectations . My programmer’s hunch tells me that something just isn’t right.

I did some searches, and found out that Checked Iterator (Secure SCL) for STL has been turned on since VC80. It is enabled by default for Debug and Release build. There are numerous performance complains for VC80 STL implementation. Our product relies extensively on STL, so that could certainly be a contributing factor to the sluggishness.

Time to Test

To see the current state of the system, I wanted to see the performance between VC71 and VC90 with Checked Iterator. I also wanted the difference without Checked Iterator. Lastly, I threw in STLport into the pot, just because I found a blog that says it is the fastest.

Four-Way Comparison

In the test, I chose four commonly used containers in our software – vector, string, map and deque. For each container type, it will be run against two types of test – Iteration and Size. For the iteration test, the container will be benchmarked with a fixed size across a large number of iterations. For the size test, the size of the container grows while the number of iteration remains the same.

Comparison – Vector

The test for vector involves three operations – inseration, iterator traversal, and copy.

Vector Size Test (Iteration = 100000)
VC90 with Checked Iterator runs much slower.

Vector Iteration Test (Num Elements = 10)
Without Checked Iterator, much of the lost performance are regained.

From VC71 to VC90 with SCL, there are 70% – 100% decrease in performance. By turning off Checked Iterator, the performance of VC90 is roughly equivalent to VC71. STLport outperforms all versions of Visual Studio.

Comparison – String

The test for string involves three operations – string copy, substring search, and concatenation.

string_size_small
VC90 performed poorly compare to VC71, regardless of Checked Iterators.
string_iter_small
STLport smoked its competitions in the short string test. (Note: 140 is the maximum character in a Twitter post)

Performance of string in VC90 degrades rapidly as the string grows. It appears that the Checked Iterator feature does not impact the performance of string.[Update: Secure SCL and HID was not turned off in string.  See article.] Again, STLport outperforms all version of Visual Studios. This is likely because of the optimization from Short String Optimization and Template Expression for string concatenation.

Comparison – Map

The test for map involves insertion, search, and deletion.

map_size_small
Minor improvement in VC9 compare to VC71.
map_iter_small
VC90 without Checked Iterator came out slightly ahead.

Surprisingly, the performance came out roughly the same for all, with VC71 to be the slowest.

Comparison – Deque

The test for Deque comes with a twist. The deque is implemented is as a priority queue through make_heap(), push_heap() and pop_heap(). Random items are inserted and removed from the queue upon each iteration.

deque_size_small
As the deque grows, VC90 with Checked Iterator runs at snail pace.

deque_iter_small
VC71 and STLport came out fastest.

The performance for VC90 with Checked Iterator is quite disappointing compare to others.

So.. Now What?

VC90 with Checked Iterator is indeed very slow. Although I see the benefit of iterator validation during debug phase, I fail to understand why it is enabled in release build. I am not convinced by the argument of correctness over performance. Once the iterators are proven correct, Checked Iterator is simply a burden. When the software is in customers’ hand, all these validations are pointless.

On a side note, the string and vector performance of STLport is very impressive. It is more 2x faster than Visual Studio. It’s simply amazing.

Source

The source and the results can be downloaded here.

Tools: Visual Studio 2003, Visual Studio 2008, STLport 5.2.1 (with Visual Studio 2008)

Machine Specification: Core Duo T2300 1.66 GHz with 2GB of RAM. Window XP SP3.