jweiler

Nov 272013
 

Hard bugs come in a number of forms. Every experienced programmer has a few war stories in their pocket. Twenty years on, my bloodiest war stories have been replaced a few times as a new, even bloodier stories were written. This is only the most recent to make the cut.

It’s not my fault, I swear!

The C++ code in question came about over a number of years by the hands of many programmers. It wasn’t my fault, but I’m not completely clean of it either. It was in a content-loading system that everyone knew was terrible, but no one had taken the initiative to fix. I’m saying this so when you read this and say, “But if only you had written your code better, you never would have been in this mess in the first place,” I can reply that your’re right, but we don’t all live in a clean, perfect world, and you should get down in the muck with the rest of us.

Symptoms

This code ran fine on all iOS devices (the iPhone 5 was the latest at the time) and almost all Android devices. The only two devices that had problems were the recently-released Galaxy S4 and the HTC-One. Something about those devices was leading us down a dark path where the code would occasionally hang sometime during the initial content load sequence. Oh, and it only happened in optimized release builds – you didn’t think this would be THAT easy, did you?

Initial Investigations

Diving right in, there’s no debugger available, so it’s logging output only. The first observation was that this hang wasn’t actually a hang at all. The main thread was still happily rocketing around it’s little track servicing the data-loading jobs. Strangely, once the bug was happening, there weren’t any more jobs to service, but the main thread was convinced that there was more work to do.

As it happens, the last job to be serviced was always seemed to be data decompression. This particular code had a single worker thread dedicated to decompression. Fixed-sized buffers were read from storage and passed to the decompressor. The resulting decompressed data is then passed back to the main thread in another buffer along with the number of bytes it contains.

The most annoying part of this bug quickly became apparent. As I added logging statements, the bug would sometimes vanish entirely. Adding output inside a certain window of the thread-handling code would cause the bug to vanish.

The inter-thread communication looked something like this:

Main thread

ReadBuffer( mInputBuffer, bufferSize );
mOutputBuffer = GetAvailableBuffer();
mTotalDecompressed = 0;
mDecompressionDone = false;
signalCompressionThread();
// Go do other stuff

Decompression thread

IterateForever
{
    waitForSignal();
    mDecompressedBytes = decompressData( mInputBuffer, mOutputBuffer );
    mDecompressionDone = true;
}

Later, back on the Main thread

// Do lots of job processing
if ( mDecompressionDone )
{
    // The decompression job is done, so process the results
    mTotalDecompressed += mDecompressedBytes;
    if ( mTotalDecompressedBytes == mExpectedDecompressedBytes )
        FetchNextDecompressionJob();
}

The primary gate is the mDecompressionDone flag, and we care about the mDecompressedBytes to move things along. Super-janky, right? But it should work, right? The bug seemed to imply that the sum of all mDecompressedBytes didn’t equal the expected value. Unfortunately, logging the values shows they’re always spot on – even when the hang happens. Weird. A little too weird.

I remembered seeing Bruce Dawson give a talk about the perils of lock-free programming. The primary take-away was that the actual order of reads and writes can be changed by the CPU. So for example, even if you think the logic looks good, your data may not be completely consistent between cores. If that was happening here, the window to see the problem would be exceptionally small, and my usual trick of using a sleep to widen the window would be ineffective. I had no shortage of ways to make the problem go away, but I still needed a way to prove what the problem actually was.

Enter the Network Side-Car

Logging statements are actually pretty expensive relative to the timescales we’re talking about here, so that was right out. I needed a way to log information without invoking the normal logging machinery. Making matters worse, this was also only happening on a mobile phone. My solution was to create a special low-overhead logging object that sat on a network port and waited for a connection.

Bingo!

The formerly-correct mDecompressedBytes was returning zero just before the hang. To prove my case, I added a little more code to detect the zero and then wait until the value became non-zero. And success! This was suddenly a clear-cut case where the memory writes to mDecompressedBytes and then mDecompressionDone were inflight at the same time from one thread on one thread, and then being observed from the main thread in the reverse order.

The fix is really quite simple once you know what’s going on. Memory write barriers are used to ensure that all pending memory writes before the barrier complete before any of the writes after it. Inserting a memory barrier does the trick here. Slightly less exotically, you can use simple atomic-writes on mDecompressedBytes which generally implies a memory barrier.

So why didn’t this problem show up earlier? After all, the problematic code had been around for years and actually shipped in multiple products. It’s hard to say with complete certainty, but I can explain some of it. X86 CPUs generally don’t have the write-pass-write problem, so PC and Mac ports were likely immune. The iOS and Android devices we were using were all ARM CPUs, but until the Galaxy S4 and the HTC One, all the devices were only dual-core. Further, I’ve seen some anecdotal evidence that while Android phones were often multicore, some of them restrict all threads from a single application to a single core which would prevent this issue. Perhaps the quad-core processors lift that restriction.

So why didn’t this show up for other Android devices or the plethora of iOS devices? It turns out the nature of the data becomes important. The compressed data we’ve been dealing with was image data, and the two problematic phones were the first to have super-high 400+ dpi pixel densities. We were using higher-resolution content on those phones, and the fixed-sized decompression input buffer was no longer enough to hold an entire compressed image. Because the decompression required multiple jobs to complete, the book keeping logic was suddenly more susceptible to the memory-write problem that was always there.

Sep 172012
 

Game engines often have some form of metadata system that can be used for a myriad of tasks. My little home brew engine, for example, uses metadata to facilitate serialization, allow object allocation by name, content-updating, etc. It’s all quite common, but creating such a system is actually pretty complex when you start to get into the nitty gritty implementation details. Your metadata design choices very quickly start to inform many other areas of your engine design.

Various engines create and store their metadata in various ways. Unreal Engine 3, for example, uses UnrealScript to describe game logic as well as provide the source for the engine metadata. Their UC compiler creates C++ headers which are compiled into the game binary while the metadata is shunted over to the .u script packages. I’ve never liked this scheme for a couple reasons. Perhaps chief among them is that it requires programmers use one language to write their type definitions and then use a different language for their code their code. In other words, they have to author their C++ header files by proxy. There are a few other problems with that system, but it’s not helpful to dwell on it.

A Little Context to Start

My hobby engine uses a system where any class or struct can have metadata, but that’s not strictly required. Additionally, each metadata-enabled type is not required to have a virtual function table. Lets consider a simple class sitting in a header file called ChildType.h:

ChildType.h
class childType : public parentType { DECLARE_TYPE( childType, parentType ); void RedactedFunction( float secretSauce ); float unsavedValue; /// +NotSaved double * ptrToDbl; ValueType_t someValues[3]; };

By virtue of using the DECLARE_TYPE macro, it is clear this type is intended to be metadata-enabled, but it’s not obvious where that data comes from. We need to express all the critical information about the class in such a way that we can serialize it or generically inspect it. The hows and whys of my design choices aren’t important, but my solution is to define a new cpp file that looks like this:

ChildTypeClass.cpp
MemberMetadata const childType::TypeMemberInfo[] = { {“unsavedValue”, eDT_Float, eDT_None, eDF_NotSaved, offsetof(childType, unsavedValue), 1, NULL }, {“ptrToDbl”, eDT_Pointer, eDT_Float64, eDF_None, offsetof(childType, ptrToDbl), 1, NULL }, {“someValues”, eDT_StaticArray, eDT_Struct, eDF_None, offsetof(childType, someValues), 3, ValueStructClass } }; IMPLEMENT_TYPE( childType, parentType );

There’s a lot going on here, and there is a lot of unimportant plumbing hidden behind those DECLARE_TYPE and IMPLEMENT_TYPE macros. For this discussion, the DECLARE_TYPE macro adds a class-static array of MemberMetadata structures. Each member of that array specifies a name, primary type, secondary type, flags (like +NotSaved), offsetof the member inside the structure, static array length (usually 1), and target metadata type (or NULL).

That’s a lot of data to type in correctly and maintain over the life of an engine. I knew very early on that while manual typing was ok to bootstrap the project, automation would have to enter the picture eventually.

Automation Enters the Picture

As I said, my design choices aren’t the focus of this article – perhaps another time. This article is about how I’m going about creating my metadata.

I very briefly considered writing a C++ parserBAHAHAHAHA! Oh, man…that’s rich! But seriously, compiler grammars are one of those weird things that amuse me, and I actually considered writing just enough of a “loose” C++ parser to pull out the information I wanted. When I stopped to think about the magnitude of language features and weird cases in C++, however, reality came crashing in and it’s easy to see why I abandoned the idea.

Enter Clang

Clang is a C language family front-end for LLVM. Clang is free (BSD), and Clang is awesome! There are a few layers to the onion, but at a high level, Clang is a C++ to LLVM compiler. When used in tandem with LLVM, it’s a complete optimizing compiler ecosystem. More importantly for our purposes, however, libClang provides a relatively simple C-based API into the abstract syntax tree (AST) created by the language parser. That’s HUGE for creating metadata. Seriously, it’s almost all of the heavy lifting in something like this. Making matters even easier, there are Python bindings if that’s your thing. The Python bindings are probably a little easier to work with due to the fast iteration times and cleaner string handling.

A Little Glossary Action

The libClang API uses a handful of simple concepts to model your code in AST form.

Translation Unit – In practical terms, it’s one run of the compiler that creates one AST. We can think of it as one compiled file + any files it includes. In terms of libClang, it’s the base-level container and the jumping-off point for the data-gathering work we’ll need to do.

Cursor – A cursor represents one syntactic construct in the AST. They can represent an entire namespace or a simple variable name. They also maintain parent/child relationships with other cursors. For our purposes, the cursors are the nodes in the AST we need to traverse. They also contain references to the source file position where they were found.

Type – This one is easy. The cursors we’re looking for will often reference a type. These are literally the language types. Keep in mind that Clang models the entire type system, so typedefs are different from the types they alias. We’ll get into that.

The Plan

Once the parsing is taken care of, the solution is pretty straight forward.

  1. Setup the Environment
  2. For each header, make a Translation Unit.
  3. Traverse the AST for interesting type definition cursors.
  4. For each type definition, look for member data cursors and other information.
  5. Once we have all the information we need about a type, dump text to a file.

Setup – Compiler Environment

Obviously, we need one or more header files to work on, but there’s more – more than I naively anticipated anyway. Even if we just hardcode a list of headers, we won’t be able to compile them by themselves. We need to replicate enough of your normal project environment get the same parsing result that your normal compiler would create. We need to use the same command line preprocessor defines (-D’s) as well as the same additional include folders (-I’s).

I’m going to leave that as an exercise for the reader, but my solution involved parsing my Visual Studio project files. It wasn’t a huge deal, but this is where project structure will play a decently sized role, and a well-structured project will be easier to configure.

Setup – Header Environment

Another thing I hadn’t considered is what sort of environment a header lives in. There’s no way to know what files need to be included prior to the inclusion of a given header. There are really only two things you can assume when dealing with this problem – Assume any pre-compiled headers are included before your header and assume that nothing else needs to be included because your header can stand on its own. That second point jibes with how I generally structure my headers anyway, but it is graduated to a hard requirement in this case. Header cascades can kill your compile time performance, but external header order dependencies are worse. Obviously, it’s a good idea to mitigate header cascades with forward declarations where possible.

Time to Make the Doughnuts

Once we’ve gathered all our include paths and preprocessor symbols, invoking clang is pretty easy. We can’t just pass the header to clang_parseTranslationUnit and call it done. I suppose we can, but “.h” is an ambiguous extension. Clang won’t know how to act without some additional arguments to indicate the language to use. I also needed to include my PCH file anyway, so I ended up creating an ephemeral .cpp file to kill two birds with one stone. Conveniently, Clang has support for in-memory or “unsaved” files. Blatting a few #include strings into a buffer is all it takes. Here is the basic setup for building a translation unit for a single header called “MyEngineHeader.h”. Obviously, your environment arguments will be a bit different.

Build a Translation Unit
char const * args[] = {"-Wmicrosoft" , "-Wunknown-pragmas" , "-I\\MyEngine\\Src" , "-I\\MyEngine\\Src\\Core" , "-D_DEBUG=1" }; CXUnsavedFile dummyFile; dummyFile.Filename = "dummy.cpp"; dummyFile.Contents = "#include \"MyEnginePCH.h\"\n#include \"MyEngineHeader.h\""; dummyFile.Length = strlen( dummyFile.Contents ); CXIndex CIdx = clang_createIndex(1, 1); CXTranslationUnit tu = clang_parseTranslationUnit( CIdx, "dummy.cpp" , args, ARRAY_SIZE(args) , &dummyFile, 1 , CXTranslationUnit_None );

Build Errors? WTF?!

Compiling your code with Clang will probably output some unexpected errors. Remember that Clang is a C++ compiler with nuances like any other, and Clang’s nuances won’t necessarily match your other C++ compilers’ nuances. This is a good thing – seriously! Anyone who’s done cross-platform work will tell you that with every additional platform, long-standing bugs show themselves. Embracing a multi-compiler situation will force you to keep cleaner, more standards-compliant code.

Unfortunately, resolving the major problems is not optional here. We need to create a valid AST for traversal that isn’t missing any attributes that describe our data. In order to get what we want, the parser actually has to finish what it’s doing. Use clang_getDiagnostic, clang_getDiagnosticSpelling, etc. to get human-readable error messages.

Diagnostics Dump
unsigned int numDiagnostics = clang_getNumDiagnostics( tu ); for ( unsigned int iDiagIdx=0; iDiagIdx < numDiagnostics; ++iDiagIdx ) { CXDiagnostic diagnostic = clang_getDiagnostic( tu, iDiagIdx ); CXString diagCategory = clang_getDiagnosticCategoryText( diag ); CXString diagText = clang_getDiagnosticSpelling( diag ); CXDiagnosticSeverity severity = clang_getDiagnosticSeverity( diag ); printf( "Diagnostic[%d] - %s(%d)- %s\n" , iDiagIdx , clang_getCString( diagCategory ) , severity , clang_getCString( diagText ) ); clang_disposeString( diagText ); clang_disposeString( diagCategory ); clang_disposeDiagnostic( diagnostic ); }

Time to Start Digging!

The compile step should have provided you with a valid translation unit. We need to keep that around, but we’re not going to do much with it once we’ve checked for errors. Once we get the top-level cursor with clang_getTranslationUnitCursor(), we’ll put the translation unit in a safe place and use the cursor as the top-level object from then on.

We want to find relevant types, but we have to be smart about it. The C-language Clang interface uses a clunky callback API called clang_visitChildren. (Note: Python provides a simpler non-recursive getChildren interface that returns an iterator.) Clang will call your callback for each child cursor it encounters. Your callback, in turn, returns a value indicating whether the iteration should recurse to deeper children, continue to this child’s siblings, or quit entirely.

We’re only interested in type declarations at this stage, but C++ allows new types to appear in several places. Fortunately, we can pare down the file pretty quickly.

Item Cursor Kind Recurse? Remember?
Typedef CXCursor_TypedefDecl Yes No
Class Decl CXCursor_ClassDecl Yes Yes
Struct Decl CXCursor_StructDecl Yes Yes
Namespace Decl CXCursor_Namespace Yes No
Enumerations CXCursor_EnumDecl No Yes?
Anything Else ??? No No

It should be fairly obvious for an experienced programmer what to look for – the table above is the rule-set I’ve been using. There are a few other cases that aren’t covered – function-private types, and unions. It’d be easy enough to deal with these cases too, but I haven’t had a need to serialize a union just yet, and function-private types have limited utility for serialization.

Traverse For Types
MyTraversalContext typeTrav; clang_visitChildren( clang_getTranslationUnitCursor( tu ), GatherTypesCB, &typeTrav ); enum CXChildVisitResult GatherTypesCB( CXCursor cursor, CXCursor parent, CXClientData client_data ) { MyTraversalContext * typeTrav = reinterpret_cast( client_data ); CXCursorKind kind = clang_getCursorKind( cursor ); CXChildVisitResult result = CXChildVisit_Continue; switch( kind ) { case CXCursor_EnumConstantDecl: typeTrav->AddEnumCursor( cursor ); break; case CXCursor_StructDecl: case CXCursor_ClassDecl: typeTrav->AddNewTypeCursor( cursor ); result = CXChildVisit_Recurse; break; case CXCursor_TypedefDecl: case CXCursor_Namespace: result = CXChildVisit_Recurse; break; } return result; }

Enumerations are in the mix even though they’re a bit of a special case. For metadata purposes, you might get away with just treating them as integers. You can, however, be more robust in the face of changing types if you store them as symbolic strings until you do the final bake of your data.

Panning For Gold

Now that we have a bunch of types, we might want to filter them. Remember that we might have encountered a massive header cascade in the compilation step. Logically, we’re only interested in types that were declared in the header we’re directly processing. We’ll get the other ones when we process their headers in turn. Fortunately, we can iterate the list of interesting type cursors we just created in the last step, and ask each one what file location it came from. Types from other headers can be safely culled.

Data Gathering – Internal

After culling types from other headers, we should have a much smaller list of interesting types, so we can use their cursors as starting points to learn more about them. This is where we start gathering all that data I mentioned earlier. We’ll iterate for this data in much the same way we got the type cursors in the first place. I’m sure you can do them both in one sweep, but I find the problem domain a little easier to think about in two-phases. This time, instead of starting at the top of the translation unit, we can start the iteration at the type cursor. We want to iterate the type declaration cursor completely and look for a few different things.

Item Cursor Kind Recurse? Remember?
Base Type Ref CXCursor_CXXBaseSpecifier No Yes
Member Var CXCursor_FieldDecl Not yet(*) Yes
Static Class Var CXCursor_VarDecl No No
Methods CXCursor_CXXMethod No ???
* See examples below

Base types and member variables should be fairly obvious as to why we want them, but static class variables might seem odd for this list. I use them for another level of filtering. I know that any class that supports metadata uses that DECLARE_TYPE macro from earlier. Of course, macros are all resolved by the preprocessor, so the C-language parser never sees that symbol, but buried within is a single static class variable with a known name and type that I can find. If it’s not there, then this class is incapable of supporting metadata, and I can just skip it entirely. Looking at the problem the other way, the only thing I need to do in order to enable metadata for a given class is add the DECLARE macro. The rest takes a care of itself.

As an aside, I haven’t bothered adding any method-invoke plumbing to my engine, but it’s fairly easy to see how one might suss that out of the data. Lots of engines do that sort of thing, so I won’t be surprised if I end up digging into it eventually.

Data Gathering – External

Once we have all the data we’ll need from inside the class, we only need a few external tidbits before we can generate the metadata file. We need to know the fully-qualified namespace of the target type as well as those of all the base types. This, too, is pretty straight forward though, as you can simply ask any cursor what its lexical parent happens to be. By iterating until you hit the translation unit, you can capture all the containing scopes of a given type. There is an important caveat that bit me only after a good while. Consider this code:

Ambiguity Operator
namespace FooNS { class Foo { int dataFoo; }; } using namespace FooNS; typedef Foo FooAlias; class Bar : public FooAlias { int dataBar; };

When we try to find the full scope of the base class of Bar, we won’t be aware that Foo actually lives inside of FooNS, and the using directive is what allows this to work. I am not a fan of using and often refer to it as the ambiguity operator. I suppose it happens often enough in production code, however, that we should deal with it correctly.

The way to deal with this situation is to walk the lexical parent chain as I already mentioned, but at every step along the way, we need to see if the parent scope is a class-type, struct-type, or typedef. If it is any of those, then we need to get the type record from the parent cursor, then ask for the canonical type record in order to eliminate the typedef indirections, and finally get the cursor from the type record using the clang_getTypeDeclaration function. This might seem needlessly complex, but consider gathering data from Bar in the above code example. Walking the lexical parents of Bar works as expected because it doesn’t quietly live inside of any scopes that aren’t obvious. Doing the same for the base class (FooAlias) is a different story. In that case, the base class is actually a typedef of Foo which is quietly defined inside of the FooNS namespace.

Data Gathering – Off-World

We’ve already dug into the class as well as ascended the various scopes in which a class might live. With all that accounted for, what else could there be to gather? Way back in the first section where I said my data members can have flags such as “+NotSaved”, I never really said how that information was found. Unfortunately, C++ doesn’t really provide a code-annotaion scheme that integrates with the mechanics of the parser. There’s a little room for a #pragma or an __attribute__ interface, but I was unable to make those systems work how I wanted. Additionally, I don’t really like how cumbersome they would have to be in order to give per-member-data attribute granularity. Instead I simply opted to use code comments. That way, the regular code would be completely unaware of them, and I would have free reign to implement any metadata features I wanted. Obviously, this is where we step outside of the AST that has served us so well up to this point, but really very far outside. We already have a cursor for each data member, and we can use clang_getCursorExtent get the exact positions in the source files where this cursor occurs. From there, it becomes fairly trivial to do localized scan for comments using any syntax you happen to want.

Writing it out

Ok, finally we should have all the data we need to write out our metadata. As I said earlier, I write everything out to a .cpp file for inclusion into the engine project. That’s a nice, human-readable method, but it has an implicit requirement that any metadata generation run might require a small additional build. In Visual Studio, that can also mean you have to reload the project if you’ve added any new classes. On the bright side, all your metadata is there at engine startup without load-order or chicken and egg problems.

There are other ways, however. For example, you could dump all this information out to a binary archive that is slurped into the engine on startup. It could also be demand-loaded and unloaded if the overhead starts to be an issue.

Details Details…

Now that you’re all sleeping soundly after traversing the AST, I thought I’d give a few examples of how the AST is structured for common cases.

Let’s briefly consider some normal member data:

class Normal
{
  int data1;
  float data2;
};

The cursor hierarchy looks like this:

Cursor TextCursor KindType Kind
Translation Unit
NormalCXCursor_ClassDeclCXType_Record
data1CXCursor_FieldDeclCXType_Int
data2CXCursor_FieldDeclCXType_Float

It’s actually pretty intuitive once you’re comfortable with some basic compiler concepts. Types are separated from semantic constructs, and POD types are directly represented in the Clang API.

Now consider only slightly more complexity:

class StillPrettyNormal
{
  int * dataPtr1;
  struct DataType * dataPtr2;
};
Cursor TextCursor KindType KindPointee Type
Translation Unit
StillPrettyNormalCXCursor_ClassDeclCXType_Record
dataPtr1CXCursor_FieldDeclCXType_PointerCXType_Int
DataTypeCXCursor_StructDeclCXType_Record
dataPtr2CXCursor_FieldDeclCXType_PointerCXType_Record(*)
* Causes ref to DataTypeClass

There are two odd parts here. First, we’ve lost the notion of ‘integer’ for dataPtr1 – we’ve only been told that it’s some sort of pointer. This isn’t really a problem though, because Clang provides the clang_getPointeeType function. You can call this on any pointer type to get the next type in the chain. Pointers to pointers to pointers can be resolved this way through multiple calls if need be.

Second, we have an unexpected struct declaration in the middle of our class declaration. Well, it’s not completely unexpected, actually. The inclusion of ‘struct’ in the field declaration of dataPtr2 is also a forward declaration for the type, and Clang represents this. Fortunately, it’s a sibling of the actual field declarations and we can safely ignore it.

The final part of this example is the addition of the type reference to the externally-defined DataType. Forward declarations in C++ allow types to be mentioned and not fully defined until they are used so for now, I have to assume that DataTypeClass actually exists somewhere else. The possibility exists, however, that DataType is not a metadata-enabled class, and the reference to DataTypeClass will break at link-time. Obviously, this needs to be tightened down in my engine, and ideally, I’d like to avoid some sort of comment-flag mark-up. In order to solve this problem the correct way, I’ll probably have to shuffle the tool to look at all headers multiple times and create a dependency/attribute graph.

Ok, one more example, but I’ll warn you ahead of time that this goes a little past the edge of where I wanted to go with my metadata-creation tool.

class WackyTown
{
     MyContainer< MyData* > cacheMisser;
     MyContainer< MyData*, PoolAllocator<32768> > sendHelp;
};
Cursor TextCursor KindType Kind
Translation Unit
WackyTownCXCursor_ClassDeclCXType_Record
cacheMisserCXCursor_FieldDeclCXType_Unexposed(*)
MyContainerCXCursor_TemplateRefCXType_Invalid
MyDataCXCursor_TypeRefCXType_Record
sendHelpCXCursor_FieldDeclCXType_Unexposed(*)
MyContainerCXCursor_TemplateRefCXType_Invalid
MyDataCXCursor_TypeRefCXType_Record
PoolAllocatorCXCursor_TemplateRefCXType_Invalid
(no-name)CXCursor_IntegerLiteralCXType_Int
* See the Canonical Type!

A quick confession: I started out using Clang’s C-interface, and I’ve attempted to write this article using C-langauge references. However, I actually did most of my experimentation using the Python bindings. They appear a little incomplete in comparison, so it’s possible that I simply missed the correct course on this one.

There is a lot of missing data in there! First, the types for cacheMisser and sendHelp are both listed as “Unexposed”. Second, the template-ref types are invalid with no obvious way to get to the template definition. Third, nowhere is it exposed to the AST that the MyType references are actually pointers. Fourth, there is no way to know what integer value was passed to PoolAllocator in either instance. Fourth, the implicit hierarchy of the template type arguments have been flattened. What a mess!

Experimentally, I found that I could get the canonical type of the “Unexposed” type for cacheMisser and then get the declaration of the canonical type. That gave me a new CXCursor_ClassDecl cursor which made a certain amount of sense if you think of templates as meta-types and template usages as real types. The new cursor looked like this:

Cursor TextCursor KindType Kind
MyContainer< MyData *, PoolAllocator< 1000 > >CXCursor_ClassDeclCXType_Record

On the surface, it seems very promising, but there were no children of this cursor. So while it looks like all of our information is represented at the top level, there was no way to dig into it. Fortunately, I’m not very template-heavy in my engine project, but this is something I’m still trying to work through for completeness.

A Note on Performance

Running my little utility script on a single header takes about 5 seconds from start to finish. Much of that time is because I’m using a fairly heavy pre-compiled header in my regular engine builds, so building an otherwise trivial file actually touches several dozen core-systems files in addition to any deep rabbit holes caused by C-Runtime inclusions. I’ve mitigated this for the most part by using Clangs actual pre-compiled header functionality instead of just including my PCH as a normal header. As expected, this turns out to be a huge win when I’m running this across all my engine files. Much like ordinary PCH’s, I pay the 5-second penalty once and then each source file is virtually instantaneous.

That’s All, Folks!

I think that’ll just about do it for my engine metadata generator. While I hope someone out there can benefit from this little bit of yak-shaving for my on-going engine project, I’m sure this wasn’t a wholly original idea. I’d love to hear what others are doing in this vein using Clang or some other tools. Until next time – hopefully not in another year.

Jun 082011
 

Download the source code for this post (19k)

Alright then…Part 4…I think it’s time we tie the first three parts together. We want to use the type of add-in that was introduced in part 3 to fill in the knowledge gaps of the add-ins we played around with in parts 1 & 2. Furthermore, for simplicity, we want to literally combine the two types of add-ins into one DLL.

Filling in the Gaps

In part 2, I spent some time talking about all the things we don’t know when we get a raw pointer from a target process through the AutoExp.Dat file. We don’t know:

  1. The target platform’s endianess
  2. The size of an int or pointer on the target platform
  3. How to access any global symbols because there’s no general way to know where they reside in memory.
  4. The data layout for a type in the target process

When our OnEnterBreakMode handler is triggered, we want to quickly gather all the extra information we might need in anticipation of our autoexp.dat handlers being triggered. We don’t know exactly what pointers will be sent to those handlers, but we can answer our nagging questions that are common to all of them.

Endianess

Unfortunately, endianess isn’t something we can just ask the debugger about, but it is almost always something we can infer. The scope of our work here covers Xbox360 debugging as well as Win32/Win64 debugging. The Windows-targets will use the AutoExp.Dat file in <VisualStudioRoot>\Common7\Packages\Debugger while the Xbox360 will use the one in <XDKROOT>\bin\win32. You can use this difference to link our MyType structure to two different handler functions in our add-in.

That’s probably the simplest way, but there are others. If you dig through the EnvDTE::Debugger object, you’ll see that you can get a Process2 object for the current target process and then ask it about the transport being used. That can also tell you if you’re attached to an Xbox.

Sizes of Types

At the end of part 3 this series, I mentioned that the Debugger object has a GetExpression() function. This thing is pure gold. You can feed it strings, and it returns the evaluations of those strings just like you had typed them into the watch window. It’s a programmatic way to evaluate expressions within the context of the target process. So what if our add-in asked it to evaluate this?

CComBSTR const bsSizeOfVoidPtr(L"(int)sizeof(void*),d");
CComPtr<EnvDTE::Expression> pExpr=NULL;
hr = pDebugger->GetExpression(bsSizeOfVoidPtr,  // our string as a BSTR
                              VARIANT_FALSE,    // Don't use autoexp.dat
                              -1,               // no timeout
                              &pExpr);          // result

It would definitively tell us how large our target platform thinks a pointer is, and allow us to query nested pointer types safely.

Since this is the sort of question that will only ever have exactly one answer for the lifetime of a given process, we only want ask it once and then cache the result for later use. Performance isn’t a huge concern inside the debugger, but we can’t ignore it either.

Global Variables

Globals are fairly similar to data sizes in the way we evaluate them. Let’s say our string table example from part 2 was instead held in a global variable. If we knew the name of that global, then we could feed it into GetExpression and get the current base pointer of the table.

Bear in mind that a table pointer might change if the table is reallocated for any reason, but the pointer to the base pointer will reside in a single place throughout the lifetime of the target process. Depending on the nature and expected lifetime of your data, you should cache it appropriately. Unfortunately, even if a value changes only once in a blue moon, you still need to check it ever time.

Data Layouts

Let’s say you know there is a string data member on your object called StrId that is perfect for displaying in the watch window. You won’t ever change it’s name, but you’re less sure about it being shifted around inside of your object.

If you look elsewhere on this blog, you’ll find an interview question that I ask a lot. I posted it before I posted this entry because the contents are important to what we’re doing here. You can use the answer to the struct-offset question along with the GetExpression function to give you the answer you’re after for the target process.

CComBTR const bsOffsetToStrId(L"(int)&(((MyType*)0)->StrId),d");

If we can figure out the critical part of our data layouts at debug-time, we no longer have to mirror our data in our AutoExp.Dat add-in. We don’t have to worry about things getting out of synch because we can ask the right questions against the actual target every time.

A final word about these strings we’re passing into GetExpression. As I said, they’re evaluated as though you typed them into the watch window. That means they’re subject to the platform’s specific expression evaluator nuances as well as the Hex-display button. Because these things are somewhat unpredictable and difficult to account for, I’ve taken to completely specifying what I want to get back. That’s why I’ve explicitly cast my expressions to integers and suffixed my expressions with “,d”  to force the display to decimal. It’s all about predictability.

Tying It All Together

As I’ve already mentioned, we have we have two types of add-ins across the first three installments. The first add-in is a Visual Studio extension that can extract lots of information from the target process. The second can’t extract much information, but has the ability to augment the watch window’s output. It’s no secret that we want the first add-in type to cache some data that’s useful to the second.

It turns out that sharing data is a pretty simple thing to do. Anywhere that you can stash data in Windows can be used to share this information between the two DLLs. The registry, named-pipes, temporary files, swap-backed files, shared memory, etc are all viable ways to make the connection, but they’re not the way I’ve used in the past. When I originally wrote the FNameAddin, I used a simple assumption and a shared secret.

I assumed that the COM add-in was always loaded at debugging time. When the AutoExp.Dat add-in is loaded, it calls LoadLibrary() on the COM DLL. I only had to know what it was called – not where it was installed – because I could assume it was already loaded. Then I could call GetProcAddress on my special GetTheDataCache() function to get do the data transfer. Everything worked fine, but there were a few caveats that needed attention.

Because I had two DLLs, I had to keep their versions in synch or at least make sure they could handle the edge cases where the versions were out of synch with one another. I also had to make sure I was installing the AutoExp.Dat add-in to an appropriate folder so it would be found by the debugger.

A Key Insight, and An Unintended Benefit!

It’s not hard, but it can be a lot easier – we’re making two DLLs here, why can’t they simply be combined into the same DLL? That way, they’d share the same memory space which makes these coupling shenanigans completely unnecessary. In fact, by combining the DLLs, we don’t even have to worry about putting the AutoExp.Dat DLL in the correct place.

Remember all that messiness in part one where we used an absolute path to our add-in DLL? That requirement just went away. Because our DLL will always be loaded by virtue of being a registered Visual Studio add-in, the AutoExp.Dat mechanism won’t ever have a problem finding it. We can put it anywhere we want as long as it’s registered with Visual Studio to be loaded on start-up. Not too shabby!

The Sample

The sample code is really a re-work of the code from part 2, but it uses all of the Visual Studio extension framework we discussed in part 3. It accomplishes the same fundamental task as part 2, but it doesn’t rely on any shared secrets with respect to types or any fragile linkages. Instead, it derives all of its data at runtime using the methods discussed above. It should also be reasonably portable between Win32, Win64, and Xbox targets.

There’s a lot more code there than in the other samples, but take your time to follow what it’s doing. The sample code also includes a trivial target application. Once you compile the add-in project and use Visual Studio as the debugging target as we discussed in the other parts of this series, load the target-app project in the second Visual Studio instance. It should give you all the breakpoints you need to exercise the add-in.

Be sure to add the correct lines to the right instance of AutoExp.Dat:

VS 2008 %PROGRAMFILES%/Microsoft Visual Studio 9.0\Common7\Packages\Debugger\autoexp.dat
MyType=$ADDIN(MT_Addin4.dll,HandleMyType_WinXX);
VS 2010 %PROGRAMFILES%/Microsoft Visual Studio 10.0\Common7\Packages\Debugger\autoexp.dat
MyType=$ADDIN(MT_Addin4.dll,HandleMyType_WinXX);
Xbox360 %PROGRAMFILES%/Microsoft Xbox 360 SDK\bin\win32\autoexp.dat
MyType=$ADDIN(MT_Addin4.dll,HandleMyType_Xbox360);

Conclusion

That should just about wrap it up for this series on Visual Studio add-ins. I’ve covered enough of the techniques used in the FNameAddin that you should be able to adapt them to your projects. I hope you’ve found it useful. Please let me know if you have any questions or problems with the sample code.

 

May 212011
 

For years, I’ve tortured DigiPen grads and intern candidates alike with this whiteboard interview question. Frankly, I’m sick of it and I need to switch things up. I’d expect senior engineers to get it fairly quickly, and on-the-metal types should just know this by rote. I wanted to see if the junior engineers could think through the problem logically, be circumspect in their reasoning, and if they were comfortable with pointers and basic compiler concepts. So here goes…

Given a simple C-structure:

struct foo_t
{
    int a;
    float b;
    char c[3];
    void * zPtr;
};

I would like you to write some code that outputs the number of bytes from the top of the structure to the beginning of the zPtr structure member. Your answer should be portable, fast, and resilient to code changes.

Aaaand Go! When you think you have it, the answer is over here.

 

May 212011
 

Download the source code for this post (10k)

In the previous installments of this series, we worked completely within the confines of the AutoExp.dat framework which was pretty limiting. That’s not really surprising given that that mechanism dates back at least 15 years as of this writing. That was back before the dot-com bubble happened, I think most people still had non-vestigial tails, and dinosaurs still roamed the earth in the some rural areas. It’s been awhile. In this installment, we’re going to take things up a notch by using mechanisms that are only a mere 10 years old – that’s right, a time after most of you were born. Crazy!

We want to deal with several of the big limitations and unknowns we ran into in the previous two parts. Unfortunately, that will have to wait until part 4. First we need to update the way we hook into the debugger. Microsoft has been developing and expanding the Development Tools Environment framework or ‘DTE’ which is at the heart of Visual Studio extensibility since VS2003. It contains quite a lot of interfaces that allow you to inspect and extend several areas of Visual Studio. The scope of what is available is actually pretty impressive. This is one of the interfaces that make products like Visual Assist X possible, But trust me, there’s a lot more to it!

Within DTE, you can do simple things like add new menus and buttons, or you can write more complex tasks like traversing the solution hierarchy. You can also write code that automates certain coding tasks and even add an entirely new language. Clearly, we’re not going to use most of this stuff. For our part, we really only care about hooking some events in the debugger and then doing some simple expression evaluation.

So what’s the plan here, anyway? First, we want to get a new add-in project up and running that will host our next steps. Thankfully, the new-project wizard has a decent template to accomplish most of that. Then we want to get access to the debugger to tell us when the user is debugging something.

A Brave New Project

If you select File->New->Project and then look through the project templates under “Other Project Types” and then “Extensibility”, you’ll find “Visual Studio Add-in” as an option. There are several options in the ensuing wizard, but most of them are just nice-to-have features. For this project, I selected a C++/ATL project and disabled as many of the add-on features as possible.

Right out of the gate, we get a many-filed project with lots  of inscrutable plumbing. You can download the new project here (10k). Let’s do a quick run-down of what’s in there:

stdafx.h/.cpp – As usual, this is the special header that drives the precompiled headers feature in the compiler. If you look closely, however, you’ll see several #import statements. Depending on your version of Visual Studio, they might contain some messy GUIDs or they might contain references to files like “dte80a.olb”. In either case, these are the large sets of interfaces that we’re going to be using  to integrate with Visual Studio.

Addin.rgs – This odd little file is actually a registry fragment that will be used to register your add-in with Windows. Don’t panic, but we’re actually making a registered component here. The registry is the mechanism that Visual Studio uses to find your add-in.The top section registers your component with Windows while the bottom section registers your add-in with Visual Studio. Don’t worry, this stuff will happen automatically.

Addin.cpp – It might seem like this is where you’ll put the important code in your add-in, but it actually just contains some basic plumbing code that allows your add-in to be a DLL and to register itself as a control.

Addin.idl – IDL stands for interface definition language. It’s used to generate data marshaling code for your control’s type library. Again, don’t worry about this file. We won’t be editing it.

Connect.h – This is where the CConnect class lives. It is the main class for your add-in. It will implement the Visual Studio interfaces that we want to use. Notice that CConnect already derives from something called IDTExtensibility2. This is the main interface that makes a class into a Visual Studio Add-in.

Connect.cpp – THIS is where all your awesome code can go, but notice what’s in there already. It has functions called OnConnection and OnStartupComplete. These functions are part of Visual Studio’s IDTExtensibility2 interface that was mentioned above. These functions will be called at various times as your add-in is loaded. Also note in OnConnection, there is some code to fill in a class variable called m_pDTE. This is the access point to most of the things we care about in Visual Studio.

When you build, two non-standard things will happen. First, a few more files will be generated. This is the IDL file doing its thing and creating its glue code. This code is generated every time you compile, so don’t bother changing it. You don’t even have to add it to the project, so just ignore it unless you’re curious what’s in there. The second thing that will happen was your new control should have been quietly registered with Windows. This is important when you start debugging your code. It doesn’t matter if Visual Studio is pointing to the Debug build profile. It will debug whatever profile was built most recently. Keep that in mind as a possible gotcha when things seem to be going all wrong.

Just like last time, if you run your add-in, you will start a new copy of Visual Studio – it’s like launching an aircraft carrier to test out the galley appliances. Go ahead and look under “Tools->Add-in Manager…”. You should see MT_Addin3 on the list of add-ins and it should be activated. If you put some breakpoints in Connect.cpp before you ran, you should have gotten a break or two in OnConnection, OnStartupComplete, or one of the other IDTExtensibility2 handlers.

Hooking a Debugger

(Easier than debugging a hooker! – hey-o!)

Everything we’ve touched on so far is just to get an add-in working inside of the Visual Studio framework. We won’t get the notification we really want. We want to get the chance to act just before the watch windows are updated. In other words, we want to be told whenever the target program is halted, and the user can inspect variables. That requires that we hook the debugger events interface.

We only need to hang a few new bits of code on out CConnect class in order for it to handle debugger events. For now, we only care about the event called OnEnterBreakMode. This is the event that is fired every time the debugger stops the execution of the target process. The obvious case is when the debugger encounters a breakpoint, but this function will also be called every time the user steps into, out of, or over a line of code.

You can follow along in the code sample for these changes, I’ve denoted each one with the comments

//*** Hooking debugger events ***/

The first thing we need to do is make out CConnect class derive from the debugger events interface. This is as simple as adding the following line of code to the class inheritance in Connect.h:

IDispEventImpl<0, CConnect, &EnvDTE::DIID__dispDebuggerEvents, &EnvDTE::LIBID_EnvDTE, 8, 0>

That’s a bit of a mouthful, so in the sample code, I use a simple typedef to make it easier.

The next thing we need to do is tell CConnect what events it should expect and then what to do with them. For this, we need to add  a sink-map:

BEGIN_SINK_MAP(CConnect)
 SINK_ENTRY_EX(0, EnvDTE::DIID__dispDebuggerEvents, 3, OnEnterBreakMode)
END_SINK_MAP()

This is essentially saying that when the EnvDTE::DIID__dispDebuggerEvents interface sends event #3 our way, we should shunt it to a function called OnEnterBreakMode. But wait, that seems a little voodoo – just a magic number three? Honestly, COM is hardly my strong suit, and Im still trying to track down the origin of this value. Its likely tied up in the definition of the IDispatch interface thats part of the debugger events object.

The other part of this event-sink mechanism is attached and detached at run-time from the Add-in’s OnConnection and OnDisonnection handlers. We use the DTE interface once again to get access to the events interface and then ask for the debugger events interface specifically. Once we have the debugger events interface, we can simply register that we want to be notified about those events.

Finally, we declare the CConnect::OnEnterBreakMode function in Connect.h and make a stub for it in Connect.cpp.

A Quick Test Drive

Set a breakpoint in the OnEnterBreakMode function and run the project. When the other copy of Visual Studio comes up, load up another project in it, set some breakpoints, and then run THAT project. When one of those breakpoints hits, you should get popped all the way back to the first Visual Studio with the Add-in project. Neat, huh? Now we can really do whatever we want within Visual Studio, so take a look at what’s available. Next time we’ll be looking at the GetExpression specifically.

Ok, that’s it for now. My apologies that this posting was mostly just a building block for what’s to come. I promise we’ll tie some stuff together next time out.

 

Apr 182011
 

Download the source code for this post (5k)

Last time, we made an addin that wasnt very useful admittedly. This time we’ll fill it out a little and discuss some of the caveats involved in these sorts of things.

MyType

Now that we want to query memory, we’ll need to fill out the details of MyType. Let’s go with this:

struct MyType
{
    int Integer1;
    int Integer2;
    char * BigString;
    char * StringList[10];
    int StringIndex;
};

Get the latest project here (5k). Ultimately, we’ll look at three different ways to visualize this structure, but for now, we’ll focus only on the first two integers. The rough set of steps we need are:

  1. Get the pointer value from GetReadAddress()
  2. Query Integer1
  3. Check for read errors
  4. Query Integer2
  5. Check for read errors
  6. Compose the output string and return

If you look at the HandleMyType_TwoIntegers() function, you can see the steps called out. Notice that we check for errors a lot – this is very important! Treat your addin code as though you’re literally adding code to Visual Studio becuase that is in essence what you’re doing. If your addin crashes or scribbles on memory, it can destabilize Visual Studio.

On the bright side, we’re finally querying memory, and it’s pretty straight forward. Unfortunately, the code is hiding a lot of bad assumptions. We’ll get into that a little later after we cover the other two examples.

Example: Big String

The second example is a little more complex. We want to read a single string from a character pointer. This means we’re going to dig through a pointer indirection as well as deal with a string of indeterminate size.

C-strings are a common data type that might be read from a debugger addin, but dealing with them can be problematic. First, we have to get the string-pointer out of the structure and then we have to read the character data. Unfortunately, we don’t know how much data to query; it could be a single character or many kilobytes long. My solution is to query 1k at a time, concatenate the data, and then check the new data for the string terminator.

There is an additional caveat that I’ve encountered, however. On some platforms, if you read off the end of a valid memory page and into an invalid one, the read will simple be truncated. Other platforms will simply return a read-failure. As a result, I always read up to 1k at a time and clip the reads to page boundaries. There’s also the possibility that the string isn’t terminated at all. We don’t want to create a dangerous situation where we keep reading memory until we happen upon a terminating null by accident. So regardless of the data, limit the read to a reasonable length. Certainly limit it to the size of the results buffer.

Example: String Table

The truth is that the second example was too complicated. We could have accomplished it in the autoexp.dat file without any addin dll at all. I included it in order to introduce the string-reading code with minimal complications. The third sample on the other hand can’t be accomplished without one. We have an array of strings and an index. The output should be the string that corresponds to the index. This was meant to mimic standard identifier schemes that are used in some game engines.

The steps we have to take are:

  1. Get the pointer value from GetReadAddress()
  2. Query the StringIndex
  3. Check for read errors
  4. Bounds check the index value
  5. Calculate the offset into the string table at the index
  6. Query the pointer value at that offset
  7. Check for read errors
  8. Handle the NULL pointer case
  9. Query the string bytes at the pointer value
  10. Check for read errors
  11. Compose the output string and return

Assumptions and Caveats

The addins mechanism is fairly useful, but it gives you very little solid information on the nature of the target process or the hardware it’s running on. Earlier, we dealt with a potential problem reading strings, but there are several more places where we have incomplete information.

Fundamenal Data sizes

How big is an integer? It may be 32-bits on your development PC, but it could be 64-bits on the target process. More importantly though, how big is a pointer? The same PC might use 32-bits for some processes and 64-bits for others. Having data with potentially different sizes on the target process can also shift other data around inside of structures. While we’re on the topic of structures, we have to be aware of alignment. The example code has a copy of the structure in the addin, but that’s a cop-out. There are no guarantees that the local version and the target version are the same.

All of this is points to a need to be circumspect in how you structure your queries. Fortunately, there is a way to be 100% sure without making assumptions which we’ll cover in an upcoming post.

Endianess

Visual Studio is a Windows PC application, so the debugger is running natively as little-endian. However, you’re only ever reading raw data from target process memory. That means that all data you read is in the target’s endian-format. If you’re drilling through layers of pointers to pointers, you’ll have to read data, swap endian, read-data, swap-endian, etc etc.

Remember that endianess can apply to wide strings as well. Before we can convert a narrow string, we have to flip each character to be local-endian.

Next time

We’ve gone just about as far as we can go with bog-standard addin dlls in the autoexp.dat file. Next time, we’ll look at creating a more versatile kind of debugger addin that give us more certainty in dealing with the caveats, but with a marked increase in complexity.

Apr 182011
 

I had this kicking around on my hard drive, so I thought I ‘d throw it out there. It’s not exactly a big secret, but it certainly amused me when I wrote it. Use CWARNING and CERROR to add compile-time items to the Error List window in Visual Studio.

This is Visual Studio specific, but can be adapted to work with other compilers and environments. The use of __pragma appears to be Microsoft-specific, but the C99 and C++11 standards introduce the _Pragma keyword.

#ifndef _COMPILEWARNINGS_H_
#define _COMPILEWARNINGS_H_

#define __NUMTOSTRINGHELPER2(x)	#x
#define __NUMTOSTRINGHELPER(x)	__NUMTOSTRINGHELPER2(x)
#define __FILE_AND_LINE__       __FILE__"(" __NUMTOSTRINGHELPER(__LINE__) ")"

#define __ADDWARNING__(msg)     __FILE_AND_LINE__" : warning : " msg
#define __ADDERROR__(msg)       __FILE_AND_LINE__" : error : " msg
// edit 2014-24-02 - CINFO is not correct...still looking for the answer here
#define __ADDINFO__(msg)        __FILE_AND_LINE__" : info : " msg

#define CWARNING(msg)           __pragma(message(__ADDWARNING__(msg)))
#define CERROR(msg)             __pragma(message(__ADDERROR__(msg)))
#define CINFO(msg)              __pragma(message(__ADDINFO__(msg)))

#endif // _COMPILEWARNINGS_H_
CWARNING("This is just a warning");
CERROR("If you get this error, call for help and hide under your desk!!");
Apr 152011
 

Download the source code for this post (3k)

Greetings from Seattle! This is the start of a multi-part series about what I’ve learned about Visual Studio debugger addins while writing the FNameAddin.

First, however, we need to take a minute to discuss some basic stuff that most engineers I’ve met are aware of, but few bother to dig into. The autoexp.dat file is a peculiar beast in that you are encouraged to add to it, yet it is squirrelled away deep in the Visual Studio installation folder. It can be customized for your project, but has to be shared among all projects. All in all, it seems poorly thought out. We’ll blast through the really basic stuff first, so anyone who is seeing this for the first time should read up on the supplementary links.

AutoExp.dat

Deep in the Visual Studio install directory, at <VisualStudioInstall>/Common7/Packages/Debugger, there lives the autoexp.dat file. It’s just a text file, so go ahead and open it up and take a look around if you’ve never done that before. This is where you can give the Visual Studio debugger some hints as to how you want your data to be displayed in the watch window. There are three sections: [AutoExpand], [Visualizer], and [hresult]. We’ll only deal with the first one for the time being.

The AutoExpand section is fairly simple and quite well documented at the top of the file, so I won’t bother with it here. Sufficed to say, you can crack lots of simple data types by just adding them to the mix and possibly using one of the type suffixes. Go ahead and play around with this stuff. It’s very safe in that it won’t destabilize Visual Studio if you get it wrong, the debugger reloads the file every time you start a new debugging session, so feel free to edit and play around – just make a copy of the original file before you start. It’s always nice to revert in a pinch.

I’m far more interested in covering the $ADDIN-dlls functionality that is briefly mentioned. The file recommends looking at Microsoft’s EEAddin sample in order to get started, but you should use some caution. It turns out that EEAddin has been broken for as long as I can remember and will crash if you use it as-is – how unfortunate! So let’s take a different route. Let’s make our own sample: MT_Addin.dll. The process looks like this.

  1. Make a new Win32 DLL project called MT_Addin.
    • I ripped out the PCH plumbing to reduce the file count, but you’ll probably want to leave it in.
  2. Write your handler function using the correct function signature. (see below)
  3. Add a .def file so our handler function is exported with a nice undecorated name.
  4. Add an entry in the autoexp.dat file that references our Type, DLL and entry point
    • MyType=$ADDIN(<MT_AddinProjectDir>\Debug\MT_Addin.dll,HandleMyType)

You can download my sample project here (3k)

Ok, now let’s see what we have. Starting with the cpp file, we have our handler function with the following signature:


ADDIN_API  HRESULT  WINAPI HandleMyType( DWORD /*dwAddress*/
                                        , DEBUGHELPER * pHelper
                                        , int nBase
                                        , BOOL /*IsUnicode*/
                                        , char * pResult
                                        , size_t maxResult
                                        , DWORD /*reserved*/ );

There are a few things to note here. First, we declare the entry point as WINAPI which is just a #define for the __stdcall calling convention. This is what’s missing from the EEAddin sample that causes it to crash. Next, notice that we don’t use the dwAddress or bIsUnicode arguments. It might seem a little strange given that the whole point of this exercise is to look up addresses in the target process, but these are really just legacy arguments.

The most important thing here is the DEBUGHELPER. Strangely, you have to define the type for yourself even though the format isn’t under your control. (You can find it in MT_Addin.h) Fortunately, it’s not that complex and contains very little nuance.

dwVersion – Structure version – <0x200000 for VS6, >=0x20000 for VS7 and beyond
GetDebugeeMemory() – Useful for VS6 otherwise deprecated. Doesn’t support 64-bit pointers
GetRealAddress() – Gets the 64-bit value of the variable to be inspected. (Replaces dwAddress)
GetDebugeeMemoryEx()
– Query raw memory from the target process.
GetProcessorType() – Its use seems clear enough, but it always returns 0 for me.

 

Our example doesn’t do any memory queries at all, but it does create a simple string to tell you it’s working.  Finally, it returns S_OK for success – your addin should always return S_OK even when something goes wrong. When data can’t be interpreted correctly, you should return success and set the output string to “Error processing the data!” because returning an error will only display “???” in the watch window. Think of the return code to mean “The addin succeeded/failed” instead of “processing this data type succeeded/failed”

Installing, Testing, and Debugging

In order for Visual Studio to find your DLL, we have two options – specify an absolute path in the autoexp.dat file or copy our DLL into the “<VisualStudioInstall>\Common\IDE” folder. I recommend the former method because it prevents pollution of the Visual Studio folder, and it facilitates debugging the addin.

In order to test the addin, we need a second project that contains a type called “MyType”. At this point, it really doesn’t matter what MyType contains since our addin doesn’t actually query process memory. We just need the match the type name in order to invoke our handler – so go ahead and try it. If you put a MyType variable in the watch window, it should display, “Handled Data Type!!” Don’t worry, we’ll eventually make some memory queries.

Debugging these sorts of addins is fairly straight forward. Edit the addin project’s Debugging properties so that the Command item points to DevStudio IDE binary – it lives at “<VisualStudioInstall>\Common\IDE \devenv.exe”. When you run, your DLL won’t be loaded, so your breakpoints are invalid initially. Don’t worry about it though. The debugger will load your dll just in time and your breakpoints will go active. In fact, it will load and unload your any time it it’s required.

Next Time

Ok, that’s it for now. I think we have the basics pretty well covered. There’s a bit more to do with the standard addins, but hopefully things get a bit more interesting when we get to the Visual Studio extensibility framework.