How To Inject a Managed .NET Assembly (DLL) Into Another Process

Introduction

Today we’re going to kill two birds—

  • How to inject a .NET assembly (DLL) into a native process?
  • How to inject a .NET assembly (DLL) into a managed process?

—with one stone: by using the CLR Hosting API.

But first, let’s talk about monsters.

To many .NET developers, the .NET runtime is like the slobbering, snaggle-toothed monster sitting in your barcalounger, that everybody in the household politely ignores because hey. It’s a frikkin’ monster; don’t make eye contact with it.

[more]

The monster occasionally grunts at you. It smells like a wet dog. And it clutters up half your living room with its bulk. But you tolerate it, because this is a useful monster. It vamooses those sticky kitchen leftovers; does the laundry; feeds the dog; vacuums the carpet; and makes sure the doors are locked at night. It takes care of you, even if your home doesn’t quite feel like your home anymore.

[more]

Now, we’re accustomed to thinking of Windows applications as being either:

  • Native OR
  • Managed

Either our application has a big, smelly, snaggle-toothed monster sitting quietly on the barcalounger, or it doesn’t.

Right?

Well, what if I said that the whole managed-vs.-native dichotomy is an illusion that’s been pulled around your eyes to blind you to the truth?

Morpheus: The Matrix is everywhere. It is all around us. Even now, in this very room. You can see it when you look out your window or when you turn on your television. You can feel it when you go to work… when you go to church… when you pay your taxes. It is the world that has been pulled over your eyes to blind you from the truth.

Neo: What truth?

Morpheus: That you are a slave, Neo. Like everyone else you were born into bondage. Into a prison that you cannot taste or see or touch. A prison for your mind.

What truth? you ask.

That you are locked in a programmatic worldview designed to turn a human being into one of these:

Okay, scratch that.

Cut.

After having a friend review this article prior to posting, I’m told I’m way out on a limb here.

James bud—you’re crazy. The .NET framework is nothing like the Matrix. Stop using the Matrix in your posts. For starters, .NET uses Unicode internally, whereas the Matrix uses [censored].

Ah. Right. The old what-programming-language-do-they-use-in-the-Matrix? debate.

I guess the point I was trying to make is this: there’s no fundamental difference between a managed process and a native one on Windows. A managed process is simply a native process in which some special code which we call the “.NET runtime” happens to be running, and in which we have access to a special library known as the “.NET framework”.

Now, this “.NET runtime” code is capable of doing some special things, to be sure. For example, it knows how to generate binary executable code on the fly in a process known as just-in-time compilation. It knows how to rig things properly such that “managed-ness” happens correctly, that our managed code runs by the rules we expect it to run by, and so forth.

But applications are not “managed” or “native”. They’re always native. Sometimes they eruct an infrastructure known as the managed runtime, and we can then start calling them “managed” but they never lose that core nativity. In fact, it’s impossible to execute managed code without executing native code! The entire phrase is a misnomer!

There is no such thing as managed executable code. Only managed code which is later converted to executable code.

Like everything else in programming, the managed runtime is an illusion, albeit a useful one.

Loading the .NET Runtime Yourself

So if a managed process is simply a native process in which some special code is running, there must be a way to load the .NET infrastructure into a native, non-.NET process, right?

Right.

Surely this is a complex and messy process, requiring special knowledge of Windows and .NET framework internals?

Maybe, but all the complexity has been placed behind one of these:

I say this because starting the .NET runtime is (pretty much) a one-liner, by design:

HRESULT hr = pointerToTheDotNetRuntimeInterface->Start();

The only trick is getting the target process (the process into which you’d like to inject your managed assembly) to execute this code.

Let’s explore.

Step 1: Create the Managed Assembly

So you have some managed code you’d like to run inside the target process. Package this code inside (for example) a typical .NET class library. Here’s a simple C# class containing one method:

namespace MyNamespace
{
    public class MyClass
    {
        // This method will be called by native code inside the target process…
        public static int MyMethod(String pwzArgument)
        {
            MessageBox.Show(“Hello World”);
            return 0;
        }

    }
}

This method should take a String and return an int (we’ll see why below). This is your managed code entry point—the function that the native code is going to call.

Step 2: Create the Bootstrap DLL

Here’s the thing. You don’t really “inject” a managed assembly into another process. Instead, you inject a native DLL, and that DLL executes some code which invokes the .NET runtime, and the .NET runtime causes your managed assembly to be loaded. 

This makes sense, as the .NET runtime understands what needs to happen in order to load and start executing code in a managed assembly.

So you’ll need to create a (simple) C++ DLL containing code similar to the following:

#include “MSCorEE.h”

void StartTheDotNetRuntime()
{
    // Bind to the CLR runtime..
    ICLRRuntimeHost *pClrHost = NULL;
    HRESULT hr = CorBindToRuntimeEx(
        NULL, L“wks”, 0, CLSID_CLRRuntimeHost,
        IID_ICLRRuntimeHost, (PVOID*)&pClrHost);

    // Push the big START button shown above
    hr = pClrHost->Start();

    // Okay, the CLR is up and running in this (previously native) process.
    // Now call a method on our managed C# class library.
    DWORD dwRet = 0;
    hr = pClrHost->ExecuteInDefaultAppDomain(
        L“c:\PathToYourManagedAssembly\MyManagedAssembly.dll”,
        L“MyNamespace.MyClass”, L“MyMethod”, L“MyParameter”, &dwRet);

    // Optionally stop the CLR runtime (we could also leave it running)
    hr = pClrHost->Stop();

    // Don’t forget to clean up.
    pClrHost->Release();
}

This code makes a few simple calls to the CLR Hosting API in order to bind to and start the .NET runtime inside the target process.

  1. Call CorBindToRuntimeEx in order to retrieve a pointer to the ICLRRuntimeHost interface.
  2. Call ICLRRuntimeHost::Start in order to launch the .NET runtime, or attach to the .NET runtime if it’s already running.
  3. Call ICLRRuntimeHost::ExecuteInDefaultAppDomain to load your managed assembly and invoke the specified method—in this case, “MyClass.MyMethod”, which we implemented above.

The ExecuteInDefaultAppDomain loads the specified assembly and executes the specified method on the specified class inside that assembly. This method must take a single parameter, of type string, and it must return an int. We defined such a method above, in our C# class.

ExecuteInDefaultAppDomain will work for the majority of applications. But if the target process is itself a .NET application, and if it features multiple application domains, you can use other methods on the ICLRRunTimeHost interface to execute a particular method on a particular domain, to enumerate application domains, and so forth.

The single toughest thing about getting the above code running is dealing with the fact that the CLR Hosting API is exposed, like many core Windows services, via COM, and working with raw COM interfaces isn’t everybody’s idea of fun.

Step 3: Inject the Bootstrap DLL into the Target Process

The last step is to inject the bootstrap DLL into the target process. Any DLL injection method will suffice, and as this topic is covered thoroughly elsewhere on the Internet and here on Coding the Wheel, I won’t rehash it. Just get your bootstrap DLL into the target process by any means necessary.

Conclusion

That’s it.

Of course, we’ve only scratched the surface of the CLR Hosting API and its powers. For a more in-depth description, consult Bart De Smet’s CLR Hosting Series. If that’s more detail than you need, Damian Mehers has an interesting post describing how he used .NET code injection to tweak Windows Media Center. Last but not least, if you’re planning on hosting the CLR in production code, you might want to go all-out and pick up a copy of Customizing the Microsoft .NET Framework Common Language Runtime from Microsoft Press, though this shouldn’t be necessary for everyday use.

And for those of you who are following the poker botting series, this technique allows you to have the best of both worlds: you can create AI or controller logic using a managed language like C#, while retaining most of the benefits of having code run inside the poker client process.

Good luck.

Learning To Drive a Stick Shift

I’ve always thought that programmers should know how to drive a stick shift

Pop the clutch, wrestle it into 2nd, finesse the gas, let the acceleration plaster you into those plush, faux-leather seats.

This is driving.

Now, I don’t have anything against automatic transmissions. In fact, I love automatic transmissions:

  • C#
  • Java
  • VB.NET
  • PHP
  • Ruby
  • Python

But a programmer should know, or at least be familiar with, the low-level stuff.

  • C/C++
  • Assembler

That means: pointers. Threads. DLLs. Import tables. Memory allocation. Nuts and bolts. Stuff that’s often abstracted away for us quite nicely by the .NET framework or the JVM. In a recent Stackoverflow.com podcast, Joel Spolsky makes the point:

[more]

But, but, but, but, but you see the thing about C is it’s not COBOL, which is the language that the old programmers used, it’s the language that is closer to the machine.  And so, things like stacks, they’re still going on there, and, and malloc and memory allocation in C, yeah it’s a lot of manual work that’s done for you automatically in modern languages, but it’s done for you automatically-it’s still getting done.  And so studying C, for example, is like learning how to drive a stick shift car.  It’s learning how your car works, and what the connections are, and how the gears work, and basically the main parts of the drive shaft, and parts of your car.  And you don’t have to know about them to drive an automatic, it’s true, but to really be a person who designs cars, or is an expert at cars, or who’s an auto mechanic, or who’s just kind of a wizard at cars, you have to know that anatomical stuff.  So you’re not really-I don’t think C is an obsolete language, because it’s just an expression of assembler.  It’s just an easier to use expression of assembler language.  And it reflects what actually goes on at the chip

Joe, you stole my metaphor. Or rather, my simile.

The typical attitude around the development water cooler is that “you only need C++ for performance, and performance is a non-issue in most applications.” But this kind of either/or thinking misses the forest for the trees. As Visual C++ team member Stephan T. Lavavej put it in The Future of C++:

Aside from the elevator controllers and supercomputers, does performance still matter for ordinary desktops and servers?  Oh yes.  Processors have finally hit a brick wall, as our Herb Sutter explained in 2005 at http://gotw.ca/publications/concurrency-ddj.htm .  The hardware people, who do magical things with silicon, have encountered engineering limitations that have prevented consumer processors from steadily rising in frequency as they have since the beginning of time.  Although our processors aren’t getting any slower, they’re also not getting massively faster anymore (at least, barring some incredible breakthrough).  And anyways, there isn’t plenty of room at the bottom anymore.  Our circuits are incredibly close to the atomic level, and atoms aren’t getting any smaller.  The engineering limit to frequency has simply arrived before the physical limit to circuitry.  Caches will continue to get larger for the foreseeable future, which is nice, but having a cache that’s twice as large isn’t as nice as running everything at twice the frequency.

But let’s ignore the mythology that performance isn’t an issue. Let’s just admit the following truths, and hold them to be self-evident:

  • That C/C++ is the most performant language in the world, with the exception of assembler itself.
  • That C++ brings this performance to bear using a high-level syntax which is remarkably similar to the syntax of Java and C#—because Java and C# borrowed heavily from C++.
  • That for some desktop applications, performance isn’t an issue.

But these truths have little or nothing to do with why every programmer should know C and C++.

They have little or nothing to do with why I say: if you want to be a top-notch programmer, you can no more afford to ignore the C and C++ languages than a civil engineer can afford to ignore the difference between a plumb line and a snap line, a right angle and an oblique one.

By saying that, I’m not implying that I consider myself to be a top-notch programmer. And I’m not recommending that you or your organization start or continue developing in C++ as a way of conducting business. That’s a considerably more complex question, and one nobody can really answer but you.

I’m not starting or contributing to a language war here, of any kind. Not even a minor skirmish.

No, I recommend learning C++ for the most selfless and utilitarian of reasons: learning C/C++ will make you a better programmer in your language of choice, whatever it may be.

Ruby. Java. C#. VB.NET. And yes, even Haskell. Whatever you program in: C++ will make you better at it.

Take a typical high-level programmer and immerse him for just a few months in a pot of boiling oil native C++ project:

  • Processes and threads
  • Stacks and heaps
  • Pointers and references
  • Memory addresses
  • Why hexadecimal is your friend
  • How strings are represented in memory

He’ll emerge leaner, tougher, and with renewed confidence in the language of the day, whether it’s C#, C++, or Ruby. He’ll know, for example, exactly what it means to push a variable onto the stack. And he’ll know why, when he tries to allocate and then start accessing an array with 2 million elements, the hard drive churns even though the machine has 4GB of memory.

And when the whole application crashes, and spits out some god-forsaken maintenance error code along with a 4-byte or 8-byte memory address: he’ll have some clue what it means, or how to go about finding out what it means.

C++ informs and bolsters your software development skills across the board by acquainting you with the nuts and bolts that underlie every language in the world, almost without exception.

So the next time you have some spare time, pick up a good introductory or advanced C++ book, dust the cobwebs off of your C++ compiler, and point the nose of that lean, mean, muscle machine out your garage door, through the quaint byways of suburbia, and up into the mountains.

You’ll thank yourself for it in the interview room.

System.Object, CObject, and the Seductive Lure of Deep Inheritance

If you’ve been programming in .NET for more than about 2 minutes you know that all .NET objects implicitly derive from System.Object.

But did you know that prior to .NET, in the world of native C++, the same convention existed (and still exists) in MFC? I mean the infamous CObject, the joy and bane of many an MFC programmer’s existence. Every class in the MFC library (with very few exceptions) ultimately derives from CObject, which provides four basic services:

  • Serialization
  • Diagnostics
  • Run-Time Class Information (RTCI, not to be confused with RTTI)
  • Collections

Now, back in MFC’s heyday, developers were encouraged to derive their domain classes from CObject, so as to leverage the benefits of the CObject boilerplate. So let’s say you created a class Vehicle to use in your application. The idea was that, if you were building an MFC application, you’d go ahead and derive your Vehicle class from CObject, and be able to do things like serialization, if you wanted. For a while (a short while) this was even regarded as somewhat of a best practice, especially for developers who toed the Microsoft company line. The message was: build your application using the full power of the MFC library. The hidden subtext: abandon platform-independent solutions like the C++ standard library and do everything using MFC.

The problem, of course, was that CObject was a mess, and nobody really used it unless they had to. Now, I don’t want to hear any geeks telling me that no, CObject was good, it was genius, it was brilliant. CObject was, is, and always will be crap, no offense to the team responsible for writing it, most of whom were brilliant. 

But CObject failed - miserably - in its role as a universal base class. Because the services it provided just weren’t that compelling:

  • How often do you write a class that requires explicit serialization? And if you did, would you trust MFC’s serialization, or use something else?
  • How often do you really need cooked-in CObject diagnostics, when it’s so easy to roll your own?
  • How often do you write a class that requires the (dubious) powers of explicit run-time class information? (or RTCI, not to be confused with RTTI, run-time type information) If you’re writing a lot of C++ code that has to explicitly check the type of an object at run-time, odds are you need to think about refactoring your code.

For all these reasons and more, CObject never really caught on as a “universal base class for developer-created classes”. In practice, we worked with CObject because we were working with other classes, such as derived window or control classes, that themselves derived from CObject. To derive from virtually any MFC class is to ultimately derive from CObject, way up at the top of the inheritance hierarchy.

Now, around the time that MFC was gaining in popularity, developers were already realizing that deep inheritance hierarchies are evil. Herb Sutter explains in his excellent work, Exceptional C++:

[more]

Incidentally, programmers in the habit of making this mistake (using public inheritance for implementation) usually end up creating deep inheritance hierarchies. This greatly increases the maintenance burden by adding unnecessary complexity, forcing users to learn the interfaces of many classes even when all they want to do is use a specific derived class. It can also have an impact on memory use and program performance by adding unnecessary vtables and indirection to classes that do not really need them. If you find yourself frequently creating deep inheritance hierarchies, you should review your design style to see if you’ve picked up this bad habit. Deep hierarchies are rarely needed and almost never good. And if you don’t believe that but think that “OO just isn’t OO without lots of inheritance,” then a good counter-example to consider is the [C++] standard library itself.

The MFC library is a classic example of the problems associated with deep inheritance hierarchies. In order to use a derived class such as a CListView, you have to know how each layer of the inheritance hierachy works:

  • CListView
  • CCtrlView
  • CView
  • CWnd
  • CCmdTarget
  • CObject

You end up with a sprawling in-memory layout of a particular object, with responsibilities divided (often unevenly) among the different layers of the hierachy. That sounds a little abstract, so let me tell you how it works in practice. In practice, you’re sitting there working with your CListView, trying to do something like cause it to refresh, or handling print preview, or some aspect of message routing, and you can’t remember where (at what level of the hierarchy) the particular service you’re looking for lives. So it’s another trip to MSDN, or fiddling around with Intellisense to figure out, aha, that particular feature lives in CWnd.

In other words, deep hierarchies are a big, confusing mess, and slapping a universal “I Am Object” base class on them doesn’t fix the problem.

So why, if that’s the case - if deep inheritance hiearchies are evil, and universal base classes along with them - why do I believe System.Object to be a brilliantly intuitive and effective universal base class? What did the .NET framework do right, that MFC got wrong?

For the answer to that, you’ll have to wait, albeit with less than bated breath, for Part Two.

Choosing Between C++ and C#

During a recent Stackoverflow.com podcast, Jeff Atwood and Joel Spolsky discussed the question:

When is it correct to develop software (Windows software, in particular) in native C and/or C++?

Joel’s response was basically “almost never”. And while I wouldn’t state it quite that strongly, the point he was trying to make is that, for most applications, there exists some language X which is a better (meaning: more appropriate) choice then C++ and especially, C.

Without getting embroiled in language wars, I can agree with this statement. C# is cleaner and more intuitive than legacy C/C++. The overwhelming consensus among Microsoft developers is that C# is the more productive language. And for all but the most intensive of applications, developer productivity trumps all other considerations.

However, as a long-time C++ and C# programmer myself, I have to say that even though C++ is often the wrong choice, there are still many, many situations in which it’s the only choice:

  • Cutting-edge 3D games.
  • Graphical and audio workstation software.
  • Large-scale productivity applications like Adobe Photoshop.
  • Legacy codebases.
  • Real-time systems of all sizes and descriptions.
  • Anything involving extreme numerical computation.
  • Large-scale data storage and retrieval.
  • Device drivers.
  • And so forth.

No other language offers the performance characteristics of C++ together with support for such a rich set of abstractions. Managed programs can get very fast, thanks to JIT compilation’s ability to optimize for the native processor, but native C++ still outperforms managed C# across the board. And for any application where you need explicit control over large amounts of memory - for example, if you’re building an RDBMS - C#’s garbage-collected approach is a deal-breaker.

Of course, most developers aren’t sitting around coding 3D video games, or database engines. They’re creating enterprise software, probably web-based. Or they’re creating relatively lightweight desktop applications using WinForms. For these scenarios, C#, VB.NET, and or C++/CLI (also known as “managed C++”) are the choices that will give an organization the most bang for its buck, in the long run.

In the non-Microsoft world, the story’s a little different. If you want to program in C# for Linux, Unix, Solaris, et al., you’ll probably end up using Mono. And the thing about Mono is it’s open-source. Now, whatever your opinion on open-source development, one thing’s certain: the future of Mono on *nix platforms is more uncertain than the future of .NET on Windows platforms. Don’t get me wrong: Mono is supported by a thriving, vibrant developer community, and in all likelihood it’ll be around…oh…approximating on forever.

But it doesn’t have the weight of the Microsoft juggernaut behind it.

The thing about this particular question - I mean the question of C# vs. C++ - is that there just aren’t that many borderline cases. Usually, some characteristic of the project will jump out at you, and it’ll be obvious which language is appropriate. If you’re doing hardcore systems or application development, C++ is probably for you. For everything else, C# is probably way to go. And in practice you’ll find that it’s something like 80% C#, and 20% C++, give or take 10%.

Now, as hardware continues to improve, expect this percentage to change. If you don’t need the performance, there’s very little reason to use C++ these days at all. Managed C# applications running on today’s hardware are visibly faster than native C++ applications running on ten-year old hardware, and we can expect that trend to continue.

But as of right now: C++ is the only choice for hardcore application development. It’s just a question of deciding what hardcore really means, in the face of ever-more-powerful hardware.