Extracting Hidden Text with WinDbg

Sunday, January 11, 2009

In How I Built a Working Online Poker Bot, Part 7: Extracting Text from 3rd-Party Applications, we talked about how to write code to snoop on the text painted to an application's user interface.

We noted that most applications, when they want to draw text, do so by calling one of a handful of standard operating system APIs. And we realized that we could "detour" these APIs in memory, such that whenever an application tried to call these APIs, it would end up calling our code instead. Our code would then have a chance to analyze the text being painted.

Several months later, one of the companies mentioned in the above-linked post—PokerStars—updated their software.

Specifically, they started painting text using a proprietary solution that doesn't rely on standard operating system APIs. Our technique stopped working, and tools which depended on it stopped working. In PokerStars Slams Hammer on Third Party Software Industry, we can see an email PokerStars sent to tool developers.

I regret to be the bearer of bad tidings, but this was the purpose and intent of the change -- to deny malicious third party developers access to hooking 'DrawText' to extract data from the dealer chat window.  We have completely avoided the use of 'DrawText' and 'ExtTextOut' in favor of a proprietary internal solution, and as such the contents of the chat cannot be extracted any longer.

While we recognize that there are many third party odds calculators and other real time tools that are absolutely not malicious and in fact are permitted on PokerStars, malicious tools such as bots and dataminers also retrieve their data in the same manner, and it is against those developers that such measures are designed.

While unfortunate, some permitted apps are likely to be hobbled by the changes intended to prevent cheating, and there's little that we can offer you in way of a solution or a way around this.  We can't offer you an API to the data (or malicious developers would hack it), and we can't go back to using DrawText.  We believe it more important to keep the malicious programs out than it is to allow permitted programs to continue to function.

I wish I had a solution for you, but the truth is that additional changes will be forthcoming to make it even more difficult to gather "real time data" from the client as well.  If a cheating program manages to find a way to code around these changes, we'll plug whatever hole they exploit as well (which may well break any solution you come up with for your applications as well).

I wish I had better news for you, but that is PokerStars' position on the gathering of real time data from the client. All I can offer is apologies that your application had to be caught up as "collateral damage" in the war on cheating applications.

Just to be clear, the issue here is that online poker tool developers have been programatically extracting text from the PokerStars game chat window.

This window gives crucial information about the betting actions in each game. The text contained in this window is not private in any way—it's intended for human consumption—but PokerStars would prefer that third-party software tools not have access to it. In their view, that capability makes it too easy for malicious software to hook into the PokerStars client. So they've taken steps to prevent it.

But is the PokerStars game chat text truly gone, such that it can never be extracted? Or is it merely hiding?

In order to answer that question, let's download a powerful but free piece of software called WinDbg, the core member of the Debugging Tools for Windows package. Here it is, in all its glory:

Most programmers who were weaned on a diet of Visual Studio development rarely, if ever, use any debugger other than the built-in Visual Studio debugger. That's usually okay because the Visual Studio debugger is integrated, eloquent, easy to use, and quite powerful. But there are many debugging facilities which aren't exposed through the Visual Studio debugger UI.

Once you've installed WinDbg, you should really take the time to configure your symbol files properly. But this isn't strictly necessary for today's work.

Let's choose the "Open Executable" option from the File menu:

Browse to the PokerStars executable:

A WinDbg command window will open. We'll see messages for each of the DLLs loaded by the target application, then a "break instruction" followed by the context (the state of the CPU registers) of the primary thread.

By default, when we launch an application through WinDbg, it opens the application in a state of suspended animation. That is, it invokes the application and then immediately halts it. When the target application is in this state, you can't interact with it in any way (except, of course, through the debugger).

So let's type g into the WinDbg command prompt (or use the toolbar button) in order to wake PokerStars up and give it a chance to run. The PokerStars main window should appear, and we'll see some more messages stream into the WinDbg command window.

Then we'll open one or more PokerStars tables, and let them run for a couple minutes, in order to accumulate some text in the game chat window.

After that, it's time to "break into the target" as we say. Hit the WinDbg Break button on the toolbar. The PokerStars application will freeze.

Now, I should mention that every game-related message in the PokerStars chat window is prefaced with the following phrase:

Dealer:

For the purposes of this experiment, we'll use that as a quick and dirty way of locating these game chat messages in memory. In other words, we want to scan the entire PokerStars process—all of it—for the above phrase.

In order to that, we'll execute the following command:

s -u 0x00000000 L?0xffffffff "Dealer: "

Loosely translated, this says: search, starting at memory location 0x00000000, and ending at memory location 0xFFFFFFFF, for all instances of the Unicode text phrase "Dealer: " in the PokerStars process. (We can tell WinDbg to search for ASCII text by replacing the "-u" with "-a".)

Type that into the WinDbg command window and voila!

As you can see, there are numerous instances of this text in PokerStars process memory—one for each line of text in the game chat window, in fact, plus a few occasional duplicates. For those of you who are new to "hex editor UI," let's look at it with some helpful annotations: 

Each line corresponds to a specific memory address whose bytes, when interpreted as Unicode text, spell out our catchphrase. Clearly, the PokerStars chat text still exists, the same as it ever did. But let's take a closer look. Open a Memory window from the menu or by hitting Alt+5:

This will let us examine the actual bits and bytes of process memory.

At the top of the window, in the field labeled "Virtual," we'll plug in the address of one of the "Dealer: " text strings shown above. (The strings inside the red box. Don't use the actual values shown above; use the values returned by your copy of WinDbg).

We are now staring at the bits and bytes of a single line of text from the PokerStars game chat window. So when PokerStars says that game text "can no longer be extracted" we should take that statement, as always, with a grain of salt.

Extra Credit: Watchpoints

Most programmers are familiar with the concept of a breakpoint. When we're debugging an application, and execution reaches a breakpoint, the debugger breaks into the application and gives us a chance to start single-stepping through it, examining variables and so forth.

Most programmers eventually learn that we can set conditional breakpoints. Instead of halting the application when a particular breakpoint is reached, conditional breakpoints halt the application when a particular condition is true. For example, "break if the value of foo is 93".

But relatively few programmers are aware of data breakpoints, also known as watchpoints. These tell the debugger to halt the application not when a particular piece of code is executed, but when a certain memory address is accessed. In any way whatsoever.

For example, if we happen to know that a certain piece of text is sitting in process memory at a certain address, and we want the debugger to tell us whenever that piece of text is passed into a function, copied, or deallocated, we can create a watchpoint.

Within WinDebug, the ba command ("break on access") is used to set up a watchpoint. For example:

ba r4 0x04EFF210

This command reads break the application whenever the 4-bytes of memory starting at address 0x04EFF210 are accessed. Sounds good, except that breakpoints are usually set in the context of a specific thread, and PokerStars, like most applications nowadays, has multiple threads:

So what we want to do is somehow set the data breakpoint across all threads. Lo and behold, WinDbg allows us to do that by using a special prefix:

~* ba r4 0x04EFF210

The ~* instructs WinDebug to set the breakpoint across all (*) threads. We could also use it to set the breakpoint for specific threads.

Let's go ahead and type the above command into the WinDbg command line, replacing the address 0x04EFF210 with one of the addresses returned when you executed the s command above.

Once we've done that, type g to resume the application.

If we've timed things correctly, the breakpoint should be hit almost immediately. We'll continue typing g to resume the application each time the breakpoint is hit and we'll end up with output similar to the following:

At this point, we've learned two things:

  • Where text exists inside the PokerStars (or other application) process.
  • The state of the process, including all threads, stack frames, and registers, each time this text is accessed.

We know what the text is, and with some careful research, we can figure out what functions are accessing this text. In order to do that, we can examine the call stack each time the debugger halts the application: "Breakpoint 0 hit."

We can then work backwards, and start to determine which functions are responsible for creating, reading, and copying PokerStars game chat messages, as well as the structure of those messages in process memory. Without ever knowing the name of any of these functions or structures.

Once those functions are identified, we can start to interposition custom code in order to redirect execution flow from the original versions of those functions to custom code, written by us...giving us a chance, once again, to read PokerStars game text in real time.

That's just one way of doing it. There are others.

And of course, PokerStars is just an example. The above technique will work with any Windows application, although, as always, some tweaking is required. Stay tuned and we'll discuss some of the details and, after a three month hiatus, take a look at some more source code.

Happy disassembling.

Tags: WinDbg, debugging, PokerStars, online poker, poker

58 comment(s)

ping - bang - boom

Hello. My question is whether it's possible to detect those text strings as they're being created, rather than after they're created. This being more useful IMHO than looking at them after the fact. Is there a way to set up a breakpoint to occur when a text string containing "Dealer" is passed into any function----no matter where the text exists in memory?

GREAT STUFF.

Consistently, you and pokerai are the only 2 sites on the Internet where you can find this level of technical info.

I don't and won't use WinDbg, but I enjoy the low-level techie stuff. Keep it coming.

After concealing the game # you give it away in a later image...

Come on guys...every time he posts, the GREAT POST!!! FANTASTIC!! people come out of the woodwork, even when the quality is sub-par. Some of his posts are ground breaking, others are less so. Like any blogger, he's hit or miss. Not every blog post has to be FANTASTIC!! GROUNDBREAKING!! Sometimes a blog post is quick and dirty and thats okay. Stop kissing his arse.

Game # 23771182740

I think the referenced game number was observed on a public machine, like all his screenshots, without logging in, or logging in on another account. Based on an email I have from him.

Nice workaround, but I can see the future, and it looks like...

encrypted data in memory

Does the above technique work on a 64-bit machine?

You have 0x00000000 thru 0xFFFFFFFF as a range which to me says, 32-bit processes only.

What if you're running Win64?

"but the truth is that additional changes will be forthcoming to make it even more difficult to gather "real time data" from the client as well. If a cheating program manages to find a way to code around these changes, we'll plug whatever hole they exploit as well"

which side has the advantage?

you really need to have a print view...

A lot of this is a bit far fetched for me but I find your stories very interesting to read even if a lot of it is in another language. Not sure how I feel about my favourite site's security being loopholed :S

Lol I think you'll be receiving some more e-mails from PokerStars soon!

"So when PokerStars says that game text "can no longer be extracted" we should take that statement, as always, with a grain of salt."

I've noticed a trend in targetting PokerStars in this blog. Any particular reason? I'm not sure how much you play poker but are you a winning player?

if anybody has/coding a dataminer for ongame please conact me nemesisattax@hushmail.com

Ha. This is now turning into a reversing blog (or maybe a Poker Stars war blog?). Seriously, do some Google work and you can find all of these techniques and more related to general purpose reversing. Any competent coder should be able to read up on these techniques and apply it to this or any other problem. The weak point or any of PS's security is that as long as the application is running in the memory on your computer, you can access it. Yes they can make it harder, but the information is still there.

But wait, what if they encrypt their memory contents somehow? Doesn't matter. Remember the old analog-hole? Same concept applies. If human eyes can read it, a screen scraper can read it too. The pixels that make up the fonts in the chat box are pretty easy to scrape and convert to ascii. Once that is done, you have the game state regardless of any internal machinations that PS goes through.

Pff.

so is this not an option to prevent snooping in memory addresses? http://msdn.microsoft.com/en-us/library/ms229741.aspx

is there ANY way to make data in a client/server app secure enough that it would be too burdensome to try and get?

with the money at stake, i would expect PS/FTP to employ someone like Schneier to help them with these problems. any chance of that happening?

thanks for another great post. really enjoying the look inside a poker app.

--bcd

Weird thing is that pokertracker still works on pokerstars...

Poker tracker works off of hand histories, not real-time game states.

It can still detect where players are sitting to place the HUD over them since that isn't defined by the chat....

not sure about pokertracker, but HM seats the players on HUD only after the hand history has been sent.

thanks for the update. I had actually pretty much entirely finished my bot, which was only going to be used for theory on a thesis, and they made this change. I've been spending time trying to figure out how to get the text again. this will definitely help but I'm still new to all this reverse engineering/injection etc. so I'm sure this will set me back another month or two :

I'm surprised they leave any plain text in the app at all. They could easily use some strategies to encode the text until it's actually being drawn. If they used some sort of data-driven encoding they could be updating the encoding mechanism on the fly continuously. You would eventually be left with OCRing the chat box (not beyond the realms of possibility). If they were concerned about real-time cheater bots/tools they could still make a plain text chat stream available, just delayed by a minute or two to allow people to keep a record of their own chat history.

[b]Please, please, please go into detail when you cover:[/b]

[quote]work backwards, and start to determine which functions are responsible for the PokerStars game chat messages, as well as the structure of those messages in process memory ... we can start to interposition custom code in order to redirect execution flow from the original[/quote]

This is as far as I've got on my own so I think I will learn something if you explain it. Thanks.

By the way, I think you should stop linking to urls in comments to stop the minor abuse before it gets worse.

OpenHoldem has always used "screen scraping" and "OCR" (interpreting pixel bits based on location within an arbitrary bitmap cell) to obtain any necessary onscreen information. Chat text is never really needed except possibly as a validation of other stuff already determined by the position of certain items. Things such as cards, the button, bet fields, etc., depending on where they're located on the table, can determine the current game state. Chat text is normally irrelevant. But, of course, James was just using that as a demonstration.

My point is that "screen scraping" precludes any actions taken by the software to protect their internal data.

I definitely agree with Bill...

@Bill:

screen scraping works UNTIL they send a software update which reconfigures the screen, or worse, they add a feature which dynamically alters the elements' locations on the screen. In fact, I'm surprised they haven't implemented the latter yet because its been talked about for years--any bot would need very robust scraping methods to get around that security feature. I wonder if doing that would muck up the "allowed" software that use heads-up to place things on the screen and thats why we havent seen it.

I'm still waiting on a post about an efficient way to simulate human input. My bot is on permanent hiatus until i find a way to type or click without using SendInput. The only way I can be confident I won't be detected is by using a system similar to how Glider for WOW works; creating custom input drivers, and other various methods to prevent it from being detected. Unfortunately this option is a little too complex for my level of skill.

I fucking LOVE YOU JAMES!!!!!!!!!

"is there ANY way to make data in a client/server app secure enough that it would be too burdensome to try and get?"

No, crackers use these methods for years!

Surely I will switch to Screen Scraping + OCR. See this:

http://www.dzone.com/links/usingtheoffice2007ocrcomponentin_c.html

@Sean - Creating custom input drivers is OK, but there are better ways.

"Consistently, you and pokerai are the only 2 sites on the Internet where you can find this level of technical info."

Not rocket science, you just haven't been looking at systems level programming, nor intermediate debugging. Learn what different text and drawing API's are usef for, etc. Go to MSDN.com and look them up. Some interesting ones for example: http://msdn.microsoft.com/en-us/library/ms534233(VS.85).aspx "Windows GDI Font and Text Functions"

The same stuff applies to anything else, game hacking ("hacking" a loose term), Poker client reading, system monitoring tools, etc.

Go on MSDN look up the APIs, play with a debugger, use your head, and you learn it. As a matter of fact a more visual, common and popilar, and probably a lot easier to use debugger is: http://www.ollydbg.de Google "ollydbg" for lots of tutorials.

Good and popular system level tools (and by far not the only place): http://technet.microsoft.com/en-us/sysinternals/default.aspx

The site is orientated to FPS game hacks, but lots of tutorials, code, etc., that applies to Poker clients too: http://forum.gamedeception.net/forumdisplay.php?f=33

Another good source: http://www.codeproject.com

And a good book to get, to know about systems level stuff: "Windows® Internals: Including Windows Server 2008 and Windows Vista, Fifth Edition (PRO-Developer) (Paperback)" The 5th edition not out til a week or two, else you can get the 4th.

http://www.amazon.com/Windows%C2%AE-Internals-Including-Windows-PRO-Developer/dp/0735625301/ref=pdbbssr_1?ie=UTF8&s=books&qid=1232407999&sr=8-1

*popular

i'm waiting for the code! it will be very interesting! great work!!!!!!!

@Indiana - could you perhaps elobarate on some better ways. I know of plenty of ways it can be accomplished, but none that are satisfactory. I'm not entirely sure at what lengths I should go in order to not be detected. On a side, I found that the most simple(but probably naive) way to extract data from full tilt is to simply highlight text in the dealerchat, copy it to the clipboard, and parse through it. Could anybody tell me the disadvantages to this method?

the disadvantage is you block the input while you reading the chatbox becouse you have too make special mousemovement and after every copy you have to wait 5 seconds to make the next one.

Thanks for your articles (don't think the word 'tutorial' applies to what you offer).

I USED to be a coder / reverser and whenever i read something by you, i get this nostalgic feeling and i feel like coding and cracking again.

Thanks for the flashbacks!!

hey great stuff.

following this article i had some trouble with the watchpoints, the memory locations i chose never got caught, i tried many times but it never broke execution, even when i could see the exact text in teh chat window. i feel like i am not understanding something.

I have had some experience disassembling java code, but not C programs, are there any good sites u can recommend?

Did anyone try to resume where James left off? I did and am hitting my head against the wall. If anyone knows where I went off-path your input is appreciated. Here's what I did...

Examining the call stack every time PokerStars accesses the dealer chat memory shows functions like [url=http://twopackages.com/call_stack.txt] these[/url] (more or less). Of those functions the only one that looks promising is RtlOemStringToUnicodeString.

I then found the RtlOemStringToUnicodeString definition/paramters/return values from its [url=http://msdn.microsoft.com/en-us/library/ms796937.aspx] MSDN site[/url]. I need thsi information to create my hook function.

I have Detours installed and successfully ran the WITHDLL tool to attach my DLLs to Pokerstars and other applications - so the next step was to hook PokerStars every time it called the RtlOemStringToUnicodeString function and hopefully extract dealer text.

The VS 2008 compiler absolutely hated my [url=http://twopackages.com/dllmain.cpp.txt] 1st attempt[/url] (as well as every attempt after). It doesn't recognize NTSTATUS, UNICODESTRING, OEMSTRING or RtlOemStringToUnicodeString. So re-checking the [url=http://msdn.microsoft.com/en-us/library/ms796937.aspx] MSDN site[/url] I notice a requirement for using the RtlOemStringToUnicodeString function is including the Ntifs.h header file. I tried googling/downloading Ntifs.h and all the header files it (and subsequent header files) reference but that quickly spiraled out of control. I wasn't sure if I was downloading the right versions, they became harder to find, etc. So I decided to look into Ntifs.h more.

Ntifs.h is part of the Windows Driver Kit which I eventually got my hands on. However, installing the [url=http://www.microsoft.com/whdc/devtools/wdk/default.mspx] WinDDK[/url] when I already had the SDK that came with VS 2008 installed was a mess. You really should only have one SDK/DDK installed on your system at a time.

And that's where I'm left. Any help is greatly appreciated. Thanks!

yeah I was looking more at the function DispatchMessageW, but im not sure either is correct.

as PS's stated in their message that they have created a custom function to print stuff to the screen, so any function calls in the call stack wil be coming from the PS.exe not system DLL's, and also wont have a definition in the msdn. please correct me if im wrong.

if I could get the breakpoints to work id be able to investigate further, but cannot. i am using the logexts & logviewer from windbg to capture everything, but its like a needle in a haystack. not even sure if im on the right track there.

I'm a software engineer that lives in Reno Nevada. Anyone interested in meeting and working together? Email me at jmark821@yahoo.com I'll also be in Las Vegas May 9-11

Hey found this cool program for debugging & disassembly

IDA

http://www.hex-rays.com/idapro/

you can download a free version

jmark,

would like to, but im in Australia :(

madness, thank you

ask bir hayal izle

Very clever method! I would never have happened. Know the functioning of this part of the system.

Hello, this is for the author of the article. I have been writing a program for the past few weeks that uses ReadProcessMemory() to determine the players cards. However, these are pointers and I haven't had any luck finding them.

Can you offer any tips? You can contact me at my email if you ever get a chance, thank you.

>If human eyes can read it, a screen scraper can read it too. >The pixels that make up the fonts in the chat box are pretty >EASY to scrape and convert to ascii.................

>My point is that "screen scraping" precludes any actions taken >by the software to protect their internal data.

If with a screen scraper is really so EASY to read the Pokerstars chat text:

1) why in this site tere isìt any article about this matter? 2) why on the web I can't find any tutorial about this matter?

I think the referenced game number was observed on a public machine, like all his screenshots, without logging in, or logging in on another account. Based on an email I have from him.

Man you are fuckin gangsta. If figured out all this shit alone, you are THE man.

PS: Try going to a night with a copy of this post, display it, and even hardcore cheerleading/prom-queen/anti-geek hoes will perform felaccio on you... Promplty.

Stay tuned and we'll discuss some of the details and, after a three month hiatus, take a look at some more source code

one year and a half is passed, but I don't see any code.....

Sooner or later they will find a way to get a complete reading of the longer-term. Developers who are looking beyond the barriers of these massive programs are always one step ahead of the companies themselves.

Greetings.

how much code do you want to build your own OCR? This should be enough .Also check out http://code.google.com/p/tesseract-ocr/

Your blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing. Keep blogging.

-1'

It is really a nice post, it is always great reading such posts, this post is good in regards of both knowledge as well as information. Very fascinating read, thanks for sharing this post here.

I enjoyed following the whole entry, I always thought one of the main things to count when you write a blog is learning how to complement the ideas with Online pharmacy reviews images, that's exploiting at the maximum the possibilities of a ciber-space! Good work on this entry!

I love my ghd outlet Glattetang! It heats up very quickly and works well. My glamour hair is annoyingly thick and poofy, but ghdstraighteneroutletaustralia works wonders! I have never been so fully satisfied with just ghd straightener outlet supplier! What a pleasure shopping at this ghd outlet australia! Thank you very much for this wonderful shopping experience. I will be shopping ghd outlet very very often.

4日の参院予算委員会では、小沢グループの奥村展三文部科学副大臣と中塚一宏内閣府副大臣が消費税法案に賛成する意向を表明した。奥村、中塚両氏は小沢元代表の意向に反し辞表を提出していない。

 首相は辞任した4人の後任について、消費税法案への賛否を確認した上で起用する方針だ。

Use the form below to leave a comment.






Coding the Wheel has appeared on the New York Time's Freakonomics blog, Jeff Atwood's Coding Horror, and the front page of Reddit, Slashdot, Digg.

On Twitter

Thanks for reading!

If you enjoyed this post, consider subscribing to Coding the Wheel by RSS or email. You can also follow us on Twitter and Facebook. And even if you didn't enjoy this post, better subscribe anyway. Keep an eye on us.

Question? Ask us.

About

Poker

All in all you're just another spoke in the Wheel.


Hire

You've read our technical articles, you've tolerated our rants and raves. Now you can hire us anytime, day or night, for any project large or small.

Learn more

We Like

Speculation, by Edmund Jorgensen.