OCR = 0nline P0ken 0pticaL Chanacter Recogrition
Friday, May 29, 2009   

Online poker, meet optical character recognition. Optical character recognition, meet online poker. I'm sure the two of you will get along just fine.

Optical character recognition or OCR is one of those cool technologies which occupies a boring space. Document scanning is nifty and all, very useful from an office productivity standpoint, but it's not sexy. But take that same OCR component and embed it as a cog in the gearworks of an intricate real-time online poker botting rig...

That's positively porn star.

Well, whatever. The point is, repurposing a plain vanilla office productivity app for the purposes of online poker degeneracy makes us happy. So let's see what happens when we feed a piece of online poker text...

...to a typical OCR engine such as the one included with recent versions of Microsoft Word: MODI, short for Microsoft Office Document Imaging

Only let's do it in code, because that's the way we do things 'round here! MODI can be automated over COM and of all the bajillion ways of invoking a COM object from a given language, .NET's COM Interop is one of the cleanest from a usage standpoint. So we'll add a reference (speaking in Visual Studio terms here) to the MODI type library:

And we'll hack out a few lines of C# code to a) create a new MODI document from an image and b) perform the OCR.

MODI.Document md = new MODI.Document();
md.Create(@"c:\somefolder\pokerstars_chat_text.tif");
md.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
MODI.Image image = (MODI.Image)md.Images[0];
MessageBox.Show(image.Layout.Text);
md.Close(false); 

For this barebones proof of concept, you'll have to take a screenshot of the poker window, crop it down to the piece of text you want to OCR, store the result as a TIFF, and pass the name of that file into the Create method. I used this image:

Later, assuming you choose to use OCR in your poker bot or other real-time strategy assist, you'd want to automate this process:

  1. Programatically snapshot the poker table window.
  2. Programatically convert the resulting image to an in-memory TIFF.
  3. Programatically invoke the OCR.
  4. Programatically parse the returned text.
  5. Go to Step 1.

A little messy, but these steps can be performed across multiple tables on commodity hardware without taxing the machine as only a small amount of text is being OCRed. If you currently do any sort of interval-based screen scraping or pixel testing, these steps can likely be factored into your existing input loop. And no, this method does not require that the poker table be visible at all times: it's possible to take a full-sized snapshot of a window which is currently minimized.

But I digress.

After running the above code, MODI will return the following text.

Dealer: downgoesdown Folds
Dealer: xactr21 bids
Dealer: Warstar raises $8 to $12
Dealer: iceoholic bIds
Dealer: rdegs2l boids
Dealer: TheYeti calls $10 

Comparing that to the original text, we can see that three errors have been introduced.

Dealer: downgoesdown Folds
Dealer: xactr21 folds
Dealer: Warstar raises $8 to $12
Dealer: iceoholic folds
Dealer: rdegs2l folds
Dealer: TheYeti calls $10

MODI has mistaken the word "folds" on line 2, 4, and 5 for "bids," "bIds," and "boids," respectively. But before you start complaining about how "Microsoft software is crap" consider that you've written a whopping total of six lines of code.

Imagine what you could do with:

  • A full-fledged commercial or robust open source OCR solution...
  • With suitable training and customization...
  • Optimized for the text-display characteristics of online poker.

Ultimately, if you're serious about incorporating OCR into your poker bot or other automata, you won't use MODI at all. It's a nifty application, but for error-free OCR you'll want to take advantage of training, customization, and your complete knowledge of how text is displayed in whatever online poker application you're targeting.

In order to do that, leverage one of the many available open-source and/or commercial OCR packages. I've used tesseract-ocr, tessnet2, OCRopus, and GOCR at different times (not necessarily in an online poker botting capacity) and we haven't even touched the commercial OCR solutions, some of which are rock-solid.

I'm not recommending that you actually use OCR. That depends on your specific situation:

But I am saying that OCR is feasible and it does offer certain advantages over mechanical techniques:

  • OCR doesn't break with every online poker client update.
  • OCR is non-invasive, requiring no injection of DLLs.
  • OCR is platform-agnostic; all it needs is a source image to work from.
  • OCR is effectively immune to countermeasures (when it comes to online poker).

The online poker user interface is really a sort of implicit CAPTCHA unto itself: easy for humans to solve, not so easy for computers. Over time this CAPTCHA has gotten progressively more sophisticated as poker sites have started employing passive countermeasures aimed at making it more difficult for external tools to interrogate the poker client application for game-related text data.

There are ways around these countermeasures, but maybe you don't want to get involved in a war of escalation. Maybe you shun that sort of bickering. Maybe you grok that one thing you can always count on is the visual interface. Or maybe, like me, you enjoy corrupting innocent office productivity applications just to be perverse.

If that's the case, give OCR a chance. You might not regret it.

UPDATE:

Those interested in the OCR approach should take a look at this fascinating 9-page discussion of real-world OCR in a poker botting context, including the pros/cons of OCR training vs. pixel testing vs. NN-based recognition at PokerAI.org. (You DO read PokerAI.org, don't you?)


Posted by James Devlin   34 comment(s)

I've used tesseract before although not for anything poker related-- but I have wondered whether typical OCR would be up to the online poker challenge. The small x-size of on-screen fonts taxes a lot of OCR apps which expect to be going off a fairly high DPI image, at least, that's what I've gleaned from the theory I've read.

Still interesting to see it actually done, even just in proof-of-proof-of-concept form. Thanks for the food for thought.

Anonymous on 5/29/2009 12:57 PM (284 days ago)

There appear to be 4 errors. It has incorretly read rdegs21 as rdegs2l (replacing the 1 with an a lowercase L). But then that was not your point...

Anonymous on 5/29/2009 1:58 PM (284 days ago)

"Marblecake also the game."

Lol. Awesome.

notmoot on 5/29/2009 2:13 PM (284 days ago)

@Anonymous 1: That's an excellent point. In fact I tested the above sample image with tesseract and got back either gibberish or empty text. There's a note on this somewhere in the tesseract docs. However, tesseract can be trained and, if push comes to shove, you can tweak the code directly.

@Anonymous 2: Nice catch. I missed that.

James Devlin on 5/29/2009 2:38 PM (284 days ago)

One advantage of OCR is that you can run poker in a virtual machine, with literally zero detectability (other than it may know it's in a VM)

Kevin H on 5/29/2009 2:45 PM (284 days ago)

And even if the poker client knows it's running in a VM, it probably won't matter. VMs will become (probably already are) too widespread for the sites to declare that Thou Shalt Not Run Our Software in a VM. Windows 7 will let you peel off XP instances out of the box. A lot of people on the Mac/Linux side wouldn't be able to play at all were it not for VMs. But who knows...

James Devlin on 5/29/2009 3:13 PM (284 days ago)

www.codinghorror.com/blog/archives/001258.html

James Devlin on 5/29/2009 3:14 PM (284 days ago)

One wonders whether the online poker sites should just offer a formal API with per-developer license keys. Legit tools could go directly through the API. Poker bots would be left out in the cold but at least there'd be a formal mechanism for the 'real-time strategy assists' as you call them, that are not blocked by EULA.

Just an idea...

Ed K. on 5/29/2009 9:20 PM (283 days ago)

I have Microsoft 2003 and 2007 but for some reason, Microsoft Document Imaging isn't installed. Any ideas? I'd rather not go through getting tessnet up and running. Looks pretty hairy.

bot4life on 5/30/2009 12:26 AM (283 days ago)

Some discussion on the OCR topic:
www.pokerai.org/pf3/viewtopic.php?f=79&t=1682


Indiana on 5/30/2009 7:34 AM (283 days ago)

another, bad, mistake is the name of xfactr 21, which is read to be xactr21, without the f.

Anonymous on 5/30/2009 7:49 AM (283 days ago)

I used to use a diy OCR via pixel reading but reading fonts is a major pain in the ass for a lot of sites due to fuzzy fonts. Someone once suggested that DirectX could be manipulated to disable all the fancy stuff but i never did find out how. Perhaps someone out there knows...

And i'm glad that Linux (and Mac) was mentioned because not everyone uses Windows. Downside is that the list of poker clients is very limited, although web browser clients are becoming more common now. I have managed to get the Entraction (Java Applet) clients to run as a native app on my Ubuntu Linux box with a simple launch script. Main reason for doing that is to enable the saving of Hand History files (which an Applet can't due to security restrictions). The script and source code can be downloaded from my website: http://bespokebots.com/linux.php.

birchy on 5/30/2009 7:28 PM (282 days ago)

@Indiana: Thanks for the link to that excellent discussion. I've updated the post to include it.

@birchy: So true. DirectX can be manipulated via hooking if nothing else, but most of the sites are using plain vanilla GDI. On the Windows side of things anyhow. Either that or they're using pixel/bitmap fonts and they have some custom logic to blit those to the screen (still, via the GDI). Both ways present problems. If they use formal text-output APIs, the font's appearance can and will vary slightly. But those APIs can be hooked. On the other hand if they draw text by blitting an image of text, then you can't necessarily hook a public API and extract the text. But in that case the fonts are static/unchanging, making a pixel-test DIY OCR more feasible.

James Devlin on 5/31/2009 2:25 AM (282 days ago)

But James I think your point was correct, and it seems indicated in the pokerai discussion as well, that if you train a robust OCR to a specific font, you can get a 100% success rate. The question is whether that's less work than doing a pixel-test method.

Anonymous on 5/31/2009 7:00 AM (282 days ago)

I'm going to have to call bullshit on this one. Nobody is actually going to use some godforsaken "pixel-test" method to read text. If they can't get the text directly from whatever window displays it, then it's easy: they can't run a bot.

Period.

Remember, the people who build poker bots aren't qualified software developers. They're Internet script-kiddies trying to get rich quick.

Guido on 6/1/2009 4:16 AM (281 days ago)

???

Anonymous on 6/1/2009 5:48 AM (281 days ago)

I only see one pile of bullshit here:
[quote="Guido on 6/1/2009 4:16:12 AM"]I'm going to have to call bullshit on this one. Nobody is actually going to use some godforsaken "pixel-test" method to read text. If they can't get the text directly from whatever window displays it, then it's easy: they can't run a bot.

Period.

Remember, the people who build poker bots aren't qualified software developers. They're Internet script-kiddies trying to get rich quick.[/quote]

Although screen scraping is a last resort tactic, it DOES work. Most of the commercial odds calculators and live advisors use screen scraping methods. Even well known names like Winholdem, OpenHoldem and Calculatem use screen scraping. Most so-called "script kiddies" are able to create working bots. If they are as unprofessional as you claim, then a flood of highly predictable bots can only be good for the rest of us...

birchy on 6/1/2009 11:36 AM (281 days ago)

@Guido: Have to agree with @birchy here.

Online poker botters, far from being "script kiddies", are some of the most tenacious and creative developers you'll ever meet. If corporate code was developed with the same level of just sheer outright...obsession...I think a lot more companies would achieve profitability.

And OCR (including DIY pixel-testing) is more common than you think among the poker botting crowd. The "script kiddy" metaphor breaks down when it comes to online poker, because of the rewards involved.

James Devlin on 6/1/2009 12:43 PM (281 days ago)

Has anybody published source code for a simple pixel-testing method? I don't want to OCR, I just want to look at the individual pixels. I've seen this mentioned various places but there's usually not source code to go with it.

Any ideas?

Anonymous on 6/1/2009 2:59 PM (281 days ago)

You mentioned getting a snapshot of a minimised window. Got a link on how that's done? Currently I'm using SwitchToThisWindow and waiting for it to come to the front, which takes an unacceptable amount of time. Thanks.

stalewee on 6/1/2009 11:12 PM (280 days ago)

Theres an article on capturing a minimized window here:
www.codeproject.com/.../...ingMinimizedWindow.aspx
I dont know how good/fast it is though, havent needed to try it.

JB on 6/2/2009 3:58 AM (280 days ago)

Anonymous: If you just want a simple pixel-testing method, you could just use a 32-bit CRC checksum of the pixel, window square, et.c, that you wish to test for. You would have to map each card to its own CRC checksum (precomputed) value. This is not the most robust way of doing it because a CRC will not tolerate a single bit change in the pixels that you check, i.e. it is an exact testing method. If only one color is changed (compared to the image you computed the checksum from in the first place) in the pixel window you are testing for, you will most likely end up with a completely different checksum value.

Check google for the CRC checksum algorithm, there should be source samples out there...

Frode Nesbakken on 6/2/2009 5:24 AM (280 days ago)

stalewee, there are many other ways which you can do that.

Have a look at experts-exchange.com

Poker Bot on 6/2/2009 6:22 PM (280 days ago)

Frode - errr. or you could just get a DC, and use GetPixel.

I test a single pixel to get card rank (2 color tests, one each for black and red), and one for card suite (again one color test for black and red), on AP - But that's probably just the way the anti-aliasing worked out in my favor.

Slightly off topic, since this is an 0ptical chanacter recogrition post. Fortunately for me AP does run drawtext to put text to screen, and the chatbox is pretty much a standard RTF text box. AP isn't gungho-psycho about demolishing third party software development - which is why I won't touch PS. If they change to something more complicated/tricky, me and my little poker calculator are probably shit-outta-luck T_T

aw on 6/2/2009 6:47 PM (279 days ago)

@stalewee:

There are various ways to do this (snapshot a minimized window). You can Google for most of these. The most interesting is probably Feng Yuan's method of BeginPaint hooking but it's a little out of date. Stay tuned for some code/examples re: how to do this cleanly.

@aw:

I agree and I think Poker Stars has actually gone a little overboard. They've made various updates which have alienated builders of "legitimate" tools even while failing to prohibit poker botters.

Of course, poker botting is neither illegal nor unethical nor cheating, and even if it were, prohibition still wouldn't really work.

James Devlin on 6/3/2009 1:24 AM (279 days ago)

Fascinating stuff James.

@ Guido. I think your response is a fairly poorly developed one. Creating bots for anything is highly technical and that goes for poker bots too. It doesn't just involve a few spotty kids trying to earn some money, there's some really great computer science involved here.

Poker Forum on 6/3/2009 6:09 AM (279 days ago)

Ultimately you could stick a camera in front of the screen and interpret the video image, generate mouse movements and key presses electronically. I remember seeing a project where a guy had a tetris playing bot using these techniques. Let them try and detect that..........

FT_Anon_But_Not_For_Long on 6/3/2009 7:48 AM (279 days ago)

You really don't think it's unethical to use bots or artificial intelligence to make calculated decisions against other poker and casino depositors?

When the local casinos in Vegas allow me to sit my robot down against the humans at live games then I will agree with you.

Ukash Casino Deposit on 6/4/2009 8:17 PM (277 days ago)

@Ukash: No more unethical than, say, trying to harvest affiliate revenue by commenting on a poker-botting blog with a link to your affiliate site. Poker bots don't have any information beyond what a normal player would have. If poker bots were hacking the servers or colluding, that would be one thing, but a bot that has to go off the same information given to the human player? That's not cheating. It might be against Terms of Service, but it's far from cheating.

James Devlin on 6/5/2009 2:23 AM (277 days ago)

Great articles, James!
I have question for you. Did you make onboard cards recognition on Poker Stars? I made simple recognition by capturing and recognizing card images from window, and it works like shit.

w_fg on 6/7/2009 1:30 PM (275 days ago)

aw - I meant calculating the CRC of all the pixels in a fixed size window area the cards are displayed (only the upper left corner that displays As, 2h and so on.) I don't think I was making my self very clear on that point. I do agree however, that this is a very naive approach, but if the client doesnt change the card bitmaps, it should work.

Frode Nesbakken on 6/19/2009 12:16 PM (263 days ago)

Hey, what is the name of OCR program used on a first picture?
And is it detectable?

tom on 7/1/2009 12:39 AM (251 days ago)

Nice article. I'll have to look into it. I currently capture the whole screen and hash screen regions against stored images to get the current table state.

Anonymous on 10/4/2009 1:32 AM (156 days ago)

in my opinion there is a problem using OCRs: how to know, the new OCR scan, if the chat lines was pre-existent or not?

True line: Dealer: Johnny34 folds

First OCR: Dealer: Jobnny34 folds
Second OCR: Dealer: Johnni34 folds

If I compare the two rows for my code them are different... I register a fold that in real play there is not....

ezio on 11/10/2009 8:04 AM (119 days ago)

Comment on this post:

Thanks for your interest in Coding the Wheel. All fields are optional.