Notes from a Leap Motion Presentation, part 2: Playing Around
This is part two of a write-up based on a recent presentation I gave showing off the Leap Motion Controller. You can read part one here
In this part I’ll go through a JRuby program that acts as a WebSocket server to control a simple browser-based game.
Please don’t take anything here as any sort of “best” or even “well-considered” practice. When preparing code for presentations or demos there are a number of requirements (or, if not required, useful for the purposes of the demo) that lead to programs with behavior or features you are unlikely to find in a “real” app. Add to this my propensity to try stuff out just to see how it plays, and the results are hopefully interesting but also quite likely quirky.
This is a very good thing. What I’ve found with the Leap, as with the XBox Kinect, is that your biggest challenges are (generally) not with getting data from the controller but in deciding what to do with the data and how to best present it.
The presence of a built-in WebSocket server means that, at least in principle, you can work with Leap data in nearly any programming language you like even if not officially supported. The Leap SDK comes with some tools for using the Leap with a few languages, such as C#, Python, Java, but not Ruby.
I took a stab at loading the Leap DLL using Ruby’s
dl library but with no success. I expect that before long someone else will do better than I did, but in the meantime I can write a Ruby program that grabs the Leap data using Web sockets.
The downside, of course, is that you get just the data, not the nice built-in classes and API. You can add those yourself, but even then you have the overhead of having to convert a JSON string into objects suitable for manipulation for every single frame.
In fact, these concerns were one reason why my HTML5 game example does not consume data from the Leap directly, but from a JRuby proxy program instead. Please note that I decided this off the top of my head. What I’ve yet to do is write something that runs entirely in the browser (something I will do and write about in the near future). Then I’ll have a more realistic idea of how well that works.
It wasn’t just speed that moved me to this; the use of a proxy app allows the code to drive many things at once. As you’ll see my simple game code provides a Web browser with JSON data also but dispatches OSC (Open Sound Control). (Note to self: See if a Web browser can send OSC.)
Not to get too off-topic, but this ability to drive multiple clients is intriguing, and there are at least two ways to do it. One is to have a program that grabs the Leap data (via the SDK, for example) and acts on it by dispatching messages to its own client programs. That’s the scenario I’ll show in a moment.
The other way is to just have multiple independent clients all talk to the Leap server at the same time. I don’t know what, if any, limitations there are on this, but I’ve had at least two or three Leap programs running at the same time (though I confess this was mostly unintentional).
The advantage to the first approach is that it should make it easier to coordinate the multiple sub-clients. With the second approach, though, you can build things that are not directly coupled. There’s a middle ground, of course, where you might create standalone programs that are composed of pieces that could, if you wanted, all be used in the same application. I’ve started playing around with some ideas for doing that.
The JRuby Game Server Proxy Thing
There are a few parts to this program. It started out as One Big File but I decided to break it apart to see what might lend itself to reuse down the line. The first chunk I extracted was the code for the WebSocket server. I put it in the file
web-socket-server.rb. It uses the em-websocket gem, which, as the README says, is a “EventMachine based, async, Ruby WebSocket server.” I linked to the GitHub repo but you can install it using the standard
It provides a handy feature, that of channels. When I started using the EM WebSockets gem I found that once it invokes
EventMachine.run it blocked all other program activity. I did not see a way to reference the WebSocket instance in order to send arbitrary messages to whatever clients were connected. Turns out, that’s what channels do. You define them before you start your socket server, and then you can push things onto a channel to have to sent out.
Still, ordinarily, the server is still blocking. To get around that I wrapped the call to
EventMachine.run in a thread.
WebSockets are relatively simple. Well, relatively simple to use. Typically you just need to handle a few conditions (e.g. connect, disconnect, message received) and decide what to do when a message comes in. You can learn more, and play around with them, on www.websocket.org.
The EventMachine WebSocket code is very similar to what you would see in the browser. The code I’m using is a variation on the multicast example in the em-websocket git repo.
Meanwhile, back in jleap-ng.rb …
In part 1 I described how I altered the core SDK classes to provide a more Rubyish API. In that case I wanted to be able to use subscript notation on the list of hands.
I needed to do that again for my Listener code. I tried using
size on a list of
Finger objects and got this:
# Exception in thread "Thread-176" org.jruby.exceptions.RaiseException: (NoMethodError) undefined method `size' for #<Java::ComLeapmotionLeap::FingerList:0x3c4fea72>
As with the Leap
Frame class I fixed-up the
Hand class in
jleap-ng.rb so that the list of fingers was a proper Ruby array.
The Listener stuff
As I was writing this program I noticed that my
onFrame method was essentially a series of condition checks: “If we have [some hand/finger/palm condition] then do [some corresponding action]”. It got me thinking that, so far, all of my little experimental programs were the same except for those conditions and actions. And even with those conditions/actions there was a fair amount of, if not actual code duplication, code similarity. I started to wonder how plausible it would be to have some sort of DSL-ish way to describe these. For example, could you use something like YAML and some predefined keywords to describe “Exactly one hand and at least two fingers”?
I’m often skeptical of such things because it can turn out to be a solution in search of a problem. Cleverness for its own sake. However, it’s always other people’s code that is too clever, so I decided to try something. [Note: I am ignoring the fact that “other people” includes my past self, and future self will be mocking current self before too long.]
(By the way, as I write this I’m noticing a sad inconsistency in file-naming style. Forgive me. I blame the language-hopping I’ve been doing.)
There’s a lot going on, which I’m not happy with and have not yet sorted out. I’m still playing. The program loads up the Leap libs, the WebSocket lib, and something called
game-demo-handlers. It is in this file that I defined those conditions and actions.
I wanted to send OSC message to an audio program (Renoise) so I pull in an OSC library (osc-ruby) and set up the OSC client; an OSC client is the thing that sends the OSC message to an OSC server (here, Renoise).
By the way, if you’re unfamiliar with OSC I’m writing a short introductory book on it which you can read here.
The OSC is to trigger a sound corresponding to a game event. There’s no good reason to trigger this over and over on every Leap frame that contains the matching hand/finger conditions, so the program forces a short delay between messages.
All this gets set up in
GamerListener#onInit, which ends by calling
load_handlers, a method defined in
game-demo-handlers.rb. We’ll get to that in just a bit.
onFrame is invoke for each frame of Leap data it simply iterates over an array of handlers and returns. Each handler is in instance of a special class,
ConditionHandler. They’re basically Procs.
The remainder of
GamerListener has helper methods for sending OSC as well using the WebSocket server to push messages to a client (presumably a browser).
I’ve wrapped the sending of the OSC in a thread so that the program can continue to handle additional frames. I need to send both “note on” and “note of” messages and the two-second delay would get in the way.
Unlike the example program in part 1, I skipped creating a special class just to run kick off the program. No special reason other than to show it.
Before I get to the really clever part I have to point out the unclever part. First, I’m unconvinced that every class and module needs its own file. For a small enough program or library putting everything in one file can make life easier. This is also one of the problems with demo programs: they are often small enough that trying to do what might otherwise be The Right Thing (e.g. splitting things up into distinct classes, modules, files, etc.) makes things more, not less, complex and harder to manage.
My breaking apart of my original code was done more to play with an idea that, ideally, would make (more) sense for larger apps. But, perhaps for the good, it introduced some interesting questions.
You’ll note that the
Listener subclass has a few helper methods for sending OSC and WebSocket messages. You might also have noticed that nothing in this class uses those methods. It’s the code in the
GameHandlerConditions module that calls them.
I was about to move those helper methods into that module when I wondered if all this partitioning was actually helping. I had started this with the idea of trying, in some way, to decouple the basic Listener behavior from the specifics of gesture detection and resulting actions. The ultimate goal would be to have a set of
GameHandlerConditions modules that could be mixed in where useful. In that scenario, though, bundling in all these helper methods would be clunky. I’m at the point where I want to add yet another module to hold the helper methods, but it makes me think that, at least for this example, it ends up as overkill.
So I left it as is. It’s a work in progress and I prefer to play around with the code in different forms to see what, in fact, works best for me. I expect that in the end I will break out the helper methods anyway because I expect to have many Leap programs that will send OSC, WebSocket, MIDI, and other messages. I’m just reluctant to make a lot of changes on the premise of “some day …”
With all those caveats let’s look at the condition handler code.
I have to say, there’s nothing quite like writing publicly about your code to make you squint at at and scratch your head. It may not be terribly good for one’s ego, but hopefully it works out well for the code.
ConditionHandler class, as it happens, is just a veneer around a bunch of procs. I’m not going to explain the relationship among blocks, procs, and lambdas. For our purpose just know that if you assign a block to a variable that variable is holding a Proc object and can be invoked using
I think when I was starting this I expected
ConditionHandler to be more complicated, then decide to just use procs to encapsulate stuff. So, if you’re wondering why that class … yeah, well, work in progress.
GameHandlerConditions starts off with a bunch of helper methods and such in order to accommodate the game conditions. The Leap will return X, Y, and Z coordinate values in millimeters over a pretty good range. What I wanted to do was to scale and map that range down to the coordinates of the game screen, and put some bounds on the usable values. The various
normalize methods are there to convert the raw Leap values into usable game-screen locations.
load_handlers is the essential method. Its job is to populate the
@handlers array with a bunch of
ConditionHandler instances. As it happens these could simply be straight-up procs. But, you know, some day I may find that wrapper class useful.
The first condition is something of a tracer bullet. I’ve had times were I think everything is correctly plugged in and running yet get no output from the Leap. I find it handy to get feedback to let me know that something is happening, even if it’s not exactly what I’m looking for.
f in each block used to create a new
ConditionHandler is the
Frame object obtained from the Leap
Controller instance. The code in the block is whatever you might have had inside an
I mentioned before that the message-sending methods are only used by these handlers. Just as with the various value-normalizer methods these other helper methods might belong with this module. Or, both of those should be split out and mixed-in as needed. It’s likely that, some day, I’ll write other programs that will target a screen-size that requires location normalization. When that days arrives I will play with other code arrangements.
The remaining condition handlers are part of the game play. If there is exactly one finger then the location is used to move the main character in the game. (I’ll be describing that game in detail in a future article.) If there are exactly two fingers, then the size of the gap between the finger tips is used to make the game character grow. (The point of the game is to chase a critter then grow big enough to scare it away.)
Over time I’ll sort out a better arrangement. Hopefully what I’ve shown so far will help you with your code, arranged as you think best.
Wrapping up this part
If you run this code, even without the HTML game code, you should see the command-line output of the various condition matches and message dispatching. If you happen to have Renoise, create a song that has some instrument assigned to the first track and make sure you have the Renoise OSC server running; you should hear that instrument triggered when you do the two-finger “grow” gesture. Likewise for the WebSocket messages, if you feel like setting up a Web page to listen for messages and do something with them.