OpenCL, Binary searches, and Matlab

I’ve been programming a lot lately.  A whole lot.  So much in fact that I’ve started accidentally ending some sentences with semicolons.  Previously, I was doing work all of my programming in Matlab for Toshiba.  That got pretty boring pretty quickly.  Now, I do the matlab programming, but I’m also doing a lot of programming for UCLA in C/OpenCL.  That’s WAY more fun.  

The programming language spectrum is kind of like pocket tools and pocket knives.  At one end of the spectrum you’ve got leatherman-style tools with a little bit of everything that one might need.  At the other end you’ve got perfectly made beautiful fixed-blade pocket knives.  For a little bit of context, Matlab is at the leatherman end of the spectrum and languages like C, Fortran, and even assembly are are the knife end of the spectrum.  

Matlab is a really handy thing to have around.  I love it for testing routines that I’m not sure will produce the results that I’m looking for. It’s a like a sandbox.  One of the things that makes it great for this purpose is that the language is interpreted, not compiled.  To you lay-folk out there, this means that the system will translate and execute the machine code as it stumbles across each line whereas compiled program first translates everything into machine code and then executes the program.  The fact that Matlab is interpreted allows for extremely interactive debugging: I can stop a program mid-execution, check on variable values, manipulate them, etc.  I also don’t have to worry about programming overhead, includes, defines, and all the basic, core code that is required to even get a C programming up and running.  Matlab handles all of this for the user.

C and C++ are the other two languages that I’ve worked in in any sort of appreciable way  (I’m just going to say C from now on, but it means both).  I like them. A lot.  With C you get a lot closer to the actual workings of the computer, without necessarily having to understand the architecture of the computer.  C is also compiled which, what you lose in convenience is more than made up for in speed.  

Now that I’m back in the programming mix of things and working in both languages, Matlab is getting frustrating.  My work for Toshiba is developing a full-sized matlab toolbox complete with graphical user interfaces.  This is where matlab starts to get tedious.  All of that overhead that’s taken care of for the user starts to add up.  Matlab, while the code resembles the structure of C, is also written in Java, which I must confess that I do not like and I find to be pretty slow.  All of this adds up to just plain old slow code, when much of what I’m doing should be pretty quick.  That being sad, Matlab has an optimized function for EVERYTHING it seems like.  Need to pad an array? padarray().  Need to reshape an array? reshape().  Need to know the size of an array? size().  Initialize and fill an array with all zeros? zeros().  These are things in C that, while not difficult, require the user to do the programming.  Matlab is also really optimized for linear algebra.  That being said, so are GPUs…

Now we get to OpenCL.  OpenCL is a little rough around the edges it seems like.  It’s a relatively new standard (2009 for version 1.0) for GPU programming.  For those who don’t know, GPU stands for Graphics Processing Unit and is basically a card that lives in your computer that has a bunch of tiny little processors optimized for certain tasks, namely the tasks involved in rendering graphics on screen.  If you search for books on how to program C, you’ll be overwhelmed with choices; if you search for books on OpenCL you’ll be overwhelmed… with about three choices.

I feel pretty strongly that I could get 90% of people up and running with basic programming in Matlab/C/related languages in about about an hour.  The structures are just simpler and once you understand how things work, it’s up to the programmer to develop their vocabulary of how to do things.  It took me two weeks of pure reading, no coding, to get to the point that I could begin to flirt with programming in OpenCL.  As I read in one place though, a lot of that complexity is “boilerplate” and once you’ve got it, it’s nearly copy and paste to get it into your new programs.  But that first go ‘round, it’s a doozy.

OpenCL, when wielded properly, is like a katana.  It’s built for one thing, it’s built perfectly (ok sorta), and it can do so much more than any pocketknife or multitool.  I’ve gotten to the point where the Matlab programming is just chipping away at different things to produce results and I’d much rather be practicing with my katana (I’m still not very good, so you’d better stand back).  Most of the interesting computing concepts in Matlab are masked by the fact that Matlab probably already has them integrated and optimized somewhere in it’s massive toolkit.  

Take, for instance, my modified “quick-sort” algorithm that instead finding where an input value would go in a sorted list:

function [ count ] = quick_table( n,list )
% Assume there aren't multiple places in the list the number could go

if numel(list)>1
    if list(count)>n
    elseif list(count)<n
    elseif list(count)==n
    if n>list
    if n<list
    if n==list

(Note that the indexing is built for Matlab since I was prototyping therein.)

Now a computer scientist looking at this is saying “well that’s just a binary search algorithm.”  True, but I didn’t know that when I wrote it.  The point is, it was super fun coming up with it!  Matlab’s “ismember” function already does this, but I didn’t know that.  I didn’t even know that this was already a CS/algorithms thing.  Truthfully, while I figured someone had to have thought of it before, I thought I might have really be onto something!  I guess I was, just a little late to the game.  If I HAD known what it was though, and if I HAD known how to search for it, odds are I would have just discovered “ismember” and just used that and never stumbled across writing my own, from scratch, and the satisfaction that came along with it.  

These are really the type of computing issues that are fascinating to me and I’m realizing that if I’m working in Matlab, I’m rarely exposed to them.  It’s like learning a bunch of vocabulary without any real knowledge of what the words mean, just what they refer to.

So long story short: OpenCL and C are where it’s at for me these days.  If you’ve got any interesting computer science problems, send them my way.  Just please, no GUI programming in Matlab…