Monday, May 30, 2011

Neat Coding Trick # 3

Here's the final useful coding trick I learned last week. While Matlab is unparalleled for rapid prototyping, there are any number of reasons why you might want to deploy your code in another language such as C/C++. For example, you might be targeting an embedded device that requires C/C++. Or you might just want to port and compile certain functions you've written in Maltab. This would be the case if you've written a function that is slow to execute in Matlab but that will run very quickly once compiled. Just about anything that is loop-intensive will fall into this category.

For a while now I've known about "mex" functions. Mex functions are C/C++ routines which you compile at the Matlab command line; mex functions must include a special function called mexFunction which specifies how data is passed back and forth between Matlab and the C routine. Once the code is compiled, you call the function from Matlab just as if it were a Matlab function and hopefully you realize huge savings in execution time. This technique works pretty well but the downsides are (a) You have to write the code yourself from scratch and (b) you have to write the mexFunction interface which is tricky at best.

Happily, I discovered "emlc"which generates C code from Matlab code. Here's how it works.

  1. Write a function in Matlab
  2. Use emlc to convert that code to C from the command line.
Wasn't that easy? There are a few important options that are worth being aware of. Suppose your function is "myFunc.m". You could go to the command line and type:
  • emlc -T MEX myFunc
In this case,  your myFunc.m would be converted into myFunc.mexmaci64 (or some other extension depending on your operating system). Your code is automatically ported over and compiled. So you can do all your prototyping in Matlab and then use this simple command to get a high speed executable.

If you are targeting an embedded system and you just need the raw C code, you can instead try using:
  • emlc -T RTW
There are a host of options; I recommend reading the emlc documentation to be aware of all the options.

Here are a couple of useful tricks I've learned by trial and error. Most importantly, whereas Matlab will dynamically allocate memory for you on the fly, C/C++ is not set up to do the same and must be explicitly instructed on how to handle memory. For example, suppose you've written a function that takes as input an array "y" which will have two elements in it. Whereas Matlab will give you the benefit of the doubt that "y" will in fact have two elements at run time, C will not. To avoid a compile-time error, you must include the following statement just after your Matlab function heading:
  • assert(all(size(y)==[2,1]));
This command tells the compiler to throw a run-time error if the "y" input to the function is not exactly of dimensions two rows and one column.

Along the same lines, you must also pre-allocate all output variables. If your output variable is "dy" and you expect it to have two elements, you can't just say:
  • dy(1) =  ... ;
  • dy(2) =  ... ;
Instead, you have to say:
  • dy = zeros(2,1); % preallocate
  • dy(1) =  ... ;
  • dy(2) =  ... ;
I also learned that its not good enough to preallocate with an indefinite number. For example, this will cause a compile-time error:
  • function dy = myFunc(y,tmax)
  • int nPts = tmax/0.001;
  • dy = zeros(nPts,1);
The problem is that the compiler has no idea how many points nPts will be. The workaround is to preallocate up to some maximum worst-case array length and then to truncate it as necessary:
  • function dy = myFunc(y,tmax)
  • int nPts = tmax/0.001;
  • int ptsMax = 1e6;
  • dy = zeros(ptsMax,1);
  • dy(1) = ... ;
  • dy(2) = ... ;
  • dy = dy(1:nPts);
Finally, you need to include the following comment next to your function heading: %#eml. This facilitates the creation of the compilation report:
  • function dy = myFunc(y,tmax) %#eml
Just for kicks, I coded up a single-cell Hodgkin-Huxley simulation using Forward Euler. The point was to create something slow and lumbering so that I could really see the speed difference. You can download the mFile by clicking here.

I used the following command at the Matlab prompt:
  • clear; tic; [v,t] = hodgkin_huxley(10,1,100); toc; plot(t,v);
Over five trials, the average elapsed time is 62ms.

Then I converted the m file to a mex file using the steps outlined above and reran it using exactly the same command line command:
  • emlc -T MEX hodgkin_huxley
  • clear; tic; [v,t] = hodgkin_huxley(10,1,100); toc; plot(t,v);
Now the average elapsed time is 14ms. Thats an execution-time reduction of over 75%, and all it cost me was typing in a single line a the command prompt.

No comments:

Post a Comment