Thread Building Block test drive

After reading the TBB book, I spent half day on playing the lib. My test is simple. Having 2 1000-element arrays, do some math on one array, store the result in the other, swap the arrays and do it again. Run this a few million times. I tried this program on my 2-core machine, the result is just as we expected:

Loop Signal Thread TBB on 2 cores
1,000,000 36.235s 18.418s
3,000,000 106.257 54.638s

In this case, the small change of code worth it well. If this library was available when I was working on my last ray-tracing engine, my life would have been much easier. I wish someday, a similar library would be available on GPU.

I found another interesting thing during my test drive:

unsigned a, b = b0;
unsigned n = <a number you like>;
for(int i = 0; i < LOOP; ++i) {
a = b * b + n;
b = a * a + n;
}
No matter what b0 is, a and b will fall into a pair of fixed numbers (s1, s2), which only depend on value of n; If n is a odd number, then a != b. Otherwise, a = b = one of s1 and s2. I can’t prove it in math but I’ve verified b0=[0, 1024) and n=(0, 1024].

Advertisements

April 28, 2009 at 12:01 am 1 comment

Porting C++ code to Linux kernel

Notes and sample code on porting existing C++ kernel code to Linux

Continue Reading April 5, 2009 at 1:01 am 3 comments

Object in Javascript, a C++ programmer’s point of view

Explain Javascript OOP, which can be confusing for C++ users.

Continue Reading March 20, 2009 at 4:59 am 1 comment

Lock-free programming

Lock-free interprocess communication introduced 4 algorithms which can increase interprocess communication efficient on SMP machines. In fact, A1,2,3 are just light-weight lock implementations, which reduced the overhead of system calls. Since they use busy loop for waiting, they also avoided most of the process switches when they failed to get the lock. A4 is a varied lock-free implementation, which requires readers keep a copy and check consistency. Since writers work on the main copy directly, this avoid the using of a GC. I believe before the lock-free technology get mature, this is the right approach that this technology can benefit us. For example, we can use A4 on optimizing JACKD’s client list. The test results on this paper shows that SMP brings more chance lock intensive process to get congestion. As a result, those process can performs even worse on SMP then on single processor machines.

June 8, 2008 at 9:50 pm Leave a comment

Newer Posts


Categories

  • Blogroll

  • Feeds