Jettisoned Facebook

Facebook has been a completely unnecessary part of my life for the past few months.  Today, it is no longer.

Why?  I got an ad which read, “Leaving Google? …” from First Round Capital.   OK – that’s neat; Josh is a clever guy!  But, that is not what I want.  I don’t mind seeing the ad, but I don’t know what information about me Facebook shares with it’s advertisers.  Fortunately, I’ve been a lot less revealing with my information than some people are.  But clearly I already said too much.  Thank god they didn’t have any credit cards!

Bye!

Semaphores vs Events Showdown

I enjoy asking an interview question about threading and queues.  Candidates usually attempt to solve the problem using one of two Windows synchronization primitives – either the Event or the Semaphore.  Sadly, I see a lot more attempts using Events than Semaphores.  Sometimes, I’ll ask, “which is faster on Windows, a semaphore or an event?”  The answer is “it depends”, but most people don’t say that.  Anyway, it is my opinion that Events are awful from a programming model as well as from a performance standpoint.  Let’s find out.

To find out, I had to create a little benchmark.  Of course, any benchmark is contrived, and this one is no different.  Lets create a set of N producers and N consumers putting and getting from a queue.  Then let’s implement basically the same queue using each of 3 types: a semaphore based queue, an event based queue (manual reset events) and an event based queue (auto reset events).

I ran each of these 3 algorithms with 8 concurrent consumers and producers through 20,000 iterations (I did 5 samples, threw out the hi/lo, etc etc):

  Latency Max Queue Depth
Semaphore 284.6ms ~10K
Event (auto-reset) 456.2ms ~150K
Event (manual reset) 297.2ms ~10K

So the conclusion is that for this test, events and semaphores are pretty close.  The semaphore does provide a slight (<10%) performance improvement over the event.  

But it also shows that there is clearly a right and wrong way to use the event; auto-reset being bad, and manual-reset being good.  I suspect this is basically due to the benchmark itself; the benchmark is greedy – the producers all add items to the queue as quickly as possible.  Since usually there is something on the queue, the manual reset event doesn’t need to clear the event nearly as often.  I weakly attempted to add a delay to mix things up, but even a modest delay (1ms) basically makes all of these algorithms behave identically unless you crank the thread count through the roof.

Next, I decided to increase the thread count of producers and consumers to 32 instead of 8.  Now we get this:

  Latency Max Queue Depth
Semaphore 949.6ms ~20K
Event (auto-reset) 1615.6ms ~600K
Event (manual reset) 621.8ms ~165K

Now all you Event fans will tell me to eat crow, clearly the Event is faster, right?  Not quite!  In this case, the manual-reset event code is cheating.  Each time you add a single element to the queue, many consumers awaken because they all see the event as set; since its a greedy benchmark, this is like predicting the future – that an element will be ready by the time the consumer gets through the lock.  It’s coincidental to the benchmark that it works.  But it is interesting to see.  Also note the max queue length now shows a different problem.  The semaphore is reasonably “fair” compared to the Event based implementations.

I still dislike events, however.  And if you don’t yet agree, take a look at the code.  The Semaphore code is incredibly simple compared to the Event based code.   (Is there a simplification I could have made?)  There is no doubt in my mind that most experienced programmers will botch the Event based algorithm.  I’m reasonably confident these solutions are correct, I wouldn’t be completely surprised if some smart reader posts that my implementation is flawed too.

There is one place where Events are useful; if you have a need for synchronization which changes only once (e.g. from set to unset), events are pretty cool.  In the driver for this test program, I used an event to start the test after warming up all the threads.  Events are great for that.  Just don’t use them in loops ever!

Lastly, I’m sure I left out something important. If you’ve got another idea about this, please post a comment!  But don’t bark about my lack of error checking or huge queue lengths.  I did that on purpose.

 

Here is the source code to the Semaphore-based Queue:


#pragma once
#include <assert.h>

class QueueUsingSemaphore {
public:
  QueueUsingSemaphore(int max_items) :
      head_(0),
      tail_(0),
      count_items_(0),
      max_items_(0),
      queue_max_(max_items) {
    items_ = new int[queue_max_];
    InitializeCriticalSection(&lock_);
    semaphore_ = CreateSemaphore(NULL, 0, queue_max_, NULL);
  }

  ~QueueUsingSemaphore() {
    delete items_;
    assert(head_ == tail_);
    printf("Max was %dn", max_items_);
  }

  int Get() {
    int rv;
    WaitForSingleObject(semaphore_, INFINITE);
    EnterCriticalSection(&lock_);
    assert(head_ != tail_);
    rv = items_[head_];
    head_ = (++head_) % queue_max_;
    LeaveCriticalSection(&lock_);
    return rv;
  }
    
  void Put(int val) {
    EnterCriticalSection(&lock_);
    items_[tail_] = val;
    tail_ = (++tail_) % queue_max_;
    assert(head_ != tail_);   // Queue is overflowing.

    int count = tail_ - head_; 
    if (count<0) count += queue_max_;
    if (count > max_items_) { max_items_ = count; } 

    LeaveCriticalSection(&lock_);
    ReleaseSemaphore(semaphore_, 1, NULL);
  }

private:
  HANDLE semaphore_;
  CRITICAL_SECTION lock_;
  int *items_;
  int head_;
  int tail_;
  int count_items_;
  int max_items_;
  int queue_max_;
};

Here is the Event based code (manual reset):


#pragma once
#include <assert.h>

class QueueUsingManualEvent {
public:
  QueueUsingManualEvent(const int max_items) :
      head_(0),
      tail_(0),
      count_items_(0),
      max_items_(0),
      queue_max_(max_items) {
    items_ = new int[queue_max_];
    InitializeCriticalSection(&lock_);
    event_ = CreateEvent(NULL, TRUE, FALSE, NULL);  // Manual-reset event.
  }

  ~QueueUsingManualEvent() {
    delete items_;
    assert(head_ == tail_);
    printf("Max was %dn", max_items_);
  }

  int Get() {
    bool got_one = false;
    int rv;
    while (!got_one) {
      WaitForSingleObject(event_, INFINITE);
      // Note: it is possible to get here and have the queue be empty.
      EnterCriticalSection(&lock_);
      if (head_ == tail_) {
        LeaveCriticalSection(&lock_);
        continue;
      }

      got_one = true;
      rv = items_[head_];
      head_ = (++head_) % queue_max_;
      if (head_ == tail_)
        ResetEvent(event_);  // The queue is now empty.
      LeaveCriticalSection(&lock_);
    }
    return rv;
  }
    
  void Put(int val) {
    EnterCriticalSection(&lock_);
    items_[tail_] = val;
    tail_ = (++tail_) % queue_max_;
    assert(head_ != tail_);   // Queue is overflowing.

    int count = tail_ - head_; 

    if (count<0) count += queue_max_;
    if (count > max_items_) { max_items_ = count; } 

    // It's tempting to move this outside the lock.  However,
    // unlike the semaphore, this cannot be moved outside the 
    // lock because the threads are intentionally racing 
    // on the event notification.
    SetEvent(event_);
    LeaveCriticalSection(&lock_);
  }

private:
  HANDLE event_;
  CRITICAL_SECTION lock_;
  int *items_;
  int head_;
  int tail_;
  int count_items_;
  int max_items_;
  int queue_max_;
};

Here is the source to the driver code:


#pragma once
#include "queue_semaphore.h"
#include "queue_event.h"
#include "queue_event_manual.h"

class Driver {
 public:
  Driver() :
      queue_(kMaxItems) {
    start_event_ = CreateEvent(NULL, TRUE, FALSE, NULL);
  }

  void StartThreads() {
    for (int i=0; i<kMaxProducers; i++)
      threads_[i] = CreateThread(NULL, 0, ProducerMain, this, 0, 0);
    for (int i=0; kMaxConsumersi++)
      threads_[kMaxProducers + i] = CreateThread(NULL, 0, ConsumerMain, this, 0, 0);

    Sleep(500);  // Let the  threads get ready.
    start_ticks_ = GetTickCount();
    SetEvent(start_event_);
  }

  void WaitForFinish() {
    for (int i=0; i<kMaxProducers + kMaxConsumers; i++)
      WaitForSingleObject(threads_[i], INFINITE);
    stop_ticks_ = GetTickCount();
  }

  int TimeTaken() {
    return stop_ticks_ - start_ticks_;
  }

  void Producer() {
    WaitForSingleObject(start_event_, INFINITE);
    for(int i=0; i<kMaxIterations; i++)
      queue_.Put(i);
  }

  void Consumer() {
    WaitForSingleObject(start_event_, INFINITE);
    for (int i=0; i<kMaxIterations; i++)
      int x = queue_.Get();
  }

private:
  static DWORD WINAPI ProducerMain(void *self) {
    Driver* driver = static_cast(self);
    driver->Producer();
    return 0;
  }

  static DWORD WINAPI ConsumerMain(void *self) {
    Driver* driver = static_cast(self);
    driver->Consumer();
    return 0;
  }

  static const int kMaxProducers = 32;
  static const int kMaxConsumers = 32;
  static const int kMaxIterations = 20000;
  static const int kMaxItems = (kMaxProducers + kMaxConsumers)*kMaxIterations;

  HANDLE start_event_;
  //QueueUsingSemaphore queue_;
  //QueueUsingEvent queue_;
  QueueUsingManualEvent queue_;
  HANDLE threads_[kMaxProducers + kMaxConsumers];
  int start_ticks_;
  int stop_ticks_;

};

Washington Based Think Tank

tank A colleague of mine asked me recently,

When you read about a ‘Washington-based Think Tank’, does that mean a lobby?

I hadn’t really thought about this term before, although I knew what they were.  Of course the answer is absolutely yes.  What a nice phrase – “think tank” – must be a bunch of really smart people right?

Well if you need to know more, Google searches tell us everything; it’s just a bunch of special interest groups.

Use Him and Lose Him

Doesn’t it seem awfully coincidental that while Bonds allegedly perjured back in 2003, they waited until 2007 to indict him?  Baseball and the Giants used Bonds to the very end; they got all the press, all the fans, all the hype for baseball by way of his road to the record.  Once that was done, they unleash the Feds on him.

Engineering Graduates and Bad Legislation

We’re not getting enough American college engineering graduates to fill the jobs we’ve got.  We’re in a global economy, and companies like Google and Microsoft are not only doing business for our fellow Americans, but also for the entire globe.  Each of these companies (like most other large US businesses) is generating more revenue outside the US than inside the US.  To continue to build, we need more college-educated Americans ready to dive into tech.  At the same time, in China and India, “they have more honors kids than we have kids“.  Is it any wonder that we need to import so many workers?  Don’t get me wrong; I’m not against international workers; I think it’s fine.  But, America needs to grow its educated workforce if it wants to compete.

At the same time, we’ve got looney politicians like George Miller, introducing legislation to cut funding to schools that don’t police illegal music trading by its students.  WHAT?  Since when do colleges need to do the RIAA’s dirty work in order to get federal funding?  This politician needs to take a broader perspective – he’s just wrong. 

It’s so sad when our politicians are prioritizing the music business ahead of our nation’s future.  We need to make lobbyists criminals.

Buying Appliances Online

This year, I replaced all appliances – refrigerator, microwave, stove, oven, dishwasher, washer, dryer.  It was a lot of work to figure out which ones to buy.

At first, I really did want to buy online.  AJ Madison is a great appliances store, and their website is awesome.  It is definitely one of the best ways to learn about what products are available and compare them.  Combine that with free shipping (they almost always offer free shipping for orders over $1500 and no sales tax in CA, and it seems like a great deal, right?

Well, it is a pretty good deal, but despite all that, they still generally are more expensive.  At the end of the day, they are mailing their appliances across the country.  They hide this by declaring “Free Shipping”, but you know it’s not really free!

I ended up buying most all the appliances at Western Appliance.  I don’t really like having to go to the store.  Appliance stores always make pricing difficult.  Instead of just showing the price, its the base price minus the in-store rebate minus the manufacturer rebate, minus the weekend special, etc.  I guess they think most of their customers can’t do basic math?  It seems to backfire, because it makes the prices seem inflated.  If they didn’t inflate the price, I never would have thought the e-store price was cheap.  Western Appliance’s salespeople are pretty reasonable and not high pressure, and they are knowledgeable too.  In the end, I know I paid less, even after Western Appliance’s $50 delivery fee and CA sales tax. 

There is one more important reason to shop locally.  The local stores know the PG&E/Water company rebates for energy efficient and water-saving appliances.  AJMadison, or any other e-tailer, just can’t possibly keep track of local area rebates for everyone.  These rebates are non trivial too; I’ve had about $400 in utility rebates this year.  If I hadn’t bought locally, I probably wouldn’t have known.

Canada’s Plot to Increase Global Warming

News is now coming to light that Canada has been investing heavily in anything and everything which increases global warming.  As we all know, much of the land in the innocent country to our north is unusable, icy tundra.  As global warming continues, Canada hopes to double its usable land area, bringing it a windfall of increased natural resources, more desirable weather, and untold riches.

One researcher stated that due to the weather changes, “Vancouver could become the Los Angeles”.

I’ve always known that Canadians have beady little eyes.  I guess we now know why.

My Favorite Geek Resources

If you write a lot of software and have your own stuff that you host and tinker with, here are some recommendations:

1) ServerBeach!  ServerBeach provides dedicated server hosting.  Need a place to hack on perl, python, apache, and php?  These guys have great bandwidth, great support, and zero downtime.  I pay $99/mo.  Checkout my uptime:

[mbelshe@s1]$ uptime
 11:02:04 up 422 days,  3:32,  2 users,  load average: 1.41, 1.07, 0.82

2) Code Project.  This is just a great site for collecting articles about random coding topics, with concrete examples.  I read the RSS feed daily.  Sure, there are people that post weak articles, but reading this daily gets your mind thinking about topics you wouldn’t have otherwise considered.

Daunting Installers

wlinstI read the news about the latest Windows Live Suite from Microsoft today.  I hadn’t tried the “Family Safety OneCare” product, so I thought I’d give it a shot.

Then I clicked on the installer. (Click to enlarge)

Then I said, “ugh.”

I really just wanted to try OneCare.  But the installer offers up 9 different checkboxes of “stuff” for me to download.  This wouldn’t be a problem except the 8 of them are checked – requiring me to uncheck them.  And, in the world’s most ironic part, the Windows Live OneCare product is the *only* one of the bunch which isn’t checked.

Am I too sensitive to unchecking boxes?

When They Take Away the Full RSS Feed…

This has happened a few times to me.  I find a blog I enjoy, and I read it for some time.  Maybe a year or more.  Then, one day, the author of the blog stops posting full-article feeds.  What do you do?

In my case, I have always stopped reading.  There’s just too much content out there to go following every link.  In order to follow a link, I need to be searching for something, and also be under the belief that the link will help me.  When browsing through the day’s news, I am not searching for anything specific.  So naturally, I won’t click the link.

This happened to me with some zdnet blogs a while ago (actually, I think they never published a full feed?), and this month it happened with Kawasaki’s blog.  I sent a nice email to Guy – but we’ll see if he responds; I’m sure he gets a lot of junk.

What do you do?  I bet few people follow the clicks.