Content

Not quite a Yegge long.

There’s a third “hard thing” about concurrency

Monday 27 April 2009 - Filed under Code

So I’m reading Bartosz Milewski’s latest on extending the D type system to eliminate common causes of errors in multithreaded code, and I come across this fragment:

There are two major challenges in multithreaded programming:

  1. Avoiding races
  2. Preventing deadlocks

While his analysis is correct, from a program-correctness point of view, and he does offer reasonably cogent thoughts on how to solve at least the first of those issues, there’s something still missing that a lot of programmers I meet don’t quite understand. Usually it’s inexperienced programmers who have figured out their threading API of choice and are something like 90% there with getting their juicy nondeterministic crashes ironed out.

Here we go, the point so many seem to miss:

3. It’s all well and good to have a “correct” concurrent program, but unless you actually need all those cycles, GTFO my CPUs.

Or,

3. The objective is not to max out all the CPUs available to you.

There’s plenty of less-widely-discussed literature on the topic of Getting The Hell Off The Processor When You Don’t Need It, but to keep things short:

  1. Every thread should have something to wait on. If you’re waiting for IO, wait on it for god’s sake. If you’re waiting for a signal, wait on it. On POSIX systems, you’ve got select(), poll(), epoll(), and no doubt another half-dozen options. On Win32, you’ve got select() if you’re doing sockets, or WaitForMultipleObjects[Ex] for everything else. And…
  2. If a thread has nothing worthwhile to do, it SHOULD be waiting on something. Nothing to do – get off my runnable queue. I didn’t really intend that, but it works.
  3. Mucking with win32 thread priorities, especially with some half-baked last-millennium notion of “make worker threads low-priority” is almost always a waste of time and lead to tuning headaches in maintenance. The low-priority myth comes from an assumption that your worker threads always have stuff to do – that is, unless you decrease their share of your CPU, they’re going to compromise the responsiveness of your UI, or something. Since points (1) and (2) apply to workers and UI threads in those kinds of programs, that’s pretty bogus. In most cases.
  4. If you call Sleep() under Win32, you’re almost certainly doing it wrong. The scheduler effects are NOT what you probably think they are.

2009-04-27  »  admin

Talkback x 7

  1. popurls.com // popular today
    28 April 2009 @ 1:39 am

    popurls.com // popular today…

    story has entered the popular today section on popurls.com…

  2. lacos
    28 April 2009 @ 1:59 am

    You didn’t explicitly mention the type of resource your worker threads should be able to wait on: synchronization primitives.

    1. Mutexes – this is actually a counter-example; if your threads wait on mutexes, you have lock contention, bad critical-section/noncritical-section ratio etc. Still I think in most cases mutexes are better than spinlocks around atomic variables. (If the latter are defined at all.)

    2. Condition variables (of which the semaphore is a special case), or in other words, monitors. This is your ultra-flexible synchronization primitive because it can block the calling thread (put it to sleep), it supports wakeups, and you get to formulate your predicate to advance the way you like. If you use locked polling in conjunction with sleep(), you should most probably use a condvar instead. However, you have to learn to use them correctly. For example:

    - formulate your predicate in disjunctive normal form
    - if you’re not using pthread_cond_wait() (trying to enter the monitor) inside a loop, you’re doing it wrong,
    - use signal (eg. notify()) instead of broadcast (eg. notifyAll()) only if you have mathematically proven that no more than one thread may wait for the wakeup. If you fail to do this, you’ll starve your worker threads, and it will be your fault.

    … maybe you meant condvars when you wrote “waiting for a signal” — signal is a loaded term, it could mean a UNIX signal.

  3. Ben
    28 April 2009 @ 2:52 am

    You are absolutely right, but I would extend it to all programs. Any program, multi-threaded or not, should GTFO if it does not have anything worthwhile to do. No program of any kind (except a system idle process of course) should max out the CPU if it is not doing anything useful.

  4. ChrisP
    28 April 2009 @ 4:12 am

    Great post, agree with everything except one comment – the win32 thread priorities issue.

    If we take a typical Windows app, with a UI thread and one or more worker thread(s) doing something for us, we want the UI thread to be responsive.

    You seem to be quite focussed on non-cpu instensive worker threads, doing things like waiting on file or network IO.

    As soon as it IS a CPU intensive task where you might actually need worker threads, e.g. media encoding, you do want to drop the thread priority to keep the UI responsive.

    I cannot see a situation where it is harmful to indicate to the OS that we want the UI to be higher priority than a workload going on in the background.

  5. Bob Foster
    28 April 2009 @ 5:38 am

    I was with you up to priorities. You don’t consider cases where you start a thread that you are certain will run immediately and expensively, and you don’t want it to make the UI thread unresponsive. Like the parsing threads that run in IDEs when you make a change to a source file. I know there are people who just don’t believe in thread priorities – as an article of faith – but I hope you will agree they make sense in some cases.

  6. Adam Jaskiewicz
    28 April 2009 @ 7:22 am

    So, basically, “don’t busy wait”? Yeah, that got drilled into me during OS junior year. The objective IS to max out the CPUs… with useful work. If you’re busy waiting, you’re using CPU cycles to check to see if other stuff is done, rather than using CPU cycles to do the other stuff.

  7. SJS
    28 April 2009 @ 10:18 am

    You could just say “Don’t Spin”. And that’s not really hard, even given that so many people can’t seem to manage it.

Share your thoughts

Re: There’s a third “hard thing” about concurrency







Tags you can use (optional):
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>