MESSAGE
DATE | 2015-03-25 |
FROM | prmarino1@gmail.com
|
SUBJECT | Re: [NYLXS - HANGOUT] Threads and Speed
|
Last time I read about Posix threads on Linux there were a few things I noticed.
1) User threads are always CPU core bound
2) Only kernel threads can support utilizing multiple CPU's
3) kernel threads come it two flavors CPU bound and multi-CPU aware and once created can not be changed from one to the other.
4) Many languages such as Java abstract you from the process of creating the threads so it can determine under the hood which type to use.
5) the limit of the maximum cpu's you can utilize in a single process is usually a fixed number determined in the code.
6) the defaults for Posix Kernel threads are different on each OS. One blaring example of this is Solaris vs Linux. Solaris assumes that the programmers know what they are doing when creating threads and will set reasonable caps on the number of CPU's you are using so by default kernel threads on Solaris are multi-CPU aware, in contrast Linux assumes that many of the programmers may be amitures and hobiests there for may not realize they need to put caps on the number of CPU'S utilized so by default they are non multi-CPU aware. This is most noticeable when threading in Perl 5. On Solaris Perl 5 canutilize multiple CPU's assuming you aren't using shared memory, but on Linux it is strictly CPU bound.
Sent from my BlackBerry 10 smartphone.
Original Message
From: Ruben Safir
Sent: Wednesday, March 25, 2015 13:00
To: learn-at-nylxs.com
Reply To: hangout-at-nylxs.com
Subject: [NYLXS - HANGOUT] Threads and Speed
I'm covering threads in the Operating Systems class and the text is in
adequate. But years ago I recall have some terse converations about the
threading model and how overblown I thought it was, and how it killed
programs, slowing them down badly.
A big response back was the discussion of contact overhead. But there
is a lot of overhead with threads as well, and especially kernel level
threads. Today we have multiple cores, so in order to get concurrency
and parallelism with code instruction, threading helps move processes
across multiple nodes and cpus (although I don't know why that can not
inherently be done on a process level). Regardless, the means to
inherent acceleration of threads is still not a proven fact as fara as i
can tell.
So I have this question in the Homework:
4.2 What are two differences between user-level threads and kernel-level
threads? Under what circumstances is one type better than the other?
Well that is a real head scratcher. It is an open ended lazy question
that is given without much thought on the part of the author.
Under what platform? Under what conditions?
But I did do some research and uncovered two decent POVs on this top,
opposed to each other, FWIW
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`
A) For clarity, I usually say "OS-level threads" or "native threads"
instead of "Kernel-level threads" (which I confused with "kernel threads"
in my original answer below.) OS-level threads are created and managed
by the OS. Most languages have support for them. (C, recent Java, etc)
They are extremely hard to use because you are 100% responsible for
preventing problems. In some languages, even the native data structures
(such as Hashes or Dictionaries) will break without extra locking code.
The opposite of an OS-thread is a green thread that is managed by your
language. These threads are given various names depending on the language
(coroutines in C, goroutines in Go, fibers in Ruby, etc). These threads
only exist inside your language and not in your OS. Because the language
chooses context switches (i.e. at the end of a statement), it prevents
TONS of subtle race conditions (such as seeing a partially-copied
structure, or needing to lock most data structures). The programmer sees
"blocking" calls (i.e. data = file.read() ), but the language translates
it into async calls to the OS. The language then allows other green
threads to run while waiting for the result.
Green threads are much simpler for the programmer, but their performance
varies: If you have a LOT of threads, green threads can be are better for
both CPU and RAM. On the other hand, most green thread languages can't
take advantage of multiple cores. (You can't even buy a single-core
computer or phone anymore!). And a bad library can halt the entire
language by doing a blocking OS call.
The best of both worlds is to have one OS thread per CPU, and many green
threads that are magically moved around onto OS threads. Languages like
Go and Erlang can do this.
system calls and other uses not available to user-level threads
This is only half true. Yes, you can easily cause problems if you call the
OS yourself (i.e. do something that's blocking.) But the language usually
has replacements, so you don't even notice. These replacements do call
the kernel, just slightly differently than you think. Kernel threads
vs User Threads
Edit: This is my original answer, but it is about User space threads vs
Kernel-only threads, which (in hindsight) probably wasn't the question.
User threads and Kernel threads are exactly the same. (You can see by
looking in /proc/ and see that the kernel threads are there too.)
A User thread is one that executes user-space code. But it can call
into kernel space at any time. It's still considered a "User" thread,
even though it's executing kernel code at elevated security levels.
A Kernel thread is one that only runs kernel code and isn't associated
with a user-space process. These are like "UNIX daemons", except they are
kernel-only daemons. So you could say that the kernel is a multi-threaded
program. For example, there is a kernel thread for swap. This forces
all swap issues to get "serialized" into a single stream.
If a user thread needs something, it will call into the kernel, which
marks that thread as sleeping. Later, the swap thread finds the data,
so it marks the user thread as runnable. Later still, the "user thread"
returns from the kernel back to userland as if nothing happened.
In fact, all threads start off in kernel space, because the clone()
operation happens in kernel space. (And there's lots of kernel accounting
to do before you can 'return' to a new process in user space.)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`
Now, FWIW, the refernece to Green threads MIGHT be a reference to the SunOS
Green Threads, but maybe not since this response seems up to date on
language
implementations and SunOS has been dead for a while. The Linux Kernel
breaks
all procedures and threads into TASKS that is a kernel data structure.
Now here is the second post: This one is more classical and is likely
what the
text is looking for,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kernel-Level Threads
To make concurrency cheaper, the execution aspect of process is separated
out into threads. As such, the OS now manages threads and processes. All
thread operations are implemented in the kernel and the OS schedules all
threads in the system. OS managed threads are called kernel-level threads
or light weight processes. • NT: Threads • Solaris: Lightweight
processes(LWP).
In this method, the kernel knows about and manages the threads. No runtime
system is needed in this case. Instead of thread table in each process,
the kernel has a thread table that keeps track of all threads in the
system. In addition, the kernel also maintains the traditional process
table to keep track of processes. Operating Systems kernel provides
system call to create and manage threads.
Advantages:
Because kernel has full knowledge of all threads, Scheduler may decide
to give more time to a process having large number of threads than
process having small number of threads. Kernel-level threads are
especially good for applications that frequently block.
Disadvantages:
The kernel-level threads are slow and inefficient. For instance,
threads operations are hundreds of times slower than that of
user-level threads. Since kernel must manage and schedule threads
as well as processes. It require a full thread control block (TCB)
for each thread to maintain information about threads. As a result
there is significant overhead and increased in kernel complexity.
User-Level Threads
Kernel-Level threads make concurrency much cheaper than process
because, much less state to allocate and initialize. However, for
fine-grained concurrency, kernel-level threads still suffer from too
much overhead. Thread operations still require system calls. Ideally, we
require thread operations to be as fast as a procedure call. Kernel-Level
threads have to be general to support the needs of all programmers,
languages, runtimes, etc. For such fine grained concurrency we need still
"cheaper" threads.
To make threads cheap and fast, they need to be implemented at user
level. User-Level threads are managed entirely by the run-time system
(user-level library). The kernel knows nothing about user-level threads
and manages them as if they were single-threaded processes. User-Level
threads are small and fast, each thread is represented by a PC, register,
stack, and small thread control block. Creating a new thread, switching
between threads, and synchronizing threads are done via procedure
call. i.e no kernel involvement. User-Level threads are hundred times
faster than Kernel-Level threads.
Advantages:
The most obvious advantage of this technique is that a user-level
threads package can be implemented on an Operating System that
does not support threads. User-level threads does not require
modification to operating systems. Simple Representation: Each
thread is represented simply by a PC, registers, stack and a small
control block, all stored in the user process address space.
Simple Management: This simply means that creating a thread,
switching between threads and synchronization between threads can
all be done without intervention of the kernel. Fast and Efficient:
Thread switching is not much more expensive than a procedure call.
Disadvantages:
User-Level threads are not a perfect solution as with everything else,
they are a trade off. Since, User-Level threads are invisible to the
OS they are not well integrated with the OS. As a result, OS can
make poor decisions like scheduling a process with idle threads,
blocking a process whose thread initiated an I/O even though the
process has other threads that can run and unscheduling a process with
a thread holding a lock. Solving this requires communication between
between kernel and user-level thread manager. There is a lack of
coordination between threads and operating system kernel. Therefore,
process as whole gets one time slice irrespective of whether process
has one thread or 1000 threads within. It is up to each thread to
relinquish control to other threads. User-level threads requires
non-blocking systems call i.e., a multithreaded kernel. Otherwise,
entire process will blocked in the kernel, even if there are runnable
threads left in the processes. For example, if one thread causes a
page fault, the process blocks.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
So putting the two responses together, you can get a broad look at Threading
topography in the modern system. And together they are more
understandable and
comprehenssive then the assigned textbooks on the subject.
Ruben
|
|