mirror of
https://github.com/ioacademy-jikim/debugging
synced 2025-06-08 08:26:14 +00:00
301 lines
14 KiB
Plaintext
301 lines
14 KiB
Plaintext
|
|
/* Make a thread the running thread. The thread must previously been
|
|
sleeping, and not holding the CPU semaphore. This will set the
|
|
thread state to VgTs_Runnable, and the thread will attempt to take
|
|
the CPU semaphore. By the time it returns, tid will be the running
|
|
thread. */
|
|
extern void VG_(set_running) ( ThreadId tid );
|
|
|
|
/* Set a thread into a sleeping state. Before the call, the thread
|
|
must be runnable, and holding the CPU semaphore. When this call
|
|
returns, the thread will be set to the specified sleeping state,
|
|
and will not be holding the CPU semaphore. Note that another
|
|
thread could be running by the time this call returns, so the
|
|
caller must be careful not to touch any shared state. It is also
|
|
the caller's responsibility to actually block until the thread is
|
|
ready to run again. */
|
|
extern void VG_(set_sleeping) ( ThreadId tid, ThreadStatus state );
|
|
|
|
|
|
The master semaphore is run_sema in vg_scheduler.c.
|
|
|
|
|
|
(what happens at a fork?)
|
|
|
|
VG_(scheduler_init) registers sched_fork_cleanup as a child atfork
|
|
handler. sched_fork_cleanup, among other things, reinitializes the
|
|
semaphore with a new pipe so the process has its own.
|
|
|
|
--------------------------------------------------------------------
|
|
|
|
Re: New World signal handling
|
|
From: Jeremy Fitzhardinge <jeremy@goop.org>
|
|
To: Julian Seward <jseward@acm.org>
|
|
Date: Mon Mar 14 09:03:51 2005
|
|
|
|
Well, the big-picture things to be clear about are:
|
|
|
|
1. signal handlers are process-wide global state
|
|
2. signal masks are per-thread (there's no notion of a process-wide
|
|
signal mask)
|
|
3. a signal can be targeted to either
|
|
1. the whole process (any eligable thread is picked for
|
|
delivery), or
|
|
2. a specific thread
|
|
|
|
1 is why it is always a bug to temporarily reset a signal handler (say,
|
|
for SIGSEGV), because if any other thread happens to be sent one in that
|
|
window it will cause havok (I think there's still one instance of this
|
|
in the symtab stuff).
|
|
2 is the meat of your questions; more below.
|
|
3 is responsible for some of the nitty detail in the signal stuff, so
|
|
its worth bearing in mind to understand it all. (Note that even if a
|
|
signal is targeting the whole process, its only ever delivered to one
|
|
particular thread; there's no such thing as a broadcast signal.)
|
|
|
|
While a thread are running core code or generated code, it has almost
|
|
all its signals blocked (all but the fault signals: SEGV, BUS, ILL, etc).
|
|
|
|
Every N basic blocks, each thread calls VG_(poll_signals) to see what
|
|
signals are pending for it. poll_signals grabs the next pending signal
|
|
which the client signal mask doesn't block, and sets it up for delivery;
|
|
it uses the sigtimedwait() syscall to fetch blocked pending signals
|
|
rather than have them delivered to a signal handler. This means that
|
|
we avoid the complexity of having signals delivered asynchronously via
|
|
the signal handlers; we can just poll for them synchronously when
|
|
they're easy to deal with.
|
|
|
|
Fault signals, being caused by a specific instruction, are the exception
|
|
because they can't be held off; if they're blocked when an instruction
|
|
raises one, the kernel will just summarily kill the process. Therefore,
|
|
they need to be always unblocked, and the signal handler is called when
|
|
an instruction raises one of these exceptions. (It's also necessary to
|
|
call poll_signals after any syscall which may raise a signal, since
|
|
signal-raising syscalls are considered to be synchronous with respect to
|
|
their signal; ie, calling kill(getpid(), SIGUSR1) will call the handler
|
|
for SIGUSR1 before kill is seen to complete.)
|
|
|
|
The one time when the thread's real signal mask actually matches the
|
|
client's requested signal mask is while running a blocking syscall. We
|
|
have to set things up to accept signals during a syscall so that we get
|
|
the right signal-interrupts-syscall semantics. The tricky part about
|
|
this is that there's no general atomic
|
|
set-signal-mask-and-block-in-syscall mechanism, so we need to fake it
|
|
with the stuff in VGA_(_client_syscall)/VGA_(interrupted_syscall).
|
|
These two basically form an explicit state machine, where the state
|
|
variable is the instruction pointer, which allows it to determine what
|
|
point the syscall got to when the async signal happens. By keeping the
|
|
window where signals are actually unblocked very narrow, the number of
|
|
possible states is pretty small.
|
|
|
|
This is all quite nice because the kernel does almost all the work of
|
|
determining which thread should get a signal, what the correct action
|
|
for a syscall when it has been interrupted is, etc. Particularly nice
|
|
is that we don't need to worry about all the queuing semantics, and the
|
|
per-signal special cases (which is, roughly, signals 1-32 are not queued
|
|
except when they are, and signals 33-64 are queued except when they aren't).
|
|
|
|
BUT, there's another complexity: because the Unix signal mechanism has
|
|
been overloaded to deal with two separate kinds of events (asynchronous
|
|
signals raised by kill(), and synchronous faults raised by an
|
|
instruction), we can't block a signal for one form and not the other.
|
|
That is, because we have to leave SIGSEGV unblocked for faulting
|
|
instructions, it also leaves us open to getting an async SIGSEGV sent
|
|
with kill(pid, SIGSEGV).
|
|
|
|
To handle this case, there's a small per-thread signal queue set up to
|
|
deal with this case (I'm using tid 0's queue for "signals sent to the
|
|
whole process" - a hack, I'll admit). If an async SIGSEGV (etc) signal
|
|
appears, then it is pushed onto the appropriate queue.
|
|
VG_(poll_signals) also checks these queues for pending signals to decide
|
|
what signal to deliver next. These queues are only manipulated with
|
|
*all* signals blocked, so there's no risk of two concurrent async signal
|
|
handlers modifying the queues at once. Also, because the liklihood of
|
|
actually being sent an async SIGSEGV is pretty low, the queues are only
|
|
allocated on demand.
|
|
|
|
|
|
|
|
There are two mechanisms to prevent disaster if multiple threads get
|
|
signals concurrently. One is that a signal handler is set up to block a
|
|
set of signals while the signal is being delivered. Valgrind's handlers
|
|
block all signals, so there's no risk of a new signal being delivered to
|
|
the same thread until the old handler has finished.
|
|
|
|
The other is that if the thread which recieves the signal is not running
|
|
(ie, doesn't hold the run_sema, which implies it must be waiting for a
|
|
syscall to complete), then the signal handler will grab the run_sema
|
|
before making any global state changes. Since the only time we can get
|
|
an async signal asynchronously is during a blocking syscall, this should
|
|
be all the time. (And since synchronous signals are always the result of
|
|
running an instruction, we should already be holding run_sema.)
|
|
|
|
|
|
Valgrind will occasionally generate signals for itself. These are always
|
|
synchronous faults as a result instruction fetch or something an
|
|
instruction did. The two mechanims are the synth_fault_* functions,
|
|
which are used to signal a problem while fetching an instruction, or by
|
|
getting generated code to call a helper which contains a fault-raising
|
|
instruction (used to deal with illegal/unimplemented instructions and
|
|
for instructions who's only job is to raise exceptions).
|
|
|
|
That all explains how signals come in, but the second part is how they
|
|
get delivered.
|
|
|
|
The main function for this is VG_(deliver_signal). There are three cases:
|
|
|
|
1. the process is ignoring the signal (SIG_IGN)
|
|
2. the process is using the default handler (SIG_DFL)
|
|
3. the process has a handler for the signal
|
|
|
|
In general, VG_(deliver_signal) shouldn't be called for ignored signals;
|
|
if it has been called, it assumes the ignore is being overridden (if an
|
|
instruction gets a SEGV etc, SIG_IGN is ignored and treated as SIG_DFL).
|
|
|
|
VG_(deliver_signal) handles the default handler case, and the
|
|
client-specified signal handler case.
|
|
|
|
The default handler case is relatively easy: the signal's default action
|
|
is either Terminate, or Ignore. We can ignore Ignore.
|
|
|
|
Terminate always kills the entire process; there's no such thing as a
|
|
thread-specific signal death. Terminate comes in two forms: with
|
|
coredump, or without. vg_default_action() will write a core file, and
|
|
then will tell all the threads to start terminating; it then longjmps
|
|
back to the current thread's scheduler loop. The scheduler loop will
|
|
terminate immediately, and the master_tid thread will wait for all the
|
|
others to exit before shutting down the process (this is the same
|
|
mechanism as exit_group).
|
|
|
|
Delivering a signal to a client-side handler modifys the thread state so
|
|
that there's a signal frame on the stack, and the instruction pointer is
|
|
pointing to the handler. The fiddly bit is that there are two
|
|
completely different signal frame formats: old and RT. While in theory
|
|
the exact shape of these frames on stack is abstracted, there are real
|
|
programs which know exactly where various parts of the structures are on
|
|
stack (most notably, g++'s exception throwing code), which is why it has
|
|
to have two separate pieces of code for each frame format. Another
|
|
tricky case is dealing with the client stack running out/overflowing
|
|
while setting up the signal frame.
|
|
|
|
Signal return is also interesting. There are two syscalls, sigreturn
|
|
and rt_sigreturn, which a signal handler will use to resume execution.
|
|
The client will call the right one for the frame it was passed, so the
|
|
core doesn't need to track that state. The tricky part is moving the
|
|
frame's register state back into the thread's state, particularly all
|
|
the FPU state reformatting gunk. Also, *sigreturn checks for new
|
|
pending signals after the old frame has been cleaned up, since there's a
|
|
requirement that all deliverable pending signals are delivered before
|
|
the mainline code makes progress. This means that a program could
|
|
live-lock on signals, but that's what would happen running natively...
|
|
|
|
Another thing to watch for: programs which unwind the stack (like gdb,
|
|
or exception throwers) recognize the existence of a signal frame by
|
|
looking at the code the return address points to: if it is one of the
|
|
two specific signal return sequences, it knows its a signal frame.
|
|
That's why the signal handler return address must point to a very
|
|
specific set of instructions.
|
|
|
|
|
|
What else. Ah, the two internal signals.
|
|
|
|
SIGVGKILL is pretty straightforward: its just used to dislodge a thread
|
|
from being blocked in a syscall, so that we can get the thread to
|
|
terminate in a timely fashion.
|
|
|
|
SIGVGCHLD is used by a thread to tell the master_tid that it has
|
|
exited. However, the only time the master_tid cares about this is when
|
|
it has already exited, and its waiting for everyone else to exit. If
|
|
the master_tid hasn't exited, then this signal is ignored. It isn't
|
|
enough to simply block it, because that will cause a pile of queued
|
|
SIGVGCHLDs to build up, eventually clogging the kernel's signal delivery
|
|
mechanism. If its unblocked and ignored, it doesn't interrupt syscalls
|
|
and it doesn't accumulate.
|
|
|
|
|
|
I hope that helps clarify things. And explain why there's so much stuff
|
|
in there: it's tracking a very complex and arcane underlying set of
|
|
machinery.
|
|
|
|
J
|
|
|
|
--------------------------------------------------------------------
|
|
|
|
>I've been seeing references to 'master thread' around the place.
|
|
>What distinguishes the master thread from the rest? Where does
|
|
>the requirement to have a master thread come from?
|
|
>
|
|
It used to be tid 1, but I had to generalize it.
|
|
|
|
The master_tid isn't very special; its main job is at process shutdown.
|
|
It waits for all the other threads to exit, and then produces all the
|
|
final reports. Until it exits, it's just a normal thread, with no other
|
|
responsibilities.
|
|
|
|
The alternative to having a master thread would be to make whichever
|
|
thread exits last be responsible for emitting all the output. That
|
|
would work, but it would make the results a bit asynchronous (that is,
|
|
if the main thread exits and the other hang around for a while, anyone
|
|
waiting on the process would see it as having exited, but no results
|
|
would have been produced).
|
|
|
|
VG_(master_tid) is a varable to handle the case where a threaded program
|
|
forks. In the first process, the master_tid will be 1. If that program
|
|
creates a few threads, and then, say, thread 3 forks, the child process
|
|
will have a single thread in it. In the child, master_tid will be 3.
|
|
It was easier to make the master thread a variable than to try to work
|
|
out how to rename thread 3 to 1 after a fork.
|
|
|
|
J
|
|
|
|
--------------------------------------------------------------------
|
|
|
|
Re: Fwd: Documentation of kernel's signal routing ?
|
|
From: David Woodhouse <...>
|
|
To: Julian Seward <jseward@acm.org>
|
|
|
|
> Regarding sys_clone created threads. I have a vague idea that
|
|
> there is a notion of 'thread group'. I further understand that if
|
|
> one thread in a group calls sys_exit_group then all threads in that
|
|
> group exit. Whereas if a thread calls sys_exit then just that
|
|
> thread exits.
|
|
>
|
|
> I'm pretty hazy on this:
|
|
|
|
Hmm, so am I :)
|
|
|
|
> * Is the above correct?
|
|
|
|
Yes, I believe so.
|
|
|
|
> * How is thread-group membership defined/changed?
|
|
|
|
By specifying CLONE_THREAD in the flags to clone(), you remain part of
|
|
the same thread group as the parent. In a single-threaded process, the
|
|
thread group id (tgid) is the same as the pid.
|
|
|
|
Linux just has tasks, which sometimes happen to share VM -- and now with
|
|
NPTL we also share other stuff like signals, etc. The 'pid' in Linux is
|
|
what POSIX would call the 'thread id', and the 'tgid' in Linux is
|
|
equivalent to the POSIX 'pid'.
|
|
|
|
> * Do you know offhand how LinuxThreads and NPTL use thread groups?
|
|
|
|
I believe that LT doesn't use the kernel's concept of thread groups at
|
|
all. LT predates the kernel's support for proper POSIX-like sharing of
|
|
anything much but memory, so uses only the CLONE_VM (and possibly
|
|
CLONE_FILES) flags. I don't _think_ it uses CLONE_SIGHAND -- it does
|
|
most of its work by propagating signals manually between threads.
|
|
|
|
NTPL uses thread groups as generated by the CLONE_THREAD flag, which is
|
|
what invokes the POSIX-related thread semantics.
|
|
|
|
> Is it the case that each LinuxThreads threads is in its own
|
|
> group whereas all NTPL threads [in a process] are in a single
|
|
> group?
|
|
|
|
Yes, that's my understanding.
|
|
|
|
--
|
|
dwmw2
|