bianbu-linux-6.6

mirror of https://gitee.com/bianbu-linux/linux-6.6 synced 2025-07-24 01:54:03 -04:00

Author	SHA1	Message	Date
Ingo Molnar	b6d9842258	Merge branch 'sched/cleanups'; commit 'v2.6.29' into sched/core	2009-03-25 10:26:51 +01:00
Steven Rostedt	a2a16d6a31	function-graph: add option to calculate graph time or not graph time is the time that a function is executing another function. Thus if function A calls B, if graph-time is set, then the time for A includes B. This is the default behavior. But if graph-time is off, then the time spent executing B is subtracted from A. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:41:11 -04:00
Steven Rostedt	cafb168a1c	tracing: make the function profiler per cpu Impact: speed enhancement By making the function profiler record in per cpu data we not only get better readings, avoid races, we also do not have to take any locks. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:41:10 -04:00
Steven Rostedt	0706f1c48c	tracing: adding function timings to function profiler If the function graph trace is enabled, the function profiler will use it to take the timing of the functions. cat /debug/tracing/trace_stat/functions Function Hit Time -------- --- ---- mwait_idle 127 183028.4 us schedule 26 151997.7 us __schedule 31 151975.1 us sys_wait4 2 74080.53 us do_wait 2 74077.80 us sys_newlstat 138 39929.16 us do_path_lookup 179 39845.79 us vfs_lstat_fd 138 39761.97 us user_path_at 153 39469.58 us path_walk 179 39435.76 us __link_path_walk 189 39143.73 us [...] Note the times are skewed due to the function graph tracer not taking into account schedules. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:41:09 -04:00
Steven Rostedt	493762fc53	tracing: move function profiler data out of function struct Impact: reduce size of memory in function profiler The function profiler originally introduces its counters into the function records itself. There is 20 thousand different functions on a normal system, and that is adding 20 thousand counters for profiling event when not needed. A normal run of the profiler yields only a couple of thousand functions executed, depending on what is being profiled. This means we have around 18 thousand useless counters. This patch rectifies this by moving the data out of the function records used by dynamic ftrace. Data is preallocated to hold the functions when the profiling begins. Checks are made during profiling to see if more recorcds should be allocated, and they are allocated if it is safe to do so. This also removes the dependency from using dynamic ftrace, and also removes the overhead by having it enabled. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:41:06 -04:00
Steven Rostedt	bac429f037	tracing: add function profiler Impact: new profiling feature This patch adds a function profiler. In debugfs/tracing/ two new files are created. function_profile_enabled - to enable or disable profiling trace_stat/functions - the profiled functions. For example: echo 1 > /debugfs/tracing/function_profile_enabled ./hackbench 50 echo 0 > /debugfs/tracing/function_profile_enabled yields: cat /debugfs/tracing/trace_stat/functions Function Hit -------- --- _spin_lock 10106442 _spin_unlock 10097492 kfree 6013704 _spin_unlock_irqrestore 4423941 _spin_lock_irqsave 4406825 __phys_addr 4181686 __slab_free 4038222 dput 4030130 path_put 4023387 unroll_tree_refs 4019532 [...] The most hit functions are listed first. Functions that are not hit are not listed. This feature depends on and uses dynamic function tracing. When the function profiling is disabled, no overhead occurs. But it still takes up around 300KB to hold the data, thus it is not recomended to keep it enabled for systems low on memory. When a '1' is echoed into the function_profile_enabled file, the counters for is function is reset back to zero. Thus you can see what functions are hit most by different programs. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:40:00 -04:00
Steven Rostedt	425480081e	tracing: add handler to trace_stat Currently, if a trace_stat user wants a handle to some private data, the trace_stat infrastructure does not supply a way to do that. This patch passes the trace_stat structure to the start function of the trace_stat code. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 23:22:58 -04:00
Jason Baron	e9d376f0fa	dynamic debug: combine dprintk and dynamic printk This patch combines Greg Bank's dprintk() work with the existing dynamic printk patchset, we are now calling it 'dynamic debug'. The new feature of this patchset is a richer /debugfs control file interface, (an example output from my system is at the bottom), which allows fined grained control over the the debug output. The output can be controlled by function, file, module, format string, and line number. for example, enabled all debug messages in module 'nf_conntrack': echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control to disable them: echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control A further explanation can be found in the documentation patch. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-03-24 16:38:26 -07:00
Luis Henriques	67aa0f767a	sched: remove unused fields from struct rq Impact: cleanup, new schedstat ABI Since they are used on in statistics and are always set to zero, the following fields from struct rq have been removed: yld_exp_empty, yld_act_empty and yld_both_empty. Both Sched Debug and SCHEDSTAT_VERSION versions has also been incremented since ABIs have been changed. The schedtop tool has been updated to properly handle new version of schedstat: http://rt.wiki.kernel.org/index.php/Schedtop_utility Signed-off-by: Luis Henriques <henrix@sapo.pt> Acked-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090324221002.GA10061@hades.domain.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 23:16:51 +01:00
Lai Jiangshan	ee000b7f9f	tracing: use union for multi-usages field Impact: cleanup struct dyn_ftrace::ip has different usages in his lifecycle, we use union for it. And also for struct dyn_ftrace::flags. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <49C871BE.3080405@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 16:43:12 +01:00
Lai Jiangshan	cc59c9e8d0	ftrace: show virtual PID Impact: fix PID output under namespaces When current namespace is not the global namespace, pid read from set_ftrace_pid is no correct. # ~/newpid_namespace_run bash # echo $$ 1 # echo 1 > set_ftrace_pid # cat set_ftrace_pid 3756 Since we write virtual PID to set_ftrace_pid, we need get virtual PID when we read it. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <49C84D65.9050606@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 16:42:49 +01:00
Steven Rostedt	be6f164a02	function-graph: add option for include sleep times Impact: give user a choice to show times spent while sleeping The user may want to see the time a function spent sleeping. This patch adds the trace option "sleep-time" to allow that. The "sleep-time" option is default on. echo sleep-time > /debug/tracing/trace_options produces: ------------------------------------------ 2) avahi-d-3428 => <idle>-0 ------------------------------------------ 2) \| finish_task_switch() { 2) 0.621 us \| _spin_unlock_irq(); 2) 2.202 us \| } 2) ! 1002.197 us \| } 2) ! 1003.521 us \| } where as, echo nosleep-time > /debug/tracing/trace_options produces: 0) <idle>-0 => yum-upd-3416 ------------------------------------------ 0) \| finish_task_switch() { 0) 0.643 us \| _spin_unlock_irq(); 0) 2.342 us \| } 0) + 41.302 us \| } 0) + 42.453 us \| } Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 11:06:24 -04:00
Steven Rostedt	8aef2d2856	function-graph: ignore times across schedule Impact: more accurate timings The current method of function graph tracing does not take into account the time spent when a task is not running. This shows functions that call schedule have increased costs: 3) + 18.664 us \| } ------------------------------------------ 3) <idle>-0 => kblockd-123 ------------------------------------------ 3) \| finish_task_switch() { 3) 1.441 us \| _spin_unlock_irq(); 3) 3.966 us \| } 3) ! 2959.433 us \| } 3) ! 2961.465 us \| } This patch uses the tracepoint in the scheduling context switch to account for time that has elapsed while a task is scheduled out. Now we see: ------------------------------------------ 3) <idle>-0 => edac-po-1067 ------------------------------------------ 3) \| finish_task_switch() { 3) 0.685 us \| _spin_unlock_irq(); 3) 2.331 us \| } 3) + 41.439 us \| } 3) + 42.663 us \| } Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 09:33:30 -04:00
Steven Rostedt	05ce5818ad	function-graph: prevent more than one tracer registering Impact: prevent crash due to multiple function graph tracers The function graph tracer can currently only handle a single tracer being registered. If another tracer registers with the function graph tracer it can crash the system. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 09:32:52 -04:00
Steven Rostedt	5d1a03dc54	function-graph: moved the timestamp from arch to generic code This patch move the timestamp from happening in the arch specific code into the general code. This allows for better control by the tracer to time manipulation. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 09:31:34 -04:00
Steven Rostedt	098335215a	tracing: fix memory leak in trace_stat If the function profiler does not have any items recorded and one were to cat the function stat file, the kernel would take a BUG with a NULL pointer dereference. Looking further into this, I found that returning NULL from stat_start did not stop the stat logic, and would later call stat_next. This breaks from the way seq_file works, so I looked into fixing the stat code. This is where I noticed that the last next_entry is never freed. It is allocated, and if the stat_next returns NULL, the code breaks out of the loop, unlocks the mutex and exits. We never link the next_entry nor do we free it. Thus it is a real memory leak. This patch rearranges the code a bit to not only fix the memory leak, but also to act more like seq_file where nothing is printed if there is nothing to print. That is, stat_start returns NULL. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-03-24 09:07:35 -04:00
Li Zefan	093419971e	blktrace: print human-readable act_mask Impact: new feature, allow symbolic values in /debug/tracing/act_mask Print stringified act_mask instead of hex value: # cat act_mask read,write,barrier,sync,queue,requeue,issue,complete,fs,pc,ahead,meta, discard,drv_data # echo "meta,write" > act_mask # cat act_mask write,meta Also: - make act_mask accept "ahead", "meta", "discard" and "drv_data" - use strsep() instead of strchr() to parse user input - return -EINVAL if a token is not found in the mask map - fix a bug that 'value' is unsigned, so it can < 0 - propagate error value of blk_trace_mask2str() to userspace, but not always return -ENXIO. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49C8AB42.1000802@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 13:09:00 +01:00
Li Zefan	e0dc81bec0	blktrace: fix t_error() Impact: fix error flag output t_error() should return t->error but not t->sector. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49C8945F.5020802@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 13:09:00 +01:00
Li Zefan	65796348e0	blktrace: fix wrong calculation of RWBS Impact: fix the output of IO type category characters Trace categories are the upper 16 bits, not the lower 16 bits. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49C89432.8010805@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 13:08:59 +01:00
Li Zefan	e4955c9986	blktrace: mark ddir_act[] const Impact: cleanup ddir_act and what2act always stay immutable. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49C89415.5080503@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 13:08:59 +01:00
Thomas Gleixner	f48fe81e5b	genirq: threaded irq handlers review fixups Delta patch to address the review comments. - Implement warning when IRQ_WAKE_THREAD is requested and no thread handler installed - coding style fixes Pointed-out-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-03-24 12:15:23 +01:00
Arjan van de Ven	935bd5b971	genirq: add support for threaded interrupts to devres Some devices use devres_request_irq() for to install their interrupt handler. Add support for threaded interrupts to devres as well. [tglx - simplified and adapted to latest threadirq version] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-03-24 12:15:23 +01:00
Thomas Gleixner	3aa551c9b4	genirq: add threaded interrupt handler support Add support for threaded interrupt handlers: A device driver can request that its main interrupt handler runs in a thread. To achive this the device driver requests the interrupt with request_threaded_irq() and provides additionally to the handler a thread function. The handler function is called in hard interrupt context and needs to check whether the interrupt originated from the device. If the interrupt originated from the device then the handler can either return IRQ_HANDLED or IRQ_WAKE_THREAD. IRQ_HANDLED is returned when no further action is required. IRQ_WAKE_THREAD causes the genirq code to invoke the threaded (main) handler. When IRQ_WAKE_THREAD is returned handler must have disabled the interrupt on the device level. This is mandatory for shared interrupt handlers, but we need to do it as well for obscure x86 hardware where disabling an interrupt on the IO_APIC level redirects the interrupt to the legacy PIC interrupt lines. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 12:15:23 +01:00
Tom Zanussi	9f58a159d0	tracing/filters: disallow integer values for string filters and vice versa Impact: fix filter use boundary condition / crash Make sure filters for string fields don't use integer values and vice versa. Getting it wrong can crash the system or produce bogus results. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237878882.8339.61.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 08:26:52 +01:00
Tom Zanussi	4bda2d517b	tracing/filters: use trace_seq_printf() to print filters Impact: cleanup Instead of just using the trace_seq buffer to print the filters, use trace_seq_printf() as it was intended to be used. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237878871.8339.59.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 08:26:52 +01:00
Tom Zanussi	09f1f245c7	tracing/filters: free pred when clearing filters Impact: fix (small) per trace filter modification memory leak Free the current pred when clearing the filters via the filter files. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237878851.8339.58.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 08:26:51 +01:00
Tom Zanussi	1fc2d5c119	tracing/filters: use list_for_each_entry Impact: cleanup No need to use the safe version here, so use list_for_each_entry instead of list_for_each_entry_safe in find_event_field(). Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237878841.8339.57.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-24 08:26:51 +01:00
Benjamin Herrenschmidt	9e41d9597e	Merge commit 'origin/master' into next	2009-03-24 13:38:30 +11:00
James Morris	703a3cd728	Merge branch 'master' into next	2009-03-24 10:52:46 +11:00
Frederic Weisbecker	1618536961	tracing/function-graph-tracer: fix functions call traces imbalance Impact: fix traces output Sometimes one can observe an imbalance in the traces between function calls and function return traces: func1() { } } The curly brace inside func1() is the return of another function nested inside func1. The return trace have been inserted in the buffer but not the entry. We are storing a return address on the function traces stack while we haven't inserted its entry on the buffer, hence the imbalance on the traces. This is because the tracers doesn't check all failures that can happen on buffer insertion. This patch reports the tracing recursion failures and the ring buffer failures. In such cases, we now restore the original return address for the function, giving up its return trace. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1237843021-11695-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 23:25:32 +01:00
Anton Vorontsov	45b9560895	tracing: Fix TRACING_SUPPORT dependency for PPC32 commit `40ada30f96` ("tracing: clean up menu"), despite the "clean up" in its purpose, introduced a behavioural change for Kconfig symbols: we no longer able to select tracing support on PPC32 (because IRQFLAGS_SUPPORT isn't yet implemented). The IRQFLAGS_SUPPORT is not mandatory for most tracers, tracing core has a special case for platforms w/o irqflags (which, by the way, has become useless as of the commit above). Though according to Ingo Molnar, there was periodic build failures on weird, unmaintained architectures that had no irqflags-tracing support and hence didn't know the raw_irqs_save/restore primitives. Thus we'd better not enable irqflags-less tracing for all architectures. This patch restores the old behaviour for PPC32, and thus brings the tracing back. Other architectures can either add themselves to the exception list or (better) implement TRACE_IRQFLAGS_SUPPORT. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-b: Steven Rostedt <rostedt@goodmis.org> Cc: linuxppc-dev@ozlabs.org LKML-Reference: <20090323220724.GA9851@oksana.dev.rtsoft.ru> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 23:23:03 +01:00
Thomas Gleixner	80c5520811	Merge branch 'cpus4096' into irq/threaded Conflicts: arch/parisc/kernel/irq.c kernel/irq/handle.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-03-23 21:20:20 +01:00
Oleg Nesterov	37bebc70d7	posix timers: fix RLIMIT_CPU && fork() See http://bugzilla.kernel.org/show_bug.cgi?id=12911 copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus fastpath_timer_check() returns false unless we have other cpu timers. This is the minimal fix for 2.6.29 (tested) and 2.6.28. The patch is not optimal, we need further cleanups here. With this patch update_rlimit_cpu() is not really needed, but I don't think it should be removed. The proper fix (I think) is: - set_process_cpu_timer() should just start the cputimer->running logic (it does), no need to change cputime_expires.xxx_exp - posix_cpu_timers_init_group() should set ->running when needed - fastpath_timer_check() can check ->running instead of task_cputime_zero(signal->cputime_expires) Reported-by: Peter Lojkin <ia6432@inbox.ru> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roland McGrath <roland@redhat.com> Cc: <stable@kernel.org> [for 2.6.29.x] LKML-Reference: <20090323193411.GA17514@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 20:43:35 +01:00
Miklos Szeredi	53da1d9456	fix ptrace slowness This patch fixes bug #12208: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208 Subject : uml is very slow on 2.6.28 host This turned out to be not a scheduler regression, but an already existing problem in ptrace being triggered by subtle scheduler changes. The problem is this: - task A is ptracing task B - task B stops on a trace event - task A is woken up and preempts task B - task A calls ptrace on task B, which does ptrace_check_attach() - this calls wait_task_inactive(), which sees that task B is still on the runq - task A goes to sleep for a jiffy - ... Since UML does lots of the above sequences, those jiffies quickly add up to make it slow as hell. This patch solves this by not rescheduling in read_unlock() after ptrace_stop() has woken up the tracer. Thanks to Oleg Nesterov and Ingo Molnar for the feedback. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-03-23 09:22:31 -07:00
Ingo Molnar	efd247fa34	Merge branches 'sched/debug' and 'linus' into sched/core	2009-03-23 16:53:20 +01:00
Frederic Weisbecker	3e1f60b80c	tracing/ftrace: check if debugfs is registered before creating files Impact: fix a crash with ftrace={nop,boot} parameter If the nop or initcall tracers are launched as boot tracers, they will attempt to create their option directory and files. But these tracers are registered very early and then assigned as "boot tracers" very early if asked to. Since they do this before debugfs has been registered (core initcall), a crash is triggered. Another early tracers could also come later. So we fix it by checking if debugfs is initialized before creating the root tracing directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1237759847-21025-3-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 16:25:47 +01:00
Ingo Molnar	b3e3b302cf	Merge branches 'irq/sparseirq' and 'linus' into irq/core	2009-03-23 10:07:49 +01:00
Tom Zanussi	c4cff064be	tracing/filters: clean up filter_add_subsystem_pred() Impact: cleanup, memory leak fix This patch cleans up filter_add_subsystem_pred(): - searches for the field before creating a copy of the pred - fixes memory leak in the case a predicate isn't applied - if -ENOMEM, makes sure there's no longer a reference to the pred so the caller can free the half-finished filter - changes the confusing i == MAX_FILTER_PRED - 1 comparison previously remarked upon This affects only per-subsystem event filtering. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237796808.7527.40.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:30:37 +01:00
Tom Zanussi	ee6cdabc82	tracing/filters: fix bug in copy_pred() Impact: fix potential crash on subsystem filter expression freeing When making a copy of the predicate, pred->field_name needs to be duplicated in the copy as well, otherwise bad things can happen due to later multiple frees of the same string. This affects only per-subsystem event filtering. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: =?ISO-8859-1?Q?Fr=E9d=E9ric?= Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237796802.7527.39.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:30:36 +01:00
Tom Zanussi	75c8b41752	tracing/filters: use list_for_each_entry_safe Impact: cleanup Use list_for_each_entry_safe instead of list_for_each_entry in find_event_field(). Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1237796788.7527.35.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:28:07 +01:00
Frederic Weisbecker	b118415bfa	tracing/events: don't discard an event after commit When we want to filter an event, the filter test is done after the event is commited to the ring-buffer to be discarded later if needed. But a reader could be reading this event while we are trying to discard it. Other kind of racy events can even happen because the event is commited and can be read and/or consumed. What we want is to discard the event before committing it. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1237763919-21505-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:22:15 +01:00
Frederic Weisbecker	7e6ea92df3	tracing/ftrace: make nop-tracer use polling wait for events on pipe Impact: display events when they arrive Now that the events don't use wake_up() anymore, we need the nop tracer to poll waiting for events on the pipe. Especially because nop is useful to look at orphan traces types (traces types that don't rely on specific tracers) because it doesn't produce traces itself. And unlike other tracers that trigger specific traces periodically, nop triggers no traces by itself that can wake him. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1237759847-21025-5-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:22:15 +01:00
Frederic Weisbecker	07edf71213	tracing/events: don't use wake up for events Impact: fix hard-lockup with sched switch events Some ftrace events, such as sched wakeup, can be traced while the runqueue lock is hold. Since they are using trace_current_buffer_unlock_commit(), they call wake_up() which can try to grab the runqueue lock too, resulting in a deadlock. Now for all event, we call a new helper: trace_nowake_buffer_unlock_commit() which do pretty the same than trace_current_buffer_unlock_commit() except than it doesn't call trace_wake_up(). Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1237759847-21025-4-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:22:14 +01:00
Frederic Weisbecker	9bd7d099ab	tracing/events: make the filter files writable We need the filter files to be writable, the current filter file permissions are only set readable. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1237759847-21025-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-23 09:22:14 +01:00
Ingo Molnar	fe9f57f250	tracing: add run-time field descriptions for event filtering, kfree fix Impact: fix potential kfree of random data in (rare) failure path Zero-initialize the field structure. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1237710639.7703.46.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:43:25 +01:00
Tom Zanussi	cfb180f3e7	tracing: add per-subsystem filtering This patch adds per-subsystem filtering to the event tracing subsystem. It adds a 'filter' debugfs file to each subsystem directory. This file can be written to to set filters; reading from it will display the current set of filters set for that subsystem. Basically what it does is propagate the filter down to each event contained in the subsystem. If a particular event doesn't have a field with the name specified in the filter, it simply doesn't get set for that event. You can verify whether or not the filter was set for a particular event by looking at the filter file for that event. As with per-event filters, compound expressions are supported, echoing '0' to the subsystem's filter file clears all filters in the subsystem, etc. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237710677.7703.49.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:38:47 +01:00
Tom Zanussi	7ce7e42499	tracing: add per-event filtering This patch adds per-event filtering to the event tracing subsystem. It adds a 'filter' debugfs file to each event directory. This file can be written to to set filters; reading from it will display the current set of filters set for that event. Basically, any field listed in the 'format' file for an event can be filtered on (including strings, but not yet other array types) using either matching ('==') or non-matching ('!=') 'predicates'. A 'predicate' can be either a single expression: # echo pid != 0 > filter # cat filter pid != 0 or a compound expression of up to 8 sub-expressions combined using '&&' or '\|\|': # echo comm == Xorg > filter # echo "&& sig != 29" > filter # cat filter comm == Xorg && sig != 29 Only events having field values matching an expression will be available in the trace output; non-matching events are discarded. Note that a compound expression is built up by echoing each sub-expression separately - it's not the most efficient way to do things, but it keeps the parser simple and assumes that compound expressions will be relatively uncommon. In any case, a subsequent patch introducing a way to set filters for entire subsystems should mitigate any need to do this for lots of events. Setting a filter without an '&&' or '\|\|' clears the previous filter completely and sets the filter to the new expression: # cat filter comm == Xorg && sig != 29 # echo comm != Xorg # cat filter comm != Xorg To clear a filter, echo 0 to the filter file: # echo 0 > filter # cat filter none The limit of 8 predicates for a compound expression is arbitrary - for efficiency, it's implemented as an array of pointers to predicates, and 8 seemed more than enough for any filter... Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1237710665.7703.48.camel@charm-linux> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:38:46 +01:00
Tom Zanussi	2d622719f1	tracing: add ring_buffer_event_discard() to ring buffer This patch overloads RINGBUF_TYPE_PADDING to provide a way to discard events from the ring buffer, for the event-filtering mechanism introduced in a subsequent patch. I did the initial version but thanks to Steven Rostedt for adding the parts that actually made it work. ;-) Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:38:25 +01:00
Dmitri Vorobiev	b8b9426533	tracing: fix four sparse warnings Impact: cleanup. This patch fixes the following sparse warnings: kernel/trace/trace.c:385:9: warning: symbol 'trace_seq_to_buffer' was not declared. Should it be static? kernel/trace/trace_clock.c:29:13: warning: symbol 'trace_clock_local' was not declared. Should it be static? kernel/trace/trace_clock.c:54:13: warning: symbol 'trace_clock' was not declared. Should it be static? kernel/trace/trace_clock.c:74:13: warning: symbol 'trace_clock_global' was not declared. Should it be static? Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com> LKML-Reference: <1237741871-5827-4-git-send-email-dmitri.vorobiev@movial.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:16:54 +01:00
Dmitri Vorobiev	f80d2d7725	tracing, Text Edit Lock: Fix one sparse warning in kernel/extable.c Impact: cleanup. The global mutex text_mutex if declared in linux/memory.h, so this file needs to be included into kernel/extable.c, where the same mutex is defined. This fixes the following sparse warning: kernel/extable.c:32:1: warning: symbol 'text_mutex' was not declared. Should it be static? Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.com> LKML-Reference: <1237741871-5827-3-git-send-email-dmitri.vorobiev@movial.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-22 18:16:20 +01:00

... 34 35 36 37 38 ...

8411 commits