If a binary operation is detected while parsing an expression string,
the operand strings are deduced by splitting the experssion string at
the position of the detected binary operator. Both operand strings are
sub-strings (can be empty string) of the expression string but will
never be NULL.
Currently a NULL check is used for missing operands, fix this by
checking for empty strings instead.
Link: https://lkml.kernel.org/r/20211112191324.1302505-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Fixes: 9710b2f341 ("tracing: Fix operator precedence for hist triggers expression")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Do not copy the fixed-size char array field of the events over
the field size. The histogram treats char array as a string and
there are 2 types of char array in the event, fixed-size and
dynamic string. The dynamic string (__data_loc) field must be
null terminated, but the fixed-size char array field may not
be null terminated (not a string, but just a data).
In that case, histogram can copy the data after the field.
This uses the original field size for fixed-size char array
field to restrict the histogram not to access over the original
field size.
Link: https://lkml.kernel.org/r/163673292822.195747.3696966210526410250.stgit@devnote2
Fixes: 02205a6752 (tracing: Add support for 'field variables')
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The resetting of the entire ring buffer use to simply go through and reset
each individual CPU buffer that had its own protection and synchronization.
But this was very slow, due to performing a synchronization for each CPU.
The code was reshuffled to do one disabling of all CPU buffers, followed
by a single RCU synchronization, and then the resetting of each of the CPU
buffers. But unfortunately, the mutex that prevented multiple occurrences
of resetting the buffer was not moved to the upper function, and there is
nothing to protect from it.
Take the ring buffer mutex around the global reset.
Cc: stable@vger.kernel.org
Fixes: b23d7a5f4a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
There's compilation fail reported kernel test robot for W=1 build:
>> samples/ftrace/ftrace-direct-multi.c:8:6: warning: no previous
prototype for function 'my_direct_func' [-Wmissing-prototypes]
void my_direct_func(unsigned long ip)
The inlined assembly is used outside function, so we can't make
my_direct_func static and pass it as asm input argument.
However my_tramp is already extern so I think there's no problem
keeping my_direct_func extern as well and just add its prototype.
Link: https://lkml.kernel.org/r/20211101153907.377668-1-jolsa@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 5fae941b9a ("ftrace/samples: Add multi direct interface test module")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If the divisor is a constant use specific division functions to
avoid extra branches when the trigger is hit.
If the divisor constant but not a power of 2, the division can be
replaced with a multiplication and shift in the following case:
Let X = dividend and Y = divisor.
Choose Z = some power of 2. If Y <= Z, then:
X / Y = (X * (Z / Y)) / Z
(Z / Y) is a constant (mult) which is calculated at parse time, so:
X / Y = (X * mult) / Z
The division by Z can be replaced by a shift since Z is a power of 2:
X / Y = (X * mult) >> shift
As long, as X < Z the results will not be off by more than 1.
Link: https://lkml.kernel.org/r/20211029232410.3494196-1-kaleshsingh@google.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The do while loop continues while ret is zero, but ret is never
initialized. The check for ret in the loop at the while should always be
initialized, but if an empty string were to be passed in, q would be NULL
and p would be '\0', and it would break out of the loop without ever
setting ret.
Set ret to zero, and then xbc_verify_tree() would be called and catch that
it is an empty tree and report the proper error.
Link: https://lkml.kernel.org/r/20211027105753.6ab9da5f@gandalf.local.home
Fixes: bdac5c2b24 ("bootconfig: Allocate xbc_data inside xbc_init()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The division is a slow operation. If the divisor is a power of 2, use a
shift instead.
Results were obtained using Android's version of perf (simpleperf[1]) as
described below:
1. hist_field_div() is modified to call 2 test functions:
test_hist_field_div_[not]_optimized(); passing them the
same args. Use noinline and volatile to ensure these are
not optimized out by the compiler.
2. Create a hist event trigger that uses division:
events/kmem/rss_stat$ echo 'hist:keys=common_pid:x=size/<divisor>'
>> trigger
events/kmem/rss_stat$ echo 'hist:keys=common_pid:vals=$x'
>> trigger
3. Run Android's lmkd_test[2] to generate rss_stat events, and
record CPU samples with Android's simpleperf:
simpleperf record -a --exclude-perf --post-unwind=yes -m 16384 -g
-f 2000 -o perf.data
== Results ==
Divisor is a power of 2 (divisor == 32):
test_hist_field_div_not_optimized | 8,717,091 cpu-cycles
test_hist_field_div_optimized | 1,643,137 cpu-cycles
If the divisor is a power of 2, the optimized version is ~5.3x faster.
Divisor is not a power of 2 (divisor == 33):
test_hist_field_div_not_optimized | 4,444,324 cpu-cycles
test_hist_field_div_optimized | 5,497,958 cpu-cycles
If the divisor is not a power of 2, as expected, the optimized version is
slightly slower (~24% slower).
[1] https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md
[2] https://cs.android.com/android/platform/superproject/+/master:system/memory/lmkd/tests/lmkd_test.cpp
Link: https://lkml.kernel.org/r/20211025200852.3002369-7-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
If both operands of a hist trigger expression are constants, convert the
expression to a constant. This optimization avoids having to perform the
same calculation multiple times and also saves on memory since the
merged constants are represented by a single struct hist_field instead
or multiple.
Link: https://lkml.kernel.org/r/20211025200852.3002369-6-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The '-' in .sym-offset can confuse the hist trigger arithmetic
expression parsing. Simplify the handling of this by replacing the
'sym-offset' with 'symXoffset'. This allows us to correctly evaluate
expressions where the user may have inadvertently added a .sym-offset
modifier to one of the operands in an expression, instead of bailing
out. In this case the .sym-offset has no effect on the evaluation of the
expression. The only valid use of the .sym-offset is as a hist key
modifier.
Link: https://lkml.kernel.org/r/20211025200852.3002369-5-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The current histogram expression evaluation logic evaluates the
expression from right to left. This can lead to incorrect results
if the operations are not associative (as is the case for subtraction
and, the now added, division operators).
e.g. 16-8-4-2 should be 2 not 10 --> 16-8-4-2 = ((16-8)-4)-2
64/8/4/2 should be 1 not 16 --> 64/8/4/2 = ((64/8)/4)/2
Division and multiplication are currently limited to single operation
expression due to operator precedence support not yet implemented.
Rework the expression parsing to support the correct evaluation of
expressions containing operators of different precedences; and fix
the associativity error by evaluating expressions with operators of
the same precedence from left to right.
Examples:
(1) echo 'hist:keys=common_pid:a=8,b=4,c=2,d=1,w=$a-$b-$c-$d' \
>> event/trigger
(2) echo 'hist:keys=common_pid:x=$a/$b/3/2' >> event/trigger
(3) echo 'hist:keys=common_pid:y=$a+10/$c*1024' >> event/trigger
(4) echo 'hist:keys=common_pid:z=$a/$b+$c*$d' >> event/trigger
Link: https://lkml.kernel.org/r/20211025200852.3002369-4-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Adds basic support for division and multiplication operations for
hist trigger variable expressions.
For simplicity this patch only supports, division and multiplication
for a single operation expression (e.g. x=$a/$b), as currently
expressions are always evaluated right to left. This can lead to some
incorrect results:
e.g. echo 'hist:keys=common_pid:x=8-4-2' >> event/trigger
8-4-2 should evaluate to 2 i.e. (8-4)-2
but currently x evaluate to 6 i.e. 8-(4-2)
Multiplication and division in sub-expressions will work correctly, once
correct operator precedence support is added (See next patch in this
series).
For the undefined case of division by 0, the histogram expression
evaluates to (u64)(-1). Since this cannot be detected when the
expression is created, it is the responsibility of the user to be
aware and account for this possibility.
Examples:
echo 'hist:keys=common_pid:a=8,b=4,x=$a/$b' \
>> event/trigger
echo 'hist:keys=common_pid:y=5*$b' \
>> event/trigger
Link: https://lkml.kernel.org/r/20211025200852.3002369-3-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Currently hist trigger expressions don't support the use of numeric
literals:
e.g. echo 'hist:keys=common_pid:x=$y-1234'
--> is not valid expression syntax
Having the ability to use numeric constants in hist triggers supports
a wider range of expressions for creating variables.
Add support for creating trace event histogram variables from numeric
literals.
e.g. echo 'hist:keys=common_pid:x=1234,y=size-1024' >> event/trigger
A negative numeric constant is created, using unary minus operator
(parentheses are required).
e.g. echo 'hist:keys=common_pid:z=-(2)' >> event/trigger
Constants can be used with division/multiplication (added in the
next patch in this series) to implement granularity filters for frequent
trace events. For instance we can limit emitting the rss_stat
trace event to when there is a 512KB cross over in the rss size:
# Create a synthetic event to monitor instead of the high frequency
# rss_stat event
echo 'rss_stat_throttled unsigned int mm_id; unsigned int curr;
int member; long size' >> tracing/synthetic_events
# Create a hist trigger that emits the synthetic rss_stat_throttled
# event only when the rss size crosses a 512KB boundary.
echo 'hist:keys=keys=mm_id,member:bucket=size/0x80000:onchange($bucket)
.rss_stat_throttled(mm_id,curr,member,size)'
>> events/kmem/rss_stat/trigger
A use case for using constants with addition/subtraction is not yet
known, but for completeness the use of constants are supported for all
operators.
Link: https://lkml.kernel.org/r/20211025200852.3002369-2-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Stop tracing while reading the trace file by default, to prevent
the test results while checking it and to avoid taking a long time
to check the result.
If there is any testcase which wants to test the tracing while reading
the trace file, please override this setting inside the test case.
This also recovers the pause-on-trace when clean it up.
Link: https://lkml.kernel.org/r/163529053143.690749.15365238954175942026.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Use the actual return value instead of always -1 if register_kretprobe()
failed.
E.g. without this patch:
# insmod samples/kprobes/kretprobe_example.ko func=no_such_func
insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Operation not permitted
With this patch:
# insmod samples/kprobes/kretprobe_example.ko func=no_such_func
insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Unknown symbol in module
Link: https://lkml.kernel.org/r/1635213091-24387-2-git-send-email-yangtiezhu@loongson.cn
Fixes: 804defea1c ("Kprobes: move kprobe examples to samples/")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add a test case for stacktrace from kretprobe handler and
nested kretprobe handlers.
This test checks both of stack trace inside kretprobe handler
and stack trace from pt_regs. Those stack trace must include
actual function return address instead of kretprobe trampoline.
The nested kretprobe stacktrace test checks whether the unwinder
can correctly unwind the call frame on the stack which has been
modified by the kretprobe.
Since the stacktrace on kretprobe is correctly fixed only on x86,
this introduces a meta kconfig ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
which tells user that the stacktrace on kretprobe is correct or not.
The test results will be shown like below;
TAP version 14
1..1
# Subtest: kprobes_test
1..6
ok 1 - test_kprobe
ok 2 - test_kprobes
ok 3 - test_kretprobe
ok 4 - test_kretprobes
ok 5 - test_stacktrace_on_kretprobe
ok 6 - test_stacktrace_on_nested_kretprobe
# kprobes_test: pass:6 fail:0 skip:0 total:6
# Totals: pass:6 fail:0 skip:0 total:6
ok 1 - kprobes_test
Link: https://lkml.kernel.org/r/163516211244.604541.18350507860972214415.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Using the linker script to fix an issue where some archs call the
function tracer with just the ip (instruction pointer) and pip (parent
instruction pointer) where as more up to date archs also pass in the
associated ftrace_ops and the ftrace_regs pointer, the generic code
will be called either with two parameters or four. To avoid any C
undefined behavior of calling two parameters to four or four to two
parameter function, two functions are created, where a preprocessor
macro uses the one that matches the architecture. As the function
pointers for them may be different, a typecast is used. But this
triggers issues with newer compilers that will fail due to -Werror.
A linker trick is now used to map the generic function to the function
that is used (note the generic function is only used to set the default
function callback). The linker trick defines ftrace_ops_list_func (the
generic function) to arch_ftrace_ops_list_func (the arch defined one).
Link: https://lore.kernel.org/all/20200617165616.52241bde@oasis.local.home/
But this fails sh arch because their linker script is included in their
compressed image that does not define arch_ftrace_ops_list_func at all
sh4-linux-ld:arch/sh/boot/compressed/../../kernel/vmlinux.lds:32: undefined symbol `arch_ftrace_ops_list_func' referenced in expression
Included a stub by that name in the misc.c to allow the code to
compile and link, even though it's not used.
This is similar to what was done for ftrace_stub:
b83b43ffc6 ("fgraph: Fix function type mismatches of
ftrace_graph_return using ftrace_stub")
Link: https://lkml.kernel.org/r/20211021221627.5d7270de@rorschach.local.home
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The sparse tool complains as follows:
kernel/trace/trace_hwlat.c:82:27: warning: symbol 'hwlat_single_cpu_data' was not declared. Should it be static?
kernel/trace/trace_hwlat.c:83:1: warning: symbol '__pcpu_scope_hwlat_per_cpu_data' was not declared. Should it be static?
This symbol is not used outside of trace_hwlat.c, so this commit
marks it static.
Link: https://lkml.kernel.org/r/20211021035225.1050685-1-bobo.shaobowang@huawei.com
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>