I'm naively passing along some variable test metadata to some py_test targets to inject that metadata into some test result artifacts that later get uploaded to the cloud. I'm doing so using either the --test_env or --test_arg values at the bazel test invocation.
Would this variable data negatively affect the way test results are cached such that running the same test back to back would effectively disturb the bazel cache?
Command Line Inputs
Command line inputs can indeed disturb cache hits. Consider the following set of executions
BUILD file
py_test(
name = "test_inputs",
srcs = ["test_inputs.py"],
deps = [
":conftest",
"#pytest",
],
)
py_library(
name = "conftest",
srcs = ["conftest.py"],
deps = [
"#pytest",
],
)
Test module
import sys
import pytest
def test_pass():
assert True
def test_arg_in(request):
assert request.config.getoption("--metadata")
if __name__ == "__main__":
args = sys.argv[1:]
ret_code = pytest.main([__file__, "--log-level=ERROR"] + args)
sys.exit(ret_code)
First execution
$ bazel test //bazel_check:test_inputs --test_arg --metadata=abc
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.40s
INFO: Critical path 0.57s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 0.72s (preparation 0.12s, execution 0.60s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.4s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
Second execution: same argument value, cache hit!
$ bazel test //bazel_check:test_inputs --test_arg --metadata=abc
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
INFO: 1 process: 1 internal (100.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.00s
INFO: Critical path 0.47s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 0.61s (preparation 0.12s, execution 0.49s)
INFO: Build completed successfully, 1 total action
//bazel_check:test_inputs (cached) PASSED in 0.4s
Executed 0 out of 1 test: 1 test passes.
INFO: Build completed successfully, 1 total action
Third execution: new argument value, no cache hit
$ bazel test //bazel_check:test_inputs --test_arg --metadata=kk
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 93 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.30s
INFO: Critical path 0.54s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 0.71s (preparation 0.14s, execution 0.57s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
Fourth execution: reused same argument as first two runs
Interestingly enough there is no cache hit despite the result being cached earlier. Somehow it did not persist.
$ bazel test //bazel_check:test_inputs --test_arg --metadata=abc
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.34s
INFO: Critical path 0.50s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 0.71s (preparation 0.17s, execution 0.55s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
Environment Inputs
The same exact behavior applies for --test_env inputs
import os
import sys
import pytest
def test_pass():
assert True
def test_env_in():
assert os.environ.get("META_ENV")
if __name__ == "__main__":
args = sys.argv[1:]
ret_code = pytest.main([__file__, "--log-level=ERROR"] + args)
sys.exit(ret_code)
First execution
$ bazel test //bazel_check:test_inputs --test_env META_ENV=33
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 7285 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.29s
INFO: Critical path 0.66s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 1.26s (preparation 0.42s, execution 0.84s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
Second execution: same env value, cache hit!
$ bazel test //bazel_check:test_inputs --test_env META_ENV=33
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 0 targets configured).
INFO: Found 1 test target...
INFO: 1 process: 1 internal (100.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.00s
INFO: Critical path 0.49s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 0.67s (preparation 0.15s, execution 0.52s)
INFO: Build completed successfully, 1 total action
//bazel_check:test_inputs (cached) PASSED in 0.3s
Executed 0 out of 1 test: 1 test passes.
INFO: Build completed successfully, 1 total action
Third execution: new env value, no cache hit
$ bazel test //bazel_check:test_inputs --test_env META_ENV=44
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 7285 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.29s
INFO: Critical path 0.62s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 1.22s (preparation 0.39s, execution 0.83s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
Fourth execution: reused same env value as first two runs
$ bazel test //bazel_check:test_inputs --test_env META_ENV=33
INFO: Build option --test_env has changed, discarding analysis cache.
INFO: Analyzed target //bazel_check:test_inputs (0 packages loaded, 7285 targets configured).
INFO: Found 1 test target...
INFO: 2 processes: 1 internal (50.00%), 1 local (50.00%).
INFO: Cache hit rate for remote actions: -- (0 / 0)
INFO: Total action wall time 0.28s
INFO: Critical path 0.66s (setup 0.00s, action wall time 0.00s)
INFO: Elapsed time 1.25s (preparation 0.40s, execution 0.85s)
INFO: Build completed successfully, 2 total actions
//bazel_check:test_inputs PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 2 total actions
I want to get the CPU and GPU utilisation of my cuda program and plot them like this.
What's the best way?
Here is my script:
### [1] Running my cuda program in background
./my_cuda_program &
PID_MY_CUDA_PROGRAM=$!
### [2] Getting CPU & GPU utilization in background
sar 1 | sed --unbuffered -e 's/^/SYSSTAT:/' &
PID_SYSSTAT=$!
nvidia-smi --format=csv --query-gpu=timestamp,utilization.gpu -l 1 \
| sed --unbuffered -e 's/^/NVIDIA_SMI:/' &
PID_NVIDIA_SMI=$!
### [3] waiting for the [1] process to finish,
### and then kill [2] processes
wait ${PID_MY_CUDA_PROGRAM}
kill ${PID_SYSSTAT}
kill ${PID_NVIDIA_SMI}
exit
That output:
SYSSTAT:Linux 4.15.0-176-generic (ubuntu00) 05/06/22 _x86_64_ (4 CPU)
NVIDIA_SMI:timestamp, utilization.gpu [%]
NVIDIA_SMI:2022/05/06 23:57:00.245, 7 %
SYSSTAT:
SYSSTAT:23:57:00 CPU %user %nice %system %iowait %steal %idle
SYSSTAT:23:57:01 all 8.73 0.00 5.74 7.48 0.00 78.05
NVIDIA_SMI:2022/05/06 23:57:01.246, 1 %
SYSSTAT:23:57:02 all 23.31 0.00 6.02 0.00 0.00 70.68
NVIDIA_SMI:2022/05/06 23:57:02.246, 16 %
SYSSTAT:23:57:03 all 25.56 0.00 3.76 0.00 0.00 70.68
NVIDIA_SMI:2022/05/06 23:57:03.246, 15 %
SYSSTAT:23:57:04 all 22.69 0.00 6.48 0.00 0.00 70.82
NVIDIA_SMI:2022/05/06 23:57:04.246, 21 %
SYSSTAT:23:57:05 all 25.81 0.00 3.26 0.00 0.00 70.93
it's a bit annoying to parse the log above.
I am trying to use perf for performance analysis.
When I use perf stat it provides execution time
Performance counter stats for './quicksort_ver1 input.txt 10000':
7.00 msec task-clock:u # 0.918 CPUs utilized
2,679,253 cycles:u # 0.383 GHz (9.58%)
18,034,446 instructions:u # 6.73 insn per cycle (23.56%)
5,764,095 branches:u # 822.955 M/sec (37.62%)
5,030,025 dTLB-loads # 718.150 M/sec (51.69%)
2,948,787 dTLB-stores # 421.006 M/sec (65.75%)
5,525,534 L1-dcache-loads # 788.895 M/sec (48.31%)
2,653,434 L1-dcache-stores # 378.838 M/sec (34.25%)
4,900 L1-dcache-load-misses # 0.09% of all L1-dcache hits (20.16%)
66 LLC-load-misses # 0.00% of all LL-cache hits (6.09%)
<not counted> LLC-store-misses (0.00%)
<not counted> LLC-loads (0.00%)
<not counted> LLC-stores (0.00%)
0.007631774 seconds time elapsed
0.006655000 seconds user
0.000950000 seconds sys
However when I use perf record, I observe that for task-clock 45 samples and 14999985 events are collected.
Samples: 45 of event 'task-clock:u', Event count (approx.): 14999985
Children Self Command Shared Object Symbol
+ 91.11% 0.00% quicksort_ver1 quicksort_ver1 [.] _start
+ 91.11% 0.00% quicksort_ver1 libc-2.17.so [.] __libc_start_main
+ 91.11% 0.00% quicksort_ver1 quicksort_ver1 [.] main
is there any way to convert task-clock events to seconds to milliseconds?
Got answer with little bit of experimentation. Basic unit of task-cpu event is Nano second
stats collected with perf stat
$ sudo perf stat -e task-clock:u ./bubble_sort input.txt 50000
Performance counter stats for './bubble_sort input.txt 50000':
11,617.33 msec task-clock:u # 1.000 CPUs utilized
11.617480215 seconds time elapsed
11.615856000 seconds user
0.002000000 seconds sys
stats collected with perf record
$ sudo perf report
Samples: 35K of event 'task-clock:u', Event count (approx.): 11715321618
Overhead Command Shared Object Symbol
73.75% bubble_sort bubble_sort [.] bubbleSort
26.15% bubble_sort bubble_sort [.] swap
0.07% bubble_sort libc-2.17.so [.] _IO_vfscanf
observe in both the cases sample has changed but event count is approximately same.
perf stat reports elapsed time as 11.617480215 seconds and perf report reports total task-clock events: 11715321618
11715321618 nanoseconds = 11.715321618 seconds which is approximately equals to 11.615856000 seconds
apparently basic unit of task-cpu event is Nanosecond.
there, I ran into a 'Segmentation fault' error when using travis-ci for my project : IPython-Dashboard
there is no error msg and it works fine on local, I feel a little confusing. any one can give any idea on fixing this, thanks.
here is the travis build log on cloud:
travis-log
$ nosetests --with-coverage --cover-package=dashboard
../home/travis/build.sh: line 45: 3187 Segmentation fault (core dumped)
nosetests --with-coverage --cover-package=dashboard
The command "nosetests --with-coverage --cover-package=dashboard" exited with 139.
here is the build log on local [osx]
taotao#mac007:~/Desktop/github/IPython-Dashboard$sudo nosetests --with-coverage --cover-package=dashboard
.../Users/chenshan/Desktop/github/IPython-Dashboard/dashboard/tests/testCreateData.py:78: Warning: Can't create database 'IPD_data'; database exists
conn.cursor().execute('CREATE DATABASE IF NOT EXISTS {};'.format(config.sql_db))
/Library/Python/2.7/site-packages/pandas/io/sql.py:599: FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy engines.
warnings.warn(_MYSQL_WARNING, FutureWarning)
...
Name Stmts Miss Cover Missing
---------------------------------------------------------------------
dashboard.py 13 0 100%
dashboard/client.py 1 0 100%
dashboard/client/sender.py 11 3 73% 26-27, 33
dashboard/conf.py 0 0 100%
dashboard/conf/config.py 29 0 100%
dashboard/server.py 0 0 100%
dashboard/server/resources.py 0 0 100%
dashboard/server/resources/dash.py 35 10 71% 36, 55-56, 67-69, 86-89
dashboard/server/resources/home.py 40 12 70% 25, 28-30, 83-91
dashboard/server/resources/sql.py 27 11 59% 30, 52-75
dashboard/server/resources/status.py 8 1 88% 19
dashboard/server/resources/storage.py 13 5 62% 26-28, 43-47
dashboard/server/utils.py 79 18 77% 20-24, 78-80, 82-83, 86, 96, 99-100, 126-127, 140-142
dashboard/server/views.py 21 1 95% 16
---------------------------------------------------------------------
TOTAL 277 61 78%
----------------------------------------------------------------------
Ran 6 tests in 4.600s
OK
taotao#mac007:~/Desktop/github/IPython-Dashboard$
I'm trying to create my cascade classifier with this command:
haartraining -data haarcascade -vec samples.vec -bg negatives.dat -nstages 20 -nsplits 2 -minhitrate 0.999 -maxfalsealarm 0.5 -npos 1000 -nneg 600 -w 20 -h 20 -nonsym -mem 2048 -mode ALL
I have 1500 samples created from one single image with this command:
createsamples -img foto.png -num 1500 -bg negatives.dat -vec samples.vec -maxxangle 0.6 -maxyangle 0 -maxzangle 0.3 -maxidev 100 -bgcolor 0 -bgthresh 0 -w 20 -h 20
This is the output at stage 3:
Tree Classifier
Stage
+---+
| 0|
+---+
Number of features used : 125199
Parent node: NULL
*** 1 cluster ***
POS: 1000 1000 1.000000
NEG: 600 1
**BACKGROUND PROCESSING TIME: 0.02**
Precalculation time: 41.39
+----+----+-+---------+---------+---------+---------+
| N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
+----+----+-+---------+---------+---------+---------+
| 1|100%|-|-0.989933| 1.000000| 0.988333| 0.003125|
+----+----+-+---------+---------+---------+---------+
| 2|100%|-| 0.006064| 1.000000| 0.000000| 0.000000|
+----+----+-+---------+---------+---------+---------+
Stage training time: 40.66
Number of used features: 4
Parent node: NULL
Chosen number of splits: 0
Total number of splits: 0
Tree Classifier
Stage
+---+
| 0|
+---+
0
Parent node: 0
*** 1 cluster ***
POS: 1000 1000 1.000000
NEG: 600 0.0169943
**BACKGROUND PROCESSING TIME: 0.23**
Precalculation time: 37.19
+----+----+-+---------+---------+---------+---------+
| N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
+----+----+-+---------+---------+---------+---------+
| 1|100%|-|-0.981031| 1.000000| 1.000000| 0.007500|
+----+----+-+---------+---------+---------+---------+
| 2|100%|-| 0.005864| 1.000000| 0.010000| 0.003750|
+----+----+-+---------+---------+---------+---------+
Stage training time: 36.25
Number of used features: 4
Parent node: 0
Chosen number of splits: 0
Total number of splits: 0
Tree Classifier
Stage
+---+---+
| 0| 1|
+---+---+
0---1
Parent node: 1
*** 1 cluster ***
POS: 1000 1000 1.000000
NEG: 600 0.000522
**BACKGROUND PROCESSING TIME: 7.54**
Precalculation time: 40.80
+----+----+-+---------+---------+---------+---------+
| N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
+----+----+-+---------+---------+---------+---------+
| 1|100%|-|-0.895043| 1.000000| 1.000000| 0.051875|
+----+----+-+---------+---------+---------+---------+
| 2|100%|-|-1.818561| 1.000000| 0.978333| 0.026250|
+----+----+-+---------+---------+---------+---------+
| 3|100%|-|-2.601195| 1.000000| 0.676667| 0.010000|
+----+----+-+---------+---------+---------+---------+
| 4|100%|-|-1.673473| 1.000000| 0.033333| 0.003125|
+----+----+-+---------+---------+---------+---------+
Stage training time: 80.58
Number of used features: 8
Parent node: 1
Chosen number of splits: 0
Total number of splits: 0
Tree Classifier
Stage
+---+---+---+
| 0| 1| 2|
+---+---+---+
0---1---2
Parent node: 2
*** 1 cluster ***
POS: 1000 1000 1.000000
NEG: 600 4.19496e-005
**BACKGROUND PROCESSING TIME: 93.92**
Precalculation time: 40.82
+----+----+-+---------+---------+---------+---------+
| N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
+----+----+-+---------+---------+---------+---------+
| 1|100%|-|-0.955309| 1.000000| 1.000000| 0.059375|
+----+----+-+---------+---------+---------+---------+
| 2|100%|-|-1.676803| 1.000000| 0.931667| 0.065000|
+----+----+-+---------+---------+---------+---------+
| 3|100%|-|-1.313002| 1.000000| 0.233333| 0.010625|
+----+----+-+---------+---------+---------+---------+
Stage training time: 63.21
Number of used features: 6
Parent node: 2
Chosen number of splits: 0
Total number of splits: 0
Tree Classifier
Stage
+---+---+---+---+
| 0| 1| 2| 3|
+---+---+---+---+
0---1---2---3
Parent node: 3
*** 1 cluster ***
POS: 1000 1000 1.000000
NEG: 600 1.23118e-005
**BACKGROUND PROCESSING TIME: 327.57**
Precalculation time: 41.54
+----+----+-+---------+---------+---------+---------+
| N |%SMP|F| ST.THR | HR | FA | EXP. ERR|
+----+----+-+---------+---------+---------+---------+
| 1|100%|-|-0.939509| 1.000000| 1.000000| 0.054375|
+----+----+-+---------+---------+---------+---------+
| 2|100%|-|-1.812912| 1.000000| 0.821667| 0.047500|
+----+----+-+---------+---------+---------+---------+
| 3|100%|-|-0.907906| 1.000000| 0.128333| 0.016875|
+----+----+-+---------+---------+---------+---------+
Stage training time: 61.52
Number of used features: 6
Parent node: 3
Chosen number of splits: 0
Total number of splits: 0
Tree Classifier
Stage
+---+---+---+---+---+
| 0| 1| 2| 3| 4|
+---+---+---+---+---+
0---1---2---3---4
Parent node: 4
*** 1 cluster ***
POS: 1000 1000 1.000000
0%
My question is:
It's normal that Background Processing Time grows up so quickly?? To arrive to stage 20 i'll take some weeks!! there is something wrong??
It could also take longer. There is a reason if OpenCV comes with pre-calculated cascade files.