This repository is part of NUMCL.
./report.sh [test]
test
specifies the test that needs to be rerun and displayed.
It is one of:
- 0template
- 1access
- 1allocation
- 2linarg
- 3arith
- 4concat
- 5math
numpy: pip3 install numpy benchmarker
or: conda env create -f environment.yml
numcl: ros install numcl/constantfold numcl/gtype numcl/specialized-function numcl/numcl trivial-benchmark
general: awk, make, other general unix tools
Masataro Asai ([email protected])
Licensed under LGPL v3.
Copyright (c) 2019 IBM Corporation
title | numcl@d296e3 | numcl@0980fc | numcl@d296e3/numcl@0980fc | numpy@ | numcl@0980fc/numpy@ |
---|---|---|---|---|---|
0template/dummy/0 | 0.0040 | 0.0030 | 1.333 | 0.8939 | 0.003356 |
1access/read-range/0 | 0.0200 | 0.0180 | 1.111 | 0.0004 | 45 |
1access/read-range/1 | 0.0230 | 0.0140 | 1.643 | 0.0004 | 35 |
1access/read-range/2 | 0.2990 | 0.1320 | 2.265 | 0.0004 | 330 |
1access/read-range/3 | 0.2960 | 0.1770 | 1.672 | 0.0004 | 442.5 |
1access/read-range/4 | 0.2900 | 0.1910 | 1.518 | 0.0004 | 477.5 |
1access/read/0 | 0.0020 | 0.0020 | 1 | 0.0003 | 6.667 |
1access/read/1 | 0.0010 | 0.0020 | 0.5 | 0.0004 | 5 |
1access/read/2 | 0.0630 | 0.0430 | 1.465 | 0.0010 | 43 |
1access/read/3 | 0.0670 | 0.0470 | 1.426 | 0.0011 | 42.73 |
1access/read/4 | 0.0620 | 0.0490 | 1.265 | 0.0005 | 98 |
1access/write-batch/0 | 0.0170 | 0.0160 | 1.062 | 0.0009 | 17.78 |
1access/write-batch/1 | 0.0520 | 0.0510 | 1.02 | 0.0088 | 5.795 |
1access/write-range/0 | 0.0200 | 0.0220 | 0.9091 | 0.0008 | 27.5 |
1access/write-range/1 | 0.0200 | 0.0200 | 1 | 0.0008 | 25 |
1access/write-range/2 | 0.0380 | 0.0390 | 0.9744 | 0.0009 | 43.33 |
1access/write-range/3 | 0.0570 | 0.0470 | 1.213 | 0.0009 | 52.22 |
1access/write-range/4 | 0.0560 | 0.0480 | 1.167 | 0.0009 | 53.33 |
1access/write/0 | 0.0210 | 0.0210 | 1 | 0.0003 | 70 |
1access/write/1 | 0.0190 | 0.0190 | 1 | 0.0008 | 23.75 |
1access/write/2 | 0.0240 | 0.0250 | 0.96 | 0.0007 | 35.71 |
1access/write/3 | 0.0270 | 0.0280 | 0.9643 | 0.0007 | 40 |
1access/write/4 | 0.0380 | 0.0290 | 1.31 | 0.0007 | 41.43 |
1allocation/arange/float32 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/arange/float64 | 0.0010 | 0.0000 | N/A | 0.0002 | 0 |
1allocation/arange/int16 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/arange/int32 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/arange/int64 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/arange/int8 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/copy/bool | 0.0040 | 0.0020 | 2 | 0.0572 | 0.03497 |
1allocation/copy/float32 | 0.1390 | 0.1240 | 1.121 | 0.0554 | 2.238 |
1allocation/copy/float64 | 0.3010 | 0.2840 | 1.06 | 0.0516 | 5.504 |
1allocation/copy/int16 | 0.0550 | 0.0550 | 1 | 0.0527 | 1.044 |
1allocation/copy/int32 | 0.1420 | 0.1250 | 1.136 | 0.0526 | 2.376 |
1allocation/copy/int64 | 0.3520 | 0.2880 | 1.222 | 0.0541 | 5.323 |
1allocation/copy/int8 | 0.0280 | 0.0280 | 1 | 0.0517 | 0.5416 |
1allocation/empty/bool | 0.0020 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/empty/float32 | 0.0520 | 0.0480 | 1.083 | 0.0001 | 480 |
1allocation/empty/float64 | 0.1040 | 0.0890 | 1.169 | 0.0002 | 445 |
1allocation/empty/int16 | 0.0260 | 0.0220 | 1.182 | 0.0001 | 220 |
1allocation/empty/int32 | 0.0520 | 0.0450 | 1.156 | 0.0001 | 450 |
1allocation/empty/int64 | 0.1070 | 0.0880 | 1.216 | 0.0002 | 440 |
1allocation/empty/int8 | 0.0130 | 0.0120 | 1.083 | 0.0001 | 120 |
1allocation/ones/bool | 0.0090 | 0.0090 | 1 | 0.0045 | 2 |
1allocation/ones/float32 | 0.1120 | 0.1020 | 1.098 | 0.0261 | 3.908 |
1allocation/ones/float64 | 0.2210 | 0.2050 | 1.078 | 0.0831 | 2.467 |
1allocation/ones/int16 | 0.1020 | 0.1070 | 0.9533 | 0.0120 | 8.917 |
1allocation/ones/int32 | 0.1930 | 0.2150 | 0.8977 | 0.0453 | 4.746 |
1allocation/ones/int64 | 0.2890 | 0.2560 | 1.129 | 0.0864 | 2.963 |
1allocation/ones/int8 | 0.0680 | 0.0710 | 0.9577 | 0.0033 | 21.52 |
1allocation/zeros/bool | 0.0020 | 0.0030 | 0.6667 | 0.0038 | 0.7895 |
1allocation/zeros/float32 | 0.1000 | 0.1060 | 0.9434 | 0.0214 | 4.953 |
1allocation/zeros/float64 | 0.2140 | 0.1930 | 1.109 | 0.0684 | 2.822 |
1allocation/zeros/int16 | 0.0530 | 0.0490 | 1.082 | 0.0078 | 6.282 |
1allocation/zeros/int32 | 0.1050 | 0.0980 | 1.071 | 0.0245 | 4 |
1allocation/zeros/int64 | 0.2210 | 0.2000 | 1.105 | 0.0656 | 3.049 |
1allocation/zeros/int8 | 0.0200 | 0.0220 | 0.9091 | 0.0035 | 6.286 |
2linarg/diag/builtin | 0.0060 | 0.0080 | 0.75 | 0.0008 | 10 |
2linarg/diag/einsum | 0.0070 | 0.0050 | 1.4 | 0.0004 | 12.5 |
2linarg/eye/builtin | 0.0000 | 0.0000 | N/A | 0.0013 | 0 |
2linarg/eye/naive | 0.1760 | 0.1690 | 1.041 | 0.0039 | 43.33 |
2linarg/gemm-large/builtin | 133.7070 | 97.6810 | 1.369 | 2.3147 | 42.2 |
2linarg/gemm-large/einsum | 143.9220 | 94.3460 | 1.525 | 30.3643 | 3.107 |
2linarg/gemm/builtin | 0.1480 | 0.0760 | 1.947 | 0.0120 | 6.333 |
2linarg/gemm/einsum | 0.1410 | 0.0870 | 1.621 | 0.0411 | 2.117 |
2linarg/inner/builtin | 0.0060 | 0.0060 | 1 | 0.0004 | 15 |
2linarg/inner/einsum | 0.0050 | 0.0080 | 0.625 | 0.0009 | 8.889 |
2linarg/outer/builtin | 0.0080 | 0.0090 | 0.8889 | 0.0041 | 2.195 |
2linarg/outer/einsum | 0.0070 | 0.0100 | 0.7 | 0.0031 | 3.226 |
2linarg/tri/builtin | 0.0170 | 0.0180 | 0.9444 | 0.0069 | 2.609 |
2linarg/tril/builtin | 0.0230 | 0.0270 | 0.8519 | 0.0082 | 3.293 |
2linarg/triu/builtin | 0.0250 | 0.0270 | 0.9259 | 0.0081 | 3.333 |
2linarg/vander/builtin | 0.0620 | 0.0630 | 0.9841 | 0.0097 | 6.495 |
2linarg/vdot/builtin | 0.0060 | 0.0070 | 0.8571 | 0.0004 | 17.5 |
2linarg/vdot/einsum | 0.0060 | 0.0080 | 0.75 | 0.0013 | 6.154 |
3arith/add_cd_cd | N/A | N/A | N/A | 0.0015 | N/A |
3arith/add_cs_cs | N/A | N/A | N/A | 0.0017 | N/A |
3arith/add_d_d | 0.0290 | 0.0180 | 1.611 | 0.0007 | 25.71 |
3arith/add_d_i32 | 0.0290 | 0.0200 | 1.45 | 0.0014 | 14.29 |
3arith/add_i16_i16 | 0.0260 | 0.0190 | 1.368 | 0.0003 | 63.33 |
3arith/add_i1_i1 | 0.0390 | 0.0280 | 1.393 | 0.0002 | 140 |
3arith/add_i32_i32 | 0.0300 | 0.0200 | 1.5 | 0.0003 | 66.67 |
3arith/add_i64_i64 | 0.0330 | 0.0190 | 1.737 | 0.0006 | 31.67 |
3arith/add_i8_i8 | 0.0320 | 0.0170 | 1.882 | 0.0002 | 85 |
3arith/add_s_d | 0.0300 | 0.0190 | 1.579 | 0.0014 | 13.57 |
3arith/add_s_i32 | 0.0260 | 0.0170 | 1.529 | 0.0019 | 8.947 |
3arith/add_s_s | 0.0270 | 0.0180 | 1.5 | 0.0003 | 60 |
3arith/fma_cd_cd_cd | N/A | N/A | N/A | 0.0043 | N/A |
3arith/fma_cs_cs_cs | N/A | N/A | N/A | 0.0040 | N/A |
3arith/fma_d_d_d | 0.0640 | 0.0350 | 1.829 | 0.0012 | 29.17 |
3arith/fma_d_i32_d | 0.0660 | 0.0360 | 1.833 | 0.0021 | 17.14 |
3arith/fma_i16_i16_i16 | 0.0760 | 0.0390 | 1.949 | 0.0004 | 97.5 |
3arith/fma_i1_i1_i1 | 0.0780 | 0.0500 | 1.56 | 0.0003 | 166.7 |
3arith/fma_i32_i32_i32 | 0.0900 | 0.0460 | 1.957 | 0.0006 | 76.67 |
3arith/fma_i64_i64_i64 | 0.0740 | 0.0380 | 1.947 | 0.0013 | 29.23 |
3arith/fma_i8_i8_i8 | 0.0720 | 0.0390 | 1.846 | 0.0003 | 130 |
3arith/fma_s_i32_d | N/A | N/A | N/A | 0.0027 | N/A |
3arith/fma_s_i32_s | 0.0600 | 0.0370 | 1.622 | 0.0034 | 10.88 |
3arith/fma_s_s_s | 0.0620 | 0.0330 | 1.879 | 0.0006 | 55 |
3arith/mul_cd_cd | N/A | N/A | N/A | 0.0022 | N/A |
3arith/mul_cs_cs | N/A | N/A | N/A | 0.0022 | N/A |
3arith/mul_d_d | 0.0560 | 0.0200 | 2.8 | 0.0005 | 40 |
3arith/mul_d_i32 | 0.0310 | 0.0160 | 1.938 | 0.0014 | 11.43 |
3arith/mul_i16_i16 | 0.0280 | 0.0220 | 1.273 | 0.0002 | 110 |
3arith/mul_i1_i1 | 0.0390 | 0.0210 | 1.857 | 0.0001 | 210 |
3arith/mul_i32_i32 | 0.0330 | 0.0160 | 2.062 | 0.0003 | 53.33 |
3arith/mul_i64_i64 | 0.0330 | 0.0160 | 2.062 | 0.0007 | 22.86 |
3arith/mul_i8_i8 | 0.0250 | 0.0190 | 1.316 | 0.0002 | 95 |
3arith/mul_s_d | 0.0560 | 0.0130 | 4.308 | 0.0014 | 9.286 |
3arith/mul_s_i32 | 0.0290 | 0.0150 | 1.933 | 0.0020 | 7.5 |
3arith/mul_s_s | 0.0320 | 0.0140 | 2.286 | 0.0003 | 46.67 |
4concat/concatenate/0 | 0.0040 | 0.0050 | 0.8 | 0.0003 | 16.67 |
4concat/concatenate/1 | 0.0050 | 0.0090 | 0.5556 | 0.0003 | 30 |
4concat/concatenate/2 | 0.0270 | 0.0360 | 0.75 | 0.0004 | 90 |
4concat/stack/0 | 0.0030 | 0.0050 | 0.6 | 0.0009 | 5.556 |
4concat/stack/1 | 0.0050 | 0.0070 | 0.7143 | 0.0009 | 7.778 |
4concat/stack/2 | 0.0300 | 0.0400 | 0.75 | 0.0010 | 40 |
5math/acos/0 | 1.5480 | 1.5670 | 0.9879 | 0.5140 | 3.049 |
5math/asin/0 | 1.8250 | 1.8550 | 0.9838 | 0.4661 | 3.98 |
5math/atan/0 | 1.1710 | 1.2180 | 0.9614 | 0.7377 | 1.651 |
5math/cos/0 | 0.8620 | 1.1550 | 0.7463 | 0.4401 | 2.624 |
5math/cosh/0 | 0.7380 | 0.7710 | 0.9572 | 0.5496 | 1.403 |
5math/exp/0 | 1.2320 | 0.9950 | 1.238 | 0.8445 | 1.178 |
5math/log/0 | 1.1480 | 2.4500 | 0.4686 | 0.7065 | 3.468 |
5math/sin/0 | 1.1890 | 1.1290 | 1.053 | 0.6603 | 1.71 |
5math/sinh/0 | 1.0450 | 0.9690 | 1.078 | 0.7570 | 1.28 |
5math/tan/0 | 1.0480 | 1.2420 | 0.8438 | 0.6351 | 1.956 |
5math/tanh/0 | 0.6400 | 0.5330 | 1.201 | 0.4663 | 1.143 |
title | numcl@0980fc | numcl@202820 | numcl@0980fc/numcl@202820 | numpy | numcl@202820/numpy |
(old master) | (new master) | (old/new) | |||
---|---|---|---|---|---|
0template/dummy/0 | 0.0040 | 0.0030 | 1.333 | 1.3115 | 0.002287 |
1access/read/0 | 0.0020 | 0.0020 | 1 | 0.0003 | 6.667 |
1access/read/1 | 0.0020 | 0.0020 | 1 | 0.0003 | 6.667 |
1access/read/2 | 0.0520 | 0.0520 | 1 | 0.0004 | 130 |
1access/read/3 | 0.0460 | 0.0530 | 0.8679 | 0.0004 | 132.5 |
1access/read/4 | 0.0460 | 0.0500 | 0.92 | 0.0004 | 125 |
1access/read-range/0 | 0.0170 | 0.0180 | 0.9444 | 0.0004 | 45 |
1access/read-range/1 | 0.0180 | 0.0180 | 1 | 0.0004 | 45 |
1access/read-range/2 | 0.1390 | 0.1630 | 0.8528 | 0.0004 | 407.5 |
1access/read-range/3 | 0.1750 | 0.1700 | 1.029 | 0.0005 | 340 |
1access/read-range/4 | 0.1620 | 0.1570 | 1.032 | 0.0004 | 392.5 |
1access/write/0 | 0.0210 | 0.0210 | 1 | 0.0002 | 105 |
1access/write/1 | 0.0200 | 0.0170 | 1.176 | 0.0007 | 24.29 |
1access/write/2 | 0.0250 | 0.0230 | 1.087 | 0.0007 | 32.86 |
1access/write/3 | 0.0270 | 0.0290 | 0.931 | 0.0007 | 41.43 |
1access/write/4 | 0.0310 | 0.0300 | 1.033 | 0.0008 | 37.5 |
1access/write-batch/0 | 0.0130 | 0.0100 | 1.3 | 0.0008 | 12.5 |
1access/write-batch/1 | 0.0460 | 0.0410 | 1.122 | 0.0080 | 5.125 |
1access/write-range/0 | 0.0250 | 0.0200 | 1.25 | 0.0007 | 28.57 |
1access/write-range/1 | 0.0260 | 0.0200 | 1.3 | 0.0008 | 25 |
1access/write-range/2 | 0.0490 | 0.0400 | 1.225 | 0.0009 | 44.44 |
1access/write-range/3 | 0.0490 | 0.0450 | 1.089 | 0.0009 | 50 |
1access/write-range/4 | 0.0520 | 0.0540 | 0.963 | 0.0009 | 60 |
1allocation/arange/float32 | 0.0020 | 0.0040 | 0.5 | 0.0002 | 20 |
1allocation/arange/float64 | 0.0040 | 0.0030 | 1.333 | 0.0002 | 15 |
1allocation/arange/int16 | 0.0010 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/arange/int32 | 0.0030 | 0.0020 | 1.5 | 0.0001 | 20 |
1allocation/arange/int64 | 0.0050 | 0.0050 | 1 | 0.0001 | 50 |
1allocation/arange/int8 | 0.0000 | 0.0000 | N/A | 0.0001 | 0 |
1allocation/copy/bool | 0.0030 | 0.0030 | 1 | 0.0706 | 0.04249 |
1allocation/copy/float32 | 0.0940 | 0.1000 | 0.94 | 0.0575 | 1.739 |
1allocation/copy/float64 | 0.2660 | 0.2710 | 0.9815 | 0.0606 | 4.472 |
1allocation/copy/int16 | 0.0380 | 0.0450 | 0.8444 | 0.0800 | 0.5625 |
1allocation/copy/int32 | 0.1010 | 0.1020 | 0.9902 | 0.0711 | 1.435 |
1allocation/copy/int64 | 0.2670 | 0.2720 | 0.9816 | 0.0607 | 4.481 |
1allocation/copy/int8 | 0.0180 | 0.0180 | 1 | 0.0746 | 0.2413 |
1allocation/empty/bool | 0.0010 | 0.0010 | 1 | 0.0001 | 10 |
1allocation/empty/float32 | 0.0390 | 0.0370 | 1.054 | 0.0001 | 370 |
1allocation/empty/float64 | 0.0790 | 0.0780 | 1.013 | 0.0002 | 390 |
1allocation/empty/int16 | 0.0190 | 0.0190 | 1 | 0.0001 | 190 |
1allocation/empty/int32 | 0.0370 | 0.0370 | 1 | 0.0001 | 370 |
1allocation/empty/int64 | 0.0780 | 0.0770 | 1.013 | 0.0002 | 385 |
1allocation/empty/int8 | 0.0100 | 0.0090 | 1.111 | 0.0001 | 90 |
1allocation/ones/bool | 0.0030 | 0.0020 | 1.5 | 0.0035 | 0.5714 |
1allocation/ones/float32 | 0.0720 | 0.0670 | 1.075 | 0.1356 | 0.4941 |
1allocation/ones/float64 | 0.1920 | 0.1960 | 0.9796 | 0.1214 | 1.614 |
1allocation/ones/int16 | 0.0580 | 0.0600 | 0.9667 | 0.0084 | 7.143 |
1allocation/ones/int32 | 0.1240 | 0.1230 | 1.008 | 0.0277 | 4.44 |
1allocation/ones/int64 | 0.2440 | 0.2320 | 1.052 | 0.1238 | 1.874 |
1allocation/ones/int8 | 0.0440 | 0.0410 | 1.073 | 0.0027 | 15.19 |
1allocation/zeros/bool | 0.0020 | 0.0020 | 1 | 0.0043 | 0.4651 |
1allocation/zeros/float32 | 0.0690 | 0.1000 | 0.69 | 0.0294 | 3.401 |
1allocation/zeros/float64 | 0.1970 | 0.1920 | 1.026 | 0.0941 | 2.04 |
1allocation/zeros/int16 | 0.0300 | 0.0300 | 1 | 0.0106 | 2.83 |
1allocation/zeros/int32 | 0.0730 | 0.0720 | 1.014 | 0.0315 | 2.286 |
1allocation/zeros/int64 | 0.2090 | 0.2150 | 0.9721 | 0.0810 | 2.654 |
1allocation/zeros/int8 | 0.0160 | 0.0190 | 0.8421 | 0.0038 | 5 |
2linarg/diag/builtin | 0.0050 | 0.0050 | 1 | 0.0004 | 12.5 |
2linarg/diag/einsum | 0.0050 | 0.0050 | 1 | 0.0002 | 25 |
2linarg/eye/builtin | 0.0000 | 0.0000 | N/A | 0.0007 | 0 |
2linarg/eye/naive | 0.1790 | 0.1900 | 0.9421 | 0.0019 | 100 |
2linarg/gemm/builtin | 0.0720 | 0.1310 | 0.5496 | 0.0124 | 10.56 |
2linarg/gemm/einsum | 0.0750 | 0.1340 | 0.5597 | 0.0457 | 2.932 |
2linarg/gemm-large/builtin | 63.9760 | 124.4860 | 0.5139 | 2.0091 | 61.96 |
2linarg/gemm-large/einsum | 64.4450 | 122.9670 | 0.5241 | 28.4868 | 4.317 |
2linarg/inner/builtin | 0.0040 | 0.0040 | 1 | 0.0004 | 10 |
2linarg/inner/einsum | 0.0040 | 0.0040 | 1 | 0.0012 | 3.333 |
2linarg/outer/builtin | 0.0070 | 0.0070 | 1 | 0.0045 | 1.556 |
2linarg/outer/einsum | 0.0090 | 0.0050 | 1.8 | 0.0030 | 1.667 |
2linarg/tri/builtin | 0.0130 | 0.0130 | 1 | 0.0077 | 1.688 |
2linarg/tril/builtin | 0.0210 | 0.0210 | 1 | 0.0083 | 2.53 |
2linarg/triu/builtin | 0.0220 | 0.0210 | 1.048 | 0.0081 | 2.593 |
2linarg/vander/builtin | 0.0510 | 0.0490 | 1.041 | 0.0091 | 5.385 |
2linarg/vdot/builtin | 0.0070 | 0.0050 | 1.4 | 0.0004 | 12.5 |
2linarg/vdot/einsum | 0.0070 | 0.0040 | 1.75 | 0.0013 | 3.077 |
3arith/add_cd_cd | N/A | N/A | N/A | 0.0016 | N/A |
3arith/add_cs_cs | N/A | N/A | N/A | 0.0012 | N/A |
3arith/add_d_d | 0.0120 | 0.0110 | 1.091 | 0.0004 | 27.5 |
3arith/add_d_i32 | 0.0130 | 0.0150 | 0.8667 | 0.0011 | 13.64 |
3arith/add_i16_i16 | 0.0120 | 0.0130 | 0.9231 | 0.0002 | 65 |
3arith/add_i1_i1 | 0.0170 | 0.0190 | 0.8947 | 0.0001 | 190 |
3arith/add_i32_i32 | 0.0150 | 0.0140 | 1.071 | 0.0003 | 46.67 |
3arith/add_i64_i64 | 0.0110 | 0.0170 | 0.6471 | 0.0006 | 28.33 |
3arith/add_i8_i8 | 0.0110 | 0.0140 | 0.7857 | 0.0001 | 140 |
3arith/add_s_d | 0.0130 | 0.0120 | 1.083 | 0.0012 | 10 |
3arith/add_s_i32 | 0.0130 | 0.0110 | 1.182 | 0.0017 | 6.471 |
3arith/add_s_s | 0.0110 | 0.0130 | 0.8462 | 0.0002 | 65 |
3arith/fma_cd_cd_cd | N/A | N/A | N/A | 0.0085 | N/A |
3arith/fma_cs_cs_cs | N/A | N/A | N/A | 0.0024 | N/A |
3arith/fma_d_d_d | 0.0230 | 0.0230 | 1 | 0.0009 | 25.56 |
3arith/fma_d_i32_d | 0.0230 | 0.0250 | 0.92 | 0.0014 | 17.86 |
3arith/fma_i16_i16_i16 | 0.0280 | 0.0280 | 1 | 0.0003 | 93.33 |
3arith/fma_i1_i1_i1 | 0.0380 | 0.0400 | 0.95 | 0.0002 | 200 |
3arith/fma_i32_i32_i32 | 0.0270 | 0.0290 | 0.931 | 0.0005 | 58 |
3arith/fma_i64_i64_i64 | 0.0290 | 0.0310 | 0.9355 | 0.0010 | 31 |
3arith/fma_i8_i8_i8 | 0.0230 | 0.0270 | 0.8519 | 0.0002 | 135 |
3arith/fma_s_i32_d | N/A | N/A | N/A | 0.0019 | N/A |
3arith/fma_s_i32_s | 0.0220 | 0.0240 | 0.9167 | 0.0024 | 10 |
3arith/fma_s_s_s | 0.0290 | 0.0240 | 1.208 | 0.0004 | 60 |
3arith/mul_cd_cd | N/A | N/A | N/A | 0.0018 | N/A |
3arith/mul_cs_cs | N/A | N/A | N/A | 0.0017 | N/A |
3arith/mul_d_d | 0.0130 | 0.0120 | 1.083 | 0.0005 | 24 |
3arith/mul_d_i32 | 0.0120 | 0.0130 | 0.9231 | 0.0010 | 13 |
3arith/mul_i16_i16 | 0.0120 | 0.0120 | 1 | 0.0002 | 60 |
3arith/mul_i1_i1 | 0.0170 | 0.0190 | 0.8947 | 0.0001 | 190 |
3arith/mul_i32_i32 | 0.0110 | 0.0150 | 0.7333 | 0.0003 | 50 |
3arith/mul_i64_i64 | 0.0110 | 0.0130 | 0.8462 | 0.0007 | 18.57 |
3arith/mul_i8_i8 | 0.0130 | 0.0110 | 1.182 | 0.0002 | 55 |
3arith/mul_s_d | 0.0120 | 0.0120 | 1 | 0.0011 | 10.91 |
3arith/mul_s_i32 | 0.0130 | 0.0120 | 1.083 | 0.0015 | 8 |
3arith/mul_s_s | 0.0120 | 0.0130 | 0.9231 | 0.0003 | 43.33 |
4concat/concatenate/0 | 0.0020 | 0.0020 | 1 | 0.0004 | 5 |
4concat/concatenate/1 | 0.0050 | 0.0060 | 0.8333 | 0.0003 | 20 |
4concat/concatenate/2 | 0.0290 | 0.0250 | 1.16 | 0.0004 | 62.5 |
4concat/stack/0 | 0.0040 | 0.0040 | 1 | 0.0010 | 4 |
4concat/stack/1 | 0.0060 | 0.0060 | 1 | 0.0010 | 6 |
4concat/stack/2 | 0.0300 | 0.0320 | 0.9375 | 0.0011 | 29.09 |
5math/acos/0 | 1.0610 | 1.1960 | 0.8871 | 0.4452 | 2.686 |
5math/asin/0 | 1.0240 | 1.1660 | 0.8782 | 0.4621 | 2.523 |
5math/atan/0 | 0.7790 | 0.8570 | 0.909 | 0.6743 | 1.271 |
5math/cos/0 | 0.7390 | 0.6740 | 1.096 | 0.4278 | 1.576 |
5math/cosh/0 | 0.5730 | 0.7410 | 0.7733 | 0.5246 | 1.413 |
5math/exp/0 | 0.7900 | 0.8990 | 0.8788 | 0.7668 | 1.172 |
5math/log/0 | 1.5530 | 1.8140 | 0.8561 | 0.6683 | 2.714 |
5math/sin/0 | 0.7880 | 0.8120 | 0.9704 | 0.8215 | 0.9884 |
5math/sinh/0 | 0.7760 | 1.1640 | 0.6667 | 0.7041 | 1.653 |
5math/tan/0 | 0.9550 | 0.8880 | 1.075 | 0.5731 | 1.549 |
5math/tanh/0 | 0.4500 | 0.6040 | 0.745 | 0.5101 | 1.184 |
title | numcl @ 9602407 | numpy 1.14.2 | cl/py |
---|---|---|---|
0template/dummy/0 | 0.0030 | 0.8337 | 0.003598 |
1access/read-range/0 | 0.0190 | 0.0004 | 47.5 |
1access/read-range/1 | 0.0190 | 0.0004 | 47.5 |
1access/read-range/2 | 0.1840 | 0.0004 | 460 |
1access/read-range/3 | 0.2250 | 0.0004 | 562.5 |
1access/read-range/4 | 0.2230 | 0.0005 | 446 |
1access/read/0 | 0.0020 | 0.0002 | 10 |
1access/read/1 | 0.0030 | 0.0003 | 10 |
1access/read/2 | 0.0610 | 0.0003 | 203.3 |
1access/read/3 | 0.0520 | 0.0004 | 130 |
1access/read/4 | 0.0550 | 0.0003 | 183.3 |
1access/write-batch/0 | 0.0140 | 0.0010 | 14 |
1access/write-batch/1 | 0.0580 | 0.0089 | 6.517 |
1access/write-range/0 | 0.0220 | 0.0008 | 27.5 |
1access/write-range/1 | 0.0200 | 0.0012 | 16.67 |
1access/write-range/2 | 0.0390 | 0.0011 | 35.45 |
1access/write-range/3 | 0.0420 | 0.0012 | 35 |
1access/write-range/4 | 0.0460 | 0.0012 | 38.33 |
1access/write/0 | 0.0180 | 0.0002 | 90 |
1access/write/1 | 0.0180 | 0.0008 | 22.5 |
1access/write/2 | 0.0250 | 0.0008 | 31.25 |
1access/write/3 | 0.0280 | 0.0007 | 40 |
1access/write/4 | 0.0310 | 0.0007 | 44.29 |
1allocation/arange/float32 | 0.0040 | 0.0001 | 40 |
1allocation/arange/float64 | 0.0040 | 0.0001 | 40 |
1allocation/arange/int16 | 0.0020 | 0.0001 | 20 |
1allocation/arange/int32 | 0.0020 | 0.0001 | 20 |
1allocation/arange/int64 | 0.0040 | 0.0001 | 40 |
1allocation/arange/int8 | 0.0000 | 0.0001 | 0 |
1allocation/copy/bool | 0.0040 | 0.0628 | 0.06369 |
1allocation/copy/float32 | 0.1260 | 0.0583 | 2.161 |
1allocation/copy/float64 | 0.3200 | 0.0646 | 4.954 |
1allocation/copy/int16 | 0.0510 | 0.0572 | 0.8916 |
1allocation/copy/int32 | 0.1260 | 0.0636 | 1.981 |
1allocation/copy/int64 | 0.3450 | 0.0551 | 6.261 |
1allocation/copy/int8 | 0.0260 | 0.0582 | 0.4467 |
1allocation/empty/bool | 0.0030 | 0.0001 | 30 |
1allocation/empty/float32 | 0.0700 | 0.0001 | 700 |
1allocation/empty/float64 | 0.1420 | 0.0001 | 1420 |
1allocation/empty/int16 | 0.0400 | 0.0001 | 400 |
1allocation/empty/int32 | 0.0680 | 0.0001 | 680 |
1allocation/empty/int64 | 0.1430 | 0.0001 | 1430 |
1allocation/empty/int8 | 0.0190 | 0.0001 | 190 |
1allocation/ones/bool | 0.0030 | 0.0030 | 1 |
1allocation/ones/float32 | 0.1210 | 0.0276 | 4.384 |
1allocation/ones/float64 | 0.2720 | 0.1137 | 2.392 |
1allocation/ones/int16 | 0.0460 | 0.0060 | 7.667 |
1allocation/ones/int32 | 0.1450 | 0.0390 | 3.718 |
1allocation/ones/int64 | 0.2800 | 0.1328 | 2.108 |
1allocation/ones/int8 | 0.0230 | 0.0024 | 9.583 |
1allocation/zeros/bool | 0.0030 | 0.0025 | 1.2 |
1allocation/zeros/float32 | 0.1170 | 0.0280 | 4.179 |
1allocation/zeros/float64 | 0.2790 | 0.1277 | 2.185 |
1allocation/zeros/int16 | 0.0520 | 0.0049 | 10.61 |
1allocation/zeros/int32 | 0.1220 | 0.0238 | 5.126 |
1allocation/zeros/int64 | 0.2980 | 0.1259 | 2.367 |
1allocation/zeros/int8 | 0.0260 | 0.0022 | 11.82 |
2linarg/diag/builtin | 0.0050 | 0.0008 | 6.25 |
2linarg/diag/einsum | 0.0050 | 0.0003 | 16.67 |
2linarg/eye/builtin | 0.0000 | 0.0010 | 0 |
2linarg/eye/naive | 0.1740 | 0.0035 | 49.71 |
2linarg/gemm-large/builtin | 44.3280 | 2.3508 | 18.86 |
2linarg/gemm-large/einsum | 46.3890 | 23.8243 | 1.947 |
2linarg/gemm/builtin | 0.0490 | 0.0259 | 1.892 |
2linarg/gemm/einsum | 0.0480 | 0.0620 | 0.7742 |
2linarg/inner/builtin | 0.0050 | 0.0001 | 50 |
2linarg/inner/einsum | 0.0050 | 0.0003 | 16.67 |
2linarg/outer/builtin | 0.0060 | 0.0022 | 2.727 |
2linarg/outer/einsum | 0.0080 | 0.0013 | 6.154 |
2linarg/tri/builtin | 0.0140 | 0.0036 | 3.889 |
2linarg/tril/builtin | 0.0200 | 0.0036 | 5.556 |
2linarg/triu/builtin | 0.0190 | 0.0035 | 5.429 |
2linarg/vander/builtin | 0.0540 | 0.0068 | 7.941 |
2linarg/vdot/builtin | 0.0070 | 0.0001 | 70 |
2linarg/vdot/einsum | 0.0070 | 0.0004 | 17.5 |
3arith/add_cd_cd | N/A | 0.0013 | N/A |
3arith/add_cs_cs | N/A | 0.0010 | N/A |
3arith/add_d_d | 0.0140 | 0.0003 | 46.67 |
3arith/add_d_i32 | 0.0140 | 0.0011 | 12.73 |
3arith/add_i16_i16 | 0.0130 | 0.0002 | 65 |
3arith/add_i1_i1 | 0.0140 | 0.0001 | 140 |
3arith/add_i32_i32 | 0.0160 | 0.0003 | 53.33 |
3arith/add_i64_i64 | 0.0150 | 0.0005 | 30 |
3arith/add_i8_i8 | 0.0120 | 0.0002 | 60 |
3arith/add_s_d | 0.0140 | 0.0009 | 15.56 |
3arith/add_s_i32 | 0.0130 | 0.0016 | 8.125 |
3arith/add_s_s | 0.0120 | 0.0002 | 60 |
3arith/fma_cd_cd_cd | N/A | 0.0028 | N/A |
3arith/fma_cs_cs_cs | N/A | 0.0024 | N/A |
3arith/fma_d_d_d | 0.0250 | 0.0008 | 31.25 |
3arith/fma_d_i32_d | 0.0250 | 0.0015 | 16.67 |
3arith/fma_i16_i16_i16 | 0.0260 | 0.0004 | 65 |
3arith/fma_i1_i1_i1 | 0.0270 | 0.0003 | 90 |
3arith/fma_i32_i32_i32 | 0.0260 | 0.0007 | 37.14 |
3arith/fma_i64_i64_i64 | 0.0260 | 0.0014 | 18.57 |
3arith/fma_i8_i8_i8 | 0.0250 | 0.0003 | 83.33 |
3arith/fma_s_i32_d | N/A | 0.0018 | N/A |
3arith/fma_s_i32_s | 0.0240 | 0.0022 | 10.91 |
3arith/fma_s_s_s | 0.0230 | 0.0005 | 46 |
3arith/mul_cd_cd | N/A | 0.0018 | N/A |
3arith/mul_cs_cs | N/A | 0.0016 | N/A |
3arith/mul_d_d | 0.0140 | 0.0005 | 28 |
3arith/mul_d_i32 | 0.0130 | 0.0009 | 14.44 |
3arith/mul_i16_i16 | 0.0130 | 0.0002 | 65 |
3arith/mul_i1_i1 | 0.0130 | 0.0002 | 65 |
3arith/mul_i32_i32 | 0.0150 | 0.0004 | 37.5 |
3arith/mul_i64_i64 | 0.0140 | 0.0009 | 15.56 |
3arith/mul_i8_i8 | 0.0120 | 0.0002 | 60 |
3arith/mul_s_d | 0.0140 | 0.0009 | 15.56 |
3arith/mul_s_i32 | 0.0130 | 0.0013 | 10 |
3arith/mul_s_s | 0.0120 | 0.0003 | 40 |
4concat/concatenate/0 | 0.0030 | 0.0003 | 10 |
4concat/concatenate/1 | 0.0050 | 0.0003 | 16.67 |
4concat/concatenate/2 | 0.0290 | 0.0004 | 72.5 |
4concat/stack/0 | 0.0030 | 0.0008 | 3.75 |
4concat/stack/1 | 0.0060 | 0.0007 | 8.571 |
4concat/stack/2 | 0.0350 | 0.0009 | 38.89 |
5math/acos/0 | 1.0010 | 0.5164 | 1.938 |
5math/asin/0 | 0.9530 | 0.4243 | 2.246 |
5math/atan/0 | 0.9750 | 0.8358 | 1.167 |
5math/cos/0 | 1.2620 | 0.7917 | 1.594 |
5math/cosh/0 | 0.4600 | 0.4140 | 1.111 |
5math/exp/0 | 0.9710 | 0.8494 | 1.143 |
5math/log/0 | 1.4590 | 0.4535 | 3.217 |
5math/sin/0 | 1.1580 | 0.8832 | 1.311 |
5math/sinh/0 | 0.5890 | 0.4834 | 1.218 |
5math/tan/0 | 1.2220 | 0.7911 | 1.545 |
5math/tanh/0 | 0.3860 | 0.3471 | 1.112 |