mirror of
https://github.com/vortexgpgpu/vortex.git
synced 2025-04-23 21:39:10 -04:00
Merge remote-tracking branch 'origin/master' into graphics
This commit is contained in:
commit
c23c0fbe4c
63 changed files with 228648 additions and 12 deletions
|
@ -1,7 +1,7 @@
|
|||
# Flubber FPGA Startup and Configuration Guide
|
||||
|
||||
OPAE environment setup
|
||||
------------------
|
||||
OPAE Environment Setup
|
||||
----------------------
|
||||
|
||||
$ source /opt/inteldevstack/init_env_user.sh
|
||||
$ export OPAE_HOME=/opt/opae/1.1.2
|
||||
|
@ -16,7 +16,7 @@ OPAE environment setup
|
|||
OPAE Build Configuration
|
||||
------------------------
|
||||
|
||||
Within the /hw/syn/opae directory, there are source text files for each core-option for the fpga build (the 32 and 64 core options are not currently implemented) which have the following parameters that can be configured:
|
||||
Within the `/hw/syn/opae` directory, there are source text files for each core-option for the fpga build (the 32 and 64 core options are not currently implemented) which have the following parameters that can be configured:
|
||||
- NUM_CORES: the number of cores per cluster
|
||||
- NUM_CLUSTERS: the number of clusters alotted to the processor
|
||||
- L3_ENABLE: enable the use of the L3 cache
|
||||
|
@ -24,7 +24,7 @@ Within the /hw/syn/opae directory, there are source text files for each core-opt
|
|||
|
||||
To enable L3 cache and profile counters for a build, simply uncomment the definition within the respective source file.
|
||||
|
||||
OPAE build
|
||||
OPAE Build
|
||||
------------------
|
||||
|
||||
The Flubber FPGA has to following configuration options:
|
||||
|
@ -35,35 +35,33 @@ The Flubber FPGA has to following configuration options:
|
|||
- 16 cores fpga (fpga-16c)
|
||||
|
||||
$ cd hw/syn/opae
|
||||
$ make fpga-`# of cores`c
|
||||
$ make fpga- *# of cores* c
|
||||
|
||||
Example: `make fpga-4c`
|
||||
|
||||
A new folder *build_fpga_`# of cores`c* will be created and the build will start and take ~30-45 min to complete.
|
||||
A new folder (ex: `build_fpga_4c`) will be created and the build will start and take ~30-45 min to complete.
|
||||
|
||||
OPAE Build Progress
|
||||
-------------------
|
||||
|
||||
You could check the last 10 lines in the build log for possible errors until build completion.
|
||||
|
||||
$ tail -n 10 ./build_fpga_`# of cores`c/build.log
|
||||
|
||||
Example: `tail -n 10 ./build_fpga_4c/build.log`
|
||||
$ tail -n 10 ./build_fpga_4c/build.log
|
||||
|
||||
Check if the build is still running by looking for quartus_sh, quartus_syn, or quartus_fit programs.
|
||||
|
||||
$ ps -u `username`
|
||||
$ ps -u *username*
|
||||
|
||||
|
||||
If the build fails and you need to restart it, clean up the build folder using the following command:
|
||||
|
||||
$ make clean-fpga-`# of cores`c
|
||||
$ make clean-fpga- *# of cores* c
|
||||
|
||||
Example: `make clean-fpga-4c`
|
||||
|
||||
The file `vortex_afu.gbs` should exist when the build is done:
|
||||
|
||||
$ ls -lsa ./build_fpga_`# of cores`c/vortex_afu.gbs
|
||||
$ ls -lsa ./build_fpga_ *# of cores* c/vortex_afu.gbs
|
||||
|
||||
|
||||
Signing the bitstream and Programming the FPGA
|
35
doc/Simulation.md
Normal file
35
doc/Simulation.md
Normal file
|
@ -0,0 +1,35 @@
|
|||
# Vortex Simulation Methods
|
||||
|
||||
### RTL Simulation
|
||||
|
||||
[Verilator](https://www.veripool.org/projects/verilator/wiki) is a Verilog/SystemVerilog design simulator that converts the Verilog HDL to single- or mult-ithreaded C++/SystemC code to perform the design simulation. An installation guide for Verilator is located [here.](https://www.veripool.org/projects/verilator/wiki/Installing)
|
||||
|
||||
### Cycle-Approximate Simulation
|
||||
|
||||
SimX is a C++ cycle-level in-house simulator developed for Vortex. The relevant files are located in the `simX` folder.
|
||||
|
||||
### FGPA Simulation
|
||||
|
||||
The current target FPGA for simulation is the Arria10 Intel Accelerator Card v1.0. The guide to build the fpga with specific configurations is located [here.](https://github.com/vortexgpgpu/vortex-dev/blob/master/doc/Flubber_FPGA_Startup_Guide.md)
|
||||
|
||||
### How to Test
|
||||
|
||||
Running tests under specific drivers (rtlsim,simx,fpga) is done using the script named `blackbox.sh` located in the `ci` folder. Running command `./ci/blackbox.sh --help` from the Vortex root directory will display the following command line arguments for `blackbox.sh`:
|
||||
|
||||
- *Clusters* - used to specify the number of clusters (collection of processing elements) within a configuration.
|
||||
- *Cores* - used to specify the number of cores (processing element containing multiple warps) within a configuration.
|
||||
- *Warps* - used to specify the number of warps (collection of concurrent hardware threads) within a configuration.
|
||||
- *Threads* - used to specify the number of threads (smallest unit of computation) within a configuration.
|
||||
- *L2cache* - used to enable the shard l2cache among the Vortex cores.
|
||||
- *L3cache* - used to enable the shared l3cache among the Vortex clusters.
|
||||
- *Driver* - used to specify which driver to run the Vortex simulation (either rtlsim, vlsim, fpga, or simx).
|
||||
- *Debug* - used to enable debug mode for the Vortex simulation.
|
||||
- *Scope* -
|
||||
- *Perf* - is used to enable the detailed performance counters within the Vortex simulation.
|
||||
- *App* - is used to specify which test/benchmark to run in the Vortex simulation. The main choices are vecadd, sgemm, basic, demo, and dogfood. Other tests/benchmarks are located in the `/benchmarks/opencl` folder though not all of them work wit the current version of Vortex.
|
||||
- *Args* -
|
||||
|
||||
Example use of command line arguments: Run the sgemm benchmark using the vlsim driver with a Vortex configuration of 1 cluster, 4 cores, 4 warps, and 4 threads.
|
||||
|
||||
$ ./ci/blackbox.sh --clusters=1 --cores=4 --warps=4 --threads=4 --driver=vlsim --app=sgemm
|
||||
|
17
evaluation/perf_2021_03_07/16c/afu_default.fit.summary
Normal file
17
evaluation/perf_2021_03_07/16c/afu_default.fit.summary
Normal file
|
@ -0,0 +1,17 @@
|
|||
Fitter Status : Successful - Sat Mar 6 08:45:37 2021
|
||||
Quartus Prime Version : 19.2.0 Build 57 06/24/2019 Patches 0.01rc SJ Pro Edition
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
||||
Device : 10AX115N2F40E2LG
|
||||
Timing Models : Final
|
||||
Logic utilization (in ALMs) : 359,139 / 427,200 ( 84 % )
|
||||
Total registers : 546782
|
||||
Total pins : 310 / 826 ( 38 % )
|
||||
Total virtual pins : 0
|
||||
Total block memory bits : 12,692,200 / 55,562,240 ( 23 % )
|
||||
Total RAM Blocks : 2,285 / 2,713 ( 84 % )
|
||||
Total DSP Blocks : 448 / 1,518 ( 30 % )
|
||||
Total HSSI RX channels : 12 / 48 ( 25 % )
|
||||
Total HSSI TX channels : 12 / 48 ( 25 % )
|
||||
Total PLLs : 25 / 112 ( 22 % )
|
6945
evaluation/perf_2021_03_07/16c/afu_default.sta.summary
Normal file
6945
evaluation/perf_2021_03_07/16c/afu_default.sta.summary
Normal file
File diff suppressed because it is too large
Load diff
4
evaluation/perf_2021_03_07/16c/afu_default.syn.summary
Normal file
4
evaluation/perf_2021_03_07/16c/afu_default.syn.summary
Normal file
|
@ -0,0 +1,4 @@
|
|||
Synthesis Status : Successful - Sat Mar 6 05:12:07 2021
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
47882
evaluation/perf_2021_03_07/16c/build.log
Normal file
47882
evaluation/perf_2021_03_07/16c/build.log
Normal file
File diff suppressed because it is too large
Load diff
29
evaluation/perf_2021_03_07/16c/guassian.result
Normal file
29
evaluation/perf_2021_03_07/16c/guassian.result
Normal file
|
@ -0,0 +1,29 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./guassian
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
OK
|
||||
The result of matrix m is:
|
||||
0.00 0.00 0.00 0.00
|
||||
0.50 0.00 0.00 0.00
|
||||
0.67 0.26 0.00 0.00
|
||||
-0.00 0.15 -0.28 0.00
|
||||
|
||||
The result of matrix a is:
|
||||
-0.60 -0.50 0.70 0.30
|
||||
0.00 -0.65 -0.05 0.55
|
||||
0.00 0.00 -0.75 -1.14
|
||||
0.00 0.00 0.00 0.50
|
||||
|
||||
The result of array b is:
|
||||
-0.85 -0.25 0.87 -0.25
|
||||
|
||||
The final solution is:
|
||||
0.70 0.00 -0.40 -0.50
|
||||
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
19
evaluation/perf_2021_03_07/16c/nearn.result
Normal file
19
evaluation/perf_2021_03_07/16c/nearn.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./nearn
|
||||
loading db: cane4_0.db
|
||||
loading db: cane4_1.db
|
||||
loading db: cane4_2.db
|
||||
Number of records: 1500
|
||||
Finding the 5 closest neighbors.
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
1974 12 22 18 24 JOYCE 30.6 89.9 80 593 --> Distance=0.608276
|
||||
1965 5 13 0 17 TONY 27.8 89.0 122 260 --> Distance=2.416610
|
||||
1991 3 18 12 19 DEBBY 28.5 87.8 107 850 --> Distance=2.662703
|
||||
1957 4 17 6 12 ALBERTO 32.5 87.8 54 510 --> Distance=3.330163
|
||||
1964 8 5 6 9 FLORENCE 31.5 86.3 18 242 --> Distance=3.992490
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
19
evaluation/perf_2021_03_07/16c/saxpy.result
Normal file
19
evaluation/perf_2021_03_07/16c/saxpy.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./saxpy
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
19
evaluation/perf_2021_03_07/16c/sfilter.result
Normal file
19
evaluation/perf_2021_03_07/16c/sfilter.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sfilter
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
458
evaluation/perf_2021_03_07/16c/sgemm.result
Normal file
458
evaluation/perf_2021_03_07/16c/sgemm.result
Normal file
|
@ -0,0 +1,458 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sgemm -n32
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=23498, cycles=16249, IPC=1.446120
|
||||
PERF: core0: ibuffer stalls=2272
|
||||
PERF: core0: scoreboard stalls=4197
|
||||
PERF: core0: alu unit stalls=737
|
||||
PERF: core0: lsu unit stalls=355
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=3
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=6155
|
||||
PERF: core0: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core0: icache pipeline stalls=2466
|
||||
PERF: core0: icache reponse stalls=2272
|
||||
PERF: core0: dcache reads=2862
|
||||
PERF: core0: dcache writes=101
|
||||
PERF: core0: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core0: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core0: dcache bank stalls=2189 (utilization=57%)
|
||||
PERF: core0: dcache mshr stalls=2617
|
||||
PERF: core0: dcache pipeline stalls=4967
|
||||
PERF: core0: dcache reponse stalls=16
|
||||
PERF: core0: smem reads=538
|
||||
PERF: core0: smem writes=447
|
||||
PERF: core0: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core0: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core0: dram stalls=1211 (utilization=15%)
|
||||
PERF: core0: dram average latency=31 cycles
|
||||
PERF: core1: instrs=23498, cycles=16180, IPC=1.452287
|
||||
PERF: core1: ibuffer stalls=2244
|
||||
PERF: core1: scoreboard stalls=4144
|
||||
PERF: core1: alu unit stalls=735
|
||||
PERF: core1: lsu unit stalls=399
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=1
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=6155
|
||||
PERF: core1: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core1: icache pipeline stalls=2462
|
||||
PERF: core1: icache reponse stalls=2244
|
||||
PERF: core1: dcache reads=2862
|
||||
PERF: core1: dcache writes=101
|
||||
PERF: core1: dcache read misses=635 (hit ratio=77%)
|
||||
PERF: core1: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core1: dcache bank stalls=2190 (utilization=57%)
|
||||
PERF: core1: dcache mshr stalls=2515
|
||||
PERF: core1: dcache pipeline stalls=4793
|
||||
PERF: core1: dcache reponse stalls=16
|
||||
PERF: core1: smem reads=538
|
||||
PERF: core1: smem writes=447
|
||||
PERF: core1: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core1: dram requests=227 (reads=126, writes=101)
|
||||
PERF: core1: dram stalls=1257 (utilization=15%)
|
||||
PERF: core1: dram average latency=30 cycles
|
||||
PERF: core2: instrs=23498, cycles=16179, IPC=1.452376
|
||||
PERF: core2: ibuffer stalls=2224
|
||||
PERF: core2: scoreboard stalls=4120
|
||||
PERF: core2: alu unit stalls=730
|
||||
PERF: core2: lsu unit stalls=423
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=2
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=6155
|
||||
PERF: core2: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core2: icache pipeline stalls=2455
|
||||
PERF: core2: icache reponse stalls=2224
|
||||
PERF: core2: dcache reads=2862
|
||||
PERF: core2: dcache writes=101
|
||||
PERF: core2: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core2: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core2: dcache bank stalls=2187 (utilization=57%)
|
||||
PERF: core2: dcache mshr stalls=2417
|
||||
PERF: core2: dcache pipeline stalls=4427
|
||||
PERF: core2: dcache reponse stalls=16
|
||||
PERF: core2: smem reads=538
|
||||
PERF: core2: smem writes=447
|
||||
PERF: core2: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core2: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core2: dram stalls=1123 (utilization=16%)
|
||||
PERF: core2: dram average latency=31 cycles
|
||||
PERF: core3: instrs=23498, cycles=16102, IPC=1.459322
|
||||
PERF: core3: ibuffer stalls=2190
|
||||
PERF: core3: scoreboard stalls=4072
|
||||
PERF: core3: alu unit stalls=741
|
||||
PERF: core3: lsu unit stalls=410
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=1
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=6155
|
||||
PERF: core3: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core3: icache pipeline stalls=2380
|
||||
PERF: core3: icache reponse stalls=2190
|
||||
PERF: core3: dcache reads=2862
|
||||
PERF: core3: dcache writes=101
|
||||
PERF: core3: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core3: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core3: dcache bank stalls=2192 (utilization=57%)
|
||||
PERF: core3: dcache mshr stalls=2345
|
||||
PERF: core3: dcache pipeline stalls=3768
|
||||
PERF: core3: dcache reponse stalls=16
|
||||
PERF: core3: smem reads=538
|
||||
PERF: core3: smem writes=447
|
||||
PERF: core3: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core3: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core3: dram stalls=699 (utilization=24%)
|
||||
PERF: core3: dram average latency=30 cycles
|
||||
PERF: core4: instrs=23498, cycles=16254, IPC=1.445675
|
||||
PERF: core4: ibuffer stalls=2311
|
||||
PERF: core4: scoreboard stalls=4269
|
||||
PERF: core4: alu unit stalls=733
|
||||
PERF: core4: lsu unit stalls=377
|
||||
PERF: core4: csr unit stalls=0
|
||||
PERF: core4: fpu unit stalls=0
|
||||
PERF: core4: gpu unit stalls=0
|
||||
PERF: core4: icache reads=6155
|
||||
PERF: core4: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core4: icache pipeline stalls=2532
|
||||
PERF: core4: icache reponse stalls=2311
|
||||
PERF: core4: dcache reads=2862
|
||||
PERF: core4: dcache writes=101
|
||||
PERF: core4: dcache read misses=653 (hit ratio=77%)
|
||||
PERF: core4: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core4: dcache bank stalls=2189 (utilization=57%)
|
||||
PERF: core4: dcache mshr stalls=2519
|
||||
PERF: core4: dcache pipeline stalls=4555
|
||||
PERF: core4: dcache reponse stalls=16
|
||||
PERF: core4: smem reads=538
|
||||
PERF: core4: smem writes=447
|
||||
PERF: core4: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core4: dram requests=233 (reads=132, writes=101)
|
||||
PERF: core4: dram stalls=1018 (utilization=18%)
|
||||
PERF: core4: dram average latency=30 cycles
|
||||
PERF: core5: instrs=23498, cycles=16177, IPC=1.452556
|
||||
PERF: core5: ibuffer stalls=2232
|
||||
PERF: core5: scoreboard stalls=4137
|
||||
PERF: core5: alu unit stalls=730
|
||||
PERF: core5: lsu unit stalls=411
|
||||
PERF: core5: csr unit stalls=0
|
||||
PERF: core5: fpu unit stalls=1
|
||||
PERF: core5: gpu unit stalls=0
|
||||
PERF: core5: icache reads=6155
|
||||
PERF: core5: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core5: icache pipeline stalls=2454
|
||||
PERF: core5: icache reponse stalls=2232
|
||||
PERF: core5: dcache reads=2862
|
||||
PERF: core5: dcache writes=101
|
||||
PERF: core5: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core5: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core5: dcache bank stalls=2184 (utilization=57%)
|
||||
PERF: core5: dcache mshr stalls=2446
|
||||
PERF: core5: dcache pipeline stalls=4560
|
||||
PERF: core5: dcache reponse stalls=16
|
||||
PERF: core5: smem reads=538
|
||||
PERF: core5: smem writes=447
|
||||
PERF: core5: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core5: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core5: dram stalls=1086 (utilization=17%)
|
||||
PERF: core5: dram average latency=30 cycles
|
||||
PERF: core6: instrs=23498, cycles=16164, IPC=1.453724
|
||||
PERF: core6: ibuffer stalls=2228
|
||||
PERF: core6: scoreboard stalls=4108
|
||||
PERF: core6: alu unit stalls=727
|
||||
PERF: core6: lsu unit stalls=419
|
||||
PERF: core6: csr unit stalls=0
|
||||
PERF: core6: fpu unit stalls=3
|
||||
PERF: core6: gpu unit stalls=0
|
||||
PERF: core6: icache reads=6155
|
||||
PERF: core6: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core6: icache pipeline stalls=2434
|
||||
PERF: core6: icache reponse stalls=2228
|
||||
PERF: core6: dcache reads=2862
|
||||
PERF: core6: dcache writes=101
|
||||
PERF: core6: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core6: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core6: dcache bank stalls=2190 (utilization=57%)
|
||||
PERF: core6: dcache mshr stalls=2451
|
||||
PERF: core6: dcache pipeline stalls=4321
|
||||
PERF: core6: dcache reponse stalls=16
|
||||
PERF: core6: smem reads=538
|
||||
PERF: core6: smem writes=447
|
||||
PERF: core6: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core6: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core6: dram stalls=930 (utilization=19%)
|
||||
PERF: core6: dram average latency=31 cycles
|
||||
PERF: core7: instrs=23498, cycles=16105, IPC=1.459050
|
||||
PERF: core7: ibuffer stalls=2189
|
||||
PERF: core7: scoreboard stalls=4068
|
||||
PERF: core7: alu unit stalls=746
|
||||
PERF: core7: lsu unit stalls=411
|
||||
PERF: core7: csr unit stalls=0
|
||||
PERF: core7: fpu unit stalls=0
|
||||
PERF: core7: gpu unit stalls=0
|
||||
PERF: core7: icache reads=6155
|
||||
PERF: core7: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core7: icache pipeline stalls=2369
|
||||
PERF: core7: icache reponse stalls=2189
|
||||
PERF: core7: dcache reads=2862
|
||||
PERF: core7: dcache writes=101
|
||||
PERF: core7: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core7: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core7: dcache bank stalls=2189 (utilization=57%)
|
||||
PERF: core7: dcache mshr stalls=2357
|
||||
PERF: core7: dcache pipeline stalls=3798
|
||||
PERF: core7: dcache reponse stalls=16
|
||||
PERF: core7: smem reads=538
|
||||
PERF: core7: smem writes=447
|
||||
PERF: core7: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core7: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core7: dram stalls=763 (utilization=22%)
|
||||
PERF: core7: dram average latency=30 cycles
|
||||
PERF: core8: instrs=23498, cycles=16256, IPC=1.445497
|
||||
PERF: core8: ibuffer stalls=2249
|
||||
PERF: core8: scoreboard stalls=4153
|
||||
PERF: core8: alu unit stalls=740
|
||||
PERF: core8: lsu unit stalls=382
|
||||
PERF: core8: csr unit stalls=0
|
||||
PERF: core8: fpu unit stalls=4
|
||||
PERF: core8: gpu unit stalls=0
|
||||
PERF: core8: icache reads=6155
|
||||
PERF: core8: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core8: icache pipeline stalls=2457
|
||||
PERF: core8: icache reponse stalls=2249
|
||||
PERF: core8: dcache reads=2862
|
||||
PERF: core8: dcache writes=101
|
||||
PERF: core8: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core8: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core8: dcache bank stalls=2193 (utilization=57%)
|
||||
PERF: core8: dcache mshr stalls=2563
|
||||
PERF: core8: dcache pipeline stalls=5209
|
||||
PERF: core8: dcache reponse stalls=15
|
||||
PERF: core8: smem reads=538
|
||||
PERF: core8: smem writes=447
|
||||
PERF: core8: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core8: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core8: dram stalls=1474 (utilization=13%)
|
||||
PERF: core8: dram average latency=31 cycles
|
||||
PERF: core9: instrs=23498, cycles=16264, IPC=1.444786
|
||||
PERF: core9: ibuffer stalls=2245
|
||||
PERF: core9: scoreboard stalls=4151
|
||||
PERF: core9: alu unit stalls=742
|
||||
PERF: core9: lsu unit stalls=385
|
||||
PERF: core9: csr unit stalls=0
|
||||
PERF: core9: fpu unit stalls=2
|
||||
PERF: core9: gpu unit stalls=0
|
||||
PERF: core9: icache reads=6155
|
||||
PERF: core9: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core9: icache pipeline stalls=2471
|
||||
PERF: core9: icache reponse stalls=2245
|
||||
PERF: core9: dcache reads=2862
|
||||
PERF: core9: dcache writes=101
|
||||
PERF: core9: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core9: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core9: dcache bank stalls=2200 (utilization=57%)
|
||||
PERF: core9: dcache mshr stalls=2548
|
||||
PERF: core9: dcache pipeline stalls=5160
|
||||
PERF: core9: dcache reponse stalls=16
|
||||
PERF: core9: smem reads=538
|
||||
PERF: core9: smem writes=447
|
||||
PERF: core9: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core9: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core9: dram stalls=1449 (utilization=13%)
|
||||
PERF: core9: dram average latency=31 cycles
|
||||
PERF: core10: instrs=23498, cycles=16253, IPC=1.445764
|
||||
PERF: core10: ibuffer stalls=2228
|
||||
PERF: core10: scoreboard stalls=4119
|
||||
PERF: core10: alu unit stalls=724
|
||||
PERF: core10: lsu unit stalls=420
|
||||
PERF: core10: csr unit stalls=0
|
||||
PERF: core10: fpu unit stalls=4
|
||||
PERF: core10: gpu unit stalls=0
|
||||
PERF: core10: icache reads=6155
|
||||
PERF: core10: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core10: icache pipeline stalls=2457
|
||||
PERF: core10: icache reponse stalls=2228
|
||||
PERF: core10: dcache reads=2862
|
||||
PERF: core10: dcache writes=101
|
||||
PERF: core10: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core10: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core10: dcache bank stalls=2182 (utilization=57%)
|
||||
PERF: core10: dcache mshr stalls=2427
|
||||
PERF: core10: dcache pipeline stalls=4855
|
||||
PERF: core10: dcache reponse stalls=16
|
||||
PERF: core10: smem reads=538
|
||||
PERF: core10: smem writes=447
|
||||
PERF: core10: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core10: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core10: dram stalls=1326 (utilization=14%)
|
||||
PERF: core10: dram average latency=31 cycles
|
||||
PERF: core11: instrs=23498, cycles=16175, IPC=1.452736
|
||||
PERF: core11: ibuffer stalls=2225
|
||||
PERF: core11: scoreboard stalls=4114
|
||||
PERF: core11: alu unit stalls=734
|
||||
PERF: core11: lsu unit stalls=425
|
||||
PERF: core11: csr unit stalls=0
|
||||
PERF: core11: fpu unit stalls=0
|
||||
PERF: core11: gpu unit stalls=0
|
||||
PERF: core11: icache reads=6155
|
||||
PERF: core11: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core11: icache pipeline stalls=2448
|
||||
PERF: core11: icache reponse stalls=2225
|
||||
PERF: core11: dcache reads=2862
|
||||
PERF: core11: dcache writes=101
|
||||
PERF: core11: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core11: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core11: dcache bank stalls=2195 (utilization=57%)
|
||||
PERF: core11: dcache mshr stalls=2455
|
||||
PERF: core11: dcache pipeline stalls=4007
|
||||
PERF: core11: dcache reponse stalls=15
|
||||
PERF: core11: smem reads=538
|
||||
PERF: core11: smem writes=447
|
||||
PERF: core11: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core11: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core11: dram stalls=967 (utilization=18%)
|
||||
PERF: core11: dram average latency=31 cycles
|
||||
PERF: core12: instrs=23498, cycles=16248, IPC=1.446209
|
||||
PERF: core12: ibuffer stalls=2243
|
||||
PERF: core12: scoreboard stalls=4147
|
||||
PERF: core12: alu unit stalls=745
|
||||
PERF: core12: lsu unit stalls=391
|
||||
PERF: core12: csr unit stalls=0
|
||||
PERF: core12: fpu unit stalls=2
|
||||
PERF: core12: gpu unit stalls=0
|
||||
PERF: core12: icache reads=6155
|
||||
PERF: core12: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core12: icache pipeline stalls=2456
|
||||
PERF: core12: icache reponse stalls=2243
|
||||
PERF: core12: dcache reads=2862
|
||||
PERF: core12: dcache writes=101
|
||||
PERF: core12: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core12: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core12: dcache bank stalls=2198 (utilization=57%)
|
||||
PERF: core12: dcache mshr stalls=2515
|
||||
PERF: core12: dcache pipeline stalls=4956
|
||||
PERF: core12: dcache reponse stalls=16
|
||||
PERF: core12: smem reads=538
|
||||
PERF: core12: smem writes=447
|
||||
PERF: core12: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core12: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core12: dram stalls=1387 (utilization=14%)
|
||||
PERF: core12: dram average latency=31 cycles
|
||||
PERF: core13: instrs=23498, cycles=16176, IPC=1.452646
|
||||
PERF: core13: ibuffer stalls=2224
|
||||
PERF: core13: scoreboard stalls=4117
|
||||
PERF: core13: alu unit stalls=732
|
||||
PERF: core13: lsu unit stalls=431
|
||||
PERF: core13: csr unit stalls=0
|
||||
PERF: core13: fpu unit stalls=3
|
||||
PERF: core13: gpu unit stalls=0
|
||||
PERF: core13: icache reads=6155
|
||||
PERF: core13: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core13: icache pipeline stalls=2446
|
||||
PERF: core13: icache reponse stalls=2224
|
||||
PERF: core13: dcache reads=2862
|
||||
PERF: core13: dcache writes=101
|
||||
PERF: core13: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core13: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core13: dcache bank stalls=2193 (utilization=57%)
|
||||
PERF: core13: dcache mshr stalls=2425
|
||||
PERF: core13: dcache pipeline stalls=4623
|
||||
PERF: core13: dcache reponse stalls=15
|
||||
PERF: core13: smem reads=538
|
||||
PERF: core13: smem writes=447
|
||||
PERF: core13: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core13: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core13: dram stalls=1260 (utilization=15%)
|
||||
PERF: core13: dram average latency=31 cycles
|
||||
PERF: core14: instrs=23498, cycles=16165, IPC=1.453634
|
||||
PERF: core14: ibuffer stalls=2233
|
||||
PERF: core14: scoreboard stalls=4091
|
||||
PERF: core14: alu unit stalls=742
|
||||
PERF: core14: lsu unit stalls=428
|
||||
PERF: core14: csr unit stalls=0
|
||||
PERF: core14: fpu unit stalls=2
|
||||
PERF: core14: gpu unit stalls=0
|
||||
PERF: core14: icache reads=6155
|
||||
PERF: core14: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core14: icache pipeline stalls=2452
|
||||
PERF: core14: icache reponse stalls=2233
|
||||
PERF: core14: dcache reads=2862
|
||||
PERF: core14: dcache writes=101
|
||||
PERF: core14: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core14: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core14: dcache bank stalls=2193 (utilization=57%)
|
||||
PERF: core14: dcache mshr stalls=2426
|
||||
PERF: core14: dcache pipeline stalls=3984
|
||||
PERF: core14: dcache reponse stalls=15
|
||||
PERF: core14: smem reads=538
|
||||
PERF: core14: smem writes=447
|
||||
PERF: core14: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core14: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core14: dram stalls=952 (utilization=19%)
|
||||
PERF: core14: dram average latency=30 cycles
|
||||
PERF: core15: instrs=23500, cycles=16251, IPC=1.446065
|
||||
PERF: core15: ibuffer stalls=2268
|
||||
PERF: core15: scoreboard stalls=4241
|
||||
PERF: core15: alu unit stalls=745
|
||||
PERF: core15: lsu unit stalls=374
|
||||
PERF: core15: csr unit stalls=0
|
||||
PERF: core15: fpu unit stalls=1
|
||||
PERF: core15: gpu unit stalls=0
|
||||
PERF: core15: icache reads=6157
|
||||
PERF: core15: icache read misses=73 (hit ratio=98%)
|
||||
PERF: core15: icache pipeline stalls=2455
|
||||
PERF: core15: icache reponse stalls=2268
|
||||
PERF: core15: dcache reads=2862
|
||||
PERF: core15: dcache writes=101
|
||||
PERF: core15: dcache read misses=634 (hit ratio=77%)
|
||||
PERF: core15: dcache write misses=97 (hit ratio=3%)
|
||||
PERF: core15: dcache bank stalls=2195 (utilization=57%)
|
||||
PERF: core15: dcache mshr stalls=2567
|
||||
PERF: core15: dcache pipeline stalls=5084
|
||||
PERF: core15: dcache reponse stalls=16
|
||||
PERF: core15: smem reads=538
|
||||
PERF: core15: smem writes=447
|
||||
PERF: core15: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core15: dram requests=226 (reads=125, writes=101)
|
||||
PERF: core15: dram stalls=1220 (utilization=15%)
|
||||
PERF: core15: dram average latency=31 cycles
|
||||
PERF: instrs=375970, cycles=16264, IPC=23.116699
|
||||
PERF: ibuffer stalls=35805
|
||||
PERF: scoreboard stalls=66248
|
||||
PERF: alu unit stalls=11783
|
||||
PERF: lsu unit stalls=6441
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=29
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=98482
|
||||
PERF: icache read misses=1168 (hit ratio=98%)
|
||||
PERF: icache pipeline stalls=39194
|
||||
PERF: icache reponse stalls=35805
|
||||
PERF: dcache reads=45792
|
||||
PERF: dcache writes=1616
|
||||
PERF: dcache read misses=10164 (hit ratio=77%)
|
||||
PERF: dcache write misses=1552 (hit ratio=3%)
|
||||
PERF: dcache bank stalls=35059 (utilization=57%)
|
||||
PERF: dcache mshr stalls=39593
|
||||
PERF: dcache pipeline stalls=73067
|
||||
PERF: dcache reponse stalls=252
|
||||
PERF: smem reads=8608
|
||||
PERF: smem writes=7152
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=3624 (reads=2008, writes=1616)
|
||||
PERF: dram stalls=18122 (utilization=16%)
|
||||
PERF: dram average latency=31 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
3
evaluation/perf_2021_03_07/16c/user_clock_freq.txt
Normal file
3
evaluation/perf_2021_03_07/16c/user_clock_freq.txt
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Generated by Platform Interface Manager user_clock_config.tcl
|
||||
afu-image/clock-frequency-low:83.5
|
||||
afu-image/clock-frequency-high:167
|
459
evaluation/perf_2021_03_07/16c/vecadd.result
Normal file
459
evaluation/perf_2021_03_07/16c/vecadd.result
Normal file
|
@ -0,0 +1,459 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./vecadd -n64
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=16, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Allocate device buffers
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=2019, cycles=5194, IPC=0.388718
|
||||
PERF: core0: ibuffer stalls=89
|
||||
PERF: core0: scoreboard stalls=493
|
||||
PERF: core0: alu unit stalls=68
|
||||
PERF: core0: lsu unit stalls=50
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=804
|
||||
PERF: core0: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core0: icache pipeline stalls=444
|
||||
PERF: core0: icache reponse stalls=89
|
||||
PERF: core0: dcache reads=114
|
||||
PERF: core0: dcache writes=65
|
||||
PERF: core0: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core0: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core0: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core0: dcache mshr stalls=58
|
||||
PERF: core0: dcache pipeline stalls=596
|
||||
PERF: core0: dcache reponse stalls=1
|
||||
PERF: core0: smem reads=70
|
||||
PERF: core0: smem writes=63
|
||||
PERF: core0: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core0: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core0: dram stalls=780 (utilization=12%)
|
||||
PERF: core0: dram average latency=31 cycles
|
||||
PERF: core1: instrs=2019, cycles=5191, IPC=0.388942
|
||||
PERF: core1: ibuffer stalls=89
|
||||
PERF: core1: scoreboard stalls=494
|
||||
PERF: core1: alu unit stalls=68
|
||||
PERF: core1: lsu unit stalls=48
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=804
|
||||
PERF: core1: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core1: icache pipeline stalls=455
|
||||
PERF: core1: icache reponse stalls=89
|
||||
PERF: core1: dcache reads=114
|
||||
PERF: core1: dcache writes=65
|
||||
PERF: core1: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core1: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core1: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core1: dcache mshr stalls=58
|
||||
PERF: core1: dcache pipeline stalls=596
|
||||
PERF: core1: dcache reponse stalls=1
|
||||
PERF: core1: smem reads=70
|
||||
PERF: core1: smem writes=63
|
||||
PERF: core1: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core1: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core1: dram stalls=774 (utilization=12%)
|
||||
PERF: core1: dram average latency=31 cycles
|
||||
PERF: core2: instrs=2019, cycles=5110, IPC=0.395108
|
||||
PERF: core2: ibuffer stalls=89
|
||||
PERF: core2: scoreboard stalls=485
|
||||
PERF: core2: alu unit stalls=68
|
||||
PERF: core2: lsu unit stalls=53
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=0
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=804
|
||||
PERF: core2: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core2: icache pipeline stalls=401
|
||||
PERF: core2: icache reponse stalls=89
|
||||
PERF: core2: dcache reads=114
|
||||
PERF: core2: dcache writes=65
|
||||
PERF: core2: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core2: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core2: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core2: dcache mshr stalls=60
|
||||
PERF: core2: dcache pipeline stalls=541
|
||||
PERF: core2: dcache reponse stalls=1
|
||||
PERF: core2: smem reads=70
|
||||
PERF: core2: smem writes=63
|
||||
PERF: core2: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core2: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core2: dram stalls=731 (utilization=12%)
|
||||
PERF: core2: dram average latency=30 cycles
|
||||
PERF: core3: instrs=2019, cycles=5101, IPC=0.395805
|
||||
PERF: core3: ibuffer stalls=89
|
||||
PERF: core3: scoreboard stalls=486
|
||||
PERF: core3: alu unit stalls=68
|
||||
PERF: core3: lsu unit stalls=52
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=0
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=804
|
||||
PERF: core3: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core3: icache pipeline stalls=401
|
||||
PERF: core3: icache reponse stalls=89
|
||||
PERF: core3: dcache reads=114
|
||||
PERF: core3: dcache writes=65
|
||||
PERF: core3: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core3: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core3: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core3: dcache mshr stalls=58
|
||||
PERF: core3: dcache pipeline stalls=532
|
||||
PERF: core3: dcache reponse stalls=1
|
||||
PERF: core3: smem reads=70
|
||||
PERF: core3: smem writes=63
|
||||
PERF: core3: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core3: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core3: dram stalls=731 (utilization=12%)
|
||||
PERF: core3: dram average latency=29 cycles
|
||||
PERF: core4: instrs=495, cycles=3605, IPC=0.137309
|
||||
PERF: core4: ibuffer stalls=0
|
||||
PERF: core4: scoreboard stalls=267
|
||||
PERF: core4: alu unit stalls=0
|
||||
PERF: core4: lsu unit stalls=0
|
||||
PERF: core4: csr unit stalls=0
|
||||
PERF: core4: fpu unit stalls=0
|
||||
PERF: core4: gpu unit stalls=0
|
||||
PERF: core4: icache reads=348
|
||||
PERF: core4: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core4: icache pipeline stalls=63
|
||||
PERF: core4: icache reponse stalls=0
|
||||
PERF: core4: dcache reads=18
|
||||
PERF: core4: dcache writes=48
|
||||
PERF: core4: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core4: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core4: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core4: dcache mshr stalls=0
|
||||
PERF: core4: dcache pipeline stalls=525
|
||||
PERF: core4: dcache reponse stalls=0
|
||||
PERF: core4: smem reads=23
|
||||
PERF: core4: smem writes=25
|
||||
PERF: core4: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core4: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core4: dram stalls=765 (utilization=9%)
|
||||
PERF: core4: dram average latency=31 cycles
|
||||
PERF: core5: instrs=495, cycles=3603, IPC=0.137386
|
||||
PERF: core5: ibuffer stalls=0
|
||||
PERF: core5: scoreboard stalls=269
|
||||
PERF: core5: alu unit stalls=0
|
||||
PERF: core5: lsu unit stalls=0
|
||||
PERF: core5: csr unit stalls=0
|
||||
PERF: core5: fpu unit stalls=0
|
||||
PERF: core5: gpu unit stalls=0
|
||||
PERF: core5: icache reads=348
|
||||
PERF: core5: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core5: icache pipeline stalls=63
|
||||
PERF: core5: icache reponse stalls=0
|
||||
PERF: core5: dcache reads=18
|
||||
PERF: core5: dcache writes=48
|
||||
PERF: core5: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core5: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core5: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core5: dcache mshr stalls=0
|
||||
PERF: core5: dcache pipeline stalls=514
|
||||
PERF: core5: dcache reponse stalls=0
|
||||
PERF: core5: smem reads=23
|
||||
PERF: core5: smem writes=25
|
||||
PERF: core5: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core5: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core5: dram stalls=758 (utilization=9%)
|
||||
PERF: core5: dram average latency=31 cycles
|
||||
PERF: core6: instrs=495, cycles=3587, IPC=0.137998
|
||||
PERF: core6: ibuffer stalls=0
|
||||
PERF: core6: scoreboard stalls=260
|
||||
PERF: core6: alu unit stalls=0
|
||||
PERF: core6: lsu unit stalls=0
|
||||
PERF: core6: csr unit stalls=0
|
||||
PERF: core6: fpu unit stalls=0
|
||||
PERF: core6: gpu unit stalls=0
|
||||
PERF: core6: icache reads=348
|
||||
PERF: core6: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core6: icache pipeline stalls=63
|
||||
PERF: core6: icache reponse stalls=0
|
||||
PERF: core6: dcache reads=18
|
||||
PERF: core6: dcache writes=48
|
||||
PERF: core6: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core6: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core6: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core6: dcache mshr stalls=0
|
||||
PERF: core6: dcache pipeline stalls=472
|
||||
PERF: core6: dcache reponse stalls=0
|
||||
PERF: core6: smem reads=23
|
||||
PERF: core6: smem writes=25
|
||||
PERF: core6: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core6: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core6: dram stalls=727 (utilization=9%)
|
||||
PERF: core6: dram average latency=31 cycles
|
||||
PERF: core7: instrs=495, cycles=3573, IPC=0.138539
|
||||
PERF: core7: ibuffer stalls=0
|
||||
PERF: core7: scoreboard stalls=260
|
||||
PERF: core7: alu unit stalls=0
|
||||
PERF: core7: lsu unit stalls=0
|
||||
PERF: core7: csr unit stalls=0
|
||||
PERF: core7: fpu unit stalls=0
|
||||
PERF: core7: gpu unit stalls=0
|
||||
PERF: core7: icache reads=348
|
||||
PERF: core7: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core7: icache pipeline stalls=63
|
||||
PERF: core7: icache reponse stalls=0
|
||||
PERF: core7: dcache reads=18
|
||||
PERF: core7: dcache writes=48
|
||||
PERF: core7: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core7: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core7: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core7: dcache mshr stalls=0
|
||||
PERF: core7: dcache pipeline stalls=474
|
||||
PERF: core7: dcache reponse stalls=0
|
||||
PERF: core7: smem reads=23
|
||||
PERF: core7: smem writes=25
|
||||
PERF: core7: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core7: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core7: dram stalls=728 (utilization=9%)
|
||||
PERF: core7: dram average latency=31 cycles
|
||||
PERF: core8: instrs=495, cycles=3604, IPC=0.137347
|
||||
PERF: core8: ibuffer stalls=0
|
||||
PERF: core8: scoreboard stalls=268
|
||||
PERF: core8: alu unit stalls=0
|
||||
PERF: core8: lsu unit stalls=0
|
||||
PERF: core8: csr unit stalls=0
|
||||
PERF: core8: fpu unit stalls=0
|
||||
PERF: core8: gpu unit stalls=0
|
||||
PERF: core8: icache reads=348
|
||||
PERF: core8: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core8: icache pipeline stalls=63
|
||||
PERF: core8: icache reponse stalls=0
|
||||
PERF: core8: dcache reads=18
|
||||
PERF: core8: dcache writes=48
|
||||
PERF: core8: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core8: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core8: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core8: dcache mshr stalls=0
|
||||
PERF: core8: dcache pipeline stalls=525
|
||||
PERF: core8: dcache reponse stalls=0
|
||||
PERF: core8: smem reads=23
|
||||
PERF: core8: smem writes=25
|
||||
PERF: core8: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core8: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core8: dram stalls=764 (utilization=9%)
|
||||
PERF: core8: dram average latency=31 cycles
|
||||
PERF: core9: instrs=495, cycles=3600, IPC=0.137500
|
||||
PERF: core9: ibuffer stalls=0
|
||||
PERF: core9: scoreboard stalls=268
|
||||
PERF: core9: alu unit stalls=0
|
||||
PERF: core9: lsu unit stalls=0
|
||||
PERF: core9: csr unit stalls=0
|
||||
PERF: core9: fpu unit stalls=0
|
||||
PERF: core9: gpu unit stalls=0
|
||||
PERF: core9: icache reads=348
|
||||
PERF: core9: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core9: icache pipeline stalls=63
|
||||
PERF: core9: icache reponse stalls=0
|
||||
PERF: core9: dcache reads=18
|
||||
PERF: core9: dcache writes=48
|
||||
PERF: core9: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core9: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core9: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core9: dcache mshr stalls=0
|
||||
PERF: core9: dcache pipeline stalls=514
|
||||
PERF: core9: dcache reponse stalls=0
|
||||
PERF: core9: smem reads=23
|
||||
PERF: core9: smem writes=25
|
||||
PERF: core9: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core9: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core9: dram stalls=756 (utilization=9%)
|
||||
PERF: core9: dram average latency=31 cycles
|
||||
PERF: core10: instrs=495, cycles=3585, IPC=0.138075
|
||||
PERF: core10: ibuffer stalls=0
|
||||
PERF: core10: scoreboard stalls=261
|
||||
PERF: core10: alu unit stalls=0
|
||||
PERF: core10: lsu unit stalls=0
|
||||
PERF: core10: csr unit stalls=0
|
||||
PERF: core10: fpu unit stalls=0
|
||||
PERF: core10: gpu unit stalls=0
|
||||
PERF: core10: icache reads=348
|
||||
PERF: core10: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core10: icache pipeline stalls=63
|
||||
PERF: core10: icache reponse stalls=0
|
||||
PERF: core10: dcache reads=18
|
||||
PERF: core10: dcache writes=48
|
||||
PERF: core10: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core10: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core10: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core10: dcache mshr stalls=0
|
||||
PERF: core10: dcache pipeline stalls=472
|
||||
PERF: core10: dcache reponse stalls=0
|
||||
PERF: core10: smem reads=23
|
||||
PERF: core10: smem writes=25
|
||||
PERF: core10: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core10: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core10: dram stalls=728 (utilization=9%)
|
||||
PERF: core10: dram average latency=31 cycles
|
||||
PERF: core11: instrs=495, cycles=3572, IPC=0.138578
|
||||
PERF: core11: ibuffer stalls=0
|
||||
PERF: core11: scoreboard stalls=259
|
||||
PERF: core11: alu unit stalls=0
|
||||
PERF: core11: lsu unit stalls=0
|
||||
PERF: core11: csr unit stalls=0
|
||||
PERF: core11: fpu unit stalls=0
|
||||
PERF: core11: gpu unit stalls=0
|
||||
PERF: core11: icache reads=348
|
||||
PERF: core11: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core11: icache pipeline stalls=63
|
||||
PERF: core11: icache reponse stalls=0
|
||||
PERF: core11: dcache reads=18
|
||||
PERF: core11: dcache writes=48
|
||||
PERF: core11: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core11: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core11: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core11: dcache mshr stalls=0
|
||||
PERF: core11: dcache pipeline stalls=474
|
||||
PERF: core11: dcache reponse stalls=0
|
||||
PERF: core11: smem reads=23
|
||||
PERF: core11: smem writes=25
|
||||
PERF: core11: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core11: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core11: dram stalls=728 (utilization=9%)
|
||||
PERF: core11: dram average latency=31 cycles
|
||||
PERF: core12: instrs=495, cycles=3599, IPC=0.137538
|
||||
PERF: core12: ibuffer stalls=0
|
||||
PERF: core12: scoreboard stalls=261
|
||||
PERF: core12: alu unit stalls=0
|
||||
PERF: core12: lsu unit stalls=0
|
||||
PERF: core12: csr unit stalls=0
|
||||
PERF: core12: fpu unit stalls=0
|
||||
PERF: core12: gpu unit stalls=0
|
||||
PERF: core12: icache reads=348
|
||||
PERF: core12: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core12: icache pipeline stalls=63
|
||||
PERF: core12: icache reponse stalls=0
|
||||
PERF: core12: dcache reads=18
|
||||
PERF: core12: dcache writes=48
|
||||
PERF: core12: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core12: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core12: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core12: dcache mshr stalls=0
|
||||
PERF: core12: dcache pipeline stalls=533
|
||||
PERF: core12: dcache reponse stalls=0
|
||||
PERF: core12: smem reads=23
|
||||
PERF: core12: smem writes=25
|
||||
PERF: core12: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core12: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core12: dram stalls=762 (utilization=9%)
|
||||
PERF: core12: dram average latency=31 cycles
|
||||
PERF: core13: instrs=495, cycles=3589, IPC=0.137921
|
||||
PERF: core13: ibuffer stalls=0
|
||||
PERF: core13: scoreboard stalls=257
|
||||
PERF: core13: alu unit stalls=0
|
||||
PERF: core13: lsu unit stalls=0
|
||||
PERF: core13: csr unit stalls=0
|
||||
PERF: core13: fpu unit stalls=0
|
||||
PERF: core13: gpu unit stalls=0
|
||||
PERF: core13: icache reads=348
|
||||
PERF: core13: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core13: icache pipeline stalls=63
|
||||
PERF: core13: icache reponse stalls=0
|
||||
PERF: core13: dcache reads=18
|
||||
PERF: core13: dcache writes=48
|
||||
PERF: core13: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core13: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core13: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core13: dcache mshr stalls=0
|
||||
PERF: core13: dcache pipeline stalls=478
|
||||
PERF: core13: dcache reponse stalls=0
|
||||
PERF: core13: smem reads=23
|
||||
PERF: core13: smem writes=25
|
||||
PERF: core13: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core13: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core13: dram stalls=736 (utilization=9%)
|
||||
PERF: core13: dram average latency=31 cycles
|
||||
PERF: core14: instrs=495, cycles=3584, IPC=0.138114
|
||||
PERF: core14: ibuffer stalls=0
|
||||
PERF: core14: scoreboard stalls=255
|
||||
PERF: core14: alu unit stalls=0
|
||||
PERF: core14: lsu unit stalls=0
|
||||
PERF: core14: csr unit stalls=0
|
||||
PERF: core14: fpu unit stalls=0
|
||||
PERF: core14: gpu unit stalls=0
|
||||
PERF: core14: icache reads=348
|
||||
PERF: core14: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core14: icache pipeline stalls=63
|
||||
PERF: core14: icache reponse stalls=0
|
||||
PERF: core14: dcache reads=18
|
||||
PERF: core14: dcache writes=48
|
||||
PERF: core14: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core14: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core14: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core14: dcache mshr stalls=0
|
||||
PERF: core14: dcache pipeline stalls=480
|
||||
PERF: core14: dcache reponse stalls=0
|
||||
PERF: core14: smem reads=23
|
||||
PERF: core14: smem writes=25
|
||||
PERF: core14: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core14: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core14: dram stalls=734 (utilization=9%)
|
||||
PERF: core14: dram average latency=31 cycles
|
||||
PERF: core15: instrs=495, cycles=3570, IPC=0.138655
|
||||
PERF: core15: ibuffer stalls=0
|
||||
PERF: core15: scoreboard stalls=241
|
||||
PERF: core15: alu unit stalls=0
|
||||
PERF: core15: lsu unit stalls=0
|
||||
PERF: core15: csr unit stalls=0
|
||||
PERF: core15: fpu unit stalls=0
|
||||
PERF: core15: gpu unit stalls=0
|
||||
PERF: core15: icache reads=348
|
||||
PERF: core15: icache read misses=31 (hit ratio=91%)
|
||||
PERF: core15: icache pipeline stalls=62
|
||||
PERF: core15: icache reponse stalls=0
|
||||
PERF: core15: dcache reads=18
|
||||
PERF: core15: dcache writes=48
|
||||
PERF: core15: dcache read misses=8 (hit ratio=55%)
|
||||
PERF: core15: dcache write misses=44 (hit ratio=8%)
|
||||
PERF: core15: dcache bank stalls=0 (utilization=100%)
|
||||
PERF: core15: dcache mshr stalls=0
|
||||
PERF: core15: dcache pipeline stalls=419
|
||||
PERF: core15: dcache reponse stalls=0
|
||||
PERF: core15: smem reads=23
|
||||
PERF: core15: smem writes=25
|
||||
PERF: core15: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core15: dram requests=79 (reads=31, writes=48)
|
||||
PERF: core15: dram stalls=667 (utilization=10%)
|
||||
PERF: core15: dram average latency=31 cycles
|
||||
PERF: instrs=14016, cycles=5194, IPC=2.698498
|
||||
PERF: ibuffer stalls=356
|
||||
PERF: scoreboard stalls=5084
|
||||
PERF: alu unit stalls=272
|
||||
PERF: lsu unit stalls=203
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=7392
|
||||
PERF: icache read misses=632 (hit ratio=91%)
|
||||
PERF: icache pipeline stalls=2456
|
||||
PERF: icache reponse stalls=356
|
||||
PERF: dcache reads=672
|
||||
PERF: dcache writes=836
|
||||
PERF: dcache read misses=208 (hit ratio=69%)
|
||||
PERF: dcache write misses=768 (hit ratio=8%)
|
||||
PERF: dcache bank stalls=288 (utilization=83%)
|
||||
PERF: dcache mshr stalls=234
|
||||
PERF: dcache pipeline stalls=8145
|
||||
PERF: dcache reponse stalls=4
|
||||
PERF: smem reads=556
|
||||
PERF: smem writes=552
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=1384 (reads=548, writes=836)
|
||||
PERF: dram stalls=11869 (utilization=10%)
|
||||
PERF: dram average latency=31 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
17
evaluation/perf_2021_03_07/1c/afu_default.fit.summary
Normal file
17
evaluation/perf_2021_03_07/1c/afu_default.fit.summary
Normal file
|
@ -0,0 +1,17 @@
|
|||
Fitter Status : Successful - Sat Mar 6 19:19:28 2021
|
||||
Quartus Prime Version : 19.2.0 Build 57 06/24/2019 Patches 0.01rc SJ Pro Edition
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
||||
Device : 10AX115N2F40E2LG
|
||||
Timing Models : Final
|
||||
Logic utilization (in ALMs) : 55,747 / 427,200 ( 13 % )
|
||||
Total registers : 79974
|
||||
Total pins : 310 / 826 ( 38 % )
|
||||
Total virtual pins : 0
|
||||
Total block memory bits : 2,272,720 / 55,562,240 ( 4 % )
|
||||
Total RAM Blocks : 320 / 2,713 ( 12 % )
|
||||
Total DSP Blocks : 28 / 1,518 ( 2 % )
|
||||
Total HSSI RX channels : 12 / 48 ( 25 % )
|
||||
Total HSSI TX channels : 12 / 48 ( 25 % )
|
||||
Total PLLs : 25 / 112 ( 22 % )
|
6945
evaluation/perf_2021_03_07/1c/afu_default.sta.summary
Normal file
6945
evaluation/perf_2021_03_07/1c/afu_default.sta.summary
Normal file
File diff suppressed because it is too large
Load diff
4
evaluation/perf_2021_03_07/1c/afu_default.syn.summary
Normal file
4
evaluation/perf_2021_03_07/1c/afu_default.syn.summary
Normal file
|
@ -0,0 +1,4 @@
|
|||
Synthesis Status : Successful - Sat Mar 6 18:56:26 2021
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
33127
evaluation/perf_2021_03_07/1c/build.log
Normal file
33127
evaluation/perf_2021_03_07/1c/build.log
Normal file
File diff suppressed because it is too large
Load diff
29
evaluation/perf_2021_03_07/1c/guassian.result
Normal file
29
evaluation/perf_2021_03_07/1c/guassian.result
Normal file
|
@ -0,0 +1,29 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./guassian
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
OK
|
||||
The result of matrix m is:
|
||||
0.00 0.00 0.00 0.00
|
||||
0.50 0.00 0.00 0.00
|
||||
0.67 0.26 0.00 0.00
|
||||
-0.00 0.15 -0.28 0.00
|
||||
|
||||
The result of matrix a is:
|
||||
-0.60 -0.50 0.70 0.30
|
||||
0.00 -0.65 -0.05 0.55
|
||||
0.00 0.00 -0.75 -1.14
|
||||
0.00 0.00 0.00 0.50
|
||||
|
||||
The result of array b is:
|
||||
-0.85 -0.25 0.87 -0.25
|
||||
|
||||
The final solution is:
|
||||
0.70 0.00 -0.40 -0.50
|
||||
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
19
evaluation/perf_2021_03_07/1c/nearn.result
Normal file
19
evaluation/perf_2021_03_07/1c/nearn.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./nearn
|
||||
loading db: cane4_0.db
|
||||
loading db: cane4_1.db
|
||||
loading db: cane4_2.db
|
||||
Number of records: 1500
|
||||
Finding the 5 closest neighbors.
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
1974 12 22 18 24 JOYCE 30.6 89.9 80 593 --> Distance=0.608276
|
||||
1965 5 13 0 17 TONY 27.8 89.0 122 260 --> Distance=2.416610
|
||||
1991 3 18 12 19 DEBBY 28.5 87.8 107 850 --> Distance=2.662703
|
||||
1957 4 17 6 12 ALBERTO 32.5 87.8 54 510 --> Distance=3.330163
|
||||
1964 8 5 6 9 FLORENCE 31.5 86.3 18 242 --> Distance=3.992490
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
19
evaluation/perf_2021_03_07/1c/saxpy.result
Normal file
19
evaluation/perf_2021_03_07/1c/saxpy.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./saxpy
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
19
evaluation/perf_2021_03_07/1c/sfilter.result
Normal file
19
evaluation/perf_2021_03_07/1c/sfilter.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sfilter
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
42
evaluation/perf_2021_03_07/1c/sgemm.result
Normal file
42
evaluation/perf_2021_03_07/1c/sgemm.result
Normal file
|
@ -0,0 +1,42 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sgemm -n32
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: instrs=360460, cycles=175991, IPC=2.048173
|
||||
PERF: ibuffer stalls=20439
|
||||
PERF: scoreboard stalls=50656
|
||||
PERF: alu unit stalls=7129
|
||||
PERF: lsu unit stalls=16771
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=90397
|
||||
PERF: icache read misses=73 (hit ratio=99%)
|
||||
PERF: icache pipeline stalls=12325
|
||||
PERF: icache reponse stalls=20439
|
||||
PERF: dcache reads=45342
|
||||
PERF: dcache writes=1061
|
||||
PERF: dcache read misses=1252 (hit ratio=97%)
|
||||
PERF: dcache write misses=1057 (hit ratio=0%)
|
||||
PERF: dcache bank stalls=50688 (utilization=47%)
|
||||
PERF: dcache mshr stalls=2005
|
||||
PERF: dcache pipeline stalls=2034
|
||||
PERF: dcache reponse stalls=192
|
||||
PERF: smem reads=7978
|
||||
PERF: smem writes=6207
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=1423 (reads=362, writes=1061)
|
||||
PERF: dram stalls=0 (utilization=100%)
|
||||
PERF: dram average latency=26 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
3
evaluation/perf_2021_03_07/1c/user_clock_freq.txt
Normal file
3
evaluation/perf_2021_03_07/1c/user_clock_freq.txt
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Generated by Platform Interface Manager user_clock_config.tcl
|
||||
afu-image/clock-frequency-low:88.5
|
||||
afu-image/clock-frequency-high:177
|
43
evaluation/perf_2021_03_07/1c/vecadd.result
Normal file
43
evaluation/perf_2021_03_07/1c/vecadd.result
Normal file
|
@ -0,0 +1,43 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./vecadd -n64
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=1, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Allocate device buffers
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: instrs=4908, cycles=6173, IPC=0.795075
|
||||
PERF: ibuffer stalls=247
|
||||
PERF: scoreboard stalls=629
|
||||
PERF: alu unit stalls=130
|
||||
PERF: lsu unit stalls=204
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=1528
|
||||
PERF: icache read misses=65 (hit ratio=95%)
|
||||
PERF: icache pipeline stalls=546
|
||||
PERF: icache reponse stalls=247
|
||||
PERF: dcache reads=371
|
||||
PERF: dcache writes=113
|
||||
PERF: dcache read misses=105 (hit ratio=71%)
|
||||
PERF: dcache write misses=108 (hit ratio=4%)
|
||||
PERF: dcache bank stalls=184 (utilization=72%)
|
||||
PERF: dcache mshr stalls=125
|
||||
PERF: dcache pipeline stalls=259
|
||||
PERF: dcache reponse stalls=15
|
||||
PERF: smem reads=154
|
||||
PERF: smem writes=63
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=175 (reads=62, writes=113)
|
||||
PERF: dram stalls=0 (utilization=100%)
|
||||
PERF: dram average latency=26 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
17
evaluation/perf_2021_03_07/2c/afu_default.fit.summary
Normal file
17
evaluation/perf_2021_03_07/2c/afu_default.fit.summary
Normal file
|
@ -0,0 +1,17 @@
|
|||
Fitter Status : Successful - Sat Mar 6 01:44:47 2021
|
||||
Quartus Prime Version : 19.2.0 Build 57 06/24/2019 Patches 0.01rc SJ Pro Edition
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
||||
Device : 10AX115N2F40E2LG
|
||||
Timing Models : Final
|
||||
Logic utilization (in ALMs) : 74,001 / 427,200 ( 17 % )
|
||||
Total registers : 109164
|
||||
Total pins : 310 / 826 ( 38 % )
|
||||
Total virtual pins : 0
|
||||
Total block memory bits : 2,967,352 / 55,562,240 ( 5 % )
|
||||
Total RAM Blocks : 451 / 2,713 ( 17 % )
|
||||
Total DSP Blocks : 56 / 1,518 ( 4 % )
|
||||
Total HSSI RX channels : 12 / 48 ( 25 % )
|
||||
Total HSSI TX channels : 12 / 48 ( 25 % )
|
||||
Total PLLs : 25 / 112 ( 22 % )
|
6945
evaluation/perf_2021_03_07/2c/afu_default.sta.summary
Normal file
6945
evaluation/perf_2021_03_07/2c/afu_default.sta.summary
Normal file
File diff suppressed because it is too large
Load diff
4
evaluation/perf_2021_03_07/2c/afu_default.syn.summary
Normal file
4
evaluation/perf_2021_03_07/2c/afu_default.syn.summary
Normal file
|
@ -0,0 +1,4 @@
|
|||
Synthesis Status : Successful - Sat Mar 6 01:12:13 2021
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
34115
evaluation/perf_2021_03_07/2c/build.log
Normal file
34115
evaluation/perf_2021_03_07/2c/build.log
Normal file
File diff suppressed because it is too large
Load diff
29
evaluation/perf_2021_03_07/2c/guassian.result
Normal file
29
evaluation/perf_2021_03_07/2c/guassian.result
Normal file
|
@ -0,0 +1,29 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./guassian
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
OK
|
||||
The result of matrix m is:
|
||||
0.00 0.00 0.00 0.00
|
||||
0.50 0.00 0.00 0.00
|
||||
0.67 0.26 0.00 0.00
|
||||
-0.00 0.15 -0.28 0.00
|
||||
|
||||
The result of matrix a is:
|
||||
-0.60 -0.50 0.70 0.30
|
||||
0.00 -0.65 -0.05 0.55
|
||||
0.00 0.00 -0.75 -1.14
|
||||
0.00 0.00 0.00 0.50
|
||||
|
||||
The result of array b is:
|
||||
-0.85 -0.25 0.87 -0.25
|
||||
|
||||
The final solution is:
|
||||
0.70 0.00 -0.40 -0.50
|
||||
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
19
evaluation/perf_2021_03_07/2c/nearn.result
Normal file
19
evaluation/perf_2021_03_07/2c/nearn.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./nearn
|
||||
loading db: cane4_0.db
|
||||
loading db: cane4_1.db
|
||||
loading db: cane4_2.db
|
||||
Number of records: 1500
|
||||
Finding the 5 closest neighbors.
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
1974 12 22 18 24 JOYCE 30.6 89.9 80 593 --> Distance=0.608276
|
||||
1965 5 13 0 17 TONY 27.8 89.0 122 260 --> Distance=2.416610
|
||||
1991 3 18 12 19 DEBBY 28.5 87.8 107 850 --> Distance=2.662703
|
||||
1957 4 17 6 12 ALBERTO 32.5 87.8 54 510 --> Distance=3.330163
|
||||
1964 8 5 6 9 FLORENCE 31.5 86.3 18 242 --> Distance=3.992490
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
19
evaluation/perf_2021_03_07/2c/saxpy.result
Normal file
19
evaluation/perf_2021_03_07/2c/saxpy.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./saxpy
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
19
evaluation/perf_2021_03_07/2c/sfilter.result
Normal file
19
evaluation/perf_2021_03_07/2c/sfilter.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sfilter
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
94
evaluation/perf_2021_03_07/2c/sgemm.result
Normal file
94
evaluation/perf_2021_03_07/2c/sgemm.result
Normal file
|
@ -0,0 +1,94 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sgemm -n32
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=180750, cycles=84306, IPC=2.143975
|
||||
PERF: core0: ibuffer stalls=0
|
||||
PERF: core0: scoreboard stalls=0
|
||||
PERF: core0: alu unit stalls=0
|
||||
PERF: core0: lsu unit stalls=0
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=0
|
||||
PERF: core0: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: icache pipeline stalls=0
|
||||
PERF: core0: icache reponse stalls=0
|
||||
PERF: core0: dcache reads=0
|
||||
PERF: core0: dcache writes=0
|
||||
PERF: core0: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dcache mshr stalls=0
|
||||
PERF: core0: dcache pipeline stalls=0
|
||||
PERF: core0: dcache reponse stalls=0
|
||||
PERF: core0: smem reads=0
|
||||
PERF: core0: smem writes=0
|
||||
PERF: core0: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core0: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram average latency=-2147483648 cycles
|
||||
PERF: core1: instrs=180752, cycles=84131, IPC=2.148459
|
||||
PERF: core1: ibuffer stalls=0
|
||||
PERF: core1: scoreboard stalls=0
|
||||
PERF: core1: alu unit stalls=0
|
||||
PERF: core1: lsu unit stalls=0
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=0
|
||||
PERF: core1: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: icache pipeline stalls=0
|
||||
PERF: core1: icache reponse stalls=0
|
||||
PERF: core1: dcache reads=0
|
||||
PERF: core1: dcache writes=0
|
||||
PERF: core1: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dcache mshr stalls=0
|
||||
PERF: core1: dcache pipeline stalls=0
|
||||
PERF: core1: dcache reponse stalls=0
|
||||
PERF: core1: smem reads=0
|
||||
PERF: core1: smem writes=0
|
||||
PERF: core1: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core1: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram average latency=-2147483648 cycles
|
||||
PERF: instrs=361502, cycles=84306, IPC=4.287975
|
||||
PERF: ibuffer stalls=0
|
||||
PERF: scoreboard stalls=0
|
||||
PERF: alu unit stalls=0
|
||||
PERF: lsu unit stalls=0
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=0
|
||||
PERF: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: icache pipeline stalls=0
|
||||
PERF: icache reponse stalls=0
|
||||
PERF: dcache reads=0
|
||||
PERF: dcache writes=0
|
||||
PERF: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dcache mshr stalls=0
|
||||
PERF: dcache pipeline stalls=0
|
||||
PERF: dcache reponse stalls=0
|
||||
PERF: smem reads=0
|
||||
PERF: smem writes=0
|
||||
PERF: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram requests=0 (reads=0, writes=0)
|
||||
PERF: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram average latency=-2147483648 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
3
evaluation/perf_2021_03_07/2c/user_clock_freq.txt
Normal file
3
evaluation/perf_2021_03_07/2c/user_clock_freq.txt
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Generated by Platform Interface Manager user_clock_config.tcl
|
||||
afu-image/clock-frequency-low:92.0
|
||||
afu-image/clock-frequency-high:184
|
95
evaluation/perf_2021_03_07/2c/vecadd.result
Normal file
95
evaluation/perf_2021_03_07/2c/vecadd.result
Normal file
|
@ -0,0 +1,95 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./vecadd -n64
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=2, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Allocate device buffers
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=2981, cycles=5416, IPC=0.550406
|
||||
PERF: core0: ibuffer stalls=0
|
||||
PERF: core0: scoreboard stalls=0
|
||||
PERF: core0: alu unit stalls=0
|
||||
PERF: core0: lsu unit stalls=0
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=0
|
||||
PERF: core0: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: icache pipeline stalls=0
|
||||
PERF: core0: icache reponse stalls=0
|
||||
PERF: core0: dcache reads=0
|
||||
PERF: core0: dcache writes=0
|
||||
PERF: core0: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dcache mshr stalls=0
|
||||
PERF: core0: dcache pipeline stalls=0
|
||||
PERF: core0: dcache reponse stalls=0
|
||||
PERF: core0: smem reads=0
|
||||
PERF: core0: smem writes=0
|
||||
PERF: core0: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core0: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram average latency=-2147483648 cycles
|
||||
PERF: core1: instrs=2983, cycles=5353, IPC=0.557258
|
||||
PERF: core1: ibuffer stalls=0
|
||||
PERF: core1: scoreboard stalls=0
|
||||
PERF: core1: alu unit stalls=0
|
||||
PERF: core1: lsu unit stalls=0
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=0
|
||||
PERF: core1: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: icache pipeline stalls=0
|
||||
PERF: core1: icache reponse stalls=0
|
||||
PERF: core1: dcache reads=0
|
||||
PERF: core1: dcache writes=0
|
||||
PERF: core1: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dcache mshr stalls=0
|
||||
PERF: core1: dcache pipeline stalls=0
|
||||
PERF: core1: dcache reponse stalls=0
|
||||
PERF: core1: smem reads=0
|
||||
PERF: core1: smem writes=0
|
||||
PERF: core1: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core1: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram average latency=-2147483648 cycles
|
||||
PERF: instrs=5964, cycles=5416, IPC=1.101182
|
||||
PERF: ibuffer stalls=0
|
||||
PERF: scoreboard stalls=0
|
||||
PERF: alu unit stalls=0
|
||||
PERF: lsu unit stalls=0
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=0
|
||||
PERF: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: icache pipeline stalls=0
|
||||
PERF: icache reponse stalls=0
|
||||
PERF: dcache reads=0
|
||||
PERF: dcache writes=0
|
||||
PERF: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dcache mshr stalls=0
|
||||
PERF: dcache pipeline stalls=0
|
||||
PERF: dcache reponse stalls=0
|
||||
PERF: smem reads=0
|
||||
PERF: smem writes=0
|
||||
PERF: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram requests=0 (reads=0, writes=0)
|
||||
PERF: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram average latency=-2147483648 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
17
evaluation/perf_2021_03_07/4c/afu_default.fit.summary
Normal file
17
evaluation/perf_2021_03_07/4c/afu_default.fit.summary
Normal file
|
@ -0,0 +1,17 @@
|
|||
Fitter Status : Successful - Sat Mar 6 02:49:17 2021
|
||||
Quartus Prime Version : 19.2.0 Build 57 06/24/2019 Patches 0.01rc SJ Pro Edition
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
||||
Device : 10AX115N2F40E2LG
|
||||
Timing Models : Final
|
||||
Logic utilization (in ALMs) : 117,451 / 427,200 ( 27 % )
|
||||
Total registers : 173797
|
||||
Total pins : 310 / 826 ( 38 % )
|
||||
Total virtual pins : 0
|
||||
Total block memory bits : 4,356,616 / 55,562,240 ( 8 % )
|
||||
Total RAM Blocks : 713 / 2,713 ( 26 % )
|
||||
Total DSP Blocks : 112 / 1,518 ( 7 % )
|
||||
Total HSSI RX channels : 12 / 48 ( 25 % )
|
||||
Total HSSI TX channels : 12 / 48 ( 25 % )
|
||||
Total PLLs : 25 / 112 ( 22 % )
|
6945
evaluation/perf_2021_03_07/4c/afu_default.sta.summary
Normal file
6945
evaluation/perf_2021_03_07/4c/afu_default.sta.summary
Normal file
File diff suppressed because it is too large
Load diff
4
evaluation/perf_2021_03_07/4c/afu_default.syn.summary
Normal file
4
evaluation/perf_2021_03_07/4c/afu_default.syn.summary
Normal file
|
@ -0,0 +1,4 @@
|
|||
Synthesis Status : Successful - Sat Mar 6 01:57:55 2021
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
36027
evaluation/perf_2021_03_07/4c/build.log
Normal file
36027
evaluation/perf_2021_03_07/4c/build.log
Normal file
File diff suppressed because it is too large
Load diff
29
evaluation/perf_2021_03_07/4c/guassian.result
Normal file
29
evaluation/perf_2021_03_07/4c/guassian.result
Normal file
|
@ -0,0 +1,29 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./guassian
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
OK
|
||||
The result of matrix m is:
|
||||
0.00 0.00 0.00 0.00
|
||||
0.50 0.00 0.00 0.00
|
||||
0.67 0.26 0.00 0.00
|
||||
-0.00 0.15 -0.28 0.00
|
||||
|
||||
The result of matrix a is:
|
||||
-0.60 -0.50 0.70 0.30
|
||||
0.00 -0.65 -0.05 0.55
|
||||
0.00 0.00 -0.75 -1.14
|
||||
0.00 0.00 0.00 0.50
|
||||
|
||||
The result of array b is:
|
||||
-0.85 -0.25 0.87 -0.25
|
||||
|
||||
The final solution is:
|
||||
0.70 0.00 -0.40 -0.50
|
||||
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
19
evaluation/perf_2021_03_07/4c/nearn.result
Normal file
19
evaluation/perf_2021_03_07/4c/nearn.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./nearn
|
||||
loading db: cane4_0.db
|
||||
loading db: cane4_1.db
|
||||
loading db: cane4_2.db
|
||||
Number of records: 1500
|
||||
Finding the 5 closest neighbors.
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
1974 12 22 18 24 JOYCE 30.6 89.9 80 593 --> Distance=0.608276
|
||||
1965 5 13 0 17 TONY 27.8 89.0 122 260 --> Distance=2.416610
|
||||
1991 3 18 12 19 DEBBY 28.5 87.8 107 850 --> Distance=2.662703
|
||||
1957 4 17 6 12 ALBERTO 32.5 87.8 54 510 --> Distance=3.330163
|
||||
1964 8 5 6 9 FLORENCE 31.5 86.3 18 242 --> Distance=3.992490
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
19
evaluation/perf_2021_03_07/4c/saxpy.result
Normal file
19
evaluation/perf_2021_03_07/4c/saxpy.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./saxpy
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
19
evaluation/perf_2021_03_07/4c/sfilter.result
Normal file
19
evaluation/perf_2021_03_07/4c/sfilter.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sfilter
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
146
evaluation/perf_2021_03_07/4c/sgemm.result
Normal file
146
evaluation/perf_2021_03_07/4c/sgemm.result
Normal file
|
@ -0,0 +1,146 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sgemm -n32
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 3 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=90890, cycles=51133, IPC=1.777521
|
||||
PERF: core0: ibuffer stalls=10132
|
||||
PERF: core0: scoreboard stalls=15251
|
||||
PERF: core0: alu unit stalls=2423
|
||||
PERF: core0: lsu unit stalls=3859
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=23003
|
||||
PERF: core0: icache read misses=73 (hit ratio=99%)
|
||||
PERF: core0: icache pipeline stalls=7639
|
||||
PERF: core0: icache reponse stalls=10132
|
||||
PERF: core0: dcache reads=17502
|
||||
PERF: core0: dcache writes=293
|
||||
PERF: core0: dcache read misses=1041 (hit ratio=94%)
|
||||
PERF: core0: dcache write misses=289 (hit ratio=1%)
|
||||
PERF: core0: dcache bank stalls=8464 (utilization=67%)
|
||||
PERF: core0: dcache mshr stalls=4228
|
||||
PERF: core0: dcache pipeline stalls=9676
|
||||
PERF: core0: dcache reponse stalls=76
|
||||
PERF: core0: smem reads=2026
|
||||
PERF: core0: smem writes=1599
|
||||
PERF: core0: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core0: dram requests=479 (reads=186, writes=293)
|
||||
PERF: core0: dram stalls=789 (utilization=37%)
|
||||
PERF: core0: dram average latency=32 cycles
|
||||
PERF: core1: instrs=90890, cycles=51143, IPC=1.777174
|
||||
PERF: core1: ibuffer stalls=10158
|
||||
PERF: core1: scoreboard stalls=15244
|
||||
PERF: core1: alu unit stalls=2440
|
||||
PERF: core1: lsu unit stalls=3894
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=23003
|
||||
PERF: core1: icache read misses=73 (hit ratio=99%)
|
||||
PERF: core1: icache pipeline stalls=7685
|
||||
PERF: core1: icache reponse stalls=10158
|
||||
PERF: core1: dcache reads=17502
|
||||
PERF: core1: dcache writes=293
|
||||
PERF: core1: dcache read misses=1101 (hit ratio=93%)
|
||||
PERF: core1: dcache write misses=289 (hit ratio=1%)
|
||||
PERF: core1: dcache bank stalls=8464 (utilization=67%)
|
||||
PERF: core1: dcache mshr stalls=4330
|
||||
PERF: core1: dcache pipeline stalls=9347
|
||||
PERF: core1: dcache reponse stalls=67
|
||||
PERF: core1: smem reads=2026
|
||||
PERF: core1: smem writes=1599
|
||||
PERF: core1: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core1: dram requests=509 (reads=216, writes=293)
|
||||
PERF: core1: dram stalls=715 (utilization=41%)
|
||||
PERF: core1: dram average latency=32 cycles
|
||||
PERF: core2: instrs=90890, cycles=51135, IPC=1.777452
|
||||
PERF: core2: ibuffer stalls=10120
|
||||
PERF: core2: scoreboard stalls=15237
|
||||
PERF: core2: alu unit stalls=2406
|
||||
PERF: core2: lsu unit stalls=3881
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=0
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=23003
|
||||
PERF: core2: icache read misses=73 (hit ratio=99%)
|
||||
PERF: core2: icache pipeline stalls=7651
|
||||
PERF: core2: icache reponse stalls=10120
|
||||
PERF: core2: dcache reads=17502
|
||||
PERF: core2: dcache writes=293
|
||||
PERF: core2: dcache read misses=1040 (hit ratio=94%)
|
||||
PERF: core2: dcache write misses=289 (hit ratio=1%)
|
||||
PERF: core2: dcache bank stalls=8464 (utilization=67%)
|
||||
PERF: core2: dcache mshr stalls=4234
|
||||
PERF: core2: dcache pipeline stalls=9580
|
||||
PERF: core2: dcache reponse stalls=75
|
||||
PERF: core2: smem reads=2026
|
||||
PERF: core2: smem writes=1599
|
||||
PERF: core2: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core2: dram requests=478 (reads=185, writes=293)
|
||||
PERF: core2: dram stalls=776 (utilization=38%)
|
||||
PERF: core2: dram average latency=32 cycles
|
||||
PERF: core3: instrs=90892, cycles=51134, IPC=1.777526
|
||||
PERF: core3: ibuffer stalls=10116
|
||||
PERF: core3: scoreboard stalls=15282
|
||||
PERF: core3: alu unit stalls=2380
|
||||
PERF: core3: lsu unit stalls=3862
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=0
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=23005
|
||||
PERF: core3: icache read misses=73 (hit ratio=99%)
|
||||
PERF: core3: icache pipeline stalls=7688
|
||||
PERF: core3: icache reponse stalls=10116
|
||||
PERF: core3: dcache reads=17502
|
||||
PERF: core3: dcache writes=293
|
||||
PERF: core3: dcache read misses=1040 (hit ratio=94%)
|
||||
PERF: core3: dcache write misses=289 (hit ratio=1%)
|
||||
PERF: core3: dcache bank stalls=8464 (utilization=67%)
|
||||
PERF: core3: dcache mshr stalls=4421
|
||||
PERF: core3: dcache pipeline stalls=9647
|
||||
PERF: core3: dcache reponse stalls=76
|
||||
PERF: core3: smem reads=2026
|
||||
PERF: core3: smem writes=1599
|
||||
PERF: core3: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core3: dram requests=478 (reads=185, writes=293)
|
||||
PERF: core3: dram stalls=684 (utilization=41%)
|
||||
PERF: core3: dram average latency=32 cycles
|
||||
PERF: instrs=363562, cycles=51143, IPC=7.108734
|
||||
PERF: ibuffer stalls=40526
|
||||
PERF: scoreboard stalls=61014
|
||||
PERF: alu unit stalls=9649
|
||||
PERF: lsu unit stalls=15496
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=92014
|
||||
PERF: icache read misses=292 (hit ratio=99%)
|
||||
PERF: icache pipeline stalls=30663
|
||||
PERF: icache reponse stalls=40526
|
||||
PERF: dcache reads=70008
|
||||
PERF: dcache writes=1172
|
||||
PERF: dcache read misses=4222 (hit ratio=93%)
|
||||
PERF: dcache write misses=1156 (hit ratio=1%)
|
||||
PERF: dcache bank stalls=33856 (utilization=67%)
|
||||
PERF: dcache mshr stalls=17213
|
||||
PERF: dcache pipeline stalls=38250
|
||||
PERF: dcache reponse stalls=294
|
||||
PERF: smem reads=8104
|
||||
PERF: smem writes=6396
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=1944 (reads=772, writes=1172)
|
||||
PERF: dram stalls=2964 (utilization=39%)
|
||||
PERF: dram average latency=32 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
3
evaluation/perf_2021_03_07/4c/user_clock_freq.txt
Normal file
3
evaluation/perf_2021_03_07/4c/user_clock_freq.txt
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Generated by Platform Interface Manager user_clock_config.tcl
|
||||
afu-image/clock-frequency-low:93.0
|
||||
afu-image/clock-frequency-high:186
|
147
evaluation/perf_2021_03_07/4c/vecadd.result
Normal file
147
evaluation/perf_2021_03_07/4c/vecadd.result
Normal file
|
@ -0,0 +1,147 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./vecadd -n64
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=4, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Allocate device buffers
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=2019, cycles=5042, IPC=0.400436
|
||||
PERF: core0: ibuffer stalls=86
|
||||
PERF: core0: scoreboard stalls=451
|
||||
PERF: core0: alu unit stalls=68
|
||||
PERF: core0: lsu unit stalls=53
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=804
|
||||
PERF: core0: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core0: icache pipeline stalls=469
|
||||
PERF: core0: icache reponse stalls=86
|
||||
PERF: core0: dcache reads=114
|
||||
PERF: core0: dcache writes=65
|
||||
PERF: core0: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core0: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core0: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core0: dcache mshr stalls=56
|
||||
PERF: core0: dcache pipeline stalls=88
|
||||
PERF: core0: dcache reponse stalls=1
|
||||
PERF: core0: smem reads=70
|
||||
PERF: core0: smem writes=63
|
||||
PERF: core0: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core0: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core0: dram stalls=53 (utilization=67%)
|
||||
PERF: core0: dram average latency=31 cycles
|
||||
PERF: core1: instrs=2019, cycles=5041, IPC=0.400516
|
||||
PERF: core1: ibuffer stalls=86
|
||||
PERF: core1: scoreboard stalls=451
|
||||
PERF: core1: alu unit stalls=68
|
||||
PERF: core1: lsu unit stalls=53
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=804
|
||||
PERF: core1: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core1: icache pipeline stalls=470
|
||||
PERF: core1: icache reponse stalls=86
|
||||
PERF: core1: dcache reads=114
|
||||
PERF: core1: dcache writes=65
|
||||
PERF: core1: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core1: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core1: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core1: dcache mshr stalls=56
|
||||
PERF: core1: dcache pipeline stalls=88
|
||||
PERF: core1: dcache reponse stalls=1
|
||||
PERF: core1: smem reads=70
|
||||
PERF: core1: smem writes=63
|
||||
PERF: core1: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core1: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core1: dram stalls=52 (utilization=67%)
|
||||
PERF: core1: dram average latency=31 cycles
|
||||
PERF: core2: instrs=2019, cycles=5040, IPC=0.400595
|
||||
PERF: core2: ibuffer stalls=86
|
||||
PERF: core2: scoreboard stalls=451
|
||||
PERF: core2: alu unit stalls=68
|
||||
PERF: core2: lsu unit stalls=53
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=0
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=804
|
||||
PERF: core2: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core2: icache pipeline stalls=470
|
||||
PERF: core2: icache reponse stalls=86
|
||||
PERF: core2: dcache reads=114
|
||||
PERF: core2: dcache writes=65
|
||||
PERF: core2: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core2: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core2: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core2: dcache mshr stalls=56
|
||||
PERF: core2: dcache pipeline stalls=88
|
||||
PERF: core2: dcache reponse stalls=1
|
||||
PERF: core2: smem reads=70
|
||||
PERF: core2: smem writes=63
|
||||
PERF: core2: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core2: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core2: dram stalls=51 (utilization=68%)
|
||||
PERF: core2: dram average latency=31 cycles
|
||||
PERF: core3: instrs=2021, cycles=5043, IPC=0.400754
|
||||
PERF: core3: ibuffer stalls=102
|
||||
PERF: core3: scoreboard stalls=496
|
||||
PERF: core3: alu unit stalls=73
|
||||
PERF: core3: lsu unit stalls=53
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=0
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=806
|
||||
PERF: core3: icache read misses=65 (hit ratio=91%)
|
||||
PERF: core3: icache pipeline stalls=439
|
||||
PERF: core3: icache reponse stalls=102
|
||||
PERF: core3: dcache reads=114
|
||||
PERF: core3: dcache writes=65
|
||||
PERF: core3: dcache read misses=28 (hit ratio=75%)
|
||||
PERF: core3: dcache write misses=60 (hit ratio=7%)
|
||||
PERF: core3: dcache bank stalls=72 (utilization=71%)
|
||||
PERF: core3: dcache mshr stalls=56
|
||||
PERF: core3: dcache pipeline stalls=88
|
||||
PERF: core3: dcache reponse stalls=1
|
||||
PERF: core3: smem reads=70
|
||||
PERF: core3: smem writes=63
|
||||
PERF: core3: smem bank stalls=0 (utilization=100%)
|
||||
PERF: core3: dram requests=109 (reads=44, writes=65)
|
||||
PERF: core3: dram stalls=50 (utilization=68%)
|
||||
PERF: core3: dram average latency=30 cycles
|
||||
PERF: instrs=8078, cycles=5043, IPC=1.601824
|
||||
PERF: ibuffer stalls=360
|
||||
PERF: scoreboard stalls=1849
|
||||
PERF: alu unit stalls=277
|
||||
PERF: lsu unit stalls=212
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=3218
|
||||
PERF: icache read misses=260 (hit ratio=91%)
|
||||
PERF: icache pipeline stalls=1848
|
||||
PERF: icache reponse stalls=360
|
||||
PERF: dcache reads=456
|
||||
PERF: dcache writes=260
|
||||
PERF: dcache read misses=112 (hit ratio=75%)
|
||||
PERF: dcache write misses=240 (hit ratio=7%)
|
||||
PERF: dcache bank stalls=288 (utilization=71%)
|
||||
PERF: dcache mshr stalls=224
|
||||
PERF: dcache pipeline stalls=352
|
||||
PERF: dcache reponse stalls=4
|
||||
PERF: smem reads=280
|
||||
PERF: smem writes=252
|
||||
PERF: smem bank stalls=0 (utilization=100%)
|
||||
PERF: dram requests=436 (reads=176, writes=260)
|
||||
PERF: dram stalls=206 (utilization=67%)
|
||||
PERF: dram average latency=30 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
17
evaluation/perf_2021_03_07/8c/afu_default.fit.summary
Normal file
17
evaluation/perf_2021_03_07/8c/afu_default.fit.summary
Normal file
|
@ -0,0 +1,17 @@
|
|||
Fitter Status : Successful - Sat Mar 6 04:32:43 2021
|
||||
Quartus Prime Version : 19.2.0 Build 57 06/24/2019 Patches 0.01rc SJ Pro Edition
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
||||
Device : 10AX115N2F40E2LG
|
||||
Timing Models : Final
|
||||
Logic utilization (in ALMs) : 190,373 / 427,200 ( 45 % )
|
||||
Total registers : 288074
|
||||
Total pins : 310 / 826 ( 38 % )
|
||||
Total virtual pins : 0
|
||||
Total block memory bits : 7,135,144 / 55,562,240 ( 13 % )
|
||||
Total RAM Blocks : 1,237 / 2,713 ( 46 % )
|
||||
Total DSP Blocks : 224 / 1,518 ( 15 % )
|
||||
Total HSSI RX channels : 12 / 48 ( 25 % )
|
||||
Total HSSI TX channels : 12 / 48 ( 25 % )
|
||||
Total PLLs : 25 / 112 ( 22 % )
|
6945
evaluation/perf_2021_03_07/8c/afu_default.sta.summary
Normal file
6945
evaluation/perf_2021_03_07/8c/afu_default.sta.summary
Normal file
File diff suppressed because it is too large
Load diff
4
evaluation/perf_2021_03_07/8c/afu_default.syn.summary
Normal file
4
evaluation/perf_2021_03_07/8c/afu_default.syn.summary
Normal file
|
@ -0,0 +1,4 @@
|
|||
Synthesis Status : Successful - Sat Mar 6 03:10:30 2021
|
||||
Revision Name : afu_default
|
||||
Top-level Entity Name : dcp_top
|
||||
Family : Arria 10
|
39983
evaluation/perf_2021_03_07/8c/build.log
Normal file
39983
evaluation/perf_2021_03_07/8c/build.log
Normal file
File diff suppressed because it is too large
Load diff
29
evaluation/perf_2021_03_07/8c/guassian.result
Normal file
29
evaluation/perf_2021_03_07/8c/guassian.result
Normal file
|
@ -0,0 +1,29 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./guassian
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
OK
|
||||
The result of matrix m is:
|
||||
0.00 0.00 0.00 0.00
|
||||
0.50 0.00 0.00 0.00
|
||||
0.67 0.26 0.00 0.00
|
||||
-0.00 0.15 -0.28 0.00
|
||||
|
||||
The result of matrix a is:
|
||||
-0.60 -0.50 0.70 0.30
|
||||
0.00 -0.65 -0.05 0.55
|
||||
0.00 0.00 -0.75 -1.14
|
||||
0.00 0.00 0.00 0.50
|
||||
|
||||
The result of array b is:
|
||||
-0.85 -0.25 0.87 -0.25
|
||||
|
||||
The final solution is:
|
||||
0.70 0.00 -0.40 -0.50
|
||||
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/guassian'
|
19
evaluation/perf_2021_03_07/8c/nearn.result
Normal file
19
evaluation/perf_2021_03_07/8c/nearn.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./nearn
|
||||
loading db: cane4_0.db
|
||||
loading db: cane4_1.db
|
||||
loading db: cane4_2.db
|
||||
Number of records: 1500
|
||||
Finding the 5 closest neighbors.
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
1974 12 22 18 24 JOYCE 30.6 89.9 80 593 --> Distance=0.608276
|
||||
1965 5 13 0 17 TONY 27.8 89.0 122 260 --> Distance=2.416610
|
||||
1991 3 18 12 19 DEBBY 28.5 87.8 107 850 --> Distance=2.662703
|
||||
1957 4 17 6 12 ALBERTO 32.5 87.8 54 510 --> Distance=3.330163
|
||||
1964 8 5 6 9 FLORENCE 31.5 86.3 18 242 --> Distance=3.992490
|
||||
Passed!
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/nearn'
|
19
evaluation/perf_2021_03_07/8c/saxpy.result
Normal file
19
evaluation/perf_2021_03_07/8c/saxpy.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./saxpy
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/saxpy'
|
19
evaluation/perf_2021_03_07/8c/sfilter.result
Normal file
19
evaluation/perf_2021_03_07/8c/sfilter.result
Normal file
|
@ -0,0 +1,19 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sfilter
|
||||
enter demo main
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
Attempting to create program from binary...
|
||||
Read program from binary.
|
||||
attempting to create input buffer
|
||||
attempting to create output buffer
|
||||
attempting to create kernel
|
||||
setting up kernel args
|
||||
attempting to enqueue write buffer
|
||||
attempting to enqueue kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sfilter'
|
250
evaluation/perf_2021_03_07/8c/sgemm.result
Normal file
250
evaluation/perf_2021_03_07/8c/sgemm.result
Normal file
|
@ -0,0 +1,250 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./sgemm -n32
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 4 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=45962, cycles=25060, IPC=1.834078
|
||||
PERF: core0: ibuffer stalls=0
|
||||
PERF: core0: scoreboard stalls=0
|
||||
PERF: core0: alu unit stalls=0
|
||||
PERF: core0: lsu unit stalls=0
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=0
|
||||
PERF: core0: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: icache pipeline stalls=0
|
||||
PERF: core0: icache reponse stalls=0
|
||||
PERF: core0: dcache reads=0
|
||||
PERF: core0: dcache writes=0
|
||||
PERF: core0: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dcache mshr stalls=0
|
||||
PERF: core0: dcache pipeline stalls=0
|
||||
PERF: core0: dcache reponse stalls=0
|
||||
PERF: core0: smem reads=0
|
||||
PERF: core0: smem writes=0
|
||||
PERF: core0: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core0: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram average latency=-2147483648 cycles
|
||||
PERF: core1: instrs=45962, cycles=25057, IPC=1.834298
|
||||
PERF: core1: ibuffer stalls=0
|
||||
PERF: core1: scoreboard stalls=0
|
||||
PERF: core1: alu unit stalls=0
|
||||
PERF: core1: lsu unit stalls=0
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=0
|
||||
PERF: core1: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: icache pipeline stalls=0
|
||||
PERF: core1: icache reponse stalls=0
|
||||
PERF: core1: dcache reads=0
|
||||
PERF: core1: dcache writes=0
|
||||
PERF: core1: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dcache mshr stalls=0
|
||||
PERF: core1: dcache pipeline stalls=0
|
||||
PERF: core1: dcache reponse stalls=0
|
||||
PERF: core1: smem reads=0
|
||||
PERF: core1: smem writes=0
|
||||
PERF: core1: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core1: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram average latency=-2147483648 cycles
|
||||
PERF: core2: instrs=45962, cycles=25062, IPC=1.833932
|
||||
PERF: core2: ibuffer stalls=0
|
||||
PERF: core2: scoreboard stalls=0
|
||||
PERF: core2: alu unit stalls=0
|
||||
PERF: core2: lsu unit stalls=0
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=0
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=0
|
||||
PERF: core2: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: icache pipeline stalls=0
|
||||
PERF: core2: icache reponse stalls=0
|
||||
PERF: core2: dcache reads=0
|
||||
PERF: core2: dcache writes=0
|
||||
PERF: core2: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dcache mshr stalls=0
|
||||
PERF: core2: dcache pipeline stalls=0
|
||||
PERF: core2: dcache reponse stalls=0
|
||||
PERF: core2: smem reads=0
|
||||
PERF: core2: smem writes=0
|
||||
PERF: core2: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core2: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dram average latency=-2147483648 cycles
|
||||
PERF: core3: instrs=45962, cycles=25054, IPC=1.834517
|
||||
PERF: core3: ibuffer stalls=0
|
||||
PERF: core3: scoreboard stalls=0
|
||||
PERF: core3: alu unit stalls=0
|
||||
PERF: core3: lsu unit stalls=0
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=0
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=0
|
||||
PERF: core3: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: icache pipeline stalls=0
|
||||
PERF: core3: icache reponse stalls=0
|
||||
PERF: core3: dcache reads=0
|
||||
PERF: core3: dcache writes=0
|
||||
PERF: core3: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dcache mshr stalls=0
|
||||
PERF: core3: dcache pipeline stalls=0
|
||||
PERF: core3: dcache reponse stalls=0
|
||||
PERF: core3: smem reads=0
|
||||
PERF: core3: smem writes=0
|
||||
PERF: core3: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core3: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dram average latency=-2147483648 cycles
|
||||
PERF: core4: instrs=45962, cycles=25056, IPC=1.834371
|
||||
PERF: core4: ibuffer stalls=0
|
||||
PERF: core4: scoreboard stalls=0
|
||||
PERF: core4: alu unit stalls=0
|
||||
PERF: core4: lsu unit stalls=0
|
||||
PERF: core4: csr unit stalls=0
|
||||
PERF: core4: fpu unit stalls=0
|
||||
PERF: core4: gpu unit stalls=0
|
||||
PERF: core4: icache reads=0
|
||||
PERF: core4: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: icache pipeline stalls=0
|
||||
PERF: core4: icache reponse stalls=0
|
||||
PERF: core4: dcache reads=0
|
||||
PERF: core4: dcache writes=0
|
||||
PERF: core4: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dcache mshr stalls=0
|
||||
PERF: core4: dcache pipeline stalls=0
|
||||
PERF: core4: dcache reponse stalls=0
|
||||
PERF: core4: smem reads=0
|
||||
PERF: core4: smem writes=0
|
||||
PERF: core4: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core4: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dram average latency=-2147483648 cycles
|
||||
PERF: core5: instrs=45962, cycles=25066, IPC=1.833639
|
||||
PERF: core5: ibuffer stalls=0
|
||||
PERF: core5: scoreboard stalls=0
|
||||
PERF: core5: alu unit stalls=0
|
||||
PERF: core5: lsu unit stalls=0
|
||||
PERF: core5: csr unit stalls=0
|
||||
PERF: core5: fpu unit stalls=0
|
||||
PERF: core5: gpu unit stalls=0
|
||||
PERF: core5: icache reads=0
|
||||
PERF: core5: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: icache pipeline stalls=0
|
||||
PERF: core5: icache reponse stalls=0
|
||||
PERF: core5: dcache reads=0
|
||||
PERF: core5: dcache writes=0
|
||||
PERF: core5: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dcache mshr stalls=0
|
||||
PERF: core5: dcache pipeline stalls=0
|
||||
PERF: core5: dcache reponse stalls=0
|
||||
PERF: core5: smem reads=0
|
||||
PERF: core5: smem writes=0
|
||||
PERF: core5: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core5: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dram average latency=-2147483648 cycles
|
||||
PERF: core6: instrs=45962, cycles=25058, IPC=1.834225
|
||||
PERF: core6: ibuffer stalls=0
|
||||
PERF: core6: scoreboard stalls=0
|
||||
PERF: core6: alu unit stalls=0
|
||||
PERF: core6: lsu unit stalls=0
|
||||
PERF: core6: csr unit stalls=0
|
||||
PERF: core6: fpu unit stalls=0
|
||||
PERF: core6: gpu unit stalls=0
|
||||
PERF: core6: icache reads=0
|
||||
PERF: core6: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: icache pipeline stalls=0
|
||||
PERF: core6: icache reponse stalls=0
|
||||
PERF: core6: dcache reads=0
|
||||
PERF: core6: dcache writes=0
|
||||
PERF: core6: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dcache mshr stalls=0
|
||||
PERF: core6: dcache pipeline stalls=0
|
||||
PERF: core6: dcache reponse stalls=0
|
||||
PERF: core6: smem reads=0
|
||||
PERF: core6: smem writes=0
|
||||
PERF: core6: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core6: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dram average latency=-2147483648 cycles
|
||||
PERF: core7: instrs=45964, cycles=25061, IPC=1.834085
|
||||
PERF: core7: ibuffer stalls=0
|
||||
PERF: core7: scoreboard stalls=0
|
||||
PERF: core7: alu unit stalls=0
|
||||
PERF: core7: lsu unit stalls=0
|
||||
PERF: core7: csr unit stalls=0
|
||||
PERF: core7: fpu unit stalls=0
|
||||
PERF: core7: gpu unit stalls=0
|
||||
PERF: core7: icache reads=0
|
||||
PERF: core7: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: icache pipeline stalls=0
|
||||
PERF: core7: icache reponse stalls=0
|
||||
PERF: core7: dcache reads=0
|
||||
PERF: core7: dcache writes=0
|
||||
PERF: core7: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dcache mshr stalls=0
|
||||
PERF: core7: dcache pipeline stalls=0
|
||||
PERF: core7: dcache reponse stalls=0
|
||||
PERF: core7: smem reads=0
|
||||
PERF: core7: smem writes=0
|
||||
PERF: core7: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core7: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dram average latency=-2147483648 cycles
|
||||
PERF: instrs=367698, cycles=25066, IPC=14.669193
|
||||
PERF: ibuffer stalls=0
|
||||
PERF: scoreboard stalls=0
|
||||
PERF: alu unit stalls=0
|
||||
PERF: lsu unit stalls=0
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=0
|
||||
PERF: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: icache pipeline stalls=0
|
||||
PERF: icache reponse stalls=0
|
||||
PERF: dcache reads=0
|
||||
PERF: dcache writes=0
|
||||
PERF: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dcache mshr stalls=0
|
||||
PERF: dcache pipeline stalls=0
|
||||
PERF: dcache reponse stalls=0
|
||||
PERF: smem reads=0
|
||||
PERF: smem writes=0
|
||||
PERF: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram requests=0 (reads=0, writes=0)
|
||||
PERF: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram average latency=-2147483648 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/sgemm'
|
3
evaluation/perf_2021_03_07/8c/user_clock_freq.txt
Normal file
3
evaluation/perf_2021_03_07/8c/user_clock_freq.txt
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Generated by Platform Interface Manager user_clock_config.tcl
|
||||
afu-image/clock-frequency-low:90.0
|
||||
afu-image/clock-frequency-high:180
|
251
evaluation/perf_2021_03_07/8c/vecadd.result
Normal file
251
evaluation/perf_2021_03_07/8c/vecadd.result
Normal file
|
@ -0,0 +1,251 @@
|
|||
CONFIGS=-DNUM_CLUSTERS=1 -DNUM_CORES=2 -DNUM_WARPS=4 -DNUM_THREADS=4 -DL2_ENABLE=0 -DL3_ENABLE=0 -DPERF_ENABLE
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
rm -rf libvortex.so *.o .depend
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/driver/opae'
|
||||
make: Entering directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
||||
LD_LIBRARY_PATH=/opt/pocl/runtime/lib:/nethome/lcooper43/vortex-dev-old/driver/opae:/opt/opae/1.1.2/lib:/opt/inteldevstack/a10_gx_pac_ias_1_2_1_pv/opencl/opencl_bsp/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/host/linux64/lib:/opt/intelFPGA_pro/quartus_19.2.0b57/hld/linux64/lib: ./vecadd -n64
|
||||
[VXDRV] DEVCAPS: version=0, num_cores=8, num_warps=4, num_threads=4
|
||||
Create context
|
||||
Allocate device buffers
|
||||
Create program from kernel source
|
||||
Upload source buffers
|
||||
Execute the kernel
|
||||
Elapsed time: 3 ms
|
||||
Download destination buffer
|
||||
Verify result
|
||||
PASSED!
|
||||
PERF: core0: instrs=2019, cycles=4958, IPC=0.407221
|
||||
PERF: core0: ibuffer stalls=0
|
||||
PERF: core0: scoreboard stalls=0
|
||||
PERF: core0: alu unit stalls=0
|
||||
PERF: core0: lsu unit stalls=0
|
||||
PERF: core0: csr unit stalls=0
|
||||
PERF: core0: fpu unit stalls=0
|
||||
PERF: core0: gpu unit stalls=0
|
||||
PERF: core0: icache reads=0
|
||||
PERF: core0: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: icache pipeline stalls=0
|
||||
PERF: core0: icache reponse stalls=0
|
||||
PERF: core0: dcache reads=0
|
||||
PERF: core0: dcache writes=0
|
||||
PERF: core0: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core0: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dcache mshr stalls=0
|
||||
PERF: core0: dcache pipeline stalls=0
|
||||
PERF: core0: dcache reponse stalls=0
|
||||
PERF: core0: smem reads=0
|
||||
PERF: core0: smem writes=0
|
||||
PERF: core0: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core0: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core0: dram average latency=-2147483648 cycles
|
||||
PERF: core1: instrs=2019, cycles=4957, IPC=0.407303
|
||||
PERF: core1: ibuffer stalls=0
|
||||
PERF: core1: scoreboard stalls=0
|
||||
PERF: core1: alu unit stalls=0
|
||||
PERF: core1: lsu unit stalls=0
|
||||
PERF: core1: csr unit stalls=0
|
||||
PERF: core1: fpu unit stalls=0
|
||||
PERF: core1: gpu unit stalls=0
|
||||
PERF: core1: icache reads=0
|
||||
PERF: core1: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: icache pipeline stalls=0
|
||||
PERF: core1: icache reponse stalls=0
|
||||
PERF: core1: dcache reads=0
|
||||
PERF: core1: dcache writes=0
|
||||
PERF: core1: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core1: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dcache mshr stalls=0
|
||||
PERF: core1: dcache pipeline stalls=0
|
||||
PERF: core1: dcache reponse stalls=0
|
||||
PERF: core1: smem reads=0
|
||||
PERF: core1: smem writes=0
|
||||
PERF: core1: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core1: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core1: dram average latency=-2147483648 cycles
|
||||
PERF: core2: instrs=2019, cycles=4955, IPC=0.407467
|
||||
PERF: core2: ibuffer stalls=0
|
||||
PERF: core2: scoreboard stalls=0
|
||||
PERF: core2: alu unit stalls=0
|
||||
PERF: core2: lsu unit stalls=0
|
||||
PERF: core2: csr unit stalls=0
|
||||
PERF: core2: fpu unit stalls=0
|
||||
PERF: core2: gpu unit stalls=0
|
||||
PERF: core2: icache reads=0
|
||||
PERF: core2: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: icache pipeline stalls=0
|
||||
PERF: core2: icache reponse stalls=0
|
||||
PERF: core2: dcache reads=0
|
||||
PERF: core2: dcache writes=0
|
||||
PERF: core2: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core2: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dcache mshr stalls=0
|
||||
PERF: core2: dcache pipeline stalls=0
|
||||
PERF: core2: dcache reponse stalls=0
|
||||
PERF: core2: smem reads=0
|
||||
PERF: core2: smem writes=0
|
||||
PERF: core2: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core2: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core2: dram average latency=-2147483648 cycles
|
||||
PERF: core3: instrs=2019, cycles=4953, IPC=0.407632
|
||||
PERF: core3: ibuffer stalls=0
|
||||
PERF: core3: scoreboard stalls=0
|
||||
PERF: core3: alu unit stalls=0
|
||||
PERF: core3: lsu unit stalls=0
|
||||
PERF: core3: csr unit stalls=0
|
||||
PERF: core3: fpu unit stalls=0
|
||||
PERF: core3: gpu unit stalls=0
|
||||
PERF: core3: icache reads=0
|
||||
PERF: core3: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: icache pipeline stalls=0
|
||||
PERF: core3: icache reponse stalls=0
|
||||
PERF: core3: dcache reads=0
|
||||
PERF: core3: dcache writes=0
|
||||
PERF: core3: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core3: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dcache mshr stalls=0
|
||||
PERF: core3: dcache pipeline stalls=0
|
||||
PERF: core3: dcache reponse stalls=0
|
||||
PERF: core3: smem reads=0
|
||||
PERF: core3: smem writes=0
|
||||
PERF: core3: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core3: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core3: dram average latency=-2147483648 cycles
|
||||
PERF: core4: instrs=495, cycles=3388, IPC=0.146104
|
||||
PERF: core4: ibuffer stalls=0
|
||||
PERF: core4: scoreboard stalls=0
|
||||
PERF: core4: alu unit stalls=0
|
||||
PERF: core4: lsu unit stalls=0
|
||||
PERF: core4: csr unit stalls=0
|
||||
PERF: core4: fpu unit stalls=0
|
||||
PERF: core4: gpu unit stalls=0
|
||||
PERF: core4: icache reads=0
|
||||
PERF: core4: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: icache pipeline stalls=0
|
||||
PERF: core4: icache reponse stalls=0
|
||||
PERF: core4: dcache reads=0
|
||||
PERF: core4: dcache writes=0
|
||||
PERF: core4: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core4: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dcache mshr stalls=0
|
||||
PERF: core4: dcache pipeline stalls=0
|
||||
PERF: core4: dcache reponse stalls=0
|
||||
PERF: core4: smem reads=0
|
||||
PERF: core4: smem writes=0
|
||||
PERF: core4: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core4: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core4: dram average latency=-2147483648 cycles
|
||||
PERF: core5: instrs=495, cycles=3387, IPC=0.146147
|
||||
PERF: core5: ibuffer stalls=0
|
||||
PERF: core5: scoreboard stalls=0
|
||||
PERF: core5: alu unit stalls=0
|
||||
PERF: core5: lsu unit stalls=0
|
||||
PERF: core5: csr unit stalls=0
|
||||
PERF: core5: fpu unit stalls=0
|
||||
PERF: core5: gpu unit stalls=0
|
||||
PERF: core5: icache reads=0
|
||||
PERF: core5: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: icache pipeline stalls=0
|
||||
PERF: core5: icache reponse stalls=0
|
||||
PERF: core5: dcache reads=0
|
||||
PERF: core5: dcache writes=0
|
||||
PERF: core5: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core5: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dcache mshr stalls=0
|
||||
PERF: core5: dcache pipeline stalls=0
|
||||
PERF: core5: dcache reponse stalls=0
|
||||
PERF: core5: smem reads=0
|
||||
PERF: core5: smem writes=0
|
||||
PERF: core5: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core5: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core5: dram average latency=-2147483648 cycles
|
||||
PERF: core6: instrs=495, cycles=3386, IPC=0.146190
|
||||
PERF: core6: ibuffer stalls=0
|
||||
PERF: core6: scoreboard stalls=0
|
||||
PERF: core6: alu unit stalls=0
|
||||
PERF: core6: lsu unit stalls=0
|
||||
PERF: core6: csr unit stalls=0
|
||||
PERF: core6: fpu unit stalls=0
|
||||
PERF: core6: gpu unit stalls=0
|
||||
PERF: core6: icache reads=0
|
||||
PERF: core6: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: icache pipeline stalls=0
|
||||
PERF: core6: icache reponse stalls=0
|
||||
PERF: core6: dcache reads=0
|
||||
PERF: core6: dcache writes=0
|
||||
PERF: core6: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core6: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dcache mshr stalls=0
|
||||
PERF: core6: dcache pipeline stalls=0
|
||||
PERF: core6: dcache reponse stalls=0
|
||||
PERF: core6: smem reads=0
|
||||
PERF: core6: smem writes=0
|
||||
PERF: core6: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core6: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core6: dram average latency=-2147483648 cycles
|
||||
PERF: core7: instrs=495, cycles=3384, IPC=0.146277
|
||||
PERF: core7: ibuffer stalls=0
|
||||
PERF: core7: scoreboard stalls=0
|
||||
PERF: core7: alu unit stalls=0
|
||||
PERF: core7: lsu unit stalls=0
|
||||
PERF: core7: csr unit stalls=0
|
||||
PERF: core7: fpu unit stalls=0
|
||||
PERF: core7: gpu unit stalls=0
|
||||
PERF: core7: icache reads=0
|
||||
PERF: core7: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: icache pipeline stalls=0
|
||||
PERF: core7: icache reponse stalls=0
|
||||
PERF: core7: dcache reads=0
|
||||
PERF: core7: dcache writes=0
|
||||
PERF: core7: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: core7: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dcache mshr stalls=0
|
||||
PERF: core7: dcache pipeline stalls=0
|
||||
PERF: core7: dcache reponse stalls=0
|
||||
PERF: core7: smem reads=0
|
||||
PERF: core7: smem writes=0
|
||||
PERF: core7: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dram requests=0 (reads=0, writes=0)
|
||||
PERF: core7: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: core7: dram average latency=-2147483648 cycles
|
||||
PERF: instrs=10056, cycles=4958, IPC=2.028237
|
||||
PERF: ibuffer stalls=0
|
||||
PERF: scoreboard stalls=0
|
||||
PERF: alu unit stalls=0
|
||||
PERF: lsu unit stalls=0
|
||||
PERF: csr unit stalls=0
|
||||
PERF: fpu unit stalls=0
|
||||
PERF: gpu unit stalls=0
|
||||
PERF: icache reads=0
|
||||
PERF: icache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: icache pipeline stalls=0
|
||||
PERF: icache reponse stalls=0
|
||||
PERF: dcache reads=0
|
||||
PERF: dcache writes=0
|
||||
PERF: dcache read misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache write misses=0 (hit ratio=-2147483648%)
|
||||
PERF: dcache bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dcache mshr stalls=0
|
||||
PERF: dcache pipeline stalls=0
|
||||
PERF: dcache reponse stalls=0
|
||||
PERF: smem reads=0
|
||||
PERF: smem writes=0
|
||||
PERF: smem bank stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram requests=0 (reads=0, writes=0)
|
||||
PERF: dram stalls=0 (utilization=-2147483648%)
|
||||
PERF: dram average latency=-2147483648 cycles
|
||||
make: Leaving directory '/nethome/lcooper43/vortex-dev-old/benchmarks/opencl/vecadd'
|
64
evaluation/scripts/README.txt
Normal file
64
evaluation/scripts/README.txt
Normal file
|
@ -0,0 +1,64 @@
|
|||
-build.sh-
|
||||
|
||||
Description: Makes the build in the opae directory with the specified core
|
||||
count and optional performance profiling. If a build already
|
||||
exists, a make clean command is ran before the build. Script waits
|
||||
until the inteldev script or quartus program is finished running.
|
||||
|
||||
Usage: ./build.sh -c [1|2|4|8|16] [-p [y|n]]
|
||||
|
||||
Options:
|
||||
-c
|
||||
Core count (1, 2, 4, 8, or 16).
|
||||
|
||||
-p
|
||||
Performance profiling enable (y or n). Changes the source file in the
|
||||
opae directory to include/exclude "+define+PERF_ENABLE".
|
||||
|
||||
_______________________________________________________________________________
|
||||
|
||||
|
||||
-build_all_perf.sh-
|
||||
|
||||
Description: Runs build.sh with performance profiling enabled for all valid
|
||||
core configurations.
|
||||
|
||||
_______________________________________________________________________________
|
||||
|
||||
|
||||
-program_fpga.sh-
|
||||
|
||||
Description: Signs and programs the fpga for a specified core count. Prompts
|
||||
for PACSign are all automatically answered 'yes'.
|
||||
|
||||
Usage: ./program_fpga.sh -c [1|2|4|8|16]
|
||||
|
||||
Options:
|
||||
-c
|
||||
Core count (1, 2, 4, 8, or 16).
|
||||
|
||||
_______________________________________________________________________________
|
||||
|
||||
|
||||
-gather_perf_results.sh-
|
||||
|
||||
Description: Creates directory named perf_YYYY_MM_DD and core subfolders in
|
||||
evaluation. Copies relevant build output files to specified core
|
||||
directory. Runs and redirects outputs of sgemm, vecadd, saxpy,
|
||||
sfilter, nearn, and gaussian benchmarks to specified core
|
||||
directory. Build should already be made before running this.
|
||||
|
||||
Usage: ./gather_perf_results.sh -c [1|2|4|8|16]
|
||||
|
||||
Options:
|
||||
-c
|
||||
Core count (1, 2, 4, 8, or 16).
|
||||
|
||||
_______________________________________________________________________________
|
||||
|
||||
|
||||
-gather_all_perf_results.sh-
|
||||
|
||||
Description: Programs fpga and runs gather_perf_results.sh for all valid core
|
||||
configurations. All builds should already be made before running
|
||||
this.
|
50
evaluation/scripts/build.sh
Executable file
50
evaluation/scripts/build.sh
Executable file
|
@ -0,0 +1,50 @@
|
|||
#!/bin/bash
|
||||
|
||||
while getopts c:p: flag
|
||||
do
|
||||
case "${flag}" in
|
||||
c) cores=${OPTARG};; #1, 2, 4, 8, 16
|
||||
p) perf=${OPTARG};; #perf counters enable (y/n)
|
||||
esac
|
||||
done
|
||||
|
||||
if [[ ! "$cores" =~ ^(1|2|4|8|16)$ ]]; then
|
||||
echo 'Invalid parameter for argument -c (1, 2, 4, 8, or 16 expected)'
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cd ../../hw/syn/opae
|
||||
|
||||
sources_file="./sources_${cores}c.txt"
|
||||
|
||||
if [ ${perf:0:1} = "n" ]; then
|
||||
if grep -v '^ *#' ${sources_file} | grep -Fxq '+define+SYNTHESIS'; then
|
||||
sed -i 's/+define+PERF_ENABLE/#+define+PERF_ENABLE/' ${sources_file}
|
||||
elif ! grep -Fxq '#+define+PERF_ENABLE' ${sources_file}; then
|
||||
sed -i '1s/^/#+define+PERF_ENABLE\n/' ${sources_file}
|
||||
fi
|
||||
elif [ ${perf:0:1} = "y" ]; then
|
||||
if grep -Fxq '#+define+PERF_ENABLE' ${sources_file}; then
|
||||
sed -i 's/+define+PERF_ENABLE/#+define+PERF_ENABLE/' ${sources_file}
|
||||
elif ! grep -Fxq '+define+PERF_ENABLE' ${sources_file}; then
|
||||
sed -i '1s/^/+define+PERF_ENABLE\n/' ${sources_file}
|
||||
fi
|
||||
else
|
||||
echo 'Invalid parameter for argument -p (y/n expected)'
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ -d "./build_fpga_{$cores}c" ]; then
|
||||
make "clean-fpga-${cores}c"
|
||||
fi
|
||||
make "fpga-${cores}c"
|
||||
|
||||
sleep 30
|
||||
|
||||
pids=($(pgrep -f "${OPAE_PLATFORM_ROOT}|quartus"))
|
||||
for pid in ${pids[@]}; do
|
||||
while kill -0 ${pid} 2> /dev/null; do
|
||||
sleep 30
|
||||
done
|
||||
done
|
||||
|
7
evaluation/scripts/build_all_perf.sh
Executable file
7
evaluation/scripts/build_all_perf.sh
Executable file
|
@ -0,0 +1,7 @@
|
|||
#!/bin/bash
|
||||
|
||||
for ((i=1; i <= 16; i=i*2)); do
|
||||
echo "Building ${i} core build..."
|
||||
./build.sh -c ${i} -p y
|
||||
echo "Done ${i} core build."
|
||||
done
|
35
evaluation/scripts/gather_all_perf_results.sh
Executable file
35
evaluation/scripts/gather_all_perf_results.sh
Executable file
|
@ -0,0 +1,35 @@
|
|||
#!/bin/bash
|
||||
|
||||
cd ../../hw/syn/opae/
|
||||
|
||||
date=$(date +%Y_%m_%d)
|
||||
results_dir="../../../evaluation/perf_${date}"
|
||||
mkdir -p ${results_dir}
|
||||
|
||||
for ((i=1; i <= 16; i=i*2)); do
|
||||
mkdir -p "${results_dir}/${i}c"
|
||||
done
|
||||
|
||||
for ((i=1; i <= 16; i=i*2)); do
|
||||
cp "./build_fpga_${i}c/build.log" "${results_dir}/${i}c/build.log"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.syn.summary" "${results_dir}/${i}c/afu_default.syn.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.fit.summary" "${results_dir}/${i}c/afu_default.fit.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.sta.summary" "${results_dir}/${i}c/afu_default.sta.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/user_clock_freq.txt" "${results_dir}/${i}c/user_clock_freq.txt"
|
||||
done
|
||||
|
||||
cd ../../../evaluation/scripts
|
||||
results_dir="../perf_${date}"
|
||||
|
||||
for ((i=1; i <= 16; i=i*2)); do
|
||||
echo "Programming fpga for ${i} core build..."
|
||||
./program_fpga.sh -c ${i}
|
||||
echo "Running tests for ${i} core build..."
|
||||
../../ci/blackbox.sh --driver=fpga --app=sgemm --perf > "${results_dir}/${i}c/sgemm.result"
|
||||
../../ci/blackbox.sh --driver=fpga --app=vecadd --perf > "${results_dir}/${i}c/vecadd.result"
|
||||
../../ci/blackbox.sh --driver=fpga --app=saxpy --perf > "${results_dir}/${i}c/saxpy.result"
|
||||
../../ci/blackbox.sh --driver=fpga --app=sfilter --perf > "${results_dir}/${i}c/sfilter.result"
|
||||
../../ci/blackbox.sh --driver=fpga --app=nearn --perf > "${results_dir}/${i}c/nearn.result"
|
||||
../../ci/blackbox.sh --driver=fpga --app=guassian --perf > "${results_dir}/${i}c/guassian.result"
|
||||
echo "Done ${i} core build."
|
||||
done
|
34
evaluation/scripts/gather_perf_results.sh
Executable file
34
evaluation/scripts/gather_perf_results.sh
Executable file
|
@ -0,0 +1,34 @@
|
|||
#!/bin/bash
|
||||
|
||||
cd ../../hw/syn/opae/
|
||||
|
||||
while getopts c: flag
|
||||
do
|
||||
case "${flag}" in
|
||||
c) i=${OPTARG};; #cores: 1, 2, 4, 8, 16
|
||||
esac
|
||||
done
|
||||
|
||||
if [[ ! "$i" =~ ^(1|2|4|8|16)$ ]]; then
|
||||
echo 'Invalid parameter for argument -c (1, 2, 4, 8, or 16 expected)'
|
||||
exit 1
|
||||
fi
|
||||
|
||||
date=$(date +%Y_%m_%d)
|
||||
results_dir="../../../evaluation/perf_${date}"
|
||||
mkdir -p ${results_dir}
|
||||
|
||||
mkdir -p "${results_dir}/${i}c"
|
||||
|
||||
cp "./build_fpga_${i}c/build.log" "${results_dir}/${i}c/build.log"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.syn.summary" "${results_dir}/${i}c/afu_default.syn.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.fit.summary" "${results_dir}/${i}c/afu_default.fit.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/afu_default.sta.summary" "${results_dir}/${i}c/afu_default.sta.summary"
|
||||
cp "./build_fpga_${i}c/build/output_files/user_clock_freq.txt" "${results_dir}/${i}c/user_clock_freq.txt"
|
||||
|
||||
../../../ci/blackbox.sh --driver=fpga --app=sgemm --perf > "${results_dir}/${i}c/sgemm.result"
|
||||
../../../ci/blackbox.sh --driver=fpga --app=vecadd --perf > "${results_dir}/${i}c/vecadd.result"
|
||||
../../../ci/blackbox.sh --driver=fpga --app=saxpy --perf > "${results_dir}/${i}c/saxpy.result"
|
||||
../../../ci/blackbox.sh --driver=fpga --app=sfilter --perf > "${results_dir}/${i}c/sfilter.result"
|
||||
../../../ci/blackbox.sh --driver=fpga --app=nearn --perf > "${results_dir}/${i}c/nearn.result"
|
||||
../../../ci/blackbox.sh --driver=fpga --app=guassian --perf > "${results_dir}/${i}c/guassian.result"
|
19
evaluation/scripts/program_fpga.sh
Executable file
19
evaluation/scripts/program_fpga.sh
Executable file
|
@ -0,0 +1,19 @@
|
|||
#!/bin/bash
|
||||
|
||||
while getopts c: flag
|
||||
do
|
||||
case "${flag}" in
|
||||
c) i=${OPTARG};; #cores: 1, 2, 4, 8, 16
|
||||
esac
|
||||
done
|
||||
|
||||
if [[ ! "$i" =~ ^(1|2|4|8|16)$ ]]; then
|
||||
echo 'Invalid parameter for argument -c (1, 2, 4, 8, or 16 expected)'
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cd "../../hw/syn/opae/build_fpga_${i}c"
|
||||
|
||||
printf "y\ny\ny\n" | PACSign PR -t UPDATE -H openssl_manager -i vortex_afu.gbs -o vortex_afu_unsigned_ssl.gbs > /dev/null
|
||||
|
||||
fpgasupdate vortex_afu_unsigned_ssl.gbs
|
Loading…
Add table
Add a link
Reference in a new issue