Upload INT8-quantized model with calibration
Browse files- yolo_nas_pose_l_int8.onnx.best.engine +2 -2
- yolo_nas_pose_l_int8.onnx.best.engine.err +7 -7
- yolo_nas_pose_l_int8.onnx.best.engine.log +0 -0
- yolo_nas_pose_l_int8.onnx.int8.engine +2 -2
- yolo_nas_pose_l_int8.onnx.int8.engine.err +7 -7
- yolo_nas_pose_l_int8.onnx.int8.engine.log +0 -0
- yolo_nas_pose_m_int8.onnx.best.engine +2 -2
- yolo_nas_pose_m_int8.onnx.best.engine.err +7 -7
- yolo_nas_pose_m_int8.onnx.best.engine.log +0 -0
- yolo_nas_pose_m_int8.onnx.int8.engine +2 -2
- yolo_nas_pose_m_int8.onnx.int8.engine.err +7 -7
- yolo_nas_pose_m_int8.onnx.int8.engine.log +0 -0
- yolo_nas_pose_n_int8.onnx.best.engine +2 -2
- yolo_nas_pose_n_int8.onnx.best.engine.err +7 -7
- yolo_nas_pose_n_int8.onnx.best.engine.log +323 -321
- yolo_nas_pose_n_int8.onnx.int8.engine +2 -2
- yolo_nas_pose_n_int8.onnx.int8.engine.err +7 -7
- yolo_nas_pose_n_int8.onnx.int8.engine.log +323 -321
- yolo_nas_pose_s_int8.onnx.best.engine +2 -2
- yolo_nas_pose_s_int8.onnx.best.engine.err +7 -7
- yolo_nas_pose_s_int8.onnx.best.engine.log +322 -321
- yolo_nas_pose_s_int8.onnx.int8.engine +2 -2
- yolo_nas_pose_s_int8.onnx.int8.engine.err +7 -7
- yolo_nas_pose_s_int8.onnx.int8.engine.log +322 -320
yolo_nas_pose_l_int8.onnx.best.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:23bee9f331f0da470fb3b6d1e4244127e7648c848e5cf9265fbeea5a15c2204d
|
3 |
+
size 57380810
|
yolo_nas_pose_l_int8.onnx.best.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-16:29:12] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-16:29:12] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-16:29:16] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-17:18:11] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-17:18:11] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-17:18:11] [W] * GPU compute time is unstable, with coefficient of variance = 2.14899%.
|
7 |
+
[01/04/2024-17:18:11] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_l_int8.onnx.best.engine.log
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
yolo_nas_pose_l_int8.onnx.int8.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f0e0b6cf94d6d65e68c969cd873afeffe71477632d733e3a4f8a9d1d9242fa72
|
3 |
+
size 57416931
|
yolo_nas_pose_l_int8.onnx.int8.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-17:18:28] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-17:18:28] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-17:18:32] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-17:33:32] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-17:33:32] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-17:33:32] [W] * GPU compute time is unstable, with coefficient of variance = 9.56431%.
|
7 |
+
[01/04/2024-17:33:32] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_l_int8.onnx.int8.engine.log
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
yolo_nas_pose_m_int8.onnx.best.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1d812fb42c4b1a75b40065100ad8e4532a011b7d5dd8954655ba54262cceca67
|
3 |
+
size 41711541
|
yolo_nas_pose_m_int8.onnx.best.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-15:38:52] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-15:38:52] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-15:38:56] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-16:17:05] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-16:17:05] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-16:17:05] [W] * GPU compute time is unstable, with coefficient of variance = 3.17268%.
|
7 |
+
[01/04/2024-16:17:05] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_m_int8.onnx.best.engine.log
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
yolo_nas_pose_m_int8.onnx.int8.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bee7381541459b8b03631b7ff27dd33fc54d85b6ef4fdc450591c2062130b3f9
|
3 |
+
size 41761906
|
yolo_nas_pose_m_int8.onnx.int8.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-16:17:13] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-16:17:13] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-16:17:16] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-16:29:05] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-16:29:05] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-16:29:05] [W] * GPU compute time is unstable, with coefficient of variance = 3.06633%.
|
7 |
+
[01/04/2024-16:29:05] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_m_int8.onnx.int8.engine.log
CHANGED
The diff for this file is too large to render.
See raw diff
|
|
yolo_nas_pose_n_int8.onnx.best.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:034c3a1e3c8f16b6ba305b1b567429a8e42f2ae6c80cb63d074ca8b794844cf3
|
3 |
+
size 10229486
|
yolo_nas_pose_n_int8.onnx.best.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-14:17:33] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-14:17:33] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-14:17:36] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-14:46:17] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-14:46:17] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-14:46:17] [W] * GPU compute time is unstable, with coefficient of variance = 2.71486%.
|
7 |
+
[01/04/2024-14:46:17] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_n_int8.onnx.best.engine.log
CHANGED
@@ -1,323 +1,325 @@
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.best.engine
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
8 |
-
[
|
9 |
-
[
|
10 |
-
[
|
11 |
-
[
|
12 |
-
[
|
13 |
-
[
|
14 |
-
[
|
15 |
-
[
|
16 |
-
[
|
17 |
-
[
|
18 |
-
[
|
19 |
-
[
|
20 |
-
[
|
21 |
-
[
|
22 |
-
[
|
23 |
-
[
|
24 |
-
[
|
25 |
-
[
|
26 |
-
[
|
27 |
-
[
|
28 |
-
[
|
29 |
-
[
|
30 |
-
[
|
31 |
-
[
|
32 |
-
[
|
33 |
-
[
|
34 |
-
[
|
35 |
-
[
|
36 |
-
[
|
37 |
-
[
|
38 |
-
[
|
39 |
-
[
|
40 |
-
[
|
41 |
-
[
|
42 |
-
[
|
43 |
-
[
|
44 |
-
[
|
45 |
-
[
|
46 |
-
[
|
47 |
-
[
|
48 |
-
[
|
49 |
-
[
|
50 |
-
[
|
51 |
-
[
|
52 |
-
[
|
53 |
-
[
|
54 |
-
[
|
55 |
-
[
|
56 |
-
[
|
57 |
-
[
|
58 |
-
[
|
59 |
-
[
|
60 |
-
[
|
61 |
-
[
|
62 |
-
[
|
63 |
-
[
|
64 |
-
[
|
65 |
-
[
|
66 |
-
[
|
67 |
-
[
|
68 |
-
[
|
69 |
-
[
|
70 |
-
[
|
71 |
-
[
|
72 |
-
[
|
73 |
-
[
|
74 |
-
[
|
75 |
-
[
|
76 |
-
[
|
77 |
-
[
|
78 |
-
[
|
79 |
-
[
|
80 |
-
[
|
81 |
-
[
|
82 |
-
[
|
83 |
-
[
|
84 |
-
[
|
85 |
-
[
|
86 |
-
[
|
87 |
-
[
|
88 |
-
[
|
89 |
-
[
|
90 |
-
[
|
91 |
-
[
|
92 |
-
[
|
93 |
-
[
|
94 |
-
[
|
95 |
-
[
|
96 |
-
[
|
97 |
-
[
|
98 |
-
[
|
99 |
-
[
|
100 |
-
[
|
101 |
-
[
|
102 |
-
[
|
103 |
-
[
|
104 |
-
[
|
105 |
-
[
|
106 |
-
[
|
107 |
-
[
|
108 |
-
[
|
109 |
-
[
|
110 |
-
[
|
111 |
-
[
|
112 |
-
[
|
113 |
-
[
|
114 |
-
[
|
115 |
-
[
|
116 |
-
[
|
117 |
-
[
|
118 |
-
[
|
119 |
-
[
|
120 |
-
[
|
121 |
-
[
|
122 |
-
[
|
123 |
-
[
|
124 |
-
[
|
125 |
-
[
|
126 |
-
[
|
127 |
-
[
|
128 |
-
[
|
129 |
-
[
|
130 |
-
[
|
131 |
-
[
|
132 |
-
[
|
133 |
-
[
|
134 |
-
[
|
135 |
-
[
|
136 |
-
[
|
137 |
-
[
|
138 |
-
[
|
139 |
-
[
|
140 |
-
[
|
141 |
-
[
|
142 |
-
[
|
143 |
-
[
|
144 |
-
[
|
145 |
-
[
|
146 |
-
[
|
147 |
-
[
|
148 |
-
[
|
149 |
-
[
|
150 |
-
[
|
151 |
-
[
|
152 |
-
[
|
153 |
-
[
|
154 |
-
[
|
155 |
-
[
|
156 |
-
[
|
157 |
-
[
|
158 |
-
[
|
159 |
-
[
|
160 |
-
[
|
161 |
-
[
|
162 |
-
[
|
163 |
-
[
|
164 |
-
[
|
165 |
-
[
|
166 |
-
[
|
167 |
-
[
|
168 |
-
[
|
169 |
-
[
|
170 |
-
[
|
171 |
-
[
|
172 |
-
[
|
173 |
-
[
|
174 |
-
[
|
175 |
-
[
|
176 |
-
[
|
177 |
-
[
|
178 |
-
[
|
179 |
-
[
|
180 |
-
[
|
181 |
-
[
|
182 |
-
[
|
183 |
-
[
|
184 |
-
[
|
185 |
-
[
|
186 |
-
[
|
187 |
-
[
|
188 |
-
[
|
189 |
-
[
|
190 |
-
[
|
191 |
-
[
|
192 |
-
[
|
193 |
-
[
|
194 |
-
[
|
195 |
-
[
|
196 |
-
[
|
197 |
-
[
|
198 |
-
[
|
199 |
-
[
|
200 |
-
[
|
201 |
-
[
|
202 |
-
[
|
203 |
-
[
|
204 |
-
[
|
205 |
-
[
|
206 |
-
[
|
207 |
-
[
|
208 |
-
[
|
209 |
-
[
|
210 |
-
[
|
211 |
-
[
|
212 |
-
[
|
213 |
-
[
|
214 |
-
[
|
215 |
-
[
|
216 |
-
[
|
217 |
-
[
|
218 |
-
[
|
219 |
-
[
|
220 |
-
[
|
221 |
-
[
|
222 |
-
[
|
223 |
-
[
|
224 |
-
[
|
225 |
-
[
|
226 |
-
[
|
227 |
-
[
|
228 |
-
[
|
229 |
-
[
|
230 |
-
[
|
231 |
-
[
|
232 |
-
[
|
233 |
-
[
|
234 |
-
[
|
235 |
-
[
|
236 |
-
[
|
237 |
-
[
|
238 |
-
[
|
239 |
-
[
|
240 |
-
[
|
241 |
-
[
|
242 |
-
[
|
243 |
-
[
|
244 |
-
[
|
245 |
-
[
|
246 |
-
[
|
247 |
-
[
|
248 |
-
[
|
249 |
-
[
|
250 |
-
[
|
251 |
-
[
|
252 |
-
[
|
253 |
-
[
|
254 |
-
[
|
255 |
-
[
|
256 |
-
[
|
257 |
-
[
|
258 |
-
[
|
259 |
-
[
|
260 |
-
[
|
261 |
-
[
|
262 |
-
[
|
263 |
-
[
|
264 |
-
[
|
265 |
-
[
|
266 |
-
[
|
267 |
-
[
|
268 |
-
[
|
269 |
-
[
|
270 |
-
[
|
271 |
-
[
|
272 |
-
[
|
273 |
-
[
|
274 |
-
[
|
275 |
-
[
|
276 |
-
[
|
277 |
-
[
|
278 |
-
[
|
279 |
-
[
|
280 |
-
[
|
281 |
-
[
|
282 |
-
[
|
283 |
-
[
|
284 |
-
[
|
285 |
-
[
|
286 |
-
[
|
287 |
-
[
|
288 |
-
[
|
289 |
-
[
|
290 |
-
[
|
291 |
-
[
|
292 |
-
[
|
293 |
-
[
|
294 |
-
[
|
295 |
-
[
|
296 |
-
[
|
297 |
-
[
|
298 |
-
[
|
299 |
-
[
|
300 |
-
[
|
301 |
-
[
|
302 |
-
[
|
303 |
-
[
|
304 |
-
[
|
305 |
-
[
|
306 |
-
[
|
307 |
-
[
|
308 |
-
[
|
309 |
-
[
|
310 |
-
[
|
311 |
-
[
|
312 |
-
[
|
313 |
-
[
|
314 |
-
[
|
315 |
-
[
|
316 |
-
[
|
317 |
-
[
|
318 |
-
[
|
319 |
-
[
|
320 |
-
[
|
321 |
-
[
|
322 |
-
[
|
|
|
|
|
323 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.best.engine
|
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.best.engine
|
2 |
+
[01/04/2024-14:17:23] [I] === Model Options ===
|
3 |
+
[01/04/2024-14:17:23] [I] Format: ONNX
|
4 |
+
[01/04/2024-14:17:23] [I] Model: yolo_nas_pose_n_int8.onnx
|
5 |
+
[01/04/2024-14:17:23] [I] Output:
|
6 |
+
[01/04/2024-14:17:23] [I] === Build Options ===
|
7 |
+
[01/04/2024-14:17:23] [I] Max batch: explicit batch
|
8 |
+
[01/04/2024-14:17:23] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
|
9 |
+
[01/04/2024-14:17:23] [I] minTiming: 1
|
10 |
+
[01/04/2024-14:17:23] [I] avgTiming: 8
|
11 |
+
[01/04/2024-14:17:23] [I] Precision: FP32+FP16+INT8
|
12 |
+
[01/04/2024-14:17:23] [I] LayerPrecisions:
|
13 |
+
[01/04/2024-14:17:23] [I] Calibration: Dynamic
|
14 |
+
[01/04/2024-14:17:23] [I] Refit: Disabled
|
15 |
+
[01/04/2024-14:17:23] [I] Sparsity: Disabled
|
16 |
+
[01/04/2024-14:17:23] [I] Safe mode: Disabled
|
17 |
+
[01/04/2024-14:17:23] [I] DirectIO mode: Disabled
|
18 |
+
[01/04/2024-14:17:23] [I] Restricted mode: Disabled
|
19 |
+
[01/04/2024-14:17:23] [I] Build only: Disabled
|
20 |
+
[01/04/2024-14:17:23] [I] Save engine: yolo_nas_pose_n_int8.onnx.best.engine
|
21 |
+
[01/04/2024-14:17:23] [I] Load engine:
|
22 |
+
[01/04/2024-14:17:23] [I] Profiling verbosity: 0
|
23 |
+
[01/04/2024-14:17:23] [I] Tactic sources: Using default tactic sources
|
24 |
+
[01/04/2024-14:17:23] [I] timingCacheMode: local
|
25 |
+
[01/04/2024-14:17:23] [I] timingCacheFile:
|
26 |
+
[01/04/2024-14:17:23] [I] Heuristic: Disabled
|
27 |
+
[01/04/2024-14:17:23] [I] Preview Features: Use default preview flags.
|
28 |
+
[01/04/2024-14:17:23] [I] Input(s)s format: fp32:CHW
|
29 |
+
[01/04/2024-14:17:23] [I] Output(s)s format: fp32:CHW
|
30 |
+
[01/04/2024-14:17:23] [I] Input build shapes: model
|
31 |
+
[01/04/2024-14:17:23] [I] Input calibration shapes: model
|
32 |
+
[01/04/2024-14:17:23] [I] === System Options ===
|
33 |
+
[01/04/2024-14:17:23] [I] Device: 0
|
34 |
+
[01/04/2024-14:17:23] [I] DLACore:
|
35 |
+
[01/04/2024-14:17:23] [I] Plugins:
|
36 |
+
[01/04/2024-14:17:23] [I] === Inference Options ===
|
37 |
+
[01/04/2024-14:17:23] [I] Batch: Explicit
|
38 |
+
[01/04/2024-14:17:23] [I] Input inference shapes: model
|
39 |
+
[01/04/2024-14:17:23] [I] Iterations: 10
|
40 |
+
[01/04/2024-14:17:23] [I] Duration: 15s (+ 200ms warm up)
|
41 |
+
[01/04/2024-14:17:23] [I] Sleep time: 0ms
|
42 |
+
[01/04/2024-14:17:23] [I] Idle time: 0ms
|
43 |
+
[01/04/2024-14:17:23] [I] Streams: 1
|
44 |
+
[01/04/2024-14:17:23] [I] ExposeDMA: Disabled
|
45 |
+
[01/04/2024-14:17:23] [I] Data transfers: Enabled
|
46 |
+
[01/04/2024-14:17:23] [I] Spin-wait: Disabled
|
47 |
+
[01/04/2024-14:17:23] [I] Multithreading: Disabled
|
48 |
+
[01/04/2024-14:17:23] [I] CUDA Graph: Disabled
|
49 |
+
[01/04/2024-14:17:23] [I] Separate profiling: Disabled
|
50 |
+
[01/04/2024-14:17:23] [I] Time Deserialize: Disabled
|
51 |
+
[01/04/2024-14:17:23] [I] Time Refit: Disabled
|
52 |
+
[01/04/2024-14:17:23] [I] NVTX verbosity: 0
|
53 |
+
[01/04/2024-14:17:23] [I] Persistent Cache Ratio: 0
|
54 |
+
[01/04/2024-14:17:23] [I] Inputs:
|
55 |
+
[01/04/2024-14:17:23] [I] === Reporting Options ===
|
56 |
+
[01/04/2024-14:17:23] [I] Verbose: Disabled
|
57 |
+
[01/04/2024-14:17:23] [I] Averages: 100 inferences
|
58 |
+
[01/04/2024-14:17:23] [I] Percentiles: 90,95,99
|
59 |
+
[01/04/2024-14:17:23] [I] Dump refittable layers:Disabled
|
60 |
+
[01/04/2024-14:17:23] [I] Dump output: Disabled
|
61 |
+
[01/04/2024-14:17:23] [I] Profile: Disabled
|
62 |
+
[01/04/2024-14:17:23] [I] Export timing to JSON file:
|
63 |
+
[01/04/2024-14:17:23] [I] Export output to JSON file:
|
64 |
+
[01/04/2024-14:17:23] [I] Export profile to JSON file:
|
65 |
+
[01/04/2024-14:17:23] [I]
|
66 |
+
[01/04/2024-14:17:24] [I] === Device Information ===
|
67 |
+
[01/04/2024-14:17:24] [I] Selected Device: Orin
|
68 |
+
[01/04/2024-14:17:24] [I] Compute Capability: 8.7
|
69 |
+
[01/04/2024-14:17:24] [I] SMs: 8
|
70 |
+
[01/04/2024-14:17:24] [I] Compute Clock Rate: 0.624 GHz
|
71 |
+
[01/04/2024-14:17:24] [I] Device Global Memory: 7471 MiB
|
72 |
+
[01/04/2024-14:17:24] [I] Shared Memory per SM: 164 KiB
|
73 |
+
[01/04/2024-14:17:24] [I] Memory Bus Width: 128 bits (ECC disabled)
|
74 |
+
[01/04/2024-14:17:24] [I] Memory Clock Rate: 0.624 GHz
|
75 |
+
[01/04/2024-14:17:24] [I]
|
76 |
+
[01/04/2024-14:17:24] [I] TensorRT version: 8.5.2
|
77 |
+
[01/04/2024-14:17:29] [I] [TRT] [MemUsageChange] Init CUDA: CPU +220, GPU +0, now: CPU 249, GPU 2718 (MiB)
|
78 |
+
[01/04/2024-14:17:33] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +435, now: CPU 574, GPU 3215 (MiB)
|
79 |
+
[01/04/2024-14:17:33] [I] Start parsing network model
|
80 |
+
[01/04/2024-14:17:33] [I] [TRT] ----------------------------------------------------------------
|
81 |
+
[01/04/2024-14:17:33] [I] [TRT] Input filename: yolo_nas_pose_n_int8.onnx
|
82 |
+
[01/04/2024-14:17:33] [I] [TRT] ONNX IR version: 0.0.8
|
83 |
+
[01/04/2024-14:17:33] [I] [TRT] Opset version: 17
|
84 |
+
[01/04/2024-14:17:33] [I] [TRT] Producer name: pytorch
|
85 |
+
[01/04/2024-14:17:33] [I] [TRT] Producer version: 2.1.2
|
86 |
+
[01/04/2024-14:17:33] [I] [TRT] Domain:
|
87 |
+
[01/04/2024-14:17:33] [I] [TRT] Model version: 0
|
88 |
+
[01/04/2024-14:17:33] [I] [TRT] Doc string:
|
89 |
+
[01/04/2024-14:17:33] [I] [TRT] ----------------------------------------------------------------
|
90 |
+
[01/04/2024-14:17:36] [I] Finish parsing network model
|
91 |
+
[01/04/2024-14:17:40] [I] [TRT] ---------- Layers Running on DLA ----------
|
92 |
+
[01/04/2024-14:17:40] [I] [TRT] ---------- Layers Running on GPU ----------
|
93 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation1]
|
94 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/pre_process/pre_process.0/Cast.../pre_process/pre_process.2/Mul]}
|
95 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1204) [Constant]
|
96 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1205) [Constant]
|
97 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1206) [Constant]
|
98 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stem/conv/rbr_reparam/_input_quantizer/QuantizeLinear
|
99 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stem.conv.rbr_reparam.weight + /model/backbone/stem/conv/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stem/conv/rbr_reparam/Conv
|
100 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.downsample.rbr_reparam.weight + /model/backbone/stage1/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/downsample/rbr_reparam/Conv
|
101 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv2.conv.weight + /model/backbone/stage1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv2/conv/Conv
|
102 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv1.conv.weight + /model/backbone/stage1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv1/conv/Conv
|
103 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
104 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
105 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
106 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 485) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add)
|
107 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
108 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
109 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
110 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 501) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Add)
|
111 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv3.conv.weight + /model/backbone/stage1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv3/conv/Conv
|
112 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_skip2.conv.weight + /model/neck/neck2/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip2/conv/Conv
|
113 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.downsample.rbr_reparam.weight + /model/backbone/stage2/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/downsample/rbr_reparam/Conv
|
114 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.downsample.conv.weight + /model/neck/neck2/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/downsample/conv/Conv
|
115 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv2.conv.weight + /model/backbone/stage2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv2/conv/Conv
|
116 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv1.conv.weight + /model/backbone/stage2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv1/conv/Conv
|
117 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
118 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
119 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
120 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 548) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add)
|
121 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
122 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
123 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
124 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 564) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add)
|
125 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
126 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
127 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
128 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 580) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add)
|
129 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/Concat_/model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add_output_0_clone_0 copy
|
130 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv3.conv.weight + /model/backbone/stage2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv3/conv/Conv
|
131 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip2.conv.weight + /model/neck/neck1/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip2/conv/Conv || model.neck.neck2.reduce_skip1.conv.weight + /model/neck/neck2/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip1/conv/Conv
|
132 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.downsample.rbr_reparam.weight + /model/backbone/stage3/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/downsample/rbr_reparam/Conv
|
133 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.downsample.conv.weight + /model/neck/neck1/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/downsample/conv/Conv
|
134 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv2.conv.weight + /model/backbone/stage3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv2/conv/Conv
|
135 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv1.conv.weight + /model/backbone/stage3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv1/conv/Conv
|
136 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
137 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
138 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
139 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 630) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add)
|
140 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
141 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
142 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
143 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 646) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add)
|
144 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
145 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
146 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
147 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 662) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add)
|
148 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
149 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv
|
150 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv
|
151 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 678) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add)
|
152 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv3.conv.weight + /model/backbone/stage3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv3/conv/Conv
|
153 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip1.conv.weight + /model/neck/neck1/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip1/conv/Conv
|
154 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.downsample.rbr_reparam.weight + /model/backbone/stage4/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/downsample/rbr_reparam/Conv
|
155 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv2.conv.weight + /model/backbone/stage4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv2/conv/Conv
|
156 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv1.conv.weight + /model/backbone/stage4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv1/conv/Conv
|
157 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
158 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
159 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
160 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 719) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add)
|
161 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
162 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
163 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
164 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 735) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Add)
|
165 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv3.conv.weight + /model/backbone/stage4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv3/conv/Conv
|
166 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv1.conv.weight + /model/backbone/context_module/cv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv1/conv/Conv
|
167 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.2/MaxPool
|
168 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.1/MaxPool
|
169 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.0/MaxPool
|
170 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/backbone/context_module/m.2/MaxPool_output_0 copy
|
171 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv2.conv.weight + /model/backbone/context_module/cv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv2/conv/Conv
|
172 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.conv.conv.weight + /model/neck/neck1/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/conv/conv/Conv
|
173 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/upsample/_input_quantizer/QuantizeLinear
|
174 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck1.upsample.weight + /model/neck/neck1/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck1/upsample/ConvTranspose
|
175 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_after_concat.conv.weight + /model/neck/neck1/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_after_concat/conv/Conv
|
176 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv2.conv.weight + /model/neck/neck1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv2/conv/Conv
|
177 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv1.conv.weight + /model/neck/neck1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv1/conv/Conv
|
178 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
179 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
180 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
181 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 800) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Add)
|
182 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
183 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
184 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
185 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 816) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add)
|
186 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/Concat_/model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
187 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv3.conv.weight + /model/neck/neck1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv3/conv/Conv
|
188 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.conv.conv.weight + /model/neck/neck2/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/conv/conv/Conv
|
189 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/upsample/_input_quantizer/QuantizeLinear
|
190 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck2.upsample.weight + /model/neck/neck2/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck2/upsample/ConvTranspose
|
191 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/Concat_/model/neck/neck2/reduce_skip1/act/Relu_output_0_clone_1 copy
|
192 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_after_concat.conv.weight + /model/neck/neck2/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_after_concat/conv/Conv
|
193 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv2.conv.weight + /model/neck/neck2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv2/conv/Conv
|
194 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv1.conv.weight + /model/neck/neck2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv1/conv/Conv
|
195 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
196 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
197 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
198 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 865) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Add)
|
199 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
200 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
201 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
202 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 881) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add)
|
203 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv3.conv.weight + /model/neck/neck2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv3/conv/Conv
|
204 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.bbox_stem.seq.conv.weight + /model/heads/head1/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/bbox_stem/seq/conv/Conv || model.heads.head1.pose_stem.seq.conv.weight + /model/heads/head1/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_stem/seq/conv/Conv
|
205 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.conv.conv.weight + /model/neck/neck3/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/conv/conv/Conv
|
206 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_convs.0.seq.conv.weight + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head1.cls_convs.0.seq.conv.weight + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/Conv
|
207 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.0.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/Conv
|
208 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
209 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.cls_pred.weight + /model/heads/head1/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_pred/Conv
|
210 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_pred.weight + /model/heads/head1/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_pred/Conv
|
211 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.1.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/Conv
|
212 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv2.conv.weight + /model/neck/neck3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv2/conv/Conv
|
213 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv1.conv.weight + /model/neck/neck3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv1/conv/Conv
|
214 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape + /model/heads/Transpose
|
215 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_pred.weight + /model/heads/head1/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_pred/Conv
|
216 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
217 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax
|
218 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
219 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv
|
220 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
221 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 947) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Add)
|
222 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
223 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
224 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
225 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 988) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add)
|
226 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/Concat_/model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
227 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv3.conv.weight + /model/neck/neck3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv3/conv/Conv
|
228 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.bbox_stem.seq.conv.weight + /model/heads/head2/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/bbox_stem/seq/conv/Conv || model.heads.head2.pose_stem.seq.conv.weight + /model/heads/head2/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_stem/seq/conv/Conv
|
229 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.conv.conv.weight + /model/neck/neck4/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/conv/conv/Conv
|
230 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_convs.0.seq.conv.weight + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head2.cls_convs.0.seq.conv.weight + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/Conv
|
231 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.0.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/Conv
|
232 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
233 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.cls_pred.weight + /model/heads/head2/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_pred/Conv
|
234 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_pred.weight + /model/heads/head2/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_pred/Conv
|
235 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.1.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/Conv
|
236 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv2.conv.weight + /model/neck/neck4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv2/conv/Conv
|
237 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv1.conv.weight + /model/neck/neck4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv1/conv/Conv
|
238 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_4 + /model/heads/Transpose_3
|
239 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_pred.weight + /model/heads/head2/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_pred/Conv
|
240 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
241 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_1
|
242 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
243 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_1
|
244 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
245 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 1054) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Add)
|
246 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
247 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
248 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
249 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1095) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add)
|
250 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/Concat_/model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
251 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv3.conv.weight + /model/neck/neck4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv3/conv/Conv
|
252 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.bbox_stem.seq.conv.weight + /model/heads/head3/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/bbox_stem/seq/conv/Conv || model.heads.head3.pose_stem.seq.conv.weight + /model/heads/head3/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_stem/seq/conv/Conv
|
253 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_convs.0.seq.conv.weight + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head3.cls_convs.0.seq.conv.weight + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/Conv
|
254 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.0.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/Conv
|
255 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.cls_pred.weight + /model/heads/head3/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_pred/Conv
|
256 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_pred.weight + /model/heads/head3/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_pred/Conv
|
257 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.1.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/Conv
|
258 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_8 + /model/heads/Transpose_6
|
259 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.2.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/Conv
|
260 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_2
|
261 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_pred.weight + /model/heads/head3/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_pred/Conv
|
262 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_2
|
263 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice_1.../post_process/Reshape_2]}
|
264 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] NMS: batched_nms_238
|
265 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] DEVICE_TO_SHAPE_HOST: (Unnamed Layer* 1208) [NMS]_1_output[DevicetoShapeHostCopy]
|
266 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation2]
|
267 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice...graph2_/Concat_5]}
|
268 |
+
[01/04/2024-14:17:40] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation3]
|
269 |
+
[01/04/2024-14:17:55] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +815, now: CPU 1168, GPU 4076 (MiB)
|
270 |
+
[01/04/2024-14:17:58] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +138, now: CPU 1250, GPU 4214 (MiB)
|
271 |
+
[01/04/2024-14:17:58] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
|
272 |
+
[01/04/2024-14:45:51] [I] [TRT] Total Activation Memory: 7900558848
|
273 |
+
[01/04/2024-14:45:51] [I] [TRT] Detected 1 inputs and 1 output network tensors.
|
274 |
+
[01/04/2024-14:45:58] [I] [TRT] Total Host Persistent Memory: 300704
|
275 |
+
[01/04/2024-14:45:58] [I] [TRT] Total Device Persistent Memory: 116736
|
276 |
+
[01/04/2024-14:45:58] [I] [TRT] Total Scratch Memory: 134217728
|
277 |
+
[01/04/2024-14:45:58] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 33 MiB, GPU 154 MiB
|
278 |
+
[01/04/2024-14:45:58] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 176 steps to complete.
|
279 |
+
[01/04/2024-14:45:58] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 60.5874ms to assign 13 blocks to 176 nodes requiring 140788224 bytes.
|
280 |
+
[01/04/2024-14:45:58] [I] [TRT] Total Activation Memory: 140788224
|
281 |
+
[01/04/2024-14:46:01] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1593, GPU 5386 (MiB)
|
282 |
+
[01/04/2024-14:46:01] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +6, GPU +8, now: CPU 6, GPU 8 (MiB)
|
283 |
+
[01/04/2024-14:46:01] [I] Engine built in 1717.31 sec.
|
284 |
+
[01/04/2024-14:46:01] [I] [TRT] Loaded engine size: 9 MiB
|
285 |
+
[01/04/2024-14:46:02] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1244, GPU 5378 (MiB)
|
286 |
+
[01/04/2024-14:46:02] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
|
287 |
+
[01/04/2024-14:46:02] [I] Engine deserialized in 0.18586 sec.
|
288 |
+
[01/04/2024-14:46:02] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU -1, now: CPU 1245, GPU 5378 (MiB)
|
289 |
+
[01/04/2024-14:46:02] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +135, now: CPU 0, GPU 142 (MiB)
|
290 |
+
[01/04/2024-14:46:02] [I] Setting persistentCacheLimit to 0 bytes.
|
291 |
+
[01/04/2024-14:46:02] [I] Using random values for input onnx::Cast_0
|
292 |
+
[01/04/2024-14:46:02] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640
|
293 |
+
[01/04/2024-14:46:02] [I] Using random values for output graph2_flat_predictions
|
294 |
+
[01/04/2024-14:46:02] [I] Created output binding for graph2_flat_predictions with dimensions -1x57
|
295 |
+
[01/04/2024-14:46:02] [I] Starting inference
|
296 |
+
[01/04/2024-14:46:17] [I] Warmup completed 3 queries over 200 ms
|
297 |
+
[01/04/2024-14:46:17] [I] Timing trace has 1232 queries over 15.0315 s
|
298 |
+
[01/04/2024-14:46:17] [I]
|
299 |
+
[01/04/2024-14:46:17] [I] === Trace details ===
|
300 |
+
[01/04/2024-14:46:17] [I] Trace averages of 100 runs:
|
301 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.29 ms - Host latency: 12.4041 ms (enqueue 12.3732 ms)
|
302 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0468 ms - Host latency: 12.1594 ms (enqueue 12.1269 ms)
|
303 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0958 ms - Host latency: 12.2081 ms (enqueue 12.1741 ms)
|
304 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0328 ms - Host latency: 12.1444 ms (enqueue 12.1128 ms)
|
305 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0594 ms - Host latency: 12.1718 ms (enqueue 12.1396 ms)
|
306 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.065 ms - Host latency: 12.1767 ms (enqueue 12.145 ms)
|
307 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0382 ms - Host latency: 12.1502 ms (enqueue 12.1181 ms)
|
308 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0485 ms - Host latency: 12.1607 ms (enqueue 12.1285 ms)
|
309 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0218 ms - Host latency: 12.1333 ms (enqueue 12.1027 ms)
|
310 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 11.9903 ms - Host latency: 12.1026 ms (enqueue 12.0704 ms)
|
311 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 11.9893 ms - Host latency: 12.1013 ms (enqueue 12.0701 ms)
|
312 |
+
[01/04/2024-14:46:17] [I] Average on 100 runs - GPU latency: 12.0489 ms - Host latency: 12.1609 ms (enqueue 12.1343 ms)
|
313 |
+
[01/04/2024-14:46:17] [I]
|
314 |
+
[01/04/2024-14:46:17] [I] === Performance summary ===
|
315 |
+
[01/04/2024-14:46:17] [I] Throughput: 81.9611 qps
|
316 |
+
[01/04/2024-14:46:17] [I] Latency: min = 11.582 ms, max = 16.703 ms, mean = 12.1721 ms, median = 12.1212 ms, percentile(90%) = 12.4434 ms, percentile(95%) = 12.5928 ms, percentile(99%) = 13.4438 ms
|
317 |
+
[01/04/2024-14:46:17] [I] Enqueue Time: min = 11.5547 ms, max = 16.675 ms, mean = 12.1407 ms, median = 12.0908 ms, percentile(90%) = 12.4111 ms, percentile(95%) = 12.5623 ms, percentile(99%) = 13.3818 ms
|
318 |
+
[01/04/2024-14:46:17] [I] H2D Latency: min = 0.0830078 ms, max = 0.146088 ms, mean = 0.1 ms, median = 0.0996094 ms, percentile(90%) = 0.10083 ms, percentile(95%) = 0.101562 ms, percentile(99%) = 0.115234 ms
|
319 |
+
[01/04/2024-14:46:17] [I] GPU Compute Time: min = 11.4717 ms, max = 16.5476 ms, mean = 12.0599 ms, median = 12.0098 ms, percentile(90%) = 12.332 ms, percentile(95%) = 12.4795 ms, percentile(99%) = 13.3123 ms
|
320 |
+
[01/04/2024-14:46:17] [I] D2H Latency: min = 0.00415039 ms, max = 0.0344238 ms, mean = 0.0122015 ms, median = 0.0117188 ms, percentile(90%) = 0.0146484 ms, percentile(95%) = 0.0159912 ms, percentile(99%) = 0.0258789 ms
|
321 |
+
[01/04/2024-14:46:17] [I] Total Host Walltime: 15.0315 s
|
322 |
+
[01/04/2024-14:46:17] [I] Total GPU Compute Time: 14.8578 s
|
323 |
+
[01/04/2024-14:46:17] [I] Explanations of the performance metrics are printed in the verbose logs.
|
324 |
+
[01/04/2024-14:46:17] [I]
|
325 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.best.engine
|
yolo_nas_pose_n_int8.onnx.int8.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b19fa98d82cf9039bb036f74f7f3f994127f88c46b7bfab06353cf82e5f8dc09
|
3 |
+
size 10115566
|
yolo_nas_pose_n_int8.onnx.int8.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-14:46:24] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-14:46:24] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-14:46:27] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-14:55:32] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-14:55:32] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-14:55:32] [W] * GPU compute time is unstable, with coefficient of variance = 4.3012%.
|
7 |
+
[01/04/2024-14:55:32] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_n_int8.onnx.int8.engine.log
CHANGED
@@ -1,323 +1,325 @@
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.int8.engine
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
8 |
-
[
|
9 |
-
[
|
10 |
-
[
|
11 |
-
[
|
12 |
-
[
|
13 |
-
[
|
14 |
-
[
|
15 |
-
[
|
16 |
-
[
|
17 |
-
[
|
18 |
-
[
|
19 |
-
[
|
20 |
-
[
|
21 |
-
[
|
22 |
-
[
|
23 |
-
[
|
24 |
-
[
|
25 |
-
[
|
26 |
-
[
|
27 |
-
[
|
28 |
-
[
|
29 |
-
[
|
30 |
-
[
|
31 |
-
[
|
32 |
-
[
|
33 |
-
[
|
34 |
-
[
|
35 |
-
[
|
36 |
-
[
|
37 |
-
[
|
38 |
-
[
|
39 |
-
[
|
40 |
-
[
|
41 |
-
[
|
42 |
-
[
|
43 |
-
[
|
44 |
-
[
|
45 |
-
[
|
46 |
-
[
|
47 |
-
[
|
48 |
-
[
|
49 |
-
[
|
50 |
-
[
|
51 |
-
[
|
52 |
-
[
|
53 |
-
[
|
54 |
-
[
|
55 |
-
[
|
56 |
-
[
|
57 |
-
[
|
58 |
-
[
|
59 |
-
[
|
60 |
-
[
|
61 |
-
[
|
62 |
-
[
|
63 |
-
[
|
64 |
-
[
|
65 |
-
[
|
66 |
-
[
|
67 |
-
[
|
68 |
-
[
|
69 |
-
[
|
70 |
-
[
|
71 |
-
[
|
72 |
-
[
|
73 |
-
[
|
74 |
-
[
|
75 |
-
[
|
76 |
-
[
|
77 |
-
[
|
78 |
-
[
|
79 |
-
[
|
80 |
-
[
|
81 |
-
[
|
82 |
-
[
|
83 |
-
[
|
84 |
-
[
|
85 |
-
[
|
86 |
-
[
|
87 |
-
[
|
88 |
-
[
|
89 |
-
[
|
90 |
-
[
|
91 |
-
[
|
92 |
-
[
|
93 |
-
[
|
94 |
-
[
|
95 |
-
[
|
96 |
-
[
|
97 |
-
[
|
98 |
-
[
|
99 |
-
[
|
100 |
-
[
|
101 |
-
[
|
102 |
-
[
|
103 |
-
[
|
104 |
-
[
|
105 |
-
[
|
106 |
-
[
|
107 |
-
[
|
108 |
-
[
|
109 |
-
[
|
110 |
-
[
|
111 |
-
[
|
112 |
-
[
|
113 |
-
[
|
114 |
-
[
|
115 |
-
[
|
116 |
-
[
|
117 |
-
[
|
118 |
-
[
|
119 |
-
[
|
120 |
-
[
|
121 |
-
[
|
122 |
-
[
|
123 |
-
[
|
124 |
-
[
|
125 |
-
[
|
126 |
-
[
|
127 |
-
[
|
128 |
-
[
|
129 |
-
[
|
130 |
-
[
|
131 |
-
[
|
132 |
-
[
|
133 |
-
[
|
134 |
-
[
|
135 |
-
[
|
136 |
-
[
|
137 |
-
[
|
138 |
-
[
|
139 |
-
[
|
140 |
-
[
|
141 |
-
[
|
142 |
-
[
|
143 |
-
[
|
144 |
-
[
|
145 |
-
[
|
146 |
-
[
|
147 |
-
[
|
148 |
-
[
|
149 |
-
[
|
150 |
-
[
|
151 |
-
[
|
152 |
-
[
|
153 |
-
[
|
154 |
-
[
|
155 |
-
[
|
156 |
-
[
|
157 |
-
[
|
158 |
-
[
|
159 |
-
[
|
160 |
-
[
|
161 |
-
[
|
162 |
-
[
|
163 |
-
[
|
164 |
-
[
|
165 |
-
[
|
166 |
-
[
|
167 |
-
[
|
168 |
-
[
|
169 |
-
[
|
170 |
-
[
|
171 |
-
[
|
172 |
-
[
|
173 |
-
[
|
174 |
-
[
|
175 |
-
[
|
176 |
-
[
|
177 |
-
[
|
178 |
-
[
|
179 |
-
[
|
180 |
-
[
|
181 |
-
[
|
182 |
-
[
|
183 |
-
[
|
184 |
-
[
|
185 |
-
[
|
186 |
-
[
|
187 |
-
[
|
188 |
-
[
|
189 |
-
[
|
190 |
-
[
|
191 |
-
[
|
192 |
-
[
|
193 |
-
[
|
194 |
-
[
|
195 |
-
[
|
196 |
-
[
|
197 |
-
[
|
198 |
-
[
|
199 |
-
[
|
200 |
-
[
|
201 |
-
[
|
202 |
-
[
|
203 |
-
[
|
204 |
-
[
|
205 |
-
[
|
206 |
-
[
|
207 |
-
[
|
208 |
-
[
|
209 |
-
[
|
210 |
-
[
|
211 |
-
[
|
212 |
-
[
|
213 |
-
[
|
214 |
-
[
|
215 |
-
[
|
216 |
-
[
|
217 |
-
[
|
218 |
-
[
|
219 |
-
[
|
220 |
-
[
|
221 |
-
[
|
222 |
-
[
|
223 |
-
[
|
224 |
-
[
|
225 |
-
[
|
226 |
-
[
|
227 |
-
[
|
228 |
-
[
|
229 |
-
[
|
230 |
-
[
|
231 |
-
[
|
232 |
-
[
|
233 |
-
[
|
234 |
-
[
|
235 |
-
[
|
236 |
-
[
|
237 |
-
[
|
238 |
-
[
|
239 |
-
[
|
240 |
-
[
|
241 |
-
[
|
242 |
-
[
|
243 |
-
[
|
244 |
-
[
|
245 |
-
[
|
246 |
-
[
|
247 |
-
[
|
248 |
-
[
|
249 |
-
[
|
250 |
-
[
|
251 |
-
[
|
252 |
-
[
|
253 |
-
[
|
254 |
-
[
|
255 |
-
[
|
256 |
-
[
|
257 |
-
[
|
258 |
-
[
|
259 |
-
[
|
260 |
-
[
|
261 |
-
[
|
262 |
-
[
|
263 |
-
[
|
264 |
-
[
|
265 |
-
[
|
266 |
-
[
|
267 |
-
[
|
268 |
-
[
|
269 |
-
[
|
270 |
-
[
|
271 |
-
[
|
272 |
-
[
|
273 |
-
[
|
274 |
-
[
|
275 |
-
[
|
276 |
-
[
|
277 |
-
[
|
278 |
-
[
|
279 |
-
[
|
280 |
-
[
|
281 |
-
[
|
282 |
-
[
|
283 |
-
[
|
284 |
-
[
|
285 |
-
[
|
286 |
-
[
|
287 |
-
[
|
288 |
-
[
|
289 |
-
[
|
290 |
-
[
|
291 |
-
[
|
292 |
-
[
|
293 |
-
[
|
294 |
-
[
|
295 |
-
[
|
296 |
-
[
|
297 |
-
[
|
298 |
-
[
|
299 |
-
[
|
300 |
-
[
|
301 |
-
[
|
302 |
-
[
|
303 |
-
[
|
304 |
-
[
|
305 |
-
[
|
306 |
-
[
|
307 |
-
[
|
308 |
-
[
|
309 |
-
[
|
310 |
-
[
|
311 |
-
[
|
312 |
-
[
|
313 |
-
[
|
314 |
-
[
|
315 |
-
[
|
316 |
-
[
|
317 |
-
[
|
318 |
-
[
|
319 |
-
[
|
320 |
-
[
|
321 |
-
[
|
322 |
-
[
|
|
|
|
|
323 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.int8.engine
|
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.int8.engine
|
2 |
+
[01/04/2024-14:46:20] [I] === Model Options ===
|
3 |
+
[01/04/2024-14:46:20] [I] Format: ONNX
|
4 |
+
[01/04/2024-14:46:20] [I] Model: yolo_nas_pose_n_int8.onnx
|
5 |
+
[01/04/2024-14:46:20] [I] Output:
|
6 |
+
[01/04/2024-14:46:20] [I] === Build Options ===
|
7 |
+
[01/04/2024-14:46:20] [I] Max batch: explicit batch
|
8 |
+
[01/04/2024-14:46:20] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
|
9 |
+
[01/04/2024-14:46:20] [I] minTiming: 1
|
10 |
+
[01/04/2024-14:46:20] [I] avgTiming: 8
|
11 |
+
[01/04/2024-14:46:20] [I] Precision: FP32+INT8
|
12 |
+
[01/04/2024-14:46:20] [I] LayerPrecisions:
|
13 |
+
[01/04/2024-14:46:20] [I] Calibration: Dynamic
|
14 |
+
[01/04/2024-14:46:20] [I] Refit: Disabled
|
15 |
+
[01/04/2024-14:46:20] [I] Sparsity: Disabled
|
16 |
+
[01/04/2024-14:46:20] [I] Safe mode: Disabled
|
17 |
+
[01/04/2024-14:46:20] [I] DirectIO mode: Disabled
|
18 |
+
[01/04/2024-14:46:20] [I] Restricted mode: Disabled
|
19 |
+
[01/04/2024-14:46:20] [I] Build only: Disabled
|
20 |
+
[01/04/2024-14:46:20] [I] Save engine: yolo_nas_pose_n_int8.onnx.int8.engine
|
21 |
+
[01/04/2024-14:46:20] [I] Load engine:
|
22 |
+
[01/04/2024-14:46:20] [I] Profiling verbosity: 0
|
23 |
+
[01/04/2024-14:46:20] [I] Tactic sources: Using default tactic sources
|
24 |
+
[01/04/2024-14:46:20] [I] timingCacheMode: local
|
25 |
+
[01/04/2024-14:46:20] [I] timingCacheFile:
|
26 |
+
[01/04/2024-14:46:20] [I] Heuristic: Disabled
|
27 |
+
[01/04/2024-14:46:20] [I] Preview Features: Use default preview flags.
|
28 |
+
[01/04/2024-14:46:20] [I] Input(s)s format: fp32:CHW
|
29 |
+
[01/04/2024-14:46:20] [I] Output(s)s format: fp32:CHW
|
30 |
+
[01/04/2024-14:46:20] [I] Input build shapes: model
|
31 |
+
[01/04/2024-14:46:20] [I] Input calibration shapes: model
|
32 |
+
[01/04/2024-14:46:20] [I] === System Options ===
|
33 |
+
[01/04/2024-14:46:20] [I] Device: 0
|
34 |
+
[01/04/2024-14:46:20] [I] DLACore:
|
35 |
+
[01/04/2024-14:46:20] [I] Plugins:
|
36 |
+
[01/04/2024-14:46:20] [I] === Inference Options ===
|
37 |
+
[01/04/2024-14:46:20] [I] Batch: Explicit
|
38 |
+
[01/04/2024-14:46:20] [I] Input inference shapes: model
|
39 |
+
[01/04/2024-14:46:20] [I] Iterations: 10
|
40 |
+
[01/04/2024-14:46:20] [I] Duration: 15s (+ 200ms warm up)
|
41 |
+
[01/04/2024-14:46:20] [I] Sleep time: 0ms
|
42 |
+
[01/04/2024-14:46:20] [I] Idle time: 0ms
|
43 |
+
[01/04/2024-14:46:20] [I] Streams: 1
|
44 |
+
[01/04/2024-14:46:20] [I] ExposeDMA: Disabled
|
45 |
+
[01/04/2024-14:46:20] [I] Data transfers: Enabled
|
46 |
+
[01/04/2024-14:46:20] [I] Spin-wait: Disabled
|
47 |
+
[01/04/2024-14:46:20] [I] Multithreading: Disabled
|
48 |
+
[01/04/2024-14:46:20] [I] CUDA Graph: Disabled
|
49 |
+
[01/04/2024-14:46:20] [I] Separate profiling: Disabled
|
50 |
+
[01/04/2024-14:46:20] [I] Time Deserialize: Disabled
|
51 |
+
[01/04/2024-14:46:20] [I] Time Refit: Disabled
|
52 |
+
[01/04/2024-14:46:20] [I] NVTX verbosity: 0
|
53 |
+
[01/04/2024-14:46:20] [I] Persistent Cache Ratio: 0
|
54 |
+
[01/04/2024-14:46:20] [I] Inputs:
|
55 |
+
[01/04/2024-14:46:20] [I] === Reporting Options ===
|
56 |
+
[01/04/2024-14:46:20] [I] Verbose: Disabled
|
57 |
+
[01/04/2024-14:46:20] [I] Averages: 100 inferences
|
58 |
+
[01/04/2024-14:46:20] [I] Percentiles: 90,95,99
|
59 |
+
[01/04/2024-14:46:20] [I] Dump refittable layers:Disabled
|
60 |
+
[01/04/2024-14:46:20] [I] Dump output: Disabled
|
61 |
+
[01/04/2024-14:46:20] [I] Profile: Disabled
|
62 |
+
[01/04/2024-14:46:20] [I] Export timing to JSON file:
|
63 |
+
[01/04/2024-14:46:20] [I] Export output to JSON file:
|
64 |
+
[01/04/2024-14:46:20] [I] Export profile to JSON file:
|
65 |
+
[01/04/2024-14:46:20] [I]
|
66 |
+
[01/04/2024-14:46:20] [I] === Device Information ===
|
67 |
+
[01/04/2024-14:46:20] [I] Selected Device: Orin
|
68 |
+
[01/04/2024-14:46:20] [I] Compute Capability: 8.7
|
69 |
+
[01/04/2024-14:46:20] [I] SMs: 8
|
70 |
+
[01/04/2024-14:46:20] [I] Compute Clock Rate: 0.624 GHz
|
71 |
+
[01/04/2024-14:46:20] [I] Device Global Memory: 7471 MiB
|
72 |
+
[01/04/2024-14:46:20] [I] Shared Memory per SM: 164 KiB
|
73 |
+
[01/04/2024-14:46:20] [I] Memory Bus Width: 128 bits (ECC disabled)
|
74 |
+
[01/04/2024-14:46:20] [I] Memory Clock Rate: 0.624 GHz
|
75 |
+
[01/04/2024-14:46:20] [I]
|
76 |
+
[01/04/2024-14:46:20] [I] TensorRT version: 8.5.2
|
77 |
+
[01/04/2024-14:46:20] [I] [TRT] [MemUsageChange] Init CUDA: CPU +220, GPU +0, now: CPU 249, GPU 3636 (MiB)
|
78 |
+
[01/04/2024-14:46:24] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +284, now: CPU 574, GPU 3941 (MiB)
|
79 |
+
[01/04/2024-14:46:24] [I] Start parsing network model
|
80 |
+
[01/04/2024-14:46:24] [I] [TRT] ----------------------------------------------------------------
|
81 |
+
[01/04/2024-14:46:24] [I] [TRT] Input filename: yolo_nas_pose_n_int8.onnx
|
82 |
+
[01/04/2024-14:46:24] [I] [TRT] ONNX IR version: 0.0.8
|
83 |
+
[01/04/2024-14:46:24] [I] [TRT] Opset version: 17
|
84 |
+
[01/04/2024-14:46:24] [I] [TRT] Producer name: pytorch
|
85 |
+
[01/04/2024-14:46:24] [I] [TRT] Producer version: 2.1.2
|
86 |
+
[01/04/2024-14:46:24] [I] [TRT] Domain:
|
87 |
+
[01/04/2024-14:46:24] [I] [TRT] Model version: 0
|
88 |
+
[01/04/2024-14:46:24] [I] [TRT] Doc string:
|
89 |
+
[01/04/2024-14:46:24] [I] [TRT] ----------------------------------------------------------------
|
90 |
+
[01/04/2024-14:46:27] [I] Finish parsing network model
|
91 |
+
[01/04/2024-14:46:27] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
|
92 |
+
[01/04/2024-14:46:31] [I] [TRT] ---------- Layers Running on DLA ----------
|
93 |
+
[01/04/2024-14:46:31] [I] [TRT] ---------- Layers Running on GPU ----------
|
94 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation1]
|
95 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/pre_process/pre_process.0/Cast.../pre_process/pre_process.2/Mul]}
|
96 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1204) [Constant]
|
97 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1205) [Constant]
|
98 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1206) [Constant]
|
99 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stem/conv/rbr_reparam/_input_quantizer/QuantizeLinear
|
100 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stem.conv.rbr_reparam.weight + /model/backbone/stem/conv/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stem/conv/rbr_reparam/Conv
|
101 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.downsample.rbr_reparam.weight + /model/backbone/stage1/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/downsample/rbr_reparam/Conv
|
102 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv2.conv.weight + /model/backbone/stage1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv2/conv/Conv
|
103 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv1.conv.weight + /model/backbone/stage1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv1/conv/Conv
|
104 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
105 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
106 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
107 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 485) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add)
|
108 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
109 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
110 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
111 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 501) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Add)
|
112 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv3.conv.weight + /model/backbone/stage1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv3/conv/Conv
|
113 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_skip2.conv.weight + /model/neck/neck2/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip2/conv/Conv
|
114 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.downsample.rbr_reparam.weight + /model/backbone/stage2/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/downsample/rbr_reparam/Conv
|
115 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.downsample.conv.weight + /model/neck/neck2/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/downsample/conv/Conv
|
116 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv2.conv.weight + /model/backbone/stage2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv2/conv/Conv
|
117 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv1.conv.weight + /model/backbone/stage2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv1/conv/Conv
|
118 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
119 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
120 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
121 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 548) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add)
|
122 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
123 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
124 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
125 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 564) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add)
|
126 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
127 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
128 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
129 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 580) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add)
|
130 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/Concat_/model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add_output_0_clone_0 copy
|
131 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv3.conv.weight + /model/backbone/stage2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv3/conv/Conv
|
132 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip2.conv.weight + /model/neck/neck1/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip2/conv/Conv || model.neck.neck2.reduce_skip1.conv.weight + /model/neck/neck2/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip1/conv/Conv
|
133 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.downsample.rbr_reparam.weight + /model/backbone/stage3/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/downsample/rbr_reparam/Conv
|
134 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.downsample.conv.weight + /model/neck/neck1/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/downsample/conv/Conv
|
135 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv2.conv.weight + /model/backbone/stage3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv2/conv/Conv
|
136 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv1.conv.weight + /model/backbone/stage3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv1/conv/Conv
|
137 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
138 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
139 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
140 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 630) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add)
|
141 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
142 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
143 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
144 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 646) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add)
|
145 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
146 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
147 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
148 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 662) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add)
|
149 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
150 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv
|
151 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv
|
152 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 678) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add)
|
153 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv3.conv.weight + /model/backbone/stage3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv3/conv/Conv
|
154 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip1.conv.weight + /model/neck/neck1/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip1/conv/Conv
|
155 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.downsample.rbr_reparam.weight + /model/backbone/stage4/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/downsample/rbr_reparam/Conv
|
156 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv2.conv.weight + /model/backbone/stage4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv2/conv/Conv
|
157 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv1.conv.weight + /model/backbone/stage4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv1/conv/Conv
|
158 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
159 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
160 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
161 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 719) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add)
|
162 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
163 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
164 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
165 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 735) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Add)
|
166 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv3.conv.weight + /model/backbone/stage4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv3/conv/Conv
|
167 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv1.conv.weight + /model/backbone/context_module/cv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv1/conv/Conv
|
168 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.2/MaxPool
|
169 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.1/MaxPool
|
170 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.0/MaxPool
|
171 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/backbone/context_module/m.2/MaxPool_output_0 copy
|
172 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv2.conv.weight + /model/backbone/context_module/cv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv2/conv/Conv
|
173 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.conv.conv.weight + /model/neck/neck1/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/conv/conv/Conv
|
174 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/upsample/_input_quantizer/QuantizeLinear
|
175 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck1.upsample.weight + /model/neck/neck1/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck1/upsample/ConvTranspose
|
176 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_after_concat.conv.weight + /model/neck/neck1/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_after_concat/conv/Conv
|
177 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv2.conv.weight + /model/neck/neck1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv2/conv/Conv
|
178 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv1.conv.weight + /model/neck/neck1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv1/conv/Conv
|
179 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
180 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
181 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
182 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 800) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Add)
|
183 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
184 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
185 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
186 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 816) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add)
|
187 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/Concat_/model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
188 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv3.conv.weight + /model/neck/neck1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv3/conv/Conv
|
189 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.conv.conv.weight + /model/neck/neck2/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/conv/conv/Conv
|
190 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/upsample/_input_quantizer/QuantizeLinear
|
191 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck2.upsample.weight + /model/neck/neck2/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck2/upsample/ConvTranspose
|
192 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/Concat_/model/neck/neck2/reduce_skip1/act/Relu_output_0_clone_1 copy
|
193 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_after_concat.conv.weight + /model/neck/neck2/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_after_concat/conv/Conv
|
194 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv2.conv.weight + /model/neck/neck2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv2/conv/Conv
|
195 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv1.conv.weight + /model/neck/neck2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv1/conv/Conv
|
196 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
197 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
198 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
199 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 865) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Add)
|
200 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
201 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
202 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
203 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 881) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add)
|
204 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv3.conv.weight + /model/neck/neck2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv3/conv/Conv
|
205 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.bbox_stem.seq.conv.weight + /model/heads/head1/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/bbox_stem/seq/conv/Conv || model.heads.head1.pose_stem.seq.conv.weight + /model/heads/head1/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_stem/seq/conv/Conv
|
206 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.conv.conv.weight + /model/neck/neck3/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/conv/conv/Conv
|
207 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_convs.0.seq.conv.weight + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head1.cls_convs.0.seq.conv.weight + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/Conv
|
208 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.0.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/Conv
|
209 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
210 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.cls_pred.weight + /model/heads/head1/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_pred/Conv
|
211 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_pred.weight + /model/heads/head1/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_pred/Conv
|
212 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.1.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/Conv
|
213 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv2.conv.weight + /model/neck/neck3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv2/conv/Conv
|
214 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv1.conv.weight + /model/neck/neck3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv1/conv/Conv
|
215 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape + /model/heads/Transpose
|
216 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_pred.weight + /model/heads/head1/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_pred/Conv
|
217 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
218 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax
|
219 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
220 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv
|
221 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
222 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 947) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Add)
|
223 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
224 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
225 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
226 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 988) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add)
|
227 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/Concat_/model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
228 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv3.conv.weight + /model/neck/neck3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv3/conv/Conv
|
229 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.bbox_stem.seq.conv.weight + /model/heads/head2/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/bbox_stem/seq/conv/Conv || model.heads.head2.pose_stem.seq.conv.weight + /model/heads/head2/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_stem/seq/conv/Conv
|
230 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.conv.conv.weight + /model/neck/neck4/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/conv/conv/Conv
|
231 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_convs.0.seq.conv.weight + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head2.cls_convs.0.seq.conv.weight + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/Conv
|
232 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.0.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/Conv
|
233 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
234 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.cls_pred.weight + /model/heads/head2/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_pred/Conv
|
235 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_pred.weight + /model/heads/head2/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_pred/Conv
|
236 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.1.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/Conv
|
237 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv2.conv.weight + /model/neck/neck4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv2/conv/Conv
|
238 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv1.conv.weight + /model/neck/neck4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv1/conv/Conv
|
239 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_4 + /model/heads/Transpose_3
|
240 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_pred.weight + /model/heads/head2/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_pred/Conv
|
241 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
242 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_1
|
243 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
244 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_1
|
245 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
246 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 1054) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Add)
|
247 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
248 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
249 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
250 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1095) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add)
|
251 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/Concat_/model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
252 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv3.conv.weight + /model/neck/neck4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv3/conv/Conv
|
253 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.bbox_stem.seq.conv.weight + /model/heads/head3/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/bbox_stem/seq/conv/Conv || model.heads.head3.pose_stem.seq.conv.weight + /model/heads/head3/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_stem/seq/conv/Conv
|
254 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_convs.0.seq.conv.weight + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head3.cls_convs.0.seq.conv.weight + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/Conv
|
255 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.0.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/Conv
|
256 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.cls_pred.weight + /model/heads/head3/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_pred/Conv
|
257 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_pred.weight + /model/heads/head3/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_pred/Conv
|
258 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.1.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/Conv
|
259 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_8 + /model/heads/Transpose_6
|
260 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.2.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/Conv
|
261 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_2
|
262 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_pred.weight + /model/heads/head3/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_pred/Conv
|
263 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_2
|
264 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice_1.../post_process/Reshape_2]}
|
265 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] NMS: batched_nms_238
|
266 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] DEVICE_TO_SHAPE_HOST: (Unnamed Layer* 1208) [NMS]_1_output[DevicetoShapeHostCopy]
|
267 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation2]
|
268 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice...graph2_/Concat_5]}
|
269 |
+
[01/04/2024-14:46:31] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation3]
|
270 |
+
[01/04/2024-14:46:36] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +436, now: CPU 1168, GPU 4440 (MiB)
|
271 |
+
[01/04/2024-14:46:37] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +43, now: CPU 1250, GPU 4483 (MiB)
|
272 |
+
[01/04/2024-14:46:37] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
|
273 |
+
[01/04/2024-14:55:15] [I] [TRT] Total Activation Memory: 7920254464
|
274 |
+
[01/04/2024-14:55:15] [I] [TRT] Detected 1 inputs and 1 output network tensors.
|
275 |
+
[01/04/2024-14:55:15] [I] [TRT] Total Host Persistent Memory: 300608
|
276 |
+
[01/04/2024-14:55:15] [I] [TRT] Total Device Persistent Memory: 119296
|
277 |
+
[01/04/2024-14:55:15] [I] [TRT] Total Scratch Memory: 134217728
|
278 |
+
[01/04/2024-14:55:15] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 33 MiB, GPU 132 MiB
|
279 |
+
[01/04/2024-14:55:15] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 185 steps to complete.
|
280 |
+
[01/04/2024-14:55:15] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 40.3888ms to assign 13 blocks to 185 nodes requiring 144141824 bytes.
|
281 |
+
[01/04/2024-14:55:15] [I] [TRT] Total Activation Memory: 144141824
|
282 |
+
[01/04/2024-14:55:16] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1590, GPU 5342 (MiB)
|
283 |
+
[01/04/2024-14:55:16] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +6, GPU +8, now: CPU 6, GPU 8 (MiB)
|
284 |
+
[01/04/2024-14:55:16] [I] Engine built in 536.559 sec.
|
285 |
+
[01/04/2024-14:55:17] [I] [TRT] Loaded engine size: 9 MiB
|
286 |
+
[01/04/2024-14:55:17] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1240, GPU 5345 (MiB)
|
287 |
+
[01/04/2024-14:55:17] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +7, now: CPU 0, GPU 7 (MiB)
|
288 |
+
[01/04/2024-14:55:17] [I] Engine deserialized in 0.120887 sec.
|
289 |
+
[01/04/2024-14:55:17] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1241, GPU 5345 (MiB)
|
290 |
+
[01/04/2024-14:55:17] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +138, now: CPU 0, GPU 145 (MiB)
|
291 |
+
[01/04/2024-14:55:17] [I] Setting persistentCacheLimit to 0 bytes.
|
292 |
+
[01/04/2024-14:55:17] [I] Using random values for input onnx::Cast_0
|
293 |
+
[01/04/2024-14:55:17] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640
|
294 |
+
[01/04/2024-14:55:17] [I] Using random values for output graph2_flat_predictions
|
295 |
+
[01/04/2024-14:55:17] [I] Created output binding for graph2_flat_predictions with dimensions -1x57
|
296 |
+
[01/04/2024-14:55:17] [I] Starting inference
|
297 |
+
[01/04/2024-14:55:32] [I] Warmup completed 11 queries over 200 ms
|
298 |
+
[01/04/2024-14:55:32] [I] Timing trace has 1108 queries over 15.0289 s
|
299 |
+
[01/04/2024-14:55:32] [I]
|
300 |
+
[01/04/2024-14:55:32] [I] === Trace details ===
|
301 |
+
[01/04/2024-14:55:32] [I] Trace averages of 100 runs:
|
302 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.2196 ms - Host latency: 13.3319 ms (enqueue 13.2998 ms)
|
303 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.3938 ms - Host latency: 13.5071 ms (enqueue 13.4684 ms)
|
304 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.2194 ms - Host latency: 13.3318 ms (enqueue 13.2975 ms)
|
305 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.2789 ms - Host latency: 13.3906 ms (enqueue 13.3589 ms)
|
306 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.6773 ms - Host latency: 13.7933 ms (enqueue 13.7451 ms)
|
307 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.7471 ms - Host latency: 13.8643 ms (enqueue 13.8164 ms)
|
308 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.5782 ms - Host latency: 13.6935 ms (enqueue 13.6539 ms)
|
309 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.3669 ms - Host latency: 13.4813 ms (enqueue 13.4435 ms)
|
310 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.2753 ms - Host latency: 13.3888 ms (enqueue 13.3504 ms)
|
311 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.3138 ms - Host latency: 13.4258 ms (enqueue 13.3842 ms)
|
312 |
+
[01/04/2024-14:55:32] [I] Average on 100 runs - GPU latency: 13.511 ms - Host latency: 13.6262 ms (enqueue 13.5945 ms)
|
313 |
+
[01/04/2024-14:55:32] [I]
|
314 |
+
[01/04/2024-14:55:32] [I] === Performance summary ===
|
315 |
+
[01/04/2024-14:55:32] [I] Throughput: 73.7247 qps
|
316 |
+
[01/04/2024-14:55:32] [I] Latency: min = 12.3433 ms, max = 18.3281 ms, mean = 13.5316 ms, median = 13.4639 ms, percentile(90%) = 14.1807 ms, percentile(95%) = 14.3574 ms, percentile(99%) = 15.7993 ms
|
317 |
+
[01/04/2024-14:55:32] [I] Enqueue Time: min = 12.313 ms, max = 18.2793 ms, mean = 13.4932 ms, median = 13.4253 ms, percentile(90%) = 14.1348 ms, percentile(95%) = 14.3091 ms, percentile(99%) = 15.7402 ms
|
318 |
+
[01/04/2024-14:55:32] [I] H2D Latency: min = 0.0810547 ms, max = 0.114258 ms, mean = 0.0982483 ms, median = 0.0986328 ms, percentile(90%) = 0.0998535 ms, percentile(95%) = 0.100586 ms, percentile(99%) = 0.102539 ms
|
319 |
+
[01/04/2024-14:55:32] [I] GPU Compute Time: min = 12.2305 ms, max = 18.21 ms, mean = 13.4177 ms, median = 13.3489 ms, percentile(90%) = 14.0645 ms, percentile(95%) = 14.2349 ms, percentile(99%) = 15.7002 ms
|
320 |
+
[01/04/2024-14:55:32] [I] D2H Latency: min = 0.00292969 ms, max = 0.0498047 ms, mean = 0.0156769 ms, median = 0.0146484 ms, percentile(90%) = 0.0205078 ms, percentile(95%) = 0.0230713 ms, percentile(99%) = 0.03125 ms
|
321 |
+
[01/04/2024-14:55:32] [I] Total Host Walltime: 15.0289 s
|
322 |
+
[01/04/2024-14:55:32] [I] Total GPU Compute Time: 14.8668 s
|
323 |
+
[01/04/2024-14:55:32] [I] Explanations of the performance metrics are printed in the verbose logs.
|
324 |
+
[01/04/2024-14:55:32] [I]
|
325 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_n_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_n_int8.onnx.int8.engine
|
yolo_nas_pose_s_int8.onnx.best.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:24cb45f2e6bbc0f6183888eae7fa29fbe79918082902ae47814c721c3f57bb68
|
3 |
+
size 18052506
|
yolo_nas_pose_s_int8.onnx.best.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-14:55:38] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-14:55:38] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-14:55:41] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-15:28:09] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-15:28:10] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-15:28:10] [W] * GPU compute time is unstable, with coefficient of variance = 4.19166%.
|
7 |
+
[01/04/2024-15:28:10] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_s_int8.onnx.best.engine.log
CHANGED
@@ -1,323 +1,324 @@
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.best.engine
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
8 |
-
[
|
9 |
-
[
|
10 |
-
[
|
11 |
-
[
|
12 |
-
[
|
13 |
-
[
|
14 |
-
[
|
15 |
-
[
|
16 |
-
[
|
17 |
-
[
|
18 |
-
[
|
19 |
-
[
|
20 |
-
[
|
21 |
-
[
|
22 |
-
[
|
23 |
-
[
|
24 |
-
[
|
25 |
-
[
|
26 |
-
[
|
27 |
-
[
|
28 |
-
[
|
29 |
-
[
|
30 |
-
[
|
31 |
-
[
|
32 |
-
[
|
33 |
-
[
|
34 |
-
[
|
35 |
-
[
|
36 |
-
[
|
37 |
-
[
|
38 |
-
[
|
39 |
-
[
|
40 |
-
[
|
41 |
-
[
|
42 |
-
[
|
43 |
-
[
|
44 |
-
[
|
45 |
-
[
|
46 |
-
[
|
47 |
-
[
|
48 |
-
[
|
49 |
-
[
|
50 |
-
[
|
51 |
-
[
|
52 |
-
[
|
53 |
-
[
|
54 |
-
[
|
55 |
-
[
|
56 |
-
[
|
57 |
-
[
|
58 |
-
[
|
59 |
-
[
|
60 |
-
[
|
61 |
-
[
|
62 |
-
[
|
63 |
-
[
|
64 |
-
[
|
65 |
-
[
|
66 |
-
[
|
67 |
-
[
|
68 |
-
[
|
69 |
-
[
|
70 |
-
[
|
71 |
-
[
|
72 |
-
[
|
73 |
-
[
|
74 |
-
[
|
75 |
-
[
|
76 |
-
[
|
77 |
-
[
|
78 |
-
[
|
79 |
-
[
|
80 |
-
[
|
81 |
-
[
|
82 |
-
[
|
83 |
-
[
|
84 |
-
[
|
85 |
-
[
|
86 |
-
[
|
87 |
-
[
|
88 |
-
[
|
89 |
-
[
|
90 |
-
[
|
91 |
-
[
|
92 |
-
[
|
93 |
-
[
|
94 |
-
[
|
95 |
-
[
|
96 |
-
[
|
97 |
-
[
|
98 |
-
[
|
99 |
-
[
|
100 |
-
[
|
101 |
-
[
|
102 |
-
[
|
103 |
-
[
|
104 |
-
[
|
105 |
-
[
|
106 |
-
[
|
107 |
-
[
|
108 |
-
[
|
109 |
-
[
|
110 |
-
[
|
111 |
-
[
|
112 |
-
[
|
113 |
-
[
|
114 |
-
[
|
115 |
-
[
|
116 |
-
[
|
117 |
-
[
|
118 |
-
[
|
119 |
-
[
|
120 |
-
[
|
121 |
-
[
|
122 |
-
[
|
123 |
-
[
|
124 |
-
[
|
125 |
-
[
|
126 |
-
[
|
127 |
-
[
|
128 |
-
[
|
129 |
-
[
|
130 |
-
[
|
131 |
-
[
|
132 |
-
[
|
133 |
-
[
|
134 |
-
[
|
135 |
-
[
|
136 |
-
[
|
137 |
-
[
|
138 |
-
[
|
139 |
-
[
|
140 |
-
[
|
141 |
-
[
|
142 |
-
[
|
143 |
-
[
|
144 |
-
[
|
145 |
-
[
|
146 |
-
[
|
147 |
-
[
|
148 |
-
[
|
149 |
-
[
|
150 |
-
[
|
151 |
-
[
|
152 |
-
[
|
153 |
-
[
|
154 |
-
[
|
155 |
-
[
|
156 |
-
[
|
157 |
-
[
|
158 |
-
[
|
159 |
-
[
|
160 |
-
[
|
161 |
-
[
|
162 |
-
[
|
163 |
-
[
|
164 |
-
[
|
165 |
-
[
|
166 |
-
[
|
167 |
-
[
|
168 |
-
[
|
169 |
-
[
|
170 |
-
[
|
171 |
-
[
|
172 |
-
[
|
173 |
-
[
|
174 |
-
[
|
175 |
-
[
|
176 |
-
[
|
177 |
-
[
|
178 |
-
[
|
179 |
-
[
|
180 |
-
[
|
181 |
-
[
|
182 |
-
[
|
183 |
-
[
|
184 |
-
[
|
185 |
-
[
|
186 |
-
[
|
187 |
-
[
|
188 |
-
[
|
189 |
-
[
|
190 |
-
[
|
191 |
-
[
|
192 |
-
[
|
193 |
-
[
|
194 |
-
[
|
195 |
-
[
|
196 |
-
[
|
197 |
-
[
|
198 |
-
[
|
199 |
-
[
|
200 |
-
[
|
201 |
-
[
|
202 |
-
[
|
203 |
-
[
|
204 |
-
[
|
205 |
-
[
|
206 |
-
[
|
207 |
-
[
|
208 |
-
[
|
209 |
-
[
|
210 |
-
[
|
211 |
-
[
|
212 |
-
[
|
213 |
-
[
|
214 |
-
[
|
215 |
-
[
|
216 |
-
[
|
217 |
-
[
|
218 |
-
[
|
219 |
-
[
|
220 |
-
[
|
221 |
-
[
|
222 |
-
[
|
223 |
-
[
|
224 |
-
[
|
225 |
-
[
|
226 |
-
[
|
227 |
-
[
|
228 |
-
[
|
229 |
-
[
|
230 |
-
[
|
231 |
-
[
|
232 |
-
[
|
233 |
-
[
|
234 |
-
[
|
235 |
-
[
|
236 |
-
[
|
237 |
-
[
|
238 |
-
[
|
239 |
-
[
|
240 |
-
[
|
241 |
-
[
|
242 |
-
[
|
243 |
-
[
|
244 |
-
[
|
245 |
-
[
|
246 |
-
[
|
247 |
-
[
|
248 |
-
[
|
249 |
-
[
|
250 |
-
[
|
251 |
-
[
|
252 |
-
[
|
253 |
-
[
|
254 |
-
[
|
255 |
-
[
|
256 |
-
[
|
257 |
-
[
|
258 |
-
[
|
259 |
-
[
|
260 |
-
[
|
261 |
-
[
|
262 |
-
[
|
263 |
-
[
|
264 |
-
[
|
265 |
-
[
|
266 |
-
[
|
267 |
-
[
|
268 |
-
[
|
269 |
-
[
|
270 |
-
[
|
271 |
-
[
|
272 |
-
[
|
273 |
-
[
|
274 |
-
[
|
275 |
-
[
|
276 |
-
[
|
277 |
-
[
|
278 |
-
[
|
279 |
-
[
|
280 |
-
[
|
281 |
-
[
|
282 |
-
[
|
283 |
-
[
|
284 |
-
[
|
285 |
-
[
|
286 |
-
[
|
287 |
-
[
|
288 |
-
[
|
289 |
-
[
|
290 |
-
[
|
291 |
-
[
|
292 |
-
[
|
293 |
-
[
|
294 |
-
[
|
295 |
-
[
|
296 |
-
[
|
297 |
-
[
|
298 |
-
[
|
299 |
-
[
|
300 |
-
[
|
301 |
-
[
|
302 |
-
[
|
303 |
-
[
|
304 |
-
[
|
305 |
-
[
|
306 |
-
[
|
307 |
-
[
|
308 |
-
[
|
309 |
-
[
|
310 |
-
[
|
311 |
-
[
|
312 |
-
[
|
313 |
-
[
|
314 |
-
[
|
315 |
-
[
|
316 |
-
[
|
317 |
-
[
|
318 |
-
[
|
319 |
-
[
|
320 |
-
[
|
321 |
-
[
|
322 |
-
[
|
|
|
323 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.best.engine
|
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.best.engine
|
2 |
+
[01/04/2024-14:55:33] [I] === Model Options ===
|
3 |
+
[01/04/2024-14:55:33] [I] Format: ONNX
|
4 |
+
[01/04/2024-14:55:33] [I] Model: yolo_nas_pose_s_int8.onnx
|
5 |
+
[01/04/2024-14:55:33] [I] Output:
|
6 |
+
[01/04/2024-14:55:33] [I] === Build Options ===
|
7 |
+
[01/04/2024-14:55:33] [I] Max batch: explicit batch
|
8 |
+
[01/04/2024-14:55:33] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
|
9 |
+
[01/04/2024-14:55:33] [I] minTiming: 1
|
10 |
+
[01/04/2024-14:55:33] [I] avgTiming: 8
|
11 |
+
[01/04/2024-14:55:33] [I] Precision: FP32+FP16+INT8
|
12 |
+
[01/04/2024-14:55:33] [I] LayerPrecisions:
|
13 |
+
[01/04/2024-14:55:33] [I] Calibration: Dynamic
|
14 |
+
[01/04/2024-14:55:33] [I] Refit: Disabled
|
15 |
+
[01/04/2024-14:55:33] [I] Sparsity: Disabled
|
16 |
+
[01/04/2024-14:55:33] [I] Safe mode: Disabled
|
17 |
+
[01/04/2024-14:55:33] [I] DirectIO mode: Disabled
|
18 |
+
[01/04/2024-14:55:33] [I] Restricted mode: Disabled
|
19 |
+
[01/04/2024-14:55:33] [I] Build only: Disabled
|
20 |
+
[01/04/2024-14:55:33] [I] Save engine: yolo_nas_pose_s_int8.onnx.best.engine
|
21 |
+
[01/04/2024-14:55:33] [I] Load engine:
|
22 |
+
[01/04/2024-14:55:33] [I] Profiling verbosity: 0
|
23 |
+
[01/04/2024-14:55:33] [I] Tactic sources: Using default tactic sources
|
24 |
+
[01/04/2024-14:55:33] [I] timingCacheMode: local
|
25 |
+
[01/04/2024-14:55:33] [I] timingCacheFile:
|
26 |
+
[01/04/2024-14:55:33] [I] Heuristic: Disabled
|
27 |
+
[01/04/2024-14:55:33] [I] Preview Features: Use default preview flags.
|
28 |
+
[01/04/2024-14:55:33] [I] Input(s)s format: fp32:CHW
|
29 |
+
[01/04/2024-14:55:33] [I] Output(s)s format: fp32:CHW
|
30 |
+
[01/04/2024-14:55:33] [I] Input build shapes: model
|
31 |
+
[01/04/2024-14:55:33] [I] Input calibration shapes: model
|
32 |
+
[01/04/2024-14:55:33] [I] === System Options ===
|
33 |
+
[01/04/2024-14:55:33] [I] Device: 0
|
34 |
+
[01/04/2024-14:55:33] [I] DLACore:
|
35 |
+
[01/04/2024-14:55:33] [I] Plugins:
|
36 |
+
[01/04/2024-14:55:33] [I] === Inference Options ===
|
37 |
+
[01/04/2024-14:55:33] [I] Batch: Explicit
|
38 |
+
[01/04/2024-14:55:33] [I] Input inference shapes: model
|
39 |
+
[01/04/2024-14:55:33] [I] Iterations: 10
|
40 |
+
[01/04/2024-14:55:33] [I] Duration: 15s (+ 200ms warm up)
|
41 |
+
[01/04/2024-14:55:33] [I] Sleep time: 0ms
|
42 |
+
[01/04/2024-14:55:33] [I] Idle time: 0ms
|
43 |
+
[01/04/2024-14:55:33] [I] Streams: 1
|
44 |
+
[01/04/2024-14:55:33] [I] ExposeDMA: Disabled
|
45 |
+
[01/04/2024-14:55:33] [I] Data transfers: Enabled
|
46 |
+
[01/04/2024-14:55:33] [I] Spin-wait: Disabled
|
47 |
+
[01/04/2024-14:55:33] [I] Multithreading: Disabled
|
48 |
+
[01/04/2024-14:55:33] [I] CUDA Graph: Disabled
|
49 |
+
[01/04/2024-14:55:33] [I] Separate profiling: Disabled
|
50 |
+
[01/04/2024-14:55:33] [I] Time Deserialize: Disabled
|
51 |
+
[01/04/2024-14:55:33] [I] Time Refit: Disabled
|
52 |
+
[01/04/2024-14:55:33] [I] NVTX verbosity: 0
|
53 |
+
[01/04/2024-14:55:33] [I] Persistent Cache Ratio: 0
|
54 |
+
[01/04/2024-14:55:33] [I] Inputs:
|
55 |
+
[01/04/2024-14:55:33] [I] === Reporting Options ===
|
56 |
+
[01/04/2024-14:55:33] [I] Verbose: Disabled
|
57 |
+
[01/04/2024-14:55:33] [I] Averages: 100 inferences
|
58 |
+
[01/04/2024-14:55:33] [I] Percentiles: 90,95,99
|
59 |
+
[01/04/2024-14:55:33] [I] Dump refittable layers:Disabled
|
60 |
+
[01/04/2024-14:55:33] [I] Dump output: Disabled
|
61 |
+
[01/04/2024-14:55:33] [I] Profile: Disabled
|
62 |
+
[01/04/2024-14:55:33] [I] Export timing to JSON file:
|
63 |
+
[01/04/2024-14:55:33] [I] Export output to JSON file:
|
64 |
+
[01/04/2024-14:55:33] [I] Export profile to JSON file:
|
65 |
+
[01/04/2024-14:55:33] [I]
|
66 |
+
[01/04/2024-14:55:33] [I] === Device Information ===
|
67 |
+
[01/04/2024-14:55:33] [I] Selected Device: Orin
|
68 |
+
[01/04/2024-14:55:33] [I] Compute Capability: 8.7
|
69 |
+
[01/04/2024-14:55:33] [I] SMs: 8
|
70 |
+
[01/04/2024-14:55:33] [I] Compute Clock Rate: 0.624 GHz
|
71 |
+
[01/04/2024-14:55:33] [I] Device Global Memory: 7471 MiB
|
72 |
+
[01/04/2024-14:55:33] [I] Shared Memory per SM: 164 KiB
|
73 |
+
[01/04/2024-14:55:33] [I] Memory Bus Width: 128 bits (ECC disabled)
|
74 |
+
[01/04/2024-14:55:33] [I] Memory Clock Rate: 0.624 GHz
|
75 |
+
[01/04/2024-14:55:33] [I]
|
76 |
+
[01/04/2024-14:55:33] [I] TensorRT version: 8.5.2
|
77 |
+
[01/04/2024-14:55:34] [I] [TRT] [MemUsageChange] Init CUDA: CPU +220, GPU +0, now: CPU 249, GPU 3779 (MiB)
|
78 |
+
[01/04/2024-14:55:37] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +284, now: CPU 574, GPU 4083 (MiB)
|
79 |
+
[01/04/2024-14:55:37] [I] Start parsing network model
|
80 |
+
[01/04/2024-14:55:38] [I] [TRT] ----------------------------------------------------------------
|
81 |
+
[01/04/2024-14:55:38] [I] [TRT] Input filename: yolo_nas_pose_s_int8.onnx
|
82 |
+
[01/04/2024-14:55:38] [I] [TRT] ONNX IR version: 0.0.8
|
83 |
+
[01/04/2024-14:55:38] [I] [TRT] Opset version: 17
|
84 |
+
[01/04/2024-14:55:38] [I] [TRT] Producer name: pytorch
|
85 |
+
[01/04/2024-14:55:38] [I] [TRT] Producer version: 2.1.2
|
86 |
+
[01/04/2024-14:55:38] [I] [TRT] Domain:
|
87 |
+
[01/04/2024-14:55:38] [I] [TRT] Model version: 0
|
88 |
+
[01/04/2024-14:55:38] [I] [TRT] Doc string:
|
89 |
+
[01/04/2024-14:55:38] [I] [TRT] ----------------------------------------------------------------
|
90 |
+
[01/04/2024-14:55:41] [I] Finish parsing network model
|
91 |
+
[01/04/2024-14:55:45] [I] [TRT] ---------- Layers Running on DLA ----------
|
92 |
+
[01/04/2024-14:55:45] [I] [TRT] ---------- Layers Running on GPU ----------
|
93 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation1]
|
94 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/pre_process/pre_process.0/Cast.../pre_process/pre_process.2/Mul]}
|
95 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1229) [Constant]
|
96 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1230) [Constant]
|
97 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1231) [Constant]
|
98 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stem/conv/rbr_reparam/_input_quantizer/QuantizeLinear
|
99 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stem.conv.rbr_reparam.weight + /model/backbone/stem/conv/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stem/conv/rbr_reparam/Conv
|
100 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.downsample.rbr_reparam.weight + /model/backbone/stage1/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/downsample/rbr_reparam/Conv
|
101 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv2.conv.weight + /model/backbone/stage1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv2/conv/Conv
|
102 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv1.conv.weight + /model/backbone/stage1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv1/conv/Conv
|
103 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
104 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
105 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
106 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 494) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add)
|
107 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
108 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
109 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
110 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 510) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Add)
|
111 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv3.conv.weight + /model/backbone/stage1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv3/conv/Conv
|
112 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_skip2.conv.weight + /model/neck/neck2/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip2/conv/Conv
|
113 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.downsample.rbr_reparam.weight + /model/backbone/stage2/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/downsample/rbr_reparam/Conv
|
114 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.downsample.conv.weight + /model/neck/neck2/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/downsample/conv/Conv
|
115 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv2.conv.weight + /model/backbone/stage2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv2/conv/Conv
|
116 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv1.conv.weight + /model/backbone/stage2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv1/conv/Conv
|
117 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
118 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
119 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
120 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 557) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add)
|
121 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
122 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
123 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
124 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 573) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add)
|
125 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
126 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
127 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
128 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 589) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add)
|
129 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv3.conv.weight + /model/backbone/stage2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv3/conv/Conv
|
130 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip2.conv.weight + /model/neck/neck1/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip2/conv/Conv || model.neck.neck2.reduce_skip1.conv.weight + /model/neck/neck2/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip1/conv/Conv
|
131 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.downsample.rbr_reparam.weight + /model/backbone/stage3/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/downsample/rbr_reparam/Conv
|
132 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.downsample.conv.weight + /model/neck/neck1/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/downsample/conv/Conv
|
133 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv2.conv.weight + /model/backbone/stage3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv2/conv/Conv
|
134 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv1.conv.weight + /model/backbone/stage3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv1/conv/Conv
|
135 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
136 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
137 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
138 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 639) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add)
|
139 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
140 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
141 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
142 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 655) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add)
|
143 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
144 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
145 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
146 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 671) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add)
|
147 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
148 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv
|
149 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv
|
150 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 687) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add)
|
151 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
152 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.4.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/Conv
|
153 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.4.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/rbr_reparam/Conv
|
154 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.4.alpha + (Unnamed Layer* 703) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Add)
|
155 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv3.conv.weight + /model/backbone/stage3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv3/conv/Conv
|
156 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip1.conv.weight + /model/neck/neck1/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip1/conv/Conv
|
157 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.downsample.rbr_reparam.weight + /model/backbone/stage4/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/downsample/rbr_reparam/Conv
|
158 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv2.conv.weight + /model/backbone/stage4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv2/conv/Conv
|
159 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv1.conv.weight + /model/backbone/stage4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv1/conv/Conv
|
160 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
161 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
162 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
163 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 744) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add)
|
164 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
165 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
166 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
167 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 760) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Add)
|
168 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv3.conv.weight + /model/backbone/stage4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv3/conv/Conv
|
169 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv1.conv.weight + /model/backbone/context_module/cv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv1/conv/Conv
|
170 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.2/MaxPool
|
171 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.1/MaxPool
|
172 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.0/MaxPool
|
173 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/backbone/context_module/m.2/MaxPool_output_0 copy
|
174 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv2.conv.weight + /model/backbone/context_module/cv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv2/conv/Conv
|
175 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.conv.conv.weight + /model/neck/neck1/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/conv/conv/Conv
|
176 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/upsample/_input_quantizer/QuantizeLinear
|
177 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck1.upsample.weight + /model/neck/neck1/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck1/upsample/ConvTranspose
|
178 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_after_concat.conv.weight + /model/neck/neck1/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_after_concat/conv/Conv
|
179 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv2.conv.weight + /model/neck/neck1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv2/conv/Conv
|
180 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv1.conv.weight + /model/neck/neck1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv1/conv/Conv
|
181 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
182 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
183 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
184 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 825) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Add)
|
185 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
186 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
187 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
188 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 841) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add)
|
189 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv3.conv.weight + /model/neck/neck1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv3/conv/Conv
|
190 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.conv.conv.weight + /model/neck/neck2/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/conv/conv/Conv
|
191 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/upsample/_input_quantizer/QuantizeLinear
|
192 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck2.upsample.weight + /model/neck/neck2/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck2/upsample/ConvTranspose
|
193 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/Concat_/model/neck/neck2/reduce_skip1/act/Relu_output_0_clone_1 copy
|
194 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_after_concat.conv.weight + /model/neck/neck2/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_after_concat/conv/Conv
|
195 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv2.conv.weight + /model/neck/neck2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv2/conv/Conv
|
196 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv1.conv.weight + /model/neck/neck2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv1/conv/Conv
|
197 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
198 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
199 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
200 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 890) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Add)
|
201 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
202 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
203 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
204 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 906) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add)
|
205 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/Concat_/model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
206 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv3.conv.weight + /model/neck/neck2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv3/conv/Conv
|
207 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.bbox_stem.seq.conv.weight + /model/heads/head1/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/bbox_stem/seq/conv/Conv || model.heads.head1.pose_stem.seq.conv.weight + /model/heads/head1/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_stem/seq/conv/Conv
|
208 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.conv.conv.weight + /model/neck/neck3/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/conv/conv/Conv
|
209 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_convs.0.seq.conv.weight + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head1.cls_convs.0.seq.conv.weight + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/Conv
|
210 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.0.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/Conv
|
211 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
212 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.cls_pred.weight + /model/heads/head1/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_pred/Conv
|
213 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_pred.weight + /model/heads/head1/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_pred/Conv
|
214 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.1.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/Conv
|
215 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv2.conv.weight + /model/neck/neck3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv2/conv/Conv
|
216 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv1.conv.weight + /model/neck/neck3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv1/conv/Conv
|
217 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape + /model/heads/Transpose
|
218 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_pred.weight + /model/heads/head1/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_pred/Conv
|
219 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
220 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax
|
221 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
222 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv
|
223 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
224 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 972) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Add)
|
225 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
226 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
227 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
228 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1013) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add)
|
229 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv3.conv.weight + /model/neck/neck3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv3/conv/Conv
|
230 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_stem.seq.conv.weight + /model/heads/head2/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_stem/seq/conv/Conv || model.heads.head2.bbox_stem.seq.conv.weight + /model/heads/head2/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/bbox_stem/seq/conv/Conv
|
231 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.conv.conv.weight + /model/neck/neck4/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/conv/conv/Conv
|
232 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_convs.0.seq.conv.weight + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head2.cls_convs.0.seq.conv.weight + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/Conv
|
233 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.0.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/Conv
|
234 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
235 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.cls_pred.weight + /model/heads/head2/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_pred/Conv
|
236 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_pred.weight + /model/heads/head2/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_pred/Conv
|
237 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.1.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/Conv
|
238 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv2.conv.weight + /model/neck/neck4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv2/conv/Conv
|
239 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv1.conv.weight + /model/neck/neck4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv1/conv/Conv
|
240 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_4 + /model/heads/Transpose_3
|
241 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_pred.weight + /model/heads/head2/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_pred/Conv
|
242 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
243 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_1
|
244 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
245 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_1
|
246 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
247 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 1079) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Add)
|
248 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
249 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
250 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
251 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1120) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add)
|
252 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv3.conv.weight + /model/neck/neck4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv3/conv/Conv
|
253 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.bbox_stem.seq.conv.weight + /model/heads/head3/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/bbox_stem/seq/conv/Conv || model.heads.head3.pose_stem.seq.conv.weight + /model/heads/head3/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_stem/seq/conv/Conv
|
254 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_convs.0.seq.conv.weight + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head3.cls_convs.0.seq.conv.weight + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/Conv
|
255 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.0.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/Conv
|
256 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.cls_pred.weight + /model/heads/head3/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_pred/Conv
|
257 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_pred.weight + /model/heads/head3/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_pred/Conv
|
258 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.1.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/Conv
|
259 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_8 + /model/heads/Transpose_6
|
260 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.2.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/Conv
|
261 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_2
|
262 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_pred.weight + /model/heads/head3/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_pred/Conv
|
263 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_2
|
264 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice_1.../post_process/Reshape_2]}
|
265 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] NMS: batched_nms_243
|
266 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] DEVICE_TO_SHAPE_HOST: (Unnamed Layer* 1233) [NMS]_1_output[DevicetoShapeHostCopy]
|
267 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation2]
|
268 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice...graph2_/Concat_5]}
|
269 |
+
[01/04/2024-14:55:45] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation3]
|
270 |
+
[01/04/2024-14:55:46] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +459, now: CPU 1233, GPU 4661 (MiB)
|
271 |
+
[01/04/2024-14:55:46] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +38, now: CPU 1315, GPU 4699 (MiB)
|
272 |
+
[01/04/2024-14:55:46] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
|
273 |
+
[01/04/2024-15:27:42] [I] [TRT] Total Activation Memory: 7917384192
|
274 |
+
[01/04/2024-15:27:42] [I] [TRT] Detected 1 inputs and 1 output network tensors.
|
275 |
+
[01/04/2024-15:27:50] [I] [TRT] Total Host Persistent Memory: 308448
|
276 |
+
[01/04/2024-15:27:50] [I] [TRT] Total Device Persistent Memory: 653824
|
277 |
+
[01/04/2024-15:27:50] [I] [TRT] Total Scratch Memory: 134217728
|
278 |
+
[01/04/2024-15:27:50] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 74 MiB, GPU 154 MiB
|
279 |
+
[01/04/2024-15:27:50] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 166 steps to complete.
|
280 |
+
[01/04/2024-15:27:50] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 59.522ms to assign 13 blocks to 166 nodes requiring 141982720 bytes.
|
281 |
+
[01/04/2024-15:27:50] [I] [TRT] Total Activation Memory: 141982720
|
282 |
+
[01/04/2024-15:27:53] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1667, GPU 5744 (MiB)
|
283 |
+
[01/04/2024-15:27:53] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +15, GPU +16, now: CPU 15, GPU 16 (MiB)
|
284 |
+
[01/04/2024-15:27:53] [I] Engine built in 1940.22 sec.
|
285 |
+
[01/04/2024-15:27:54] [I] [TRT] Loaded engine size: 17 MiB
|
286 |
+
[01/04/2024-15:27:54] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1252, GPU 5595 (MiB)
|
287 |
+
[01/04/2024-15:27:54] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +15, now: CPU 0, GPU 15 (MiB)
|
288 |
+
[01/04/2024-15:27:54] [I] Engine deserialized in 0.210553 sec.
|
289 |
+
[01/04/2024-15:27:54] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU -1, now: CPU 1252, GPU 5594 (MiB)
|
290 |
+
[01/04/2024-15:27:54] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +136, now: CPU 0, GPU 151 (MiB)
|
291 |
+
[01/04/2024-15:27:54] [I] Setting persistentCacheLimit to 0 bytes.
|
292 |
+
[01/04/2024-15:27:54] [I] Using random values for input onnx::Cast_0
|
293 |
+
[01/04/2024-15:27:54] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640
|
294 |
+
[01/04/2024-15:27:54] [I] Using random values for output graph2_flat_predictions
|
295 |
+
[01/04/2024-15:27:54] [I] Created output binding for graph2_flat_predictions with dimensions -1x57
|
296 |
+
[01/04/2024-15:27:54] [I] Starting inference
|
297 |
+
[01/04/2024-15:28:09] [I] Warmup completed 12 queries over 200 ms
|
298 |
+
[01/04/2024-15:28:09] [I] Timing trace has 1074 queries over 15.0266 s
|
299 |
+
[01/04/2024-15:28:09] [I]
|
300 |
+
[01/04/2024-15:28:09] [I] === Trace details ===
|
301 |
+
[01/04/2024-15:28:09] [I] Trace averages of 100 runs:
|
302 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.6253 ms - Host latency: 13.7361 ms (enqueue 13.703 ms)
|
303 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.9431 ms - Host latency: 14.0566 ms (enqueue 14.0098 ms)
|
304 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.8369 ms - Host latency: 13.9494 ms (enqueue 13.9083 ms)
|
305 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.8257 ms - Host latency: 13.9381 ms (enqueue 13.8989 ms)
|
306 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.6064 ms - Host latency: 13.7172 ms (enqueue 13.6832 ms)
|
307 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 14.264 ms - Host latency: 14.3781 ms (enqueue 14.3258 ms)
|
308 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.6034 ms - Host latency: 13.7146 ms (enqueue 13.682 ms)
|
309 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 14.1877 ms - Host latency: 14.3027 ms (enqueue 14.2525 ms)
|
310 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.7484 ms - Host latency: 13.8601 ms (enqueue 13.8257 ms)
|
311 |
+
[01/04/2024-15:28:09] [I] Average on 100 runs - GPU latency: 13.7575 ms - Host latency: 13.8697 ms (enqueue 13.8349 ms)
|
312 |
+
[01/04/2024-15:28:09] [I]
|
313 |
+
[01/04/2024-15:28:09] [I] === Performance summary ===
|
314 |
+
[01/04/2024-15:28:09] [I] Throughput: 71.4732 qps
|
315 |
+
[01/04/2024-15:28:09] [I] Latency: min = 13.0068 ms, max = 17.7432 ms, mean = 13.9607 ms, median = 13.9542 ms, percentile(90%) = 14.7441 ms, percentile(95%) = 14.9595 ms, percentile(99%) = 15.5879 ms
|
316 |
+
[01/04/2024-15:28:09] [I] Enqueue Time: min = 12.9634 ms, max = 18.0693 ms, mean = 13.9208 ms, median = 13.9097 ms, percentile(90%) = 14.6982 ms, percentile(95%) = 14.8809 ms, percentile(99%) = 15.5361 ms
|
317 |
+
[01/04/2024-15:28:09] [I] H2D Latency: min = 0.0809937 ms, max = 0.114258 ms, mean = 0.0973303 ms, median = 0.0976562 ms, percentile(90%) = 0.0991211 ms, percentile(95%) = 0.0996094 ms, percentile(99%) = 0.101562 ms
|
318 |
+
[01/04/2024-15:28:09] [I] GPU Compute Time: min = 12.8984 ms, max = 17.6377 ms, mean = 13.8482 ms, median = 13.8396 ms, percentile(90%) = 14.6279 ms, percentile(95%) = 14.8496 ms, percentile(99%) = 15.4727 ms
|
319 |
+
[01/04/2024-15:28:09] [I] D2H Latency: min = 0.00390625 ms, max = 0.0466309 ms, mean = 0.0151338 ms, median = 0.0130615 ms, percentile(90%) = 0.0224609 ms, percentile(95%) = 0.0244141 ms, percentile(99%) = 0.03125 ms
|
320 |
+
[01/04/2024-15:28:09] [I] Total Host Walltime: 15.0266 s
|
321 |
+
[01/04/2024-15:28:09] [I] Total GPU Compute Time: 14.873 s
|
322 |
+
[01/04/2024-15:28:10] [I] Explanations of the performance metrics are printed in the verbose logs.
|
323 |
+
[01/04/2024-15:28:10] [I]
|
324 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --best --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.best.engine
|
yolo_nas_pose_s_int8.onnx.int8.engine
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8795eb441b2790005ae4c651d93cda57424a970bde3e5a5bceff132b34cf7c78
|
3 |
+
size 17990796
|
yolo_nas_pose_s_int8.onnx.int8.engine.err
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
-
[
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
|
|
1 |
+
[01/04/2024-15:28:15] [W] [TRT] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
|
2 |
+
[01/04/2024-15:28:15] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
|
3 |
+
[01/04/2024-15:28:18] [W] [TRT] Calibrator won't be used in explicit precision mode. Use quantization aware training to generate network with Quantize/Dequantize nodes.
|
4 |
+
[01/04/2024-15:38:46] [W] * Throughput may be bound by Enqueue Time rather than GPU Compute and the GPU may be under-utilized.
|
5 |
+
[01/04/2024-15:38:46] [W] If not already in use, --useCudaGraph (utilize CUDA graphs where possible) may increase the throughput.
|
6 |
+
[01/04/2024-15:38:46] [W] * GPU compute time is unstable, with coefficient of variance = 4.08535%.
|
7 |
+
[01/04/2024-15:38:46] [W] If not already in use, locking GPU clock frequency or adding --useSpinWait may improve the stability.
|
yolo_nas_pose_s_int8.onnx.int8.engine.log
CHANGED
@@ -1,322 +1,324 @@
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.int8.engine
|
2 |
-
[
|
3 |
-
[
|
4 |
-
[
|
5 |
-
[
|
6 |
-
[
|
7 |
-
[
|
8 |
-
[
|
9 |
-
[
|
10 |
-
[
|
11 |
-
[
|
12 |
-
[
|
13 |
-
[
|
14 |
-
[
|
15 |
-
[
|
16 |
-
[
|
17 |
-
[
|
18 |
-
[
|
19 |
-
[
|
20 |
-
[
|
21 |
-
[
|
22 |
-
[
|
23 |
-
[
|
24 |
-
[
|
25 |
-
[
|
26 |
-
[
|
27 |
-
[
|
28 |
-
[
|
29 |
-
[
|
30 |
-
[
|
31 |
-
[
|
32 |
-
[
|
33 |
-
[
|
34 |
-
[
|
35 |
-
[
|
36 |
-
[
|
37 |
-
[
|
38 |
-
[
|
39 |
-
[
|
40 |
-
[
|
41 |
-
[
|
42 |
-
[
|
43 |
-
[
|
44 |
-
[
|
45 |
-
[
|
46 |
-
[
|
47 |
-
[
|
48 |
-
[
|
49 |
-
[
|
50 |
-
[
|
51 |
-
[
|
52 |
-
[
|
53 |
-
[
|
54 |
-
[
|
55 |
-
[
|
56 |
-
[
|
57 |
-
[
|
58 |
-
[
|
59 |
-
[
|
60 |
-
[
|
61 |
-
[
|
62 |
-
[
|
63 |
-
[
|
64 |
-
[
|
65 |
-
[
|
66 |
-
[
|
67 |
-
[
|
68 |
-
[
|
69 |
-
[
|
70 |
-
[
|
71 |
-
[
|
72 |
-
[
|
73 |
-
[
|
74 |
-
[
|
75 |
-
[
|
76 |
-
[
|
77 |
-
[
|
78 |
-
[
|
79 |
-
[
|
80 |
-
[
|
81 |
-
[
|
82 |
-
[
|
83 |
-
[
|
84 |
-
[
|
85 |
-
[
|
86 |
-
[
|
87 |
-
[
|
88 |
-
[
|
89 |
-
[
|
90 |
-
[
|
91 |
-
[
|
92 |
-
[
|
93 |
-
[
|
94 |
-
[
|
95 |
-
[
|
96 |
-
[
|
97 |
-
[
|
98 |
-
[
|
99 |
-
[
|
100 |
-
[
|
101 |
-
[
|
102 |
-
[
|
103 |
-
[
|
104 |
-
[
|
105 |
-
[
|
106 |
-
[
|
107 |
-
[
|
108 |
-
[
|
109 |
-
[
|
110 |
-
[
|
111 |
-
[
|
112 |
-
[
|
113 |
-
[
|
114 |
-
[
|
115 |
-
[
|
116 |
-
[
|
117 |
-
[
|
118 |
-
[
|
119 |
-
[
|
120 |
-
[
|
121 |
-
[
|
122 |
-
[
|
123 |
-
[
|
124 |
-
[
|
125 |
-
[
|
126 |
-
[
|
127 |
-
[
|
128 |
-
[
|
129 |
-
[
|
130 |
-
[
|
131 |
-
[
|
132 |
-
[
|
133 |
-
[
|
134 |
-
[
|
135 |
-
[
|
136 |
-
[
|
137 |
-
[
|
138 |
-
[
|
139 |
-
[
|
140 |
-
[
|
141 |
-
[
|
142 |
-
[
|
143 |
-
[
|
144 |
-
[
|
145 |
-
[
|
146 |
-
[
|
147 |
-
[
|
148 |
-
[
|
149 |
-
[
|
150 |
-
[
|
151 |
-
[
|
152 |
-
[
|
153 |
-
[
|
154 |
-
[
|
155 |
-
[
|
156 |
-
[
|
157 |
-
[
|
158 |
-
[
|
159 |
-
[
|
160 |
-
[
|
161 |
-
[
|
162 |
-
[
|
163 |
-
[
|
164 |
-
[
|
165 |
-
[
|
166 |
-
[
|
167 |
-
[
|
168 |
-
[
|
169 |
-
[
|
170 |
-
[
|
171 |
-
[
|
172 |
-
[
|
173 |
-
[
|
174 |
-
[
|
175 |
-
[
|
176 |
-
[
|
177 |
-
[
|
178 |
-
[
|
179 |
-
[
|
180 |
-
[
|
181 |
-
[
|
182 |
-
[
|
183 |
-
[
|
184 |
-
[
|
185 |
-
[
|
186 |
-
[
|
187 |
-
[
|
188 |
-
[
|
189 |
-
[
|
190 |
-
[
|
191 |
-
[
|
192 |
-
[
|
193 |
-
[
|
194 |
-
[
|
195 |
-
[
|
196 |
-
[
|
197 |
-
[
|
198 |
-
[
|
199 |
-
[
|
200 |
-
[
|
201 |
-
[
|
202 |
-
[
|
203 |
-
[
|
204 |
-
[
|
205 |
-
[
|
206 |
-
[
|
207 |
-
[
|
208 |
-
[
|
209 |
-
[
|
210 |
-
[
|
211 |
-
[
|
212 |
-
[
|
213 |
-
[
|
214 |
-
[
|
215 |
-
[
|
216 |
-
[
|
217 |
-
[
|
218 |
-
[
|
219 |
-
[
|
220 |
-
[
|
221 |
-
[
|
222 |
-
[
|
223 |
-
[
|
224 |
-
[
|
225 |
-
[
|
226 |
-
[
|
227 |
-
[
|
228 |
-
[
|
229 |
-
[
|
230 |
-
[
|
231 |
-
[
|
232 |
-
[
|
233 |
-
[
|
234 |
-
[
|
235 |
-
[
|
236 |
-
[
|
237 |
-
[
|
238 |
-
[
|
239 |
-
[
|
240 |
-
[
|
241 |
-
[
|
242 |
-
[
|
243 |
-
[
|
244 |
-
[
|
245 |
-
[
|
246 |
-
[
|
247 |
-
[
|
248 |
-
[
|
249 |
-
[
|
250 |
-
[
|
251 |
-
[
|
252 |
-
[
|
253 |
-
[
|
254 |
-
[
|
255 |
-
[
|
256 |
-
[
|
257 |
-
[
|
258 |
-
[
|
259 |
-
[
|
260 |
-
[
|
261 |
-
[
|
262 |
-
[
|
263 |
-
[
|
264 |
-
[
|
265 |
-
[
|
266 |
-
[
|
267 |
-
[
|
268 |
-
[
|
269 |
-
[
|
270 |
-
[
|
271 |
-
[
|
272 |
-
[
|
273 |
-
[
|
274 |
-
[
|
275 |
-
[
|
276 |
-
[
|
277 |
-
[
|
278 |
-
[
|
279 |
-
[
|
280 |
-
[
|
281 |
-
[
|
282 |
-
[
|
283 |
-
[
|
284 |
-
[
|
285 |
-
[
|
286 |
-
[
|
287 |
-
[
|
288 |
-
[
|
289 |
-
[
|
290 |
-
[
|
291 |
-
[
|
292 |
-
[
|
293 |
-
[
|
294 |
-
[
|
295 |
-
[
|
296 |
-
[
|
297 |
-
[
|
298 |
-
[
|
299 |
-
[
|
300 |
-
[
|
301 |
-
[
|
302 |
-
[
|
303 |
-
[
|
304 |
-
[
|
305 |
-
[
|
306 |
-
[
|
307 |
-
[
|
308 |
-
[
|
309 |
-
[
|
310 |
-
[
|
311 |
-
[
|
312 |
-
[
|
313 |
-
[
|
314 |
-
[
|
315 |
-
[
|
316 |
-
[
|
317 |
-
[
|
318 |
-
[
|
319 |
-
[
|
320 |
-
[
|
321 |
-
[
|
|
|
|
|
322 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.int8.engine
|
|
|
1 |
&&&& RUNNING TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.int8.engine
|
2 |
+
[01/04/2024-15:28:11] [I] === Model Options ===
|
3 |
+
[01/04/2024-15:28:11] [I] Format: ONNX
|
4 |
+
[01/04/2024-15:28:11] [I] Model: yolo_nas_pose_s_int8.onnx
|
5 |
+
[01/04/2024-15:28:11] [I] Output:
|
6 |
+
[01/04/2024-15:28:11] [I] === Build Options ===
|
7 |
+
[01/04/2024-15:28:11] [I] Max batch: explicit batch
|
8 |
+
[01/04/2024-15:28:11] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
|
9 |
+
[01/04/2024-15:28:11] [I] minTiming: 1
|
10 |
+
[01/04/2024-15:28:11] [I] avgTiming: 8
|
11 |
+
[01/04/2024-15:28:11] [I] Precision: FP32+INT8
|
12 |
+
[01/04/2024-15:28:11] [I] LayerPrecisions:
|
13 |
+
[01/04/2024-15:28:11] [I] Calibration: Dynamic
|
14 |
+
[01/04/2024-15:28:11] [I] Refit: Disabled
|
15 |
+
[01/04/2024-15:28:11] [I] Sparsity: Disabled
|
16 |
+
[01/04/2024-15:28:11] [I] Safe mode: Disabled
|
17 |
+
[01/04/2024-15:28:11] [I] DirectIO mode: Disabled
|
18 |
+
[01/04/2024-15:28:11] [I] Restricted mode: Disabled
|
19 |
+
[01/04/2024-15:28:11] [I] Build only: Disabled
|
20 |
+
[01/04/2024-15:28:11] [I] Save engine: yolo_nas_pose_s_int8.onnx.int8.engine
|
21 |
+
[01/04/2024-15:28:11] [I] Load engine:
|
22 |
+
[01/04/2024-15:28:11] [I] Profiling verbosity: 0
|
23 |
+
[01/04/2024-15:28:11] [I] Tactic sources: Using default tactic sources
|
24 |
+
[01/04/2024-15:28:11] [I] timingCacheMode: local
|
25 |
+
[01/04/2024-15:28:11] [I] timingCacheFile:
|
26 |
+
[01/04/2024-15:28:11] [I] Heuristic: Disabled
|
27 |
+
[01/04/2024-15:28:11] [I] Preview Features: Use default preview flags.
|
28 |
+
[01/04/2024-15:28:11] [I] Input(s)s format: fp32:CHW
|
29 |
+
[01/04/2024-15:28:11] [I] Output(s)s format: fp32:CHW
|
30 |
+
[01/04/2024-15:28:11] [I] Input build shapes: model
|
31 |
+
[01/04/2024-15:28:11] [I] Input calibration shapes: model
|
32 |
+
[01/04/2024-15:28:11] [I] === System Options ===
|
33 |
+
[01/04/2024-15:28:11] [I] Device: 0
|
34 |
+
[01/04/2024-15:28:11] [I] DLACore:
|
35 |
+
[01/04/2024-15:28:11] [I] Plugins:
|
36 |
+
[01/04/2024-15:28:11] [I] === Inference Options ===
|
37 |
+
[01/04/2024-15:28:11] [I] Batch: Explicit
|
38 |
+
[01/04/2024-15:28:11] [I] Input inference shapes: model
|
39 |
+
[01/04/2024-15:28:11] [I] Iterations: 10
|
40 |
+
[01/04/2024-15:28:11] [I] Duration: 15s (+ 200ms warm up)
|
41 |
+
[01/04/2024-15:28:11] [I] Sleep time: 0ms
|
42 |
+
[01/04/2024-15:28:11] [I] Idle time: 0ms
|
43 |
+
[01/04/2024-15:28:11] [I] Streams: 1
|
44 |
+
[01/04/2024-15:28:11] [I] ExposeDMA: Disabled
|
45 |
+
[01/04/2024-15:28:11] [I] Data transfers: Enabled
|
46 |
+
[01/04/2024-15:28:11] [I] Spin-wait: Disabled
|
47 |
+
[01/04/2024-15:28:11] [I] Multithreading: Disabled
|
48 |
+
[01/04/2024-15:28:11] [I] CUDA Graph: Disabled
|
49 |
+
[01/04/2024-15:28:11] [I] Separate profiling: Disabled
|
50 |
+
[01/04/2024-15:28:11] [I] Time Deserialize: Disabled
|
51 |
+
[01/04/2024-15:28:11] [I] Time Refit: Disabled
|
52 |
+
[01/04/2024-15:28:11] [I] NVTX verbosity: 0
|
53 |
+
[01/04/2024-15:28:11] [I] Persistent Cache Ratio: 0
|
54 |
+
[01/04/2024-15:28:11] [I] Inputs:
|
55 |
+
[01/04/2024-15:28:11] [I] === Reporting Options ===
|
56 |
+
[01/04/2024-15:28:11] [I] Verbose: Disabled
|
57 |
+
[01/04/2024-15:28:11] [I] Averages: 100 inferences
|
58 |
+
[01/04/2024-15:28:11] [I] Percentiles: 90,95,99
|
59 |
+
[01/04/2024-15:28:11] [I] Dump refittable layers:Disabled
|
60 |
+
[01/04/2024-15:28:11] [I] Dump output: Disabled
|
61 |
+
[01/04/2024-15:28:11] [I] Profile: Disabled
|
62 |
+
[01/04/2024-15:28:11] [I] Export timing to JSON file:
|
63 |
+
[01/04/2024-15:28:11] [I] Export output to JSON file:
|
64 |
+
[01/04/2024-15:28:11] [I] Export profile to JSON file:
|
65 |
+
[01/04/2024-15:28:11] [I]
|
66 |
+
[01/04/2024-15:28:11] [I] === Device Information ===
|
67 |
+
[01/04/2024-15:28:11] [I] Selected Device: Orin
|
68 |
+
[01/04/2024-15:28:11] [I] Compute Capability: 8.7
|
69 |
+
[01/04/2024-15:28:11] [I] SMs: 8
|
70 |
+
[01/04/2024-15:28:11] [I] Compute Clock Rate: 0.624 GHz
|
71 |
+
[01/04/2024-15:28:11] [I] Device Global Memory: 7471 MiB
|
72 |
+
[01/04/2024-15:28:11] [I] Shared Memory per SM: 164 KiB
|
73 |
+
[01/04/2024-15:28:11] [I] Memory Bus Width: 128 bits (ECC disabled)
|
74 |
+
[01/04/2024-15:28:11] [I] Memory Clock Rate: 0.624 GHz
|
75 |
+
[01/04/2024-15:28:11] [I]
|
76 |
+
[01/04/2024-15:28:11] [I] TensorRT version: 8.5.2
|
77 |
+
[01/04/2024-15:28:11] [I] [TRT] [MemUsageChange] Init CUDA: CPU +220, GPU +0, now: CPU 249, GPU 3760 (MiB)
|
78 |
+
[01/04/2024-15:28:15] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +302, GPU +284, now: CPU 574, GPU 4064 (MiB)
|
79 |
+
[01/04/2024-15:28:15] [I] Start parsing network model
|
80 |
+
[01/04/2024-15:28:15] [I] [TRT] ----------------------------------------------------------------
|
81 |
+
[01/04/2024-15:28:15] [I] [TRT] Input filename: yolo_nas_pose_s_int8.onnx
|
82 |
+
[01/04/2024-15:28:15] [I] [TRT] ONNX IR version: 0.0.8
|
83 |
+
[01/04/2024-15:28:15] [I] [TRT] Opset version: 17
|
84 |
+
[01/04/2024-15:28:15] [I] [TRT] Producer name: pytorch
|
85 |
+
[01/04/2024-15:28:15] [I] [TRT] Producer version: 2.1.2
|
86 |
+
[01/04/2024-15:28:15] [I] [TRT] Domain:
|
87 |
+
[01/04/2024-15:28:15] [I] [TRT] Model version: 0
|
88 |
+
[01/04/2024-15:28:15] [I] [TRT] Doc string:
|
89 |
+
[01/04/2024-15:28:15] [I] [TRT] ----------------------------------------------------------------
|
90 |
+
[01/04/2024-15:28:18] [I] Finish parsing network model
|
91 |
+
[01/04/2024-15:28:18] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
|
92 |
+
[01/04/2024-15:28:22] [I] [TRT] ---------- Layers Running on DLA ----------
|
93 |
+
[01/04/2024-15:28:22] [I] [TRT] ---------- Layers Running on GPU ----------
|
94 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation1]
|
95 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/pre_process/pre_process.0/Cast.../pre_process/pre_process.2/Mul]}
|
96 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1229) [Constant]
|
97 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1230) [Constant]
|
98 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONSTANT: (Unnamed Layer* 1231) [Constant]
|
99 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stem/conv/rbr_reparam/_input_quantizer/QuantizeLinear
|
100 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stem.conv.rbr_reparam.weight + /model/backbone/stem/conv/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stem/conv/rbr_reparam/Conv
|
101 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.downsample.rbr_reparam.weight + /model/backbone/stage1/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/downsample/rbr_reparam/Conv
|
102 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv2.conv.weight + /model/backbone/stage1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv2/conv/Conv
|
103 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv1.conv.weight + /model/backbone/stage1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv1/conv/Conv
|
104 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
105 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
106 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
107 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 494) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.0/Add)
|
108 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
109 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
110 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
111 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 510) [Shuffle] + /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage1/blocks/bottlenecks/bottlenecks.1/Add)
|
112 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage1.blocks.conv3.conv.weight + /model/backbone/stage1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage1/blocks/conv3/conv/Conv
|
113 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_skip2.conv.weight + /model/neck/neck2/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip2/conv/Conv
|
114 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.downsample.rbr_reparam.weight + /model/backbone/stage2/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/downsample/rbr_reparam/Conv
|
115 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.downsample.conv.weight + /model/neck/neck2/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/downsample/conv/Conv
|
116 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv2.conv.weight + /model/backbone/stage2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv2/conv/Conv
|
117 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv1.conv.weight + /model/backbone/stage2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv1/conv/Conv
|
118 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
119 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
120 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
121 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 557) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.0/Add)
|
122 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
123 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
124 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
125 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 573) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.1/Add)
|
126 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
127 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
128 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
129 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage2.blocks.bottlenecks.2.alpha + (Unnamed Layer* 589) [Shuffle] + /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage2/blocks/bottlenecks/bottlenecks.2/Add)
|
130 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage2.blocks.conv3.conv.weight + /model/backbone/stage2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage2/blocks/conv3/conv/Conv
|
131 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip2.conv.weight + /model/neck/neck1/reduce_skip2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip2/conv/Conv || model.neck.neck2.reduce_skip1.conv.weight + /model/neck/neck2/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_skip1/conv/Conv
|
132 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.downsample.rbr_reparam.weight + /model/backbone/stage3/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/downsample/rbr_reparam/Conv
|
133 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.downsample.conv.weight + /model/neck/neck1/downsample/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/downsample/conv/Conv
|
134 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv2.conv.weight + /model/backbone/stage3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv2/conv/Conv
|
135 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv1.conv.weight + /model/backbone/stage3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv1/conv/Conv
|
136 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
137 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
138 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
139 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 639) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.0/Add)
|
140 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
141 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
142 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
143 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 655) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.1/Add)
|
144 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
145 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv1/rbr_reparam/Conv
|
146 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.2.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/cv2/rbr_reparam/Conv
|
147 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.2.alpha + (Unnamed Layer* 671) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.2/Add)
|
148 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
149 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv1/rbr_reparam/Conv
|
150 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.3.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/cv2/rbr_reparam/Conv
|
151 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.3.alpha + (Unnamed Layer* 687) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.3/Add)
|
152 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
153 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.4.cv1.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv1/rbr_reparam/Conv
|
154 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.bottlenecks.4.cv2.rbr_reparam.weight + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/cv2/rbr_reparam/Conv
|
155 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage3.blocks.bottlenecks.4.alpha + (Unnamed Layer* 703) [Shuffle] + /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Mul, /model/backbone/stage3/blocks/bottlenecks/bottlenecks.4/Add)
|
156 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage3.blocks.conv3.conv.weight + /model/backbone/stage3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage3/blocks/conv3/conv/Conv
|
157 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_skip1.conv.weight + /model/neck/neck1/reduce_skip1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_skip1/conv/Conv
|
158 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.downsample.rbr_reparam.weight + /model/backbone/stage4/downsample/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/downsample/rbr_reparam/Conv
|
159 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv2.conv.weight + /model/backbone/stage4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv2/conv/Conv
|
160 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv1.conv.weight + /model/backbone/stage4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv1/conv/Conv
|
161 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
162 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
163 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
164 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 744) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.0/Add)
|
165 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
166 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
167 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
168 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.backbone.stage4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 760) [Shuffle] + /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Mul, /model/backbone/stage4/blocks/bottlenecks/bottlenecks.1/Add)
|
169 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.stage4.blocks.conv3.conv.weight + /model/backbone/stage4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/backbone/stage4/blocks/conv3/conv/Conv
|
170 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv1.conv.weight + /model/backbone/context_module/cv1/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv1/conv/Conv
|
171 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.2/MaxPool
|
172 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.1/MaxPool
|
173 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POOLING: /model/backbone/context_module/m.0/MaxPool
|
174 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/backbone/context_module/m.2/MaxPool_output_0 copy
|
175 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.backbone.context_module.cv2.conv.weight + /model/backbone/context_module/cv2/conv/_weight_quantizer/QuantizeLinear + /model/backbone/context_module/cv2/conv/Conv
|
176 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.conv.conv.weight + /model/neck/neck1/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/conv/conv/Conv
|
177 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/upsample/_input_quantizer/QuantizeLinear
|
178 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck1.upsample.weight + /model/neck/neck1/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck1/upsample/ConvTranspose
|
179 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.reduce_after_concat.conv.weight + /model/neck/neck1/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/reduce_after_concat/conv/Conv
|
180 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv2.conv.weight + /model/neck/neck1/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv2/conv/Conv
|
181 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv1.conv.weight + /model/neck/neck1/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv1/conv/Conv
|
182 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
183 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
184 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
185 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.0.alpha + (Unnamed Layer* 825) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.0/Add)
|
186 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
187 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
188 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
189 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck1.blocks.bottlenecks.1.alpha + (Unnamed Layer* 841) [Shuffle] + /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck1/blocks/bottlenecks/bottlenecks.1/Add)
|
190 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck1.blocks.conv3.conv.weight + /model/neck/neck1/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck1/blocks/conv3/conv/Conv
|
191 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.conv.conv.weight + /model/neck/neck2/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/conv/conv/Conv
|
192 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/upsample/_input_quantizer/QuantizeLinear
|
193 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] DECONVOLUTION: model.neck.neck2.upsample.weight + /model/neck/neck2/upsample/_weight_quantizer/QuantizeLinear + /model/neck/neck2/upsample/ConvTranspose
|
194 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/Concat_/model/neck/neck2/reduce_skip1/act/Relu_output_0_clone_1 copy
|
195 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.reduce_after_concat.conv.weight + /model/neck/neck2/reduce_after_concat/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/reduce_after_concat/conv/Conv
|
196 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv2.conv.weight + /model/neck/neck2/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv2/conv/Conv
|
197 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv1.conv.weight + /model/neck/neck2/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv1/conv/Conv
|
198 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
199 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv1/rbr_reparam/Conv
|
200 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.0.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/cv2/rbr_reparam/Conv
|
201 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.0.alpha + (Unnamed Layer* 890) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.0/Add)
|
202 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_input_quantizer/QuantizeLinear
|
203 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv1.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv1/rbr_reparam/Conv
|
204 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.bottlenecks.1.cv2.rbr_reparam.weight + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/cv2/rbr_reparam/Conv
|
205 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck2.blocks.bottlenecks.1.alpha + (Unnamed Layer* 906) [Shuffle] + /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add)
|
206 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck2/blocks/Concat_/model/neck/neck2/blocks/bottlenecks/bottlenecks.1/Add_output_0_clone_0 copy
|
207 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck2.blocks.conv3.conv.weight + /model/neck/neck2/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck2/blocks/conv3/conv/Conv
|
208 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.bbox_stem.seq.conv.weight + /model/heads/head1/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/bbox_stem/seq/conv/Conv || model.heads.head1.pose_stem.seq.conv.weight + /model/heads/head1/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_stem/seq/conv/Conv
|
209 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.conv.conv.weight + /model/neck/neck3/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/conv/conv/Conv
|
210 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_convs.0.seq.conv.weight + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head1.cls_convs.0.seq.conv.weight + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_convs/cls_convs.0/seq/conv/Conv
|
211 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.0.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.0/seq/conv/Conv
|
212 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
213 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.cls_pred.weight + /model/heads/head1/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/cls_pred/Conv
|
214 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.reg_pred.weight + /model/heads/head1/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/reg_pred/Conv
|
215 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_convs.1.seq.conv.weight + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_convs/pose_convs.1/seq/conv/Conv
|
216 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv2.conv.weight + /model/neck/neck3/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv2/conv/Conv
|
217 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv1.conv.weight + /model/neck/neck3/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv1/conv/Conv
|
218 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape + /model/heads/Transpose
|
219 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head1.pose_pred.weight + /model/heads/head1/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head1/pose_pred/Conv
|
220 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
221 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax
|
222 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
223 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv
|
224 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
225 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.0.alpha + (Unnamed Layer* 972) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.0/Add)
|
226 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
227 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
228 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
229 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck3.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1013) [Shuffle] + /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck3/blocks/bottlenecks/bottlenecks.1/Add)
|
230 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck3.blocks.conv3.conv.weight + /model/neck/neck3/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck3/blocks/conv3/conv/Conv
|
231 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_stem.seq.conv.weight + /model/heads/head2/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_stem/seq/conv/Conv || model.heads.head2.bbox_stem.seq.conv.weight + /model/heads/head2/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/bbox_stem/seq/conv/Conv
|
232 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.conv.conv.weight + /model/neck/neck4/conv/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/conv/conv/Conv
|
233 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_convs.0.seq.conv.weight + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head2.cls_convs.0.seq.conv.weight + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_convs/cls_convs.0/seq/conv/Conv
|
234 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.0.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.0/seq/conv/Conv
|
235 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/conv1/conv/_input_quantizer/QuantizeLinear_clone_1
|
236 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.cls_pred.weight + /model/heads/head2/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/cls_pred/Conv
|
237 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.reg_pred.weight + /model/heads/head2/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/reg_pred/Conv
|
238 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_convs.1.seq.conv.weight + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_convs/pose_convs.1/seq/conv/Conv
|
239 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv2.conv.weight + /model/neck/neck4/blocks/conv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv2/conv/Conv
|
240 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv1.conv.weight + /model/neck/neck4/blocks/conv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv1/conv/Conv
|
241 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_4 + /model/heads/Transpose_3
|
242 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head2.pose_pred.weight + /model/heads/head2/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head2/pose_pred/Conv
|
243 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_input_quantizer/QuantizeLinear
|
244 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_1
|
245 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv1/conv/Conv
|
246 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_1
|
247 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.0.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/cv2/conv/Conv
|
248 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.0.alpha + (Unnamed Layer* 1079) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.0/Add)
|
249 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] COPY: /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_input_quantizer/QuantizeLinear
|
250 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv1.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv1/conv/Conv
|
251 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.bottlenecks.1.cv2.conv.weight + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/cv2/conv/Conv
|
252 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] POINTWISE: PWN(model.neck.neck4.blocks.bottlenecks.1.alpha + (Unnamed Layer* 1120) [Shuffle] + /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Mul, /model/neck/neck4/blocks/bottlenecks/bottlenecks.1/Add)
|
253 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.neck.neck4.blocks.conv3.conv.weight + /model/neck/neck4/blocks/conv3/conv/_weight_quantizer/QuantizeLinear + /model/neck/neck4/blocks/conv3/conv/Conv
|
254 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.bbox_stem.seq.conv.weight + /model/heads/head3/bbox_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/bbox_stem/seq/conv/Conv || model.heads.head3.pose_stem.seq.conv.weight + /model/heads/head3/pose_stem/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_stem/seq/conv/Conv
|
255 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_convs.0.seq.conv.weight + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_convs/reg_convs.0/seq/conv/Conv || model.heads.head3.cls_convs.0.seq.conv.weight + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_convs/cls_convs.0/seq/conv/Conv
|
256 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.0.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.0/seq/conv/Conv
|
257 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.cls_pred.weight + /model/heads/head3/cls_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/cls_pred/Conv
|
258 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.reg_pred.weight + /model/heads/head3/reg_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/reg_pred/Conv
|
259 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.1.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.1/seq/conv/Conv
|
260 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SHUFFLE: /model/heads/Reshape_8 + /model/heads/Transpose_6
|
261 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_convs.2.seq.conv.weight + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_convs/pose_convs.2/seq/conv/Conv
|
262 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] SOFTMAX: /model/heads/Softmax_2
|
263 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: model.heads.head3.pose_pred.weight + /model/heads/head3/pose_pred/_weight_quantizer/QuantizeLinear + /model/heads/head3/pose_pred/Conv
|
264 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] CONVOLUTION: /model/heads/Conv_2
|
265 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice_1.../post_process/Reshape_2]}
|
266 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] NMS: batched_nms_243
|
267 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] DEVICE_TO_SHAPE_HOST: (Unnamed Layer* 1233) [NMS]_1_output[DevicetoShapeHostCopy]
|
268 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation2]
|
269 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] MYELIN: {ForeignNode[/model/heads/head1/Slice...graph2_/Concat_5]}
|
270 |
+
[01/04/2024-15:28:22] [I] [TRT] [GpuLayer] TRAIN_STATION: [trainStation3]
|
271 |
+
[01/04/2024-15:28:23] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +534, GPU +499, now: CPU 1233, GPU 4691 (MiB)
|
272 |
+
[01/04/2024-15:28:23] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +82, GPU +74, now: CPU 1315, GPU 4765 (MiB)
|
273 |
+
[01/04/2024-15:28:23] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
|
274 |
+
[01/04/2024-15:38:28] [I] [TRT] Total Activation Memory: 7939057152
|
275 |
+
[01/04/2024-15:38:28] [I] [TRT] Detected 1 inputs and 1 output network tensors.
|
276 |
+
[01/04/2024-15:38:29] [I] [TRT] Total Host Persistent Memory: 309280
|
277 |
+
[01/04/2024-15:38:29] [I] [TRT] Total Device Persistent Memory: 38912
|
278 |
+
[01/04/2024-15:38:29] [I] [TRT] Total Scratch Memory: 134217728
|
279 |
+
[01/04/2024-15:38:29] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 74 MiB, GPU 132 MiB
|
280 |
+
[01/04/2024-15:38:29] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 171 steps to complete.
|
281 |
+
[01/04/2024-15:38:29] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 40.1195ms to assign 13 blocks to 171 nodes requiring 144747520 bytes.
|
282 |
+
[01/04/2024-15:38:29] [I] [TRT] Total Activation Memory: 144747520
|
283 |
+
[01/04/2024-15:38:30] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1665, GPU 5901 (MiB)
|
284 |
+
[01/04/2024-15:38:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +15, GPU +16, now: CPU 15, GPU 16 (MiB)
|
285 |
+
[01/04/2024-15:38:30] [I] Engine built in 619.407 sec.
|
286 |
+
[01/04/2024-15:38:31] [I] [TRT] Loaded engine size: 17 MiB
|
287 |
+
[01/04/2024-15:38:31] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +0, now: CPU 1251, GPU 5877 (MiB)
|
288 |
+
[01/04/2024-15:38:31] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +15, now: CPU 0, GPU 15 (MiB)
|
289 |
+
[01/04/2024-15:38:31] [I] Engine deserialized in 0.128553 sec.
|
290 |
+
[01/04/2024-15:38:31] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 1251, GPU 5877 (MiB)
|
291 |
+
[01/04/2024-15:38:31] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +138, now: CPU 0, GPU 153 (MiB)
|
292 |
+
[01/04/2024-15:38:31] [I] Setting persistentCacheLimit to 0 bytes.
|
293 |
+
[01/04/2024-15:38:31] [I] Using random values for input onnx::Cast_0
|
294 |
+
[01/04/2024-15:38:31] [I] Created input binding for onnx::Cast_0 with dimensions 1x3x640x640
|
295 |
+
[01/04/2024-15:38:31] [I] Using random values for output graph2_flat_predictions
|
296 |
+
[01/04/2024-15:38:31] [I] Created output binding for graph2_flat_predictions with dimensions -1x57
|
297 |
+
[01/04/2024-15:38:31] [I] Starting inference
|
298 |
+
[01/04/2024-15:38:46] [I] Warmup completed 10 queries over 200 ms
|
299 |
+
[01/04/2024-15:38:46] [I] Timing trace has 924 queries over 15.0277 s
|
300 |
+
[01/04/2024-15:38:46] [I]
|
301 |
+
[01/04/2024-15:38:46] [I] === Trace details ===
|
302 |
+
[01/04/2024-15:38:46] [I] Trace averages of 100 runs:
|
303 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 16.5012 ms - Host latency: 16.6184 ms (enqueue 16.5675 ms)
|
304 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 16.52 ms - Host latency: 16.6389 ms (enqueue 16.5841 ms)
|
305 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 15.9968 ms - Host latency: 16.11 ms (enqueue 16.0689 ms)
|
306 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 15.7859 ms - Host latency: 15.8982 ms (enqueue 15.8629 ms)
|
307 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 15.6698 ms - Host latency: 15.7823 ms (enqueue 15.7438 ms)
|
308 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 16.1195 ms - Host latency: 16.2344 ms (enqueue 16.191 ms)
|
309 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 16.3087 ms - Host latency: 16.4258 ms (enqueue 16.3729 ms)
|
310 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 15.5609 ms - Host latency: 15.6727 ms (enqueue 15.6373 ms)
|
311 |
+
[01/04/2024-15:38:46] [I] Average on 100 runs - GPU latency: 16.4593 ms - Host latency: 16.577 ms (enqueue 16.5201 ms)
|
312 |
+
[01/04/2024-15:38:46] [I]
|
313 |
+
[01/04/2024-15:38:46] [I] === Performance summary ===
|
314 |
+
[01/04/2024-15:38:46] [I] Throughput: 61.4865 qps
|
315 |
+
[01/04/2024-15:38:46] [I] Latency: min = 14.6812 ms, max = 18.0088 ms, mean = 16.2285 ms, median = 16.2705 ms, percentile(90%) = 17.0212 ms, percentile(95%) = 17.188 ms, percentile(99%) = 17.4453 ms
|
316 |
+
[01/04/2024-15:38:46] [I] Enqueue Time: min = 14.6509 ms, max = 17.9592 ms, mean = 16.1828 ms, median = 16.2139 ms, percentile(90%) = 16.9602 ms, percentile(95%) = 17.1279 ms, percentile(99%) = 17.3662 ms
|
317 |
+
[01/04/2024-15:38:46] [I] H2D Latency: min = 0.0830078 ms, max = 0.121094 ms, mean = 0.0968298 ms, median = 0.0967102 ms, percentile(90%) = 0.0991211 ms, percentile(95%) = 0.0996094 ms, percentile(99%) = 0.112915 ms
|
318 |
+
[01/04/2024-15:38:46] [I] GPU Compute Time: min = 14.5688 ms, max = 17.8901 ms, mean = 16.1133 ms, median = 16.1515 ms, percentile(90%) = 16.9043 ms, percentile(95%) = 17.0674 ms, percentile(99%) = 17.3271 ms
|
319 |
+
[01/04/2024-15:38:46] [I] D2H Latency: min = 0.00341797 ms, max = 0.0561523 ms, mean = 0.0183549 ms, median = 0.0194397 ms, percentile(90%) = 0.0244141 ms, percentile(95%) = 0.0263672 ms, percentile(99%) = 0.0390625 ms
|
320 |
+
[01/04/2024-15:38:46] [I] Total Host Walltime: 15.0277 s
|
321 |
+
[01/04/2024-15:38:46] [I] Total GPU Compute Time: 14.8887 s
|
322 |
+
[01/04/2024-15:38:46] [I] Explanations of the performance metrics are printed in the verbose logs.
|
323 |
+
[01/04/2024-15:38:46] [I]
|
324 |
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=yolo_nas_pose_s_int8.onnx --int8 --avgRuns=100 --duration=15 --saveEngine=yolo_nas_pose_s_int8.onnx.int8.engine
|