File size: 33,066 Bytes
ca0cb2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
2024-02-08 18:00:20,766 INFO    StreamThr :1998 [internal.py:wandb_internal():86] W&B internal server running at pid: 1998, started at: 2024-02-08 18:00:20.765271
2024-02-08 18:00:20,767 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: status
2024-02-08 18:00:20,770 INFO    WriterThread:1998 [datastore.py:open_for_write():85] open: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/run-3wdew33h.wandb
2024-02-08 18:00:20,771 DEBUG   SenderThread:1998 [sender.py:send():382] send: header
2024-02-08 18:00:20,771 DEBUG   SenderThread:1998 [sender.py:send():382] send: run
2024-02-08 18:00:21,104 INFO    SenderThread:1998 [dir_watcher.py:__init__():211] watching files in: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files
2024-02-08 18:00:21,104 INFO    SenderThread:1998 [sender.py:_start_run_threads():1136] run started: 3wdew33h with start time 1707415220.764703
2024-02-08 18:00:21,108 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: check_version
2024-02-08 18:00:21,108 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: check_version
2024-02-08 18:00:21,151 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: run_start
2024-02-08 18:00:21,179 DEBUG   HandlerThread:1998 [system_info.py:__init__():32] System info init
2024-02-08 18:00:21,179 DEBUG   HandlerThread:1998 [system_info.py:__init__():47] System info init done
2024-02-08 18:00:21,179 INFO    HandlerThread:1998 [system_monitor.py:start():194] Starting system monitor
2024-02-08 18:00:21,180 INFO    SystemMonitor:1998 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-02-08 18:00:21,181 INFO    HandlerThread:1998 [system_monitor.py:probe():214] Collecting system info
2024-02-08 18:00:21,181 INFO    SystemMonitor:1998 [interfaces.py:start():190] Started cpu monitoring
2024-02-08 18:00:21,182 INFO    SystemMonitor:1998 [interfaces.py:start():190] Started disk monitoring
2024-02-08 18:00:21,182 INFO    SystemMonitor:1998 [interfaces.py:start():190] Started gpu monitoring
2024-02-08 18:00:21,184 INFO    SystemMonitor:1998 [interfaces.py:start():190] Started memory monitoring
2024-02-08 18:00:21,184 INFO    SystemMonitor:1998 [interfaces.py:start():190] Started network monitoring
2024-02-08 18:00:21,239 DEBUG   HandlerThread:1998 [system_info.py:probe():196] Probing system
2024-02-08 18:00:21,241 DEBUG   HandlerThread:1998 [gitlib.py:_init_repo():56] git repository is invalid
2024-02-08 18:00:21,241 DEBUG   HandlerThread:1998 [system_info.py:probe():244] Probing system done
2024-02-08 18:00:21,241 DEBUG   HandlerThread:1998 [system_monitor.py:probe():223] {'os': 'Linux-4.14.336-253.554.amzn2.x86_64-x86_64-with-glibc2.35', 'python': '3.10.13', 'heartbeatAt': '2024-02-08T18:00:21.239172', 'startedAt': '2024-02-08T18:00:20.760995', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/sagemaker-user/output-7b-26k-lora/../lora_finetuning_push_to_hub_save_local.py', 'codePathLocal': None, 'host': 'default', 'username': 'sagemaker-user', 'executable': '/opt/conda/bin/python3', 'cpu_count': 96, 'cpu_count_logical': 192, 'cpu_freq': {'current': 3011.898104166666, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 2708.532, 'min': 0.0, 'max': 0.0}, {'current': 2549.917, 'min': 0.0, 'max': 0.0}, {'current': 2649.342, 'min': 0.0, 'max': 0.0}, {'current': 2617.578, 'min': 0.0, 'max': 0.0}, {'current': 2514.64, 'min': 0.0, 'max': 0.0}, {'current': 3300.018, 'min': 0.0, 'max': 0.0}, {'current': 2514.814, 'min': 0.0, 'max': 0.0}, {'current': 2823.715, 'min': 0.0, 'max': 0.0}, {'current': 2444.923, 'min': 0.0, 'max': 0.0}, {'current': 2450.425, 'min': 0.0, 'max': 0.0}, {'current': 2405.045, 'min': 0.0, 'max': 0.0}, {'current': 3300.611, 'min': 0.0, 'max': 0.0}, {'current': 2521.7, 'min': 0.0, 'max': 0.0}, {'current': 2536.269, 'min': 0.0, 'max': 0.0}, {'current': 2454.76, 'min': 0.0, 'max': 0.0}, {'current': 2452.125, 'min': 0.0, 'max': 0.0}, {'current': 2578.243, 'min': 0.0, 'max': 0.0}, {'current': 2588.493, 'min': 0.0, 'max': 0.0}, {'current': 2597.93, 'min': 0.0, 'max': 0.0}, {'current': 2362.191, 'min': 0.0, 'max': 0.0}, {'current': 2603.048, 'min': 0.0, 'max': 0.0}, {'current': 1726.583, 'min': 0.0, 'max': 0.0}, {'current': 2010.457, 'min': 0.0, 'max': 0.0}, {'current': 1983.584, 'min': 0.0, 'max': 0.0}, {'current': 2585.543, 'min': 0.0, 'max': 0.0}, {'current': 2448.628, 'min': 0.0, 'max': 0.0}, {'current': 2473.293, 'min': 0.0, 'max': 0.0}, {'current': 2326.422, 'min': 0.0, 'max': 0.0}, {'current': 2469.471, 'min': 0.0, 'max': 0.0}, {'current': 1679.725, 'min': 0.0, 'max': 0.0}, {'current': 1707.891, 'min': 0.0, 'max': 0.0}, {'current': 2608.243, 'min': 0.0, 'max': 0.0}, {'current': 1825.706, 'min': 0.0, 'max': 0.0}, {'current': 1852.851, 'min': 0.0, 'max': 0.0}, {'current': 2741.597, 'min': 0.0, 'max': 0.0}, {'current': 2763.617, 'min': 0.0, 'max': 0.0}, {'current': 2668.973, 'min': 0.0, 'max': 0.0}, {'current': 1848.906, 'min': 0.0, 'max': 0.0}, {'current': 1859.662, 'min': 0.0, 'max': 0.0}, {'current': 1872.243, 'min': 0.0, 'max': 0.0}, {'current': 2750.258, 'min': 0.0, 'max': 0.0}, {'current': 2171.724, 'min': 0.0, 'max': 0.0}, {'current': 2673.366, 'min': 0.0, 'max': 0.0}, {'current': 2686.6, 'min': 0.0, 'max': 0.0}, {'current': 2254.628, 'min': 0.0, 'max': 0.0}, {'current': 1922.174, 'min': 0.0, 'max': 0.0}, {'current': 1919.649, 'min': 0.0, 'max': 0.0}, {'current': 1914.156, 'min': 0.0, 'max': 0.0}, {'current': 3299.749, 'min': 0.0, 'max': 0.0}, {'current': 3299.574, 'min': 0.0, 'max': 0.0}, {'current': 3299.179, 'min': 0.0, 'max': 0.0}, {'current': 3300.775, 'min': 0.0, 'max': 0.0}, {'current': 3299.087, 'min': 0.0, 'max': 0.0}, {'current': 3299.329, 'min': 0.0, 'max': 0.0}, {'current': 2200.612, 'min': 0.0, 'max': 0.0}, {'current': 2744.304, 'min': 0.0, 'max': 0.0}, {'current': 3300.722, 'min': 0.0, 'max': 0.0}, {'current': 3299.792, 'min': 0.0, 'max': 0.0}, {'current': 3299.956, 'min': 0.0, 'max': 0.0}, {'current': 3301.117, 'min': 0.0, 'max': 0.0}, {'current': 2226.489, 'min': 0.0, 'max': 0.0}, {'current': 3070.859, 'min': 0.0, 'max': 0.0}, {'current': 1992.465, 'min': 0.0, 'max': 0.0}, {'current': 3048.294, 'min': 0.0, 'max': 0.0}, {'current': 2694.481, 'min': 0.0, 'max': 0.0}, {'current': 3299.499, 'min': 0.0, 'max': 0.0}, {'current': 2794.422, 'min': 0.0, 'max': 0.0}, {'current': 3292.924, 'min': 0.0, 'max': 0.0}, {'current': 3299.232, 'min': 0.0, 'max': 0.0}, {'current': 3297.378, 'min': 0.0, 'max': 0.0}, {'current': 3300.519, 'min': 0.0, 'max': 0.0}, {'current': 3300.987, 'min': 0.0, 'max': 0.0}, {'current': 3298.537, 'min': 0.0, 'max': 0.0}, {'current': 2474.441, 'min': 0.0, 'max': 0.0}, {'current': 2795.47, 'min': 0.0, 'max': 0.0}, {'current': 2412.968, 'min': 0.0, 'max': 0.0}, {'current': 2550.34, 'min': 0.0, 'max': 0.0}, {'current': 3009.464, 'min': 0.0, 'max': 0.0}, {'current': 2578.135, 'min': 0.0, 'max': 0.0}, {'current': 2404.379, 'min': 0.0, 'max': 0.0}, {'current': 3020.541, 'min': 0.0, 'max': 0.0}, {'current': 3058.728, 'min': 0.0, 'max': 0.0}, {'current': 2989.512, 'min': 0.0, 'max': 0.0}, {'current': 3059.601, 'min': 0.0, 'max': 0.0}, {'current': 2714.737, 'min': 0.0, 'max': 0.0}, {'current': 2694.293, 'min': 0.0, 'max': 0.0}, {'current': 3261.941, 'min': 0.0, 'max': 0.0}, {'current': 2586.965, 'min': 0.0, 'max': 0.0}, {'current': 3294.605, 'min': 0.0, 'max': 0.0}, {'current': 3260.796, 'min': 0.0, 'max': 0.0}, {'current': 3295.156, 'min': 0.0, 'max': 0.0}, {'current': 3253.521, 'min': 0.0, 'max': 0.0}, {'current': 2832.733, 'min': 0.0, 'max': 0.0}, {'current': 3264.153, 'min': 0.0, 'max': 0.0}, {'current': 2767.538, 'min': 0.0, 'max': 0.0}, {'current': 3300.784, 'min': 0.0, 'max': 0.0}, {'current': 2342.793, 'min': 0.0, 'max': 0.0}, {'current': 2291.218, 'min': 0.0, 'max': 0.0}, {'current': 2343.318, 'min': 0.0, 'max': 0.0}, {'current': 2377.449, 'min': 0.0, 'max': 0.0}, {'current': 2182.822, 'min': 0.0, 'max': 0.0}, {'current': 3300.271, 'min': 0.0, 'max': 0.0}, {'current': 2180.405, 'min': 0.0, 'max': 0.0}, {'current': 2708.15, 'min': 0.0, 'max': 0.0}, {'current': 2217.841, 'min': 0.0, 'max': 0.0}, {'current': 2223.228, 'min': 0.0, 'max': 0.0}, {'current': 2332.486, 'min': 0.0, 'max': 0.0}, {'current': 2585.795, 'min': 0.0, 'max': 0.0}, {'current': 2332.572, 'min': 0.0, 'max': 0.0}, {'current': 2256.902, 'min': 0.0, 'max': 0.0}, {'current': 2334.544, 'min': 0.0, 'max': 0.0}, {'current': 2350.832, 'min': 0.0, 'max': 0.0}, {'current': 2424.323, 'min': 0.0, 'max': 0.0}, {'current': 2456.004, 'min': 0.0, 'max': 0.0}, {'current': 2457.45, 'min': 0.0, 'max': 0.0}, {'current': 2579.468, 'min': 0.0, 'max': 0.0}, {'current': 2458.733, 'min': 0.0, 'max': 0.0}, {'current': 1860.983, 'min': 0.0, 'max': 0.0}, {'current': 2198.204, 'min': 0.0, 'max': 0.0}, {'current': 2131.955, 'min': 0.0, 'max': 0.0}, {'current': 2416.616, 'min': 0.0, 'max': 0.0}, {'current': 2498.284, 'min': 0.0, 'max': 0.0}, {'current': 2409.271, 'min': 0.0, 'max': 0.0}, {'current': 2442.917, 'min': 0.0, 'max': 0.0}, {'current': 2387.494, 'min': 0.0, 'max': 0.0}, {'current': 1840.589, 'min': 0.0, 'max': 0.0}, {'current': 1851.316, 'min': 0.0, 'max': 0.0}, {'current': 2572.071, 'min': 0.0, 'max': 0.0}, {'current': 1846.514, 'min': 0.0, 'max': 0.0}, {'current': 1838.129, 'min': 0.0, 'max': 0.0}, {'current': 2458.548, 'min': 0.0, 'max': 0.0}, {'current': 2468.97, 'min': 0.0, 'max': 0.0}, {'current': 2573.521, 'min': 0.0, 'max': 0.0}, {'current': 1854.263, 'min': 0.0, 'max': 0.0}, {'current': 1852.806, 'min': 0.0, 'max': 0.0}, {'current': 1826.095, 'min': 0.0, 'max': 0.0}, {'current': 2445.646, 'min': 0.0, 'max': 0.0}, {'current': 2189.591, 'min': 0.0, 'max': 0.0}, {'current': 2264.349, 'min': 0.0, 'max': 0.0}, {'current': 2377.441, 'min': 0.0, 'max': 0.0}, {'current': 2012.723, 'min': 0.0, 'max': 0.0}, {'current': 1906.662, 'min': 0.0, 'max': 0.0}, {'current': 2124.071, 'min': 0.0, 'max': 0.0}, {'current': 1895.988, 'min': 0.0, 'max': 0.0}, {'current': 3300.029, 'min': 0.0, 'max': 0.0}, {'current': 2525.577, 'min': 0.0, 'max': 0.0}, {'current': 3299.997, 'min': 0.0, 'max': 0.0}, {'current': 2670.847, 'min': 0.0, 'max': 0.0}, {'current': 3299.699, 'min': 0.0, 'max': 0.0}, {'current': 3299.175, 'min': 0.0, 'max': 0.0}, {'current': 2197.86, 'min': 0.0, 'max': 0.0}, {'current': 2528.763, 'min': 0.0, 'max': 0.0}, {'current': 3197.634, 'min': 0.0, 'max': 0.0}, {'current': 3211.437, 'min': 0.0, 'max': 0.0}, {'current': 3251.423, 'min': 0.0, 'max': 0.0}, {'current': 2959.085, 'min': 0.0, 'max': 0.0}, {'current': 2065.145, 'min': 0.0, 'max': 0.0}, {'current': 2974.061, 'min': 0.0, 'max': 0.0}, {'current': 2068.626, 'min': 0.0, 'max': 0.0}, {'current': 2986.901, 'min': 0.0, 'max': 0.0}, {'current': 2340.617, 'min': 0.0, 'max': 0.0}, {'current': 3032.214, 'min': 0.0, 'max': 0.0}, {'current': 2431.142, 'min': 0.0, 'max': 0.0}, {'current': 3302.396, 'min': 0.0, 'max': 0.0}, {'current': 2944.793, 'min': 0.0, 'max': 0.0}, {'current': 3299.677, 'min': 0.0, 'max': 0.0}, {'current': 3298.69, 'min': 0.0, 'max': 0.0}, {'current': 2920.67, 'min': 0.0, 'max': 0.0}, {'current': 3090.867, 'min': 0.0, 'max': 0.0}, {'current': 2503.478, 'min': 0.0, 'max': 0.0}, {'current': 2473.681, 'min': 0.0, 'max': 0.0}, {'current': 2485.275, 'min': 0.0, 'max': 0.0}, {'current': 2488.179, 'min': 0.0, 'max': 0.0}, {'current': 2990.001, 'min': 0.0, 'max': 0.0}, {'current': 2452.206, 'min': 0.0, 'max': 0.0}, {'current': 2729.116, 'min': 0.0, 'max': 0.0}, {'current': 3001.225, 'min': 0.0, 'max': 0.0}, {'current': 3299.389, 'min': 0.0, 'max': 0.0}, {'current': 3299.765, 'min': 0.0, 'max': 0.0}, {'current': 3299.125, 'min': 0.0, 'max': 0.0}, {'current': 2002.363, 'min': 0.0, 'max': 0.0}, {'current': 2302.227, 'min': 0.0, 'max': 0.0}, {'current': 3132.468, 'min': 0.0, 'max': 0.0}, {'current': 2770.449, 'min': 0.0, 'max': 0.0}, {'current': 3288.126, 'min': 0.0, 'max': 0.0}, {'current': 3298.022, 'min': 0.0, 'max': 0.0}, {'current': 3298.691, 'min': 0.0, 'max': 0.0}, {'current': 3297.894, 'min': 0.0, 'max': 0.0}, {'current': 2828.863, 'min': 0.0, 'max': 0.0}, {'current': 3299.914, 'min': 0.0, 'max': 0.0}, {'current': 2721.595, 'min': 0.0, 'max': 0.0}, {'current': 3299.427, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 32.0, 'used': 0.01256561279296875}}, 'gpu': 'NVIDIA A10G', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}], 'memory': {'total': 747.9597625732422}}
2024-02-08 18:00:21,241 INFO    HandlerThread:1998 [system_monitor.py:probe():224] Finished collecting system info
2024-02-08 18:00:21,241 INFO    HandlerThread:1998 [system_monitor.py:probe():227] Publishing system info
2024-02-08 18:00:21,241 DEBUG   HandlerThread:1998 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment
2024-02-08 18:00:21,242 DEBUG   HandlerThread:1998 [system_info.py:_save_pip():68] Saving pip packages done
2024-02-08 18:00:21,242 DEBUG   HandlerThread:1998 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment
2024-02-08 18:00:22,105 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/requirements.txt
2024-02-08 18:00:22,105 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/conda-environment.yaml
2024-02-08 18:00:35,540 DEBUG   HandlerThread:1998 [system_info.py:_save_conda():87] Saving conda packages done
2024-02-08 18:00:35,542 INFO    HandlerThread:1998 [system_monitor.py:probe():229] Finished publishing system info
2024-02-08 18:00:35,545 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:00:35,545 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:00:35,545 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:00:35,545 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:00:35,546 DEBUG   SenderThread:1998 [sender.py:send():382] send: files
2024-02-08 18:00:35,546 INFO    SenderThread:1998 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now
2024-02-08 18:00:35,550 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: stop_status
2024-02-08 18:00:35,550 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: stop_status
2024-02-08 18:00:35,557 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:00:35,673 DEBUG   SenderThread:1998 [sender.py:send():382] send: telemetry
2024-02-08 18:00:35,673 DEBUG   SenderThread:1998 [sender.py:send():382] send: config
2024-02-08 18:00:35,674 DEBUG   SenderThread:1998 [sender.py:send():382] send: metric
2024-02-08 18:00:35,674 DEBUG   SenderThread:1998 [sender.py:send():382] send: telemetry
2024-02-08 18:00:35,674 DEBUG   SenderThread:1998 [sender.py:send():382] send: metric
2024-02-08 18:00:35,674 WARNING SenderThread:1998 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen)
2024-02-08 18:00:35,893 INFO    wandb-upload_0:1998 [upload_job.py:push():131] Uploaded file /tmp/tmp_9e35467wandb/s8xuyfes-wandb-metadata.json
2024-02-08 18:00:36,107 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/conda-environment.yaml
2024-02-08 18:00:36,107 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/output.log
2024-02-08 18:00:36,108 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/wandb-metadata.json
2024-02-08 18:00:36,534 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:00:38,108 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/output.log
2024-02-08 18:00:38,575 DEBUG   SenderThread:1998 [sender.py:send():382] send: exit
2024-02-08 18:00:38,575 INFO    SenderThread:1998 [sender.py:send_exit():589] handling exit code: 1
2024-02-08 18:00:38,575 INFO    SenderThread:1998 [sender.py:send_exit():591] handling runtime: 17
2024-02-08 18:00:38,575 INFO    SenderThread:1998 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:00:38,575 INFO    SenderThread:1998 [sender.py:send_exit():597] send defer
2024-02-08 18:00:38,575 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,576 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 0
2024-02-08 18:00:38,576 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,576 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 0
2024-02-08 18:00:38,576 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 1
2024-02-08 18:00:38,576 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,576 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 1
2024-02-08 18:00:38,576 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,576 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 1
2024-02-08 18:00:38,576 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 2
2024-02-08 18:00:38,576 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,576 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 2
2024-02-08 18:00:38,576 INFO    HandlerThread:1998 [system_monitor.py:finish():203] Stopping system monitor
2024-02-08 18:00:38,578 INFO    HandlerThread:1998 [interfaces.py:finish():202] Joined cpu monitor
2024-02-08 18:00:38,578 INFO    HandlerThread:1998 [interfaces.py:finish():202] Joined disk monitor
2024-02-08 18:00:38,578 DEBUG   SystemMonitor:1998 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-02-08 18:00:38,578 DEBUG   SystemMonitor:1998 [system_monitor.py:_start():179] Finished system metrics aggregation loop
2024-02-08 18:00:38,578 DEBUG   SystemMonitor:1998 [system_monitor.py:_start():183] Publishing last batch of metrics
2024-02-08 18:00:38,618 INFO    HandlerThread:1998 [interfaces.py:finish():202] Joined gpu monitor
2024-02-08 18:00:38,618 INFO    HandlerThread:1998 [interfaces.py:finish():202] Joined memory monitor
2024-02-08 18:00:38,618 INFO    HandlerThread:1998 [interfaces.py:finish():202] Joined network monitor
2024-02-08 18:00:38,618 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,619 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 2
2024-02-08 18:00:38,619 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 3
2024-02-08 18:00:38,619 DEBUG   SenderThread:1998 [sender.py:send():382] send: stats
2024-02-08 18:00:38,620 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,620 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 3
2024-02-08 18:00:38,620 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,620 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 3
2024-02-08 18:00:38,620 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 4
2024-02-08 18:00:38,620 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,620 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 4
2024-02-08 18:00:38,620 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,620 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 4
2024-02-08 18:00:38,620 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 5
2024-02-08 18:00:38,620 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,621 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 5
2024-02-08 18:00:38,621 DEBUG   SenderThread:1998 [sender.py:send():382] send: summary
2024-02-08 18:00:38,633 INFO    SenderThread:1998 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:00:38,633 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,633 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 5
2024-02-08 18:00:38,633 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 6
2024-02-08 18:00:38,634 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,634 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 6
2024-02-08 18:00:38,634 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,634 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 6
2024-02-08 18:00:38,638 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:00:38,773 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 7
2024-02-08 18:00:38,773 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:38,773 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 7
2024-02-08 18:00:38,774 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:38,774 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 7
2024-02-08 18:00:39,108 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/config.yaml
2024-02-08 18:00:39,108 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/wandb-summary.json
2024-02-08 18:00:39,575 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:00:39,685 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 8
2024-02-08 18:00:39,685 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:00:39,686 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:39,686 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 8
2024-02-08 18:00:39,686 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:39,686 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 8
2024-02-08 18:00:39,686 INFO    SenderThread:1998 [job_builder.py:build():298] Attempting to build job artifact
2024-02-08 18:00:39,688 INFO    SenderThread:1998 [job_builder.py:_get_source_type():439] no source found
2024-02-08 18:00:39,688 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 9
2024-02-08 18:00:39,688 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:39,688 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 9
2024-02-08 18:00:39,688 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:39,688 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 9
2024-02-08 18:00:39,688 INFO    SenderThread:1998 [dir_watcher.py:finish():358] shutting down directory watcher
2024-02-08 18:00:40,109 INFO    Thread-12 :1998 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/output.log
2024-02-08 18:00:40,109 INFO    SenderThread:1998 [dir_watcher.py:finish():388] scan: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files
2024-02-08 18:00:40,109 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/config.yaml config.yaml
2024-02-08 18:00:40,109 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/requirements.txt requirements.txt
2024-02-08 18:00:40,109 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/conda-environment.yaml conda-environment.yaml
2024-02-08 18:00:40,109 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/wandb-metadata.json wandb-metadata.json
2024-02-08 18:00:40,110 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/output.log output.log
2024-02-08 18:00:40,111 INFO    SenderThread:1998 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/wandb-summary.json wandb-summary.json
2024-02-08 18:00:40,111 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 10
2024-02-08 18:00:40,112 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:40,112 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 10
2024-02-08 18:00:40,114 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:40,114 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 10
2024-02-08 18:00:40,114 INFO    SenderThread:1998 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:00:40,385 INFO    wandb-upload_0:1998 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/config.yaml
2024-02-08 18:00:40,437 INFO    wandb-upload_3:1998 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/output.log
2024-02-08 18:00:40,470 INFO    wandb-upload_2:1998 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/conda-environment.yaml
2024-02-08 18:00:40,560 INFO    wandb-upload_1:1998 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/requirements.txt
2024-02-08 18:00:40,575 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:00:40,576 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:00:40,674 INFO    wandb-upload_4:1998 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/files/wandb-summary.json
2024-02-08 18:00:40,875 INFO    Thread-11 (_thread_body):1998 [sender.py:transition_state():617] send defer: 11
2024-02-08 18:00:40,875 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:40,875 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 11
2024-02-08 18:00:40,875 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:40,875 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 11
2024-02-08 18:00:40,875 INFO    SenderThread:1998 [file_pusher.py:join():181] waiting for file pusher
2024-02-08 18:00:40,876 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 12
2024-02-08 18:00:40,876 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:40,876 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 12
2024-02-08 18:00:40,876 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:40,876 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 12
2024-02-08 18:00:40,876 INFO    SenderThread:1998 [file_stream.py:finish():595] file stream finish called
2024-02-08 18:00:40,954 INFO    SenderThread:1998 [file_stream.py:finish():599] file stream finish is done
2024-02-08 18:00:40,954 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 13
2024-02-08 18:00:40,954 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:40,954 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 13
2024-02-08 18:00:40,954 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:40,954 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 13
2024-02-08 18:00:40,954 INFO    SenderThread:1998 [sender.py:transition_state():617] send defer: 14
2024-02-08 18:00:40,955 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:00:40,955 INFO    HandlerThread:1998 [handler.py:handle_request_defer():172] handle defer: 14
2024-02-08 18:00:40,955 DEBUG   SenderThread:1998 [sender.py:send():382] send: final
2024-02-08 18:00:40,955 DEBUG   SenderThread:1998 [sender.py:send():382] send: footer
2024-02-08 18:00:40,955 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: defer
2024-02-08 18:00:40,955 INFO    SenderThread:1998 [sender.py:send_request_defer():613] handle sender defer: 14
2024-02-08 18:00:40,955 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:00:40,956 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:00:40,956 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:00:40,956 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: server_info
2024-02-08 18:00:40,956 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:00:40,957 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: server_info
2024-02-08 18:00:40,959 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: get_summary
2024-02-08 18:00:40,959 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: sampled_history
2024-02-08 18:00:40,959 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:00:40,959 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: job_info
2024-02-08 18:00:41,013 DEBUG   SenderThread:1998 [sender.py:send_request():409] send_request: job_info
2024-02-08 18:00:41,013 INFO    MainThread:1998 [wandb_run.py:_footer_history_summary_info():3837] rendering history
2024-02-08 18:00:41,013 INFO    MainThread:1998 [wandb_run.py:_footer_history_summary_info():3869] rendering summary
2024-02-08 18:00:41,013 INFO    MainThread:1998 [wandb_run.py:_footer_sync_info():3796] logging synced files
2024-02-08 18:00:41,014 DEBUG   HandlerThread:1998 [handler.py:handle_request():146] handle_request: shutdown
2024-02-08 18:00:41,014 INFO    HandlerThread:1998 [handler.py:finish():866] shutting down handler
2024-02-08 18:00:41,960 INFO    WriterThread:1998 [datastore.py:close():294] close: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_180020-3wdew33h/run-3wdew33h.wandb
2024-02-08 18:00:42,013 INFO    SenderThread:1998 [sender.py:finish():1548] shutting down sender
2024-02-08 18:00:42,013 INFO    SenderThread:1998 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:00:42,013 INFO    SenderThread:1998 [file_pusher.py:join():181] waiting for file pusher