File size: 32,869 Bytes
ca0cb2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
2024-02-08 18:45:21,177 INFO    StreamThr :789 [internal.py:wandb_internal():86] W&B internal server running at pid: 789, started at: 2024-02-08 18:45:21.176939
2024-02-08 18:45:21,181 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: status
2024-02-08 18:45:21,182 INFO    WriterThread:789 [datastore.py:open_for_write():85] open: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/run-kk66dbgv.wandb
2024-02-08 18:45:21,183 DEBUG   SenderThread:789 [sender.py:send():382] send: header
2024-02-08 18:45:21,183 DEBUG   SenderThread:789 [sender.py:send():382] send: run
2024-02-08 18:45:21,486 INFO    SenderThread:789 [dir_watcher.py:__init__():211] watching files in: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files
2024-02-08 18:45:21,486 INFO    SenderThread:789 [sender.py:_start_run_threads():1136] run started: kk66dbgv with start time 1707417921.176525
2024-02-08 18:45:21,490 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: check_version
2024-02-08 18:45:21,491 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: check_version
2024-02-08 18:45:21,534 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: run_start
2024-02-08 18:45:21,565 DEBUG   HandlerThread:789 [system_info.py:__init__():32] System info init
2024-02-08 18:45:21,565 DEBUG   HandlerThread:789 [system_info.py:__init__():47] System info init done
2024-02-08 18:45:21,565 INFO    HandlerThread:789 [system_monitor.py:start():194] Starting system monitor
2024-02-08 18:45:21,565 INFO    SystemMonitor:789 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-02-08 18:45:21,566 INFO    HandlerThread:789 [system_monitor.py:probe():214] Collecting system info
2024-02-08 18:45:21,566 INFO    SystemMonitor:789 [interfaces.py:start():190] Started cpu monitoring
2024-02-08 18:45:21,567 INFO    SystemMonitor:789 [interfaces.py:start():190] Started disk monitoring
2024-02-08 18:45:21,568 INFO    SystemMonitor:789 [interfaces.py:start():190] Started gpu monitoring
2024-02-08 18:45:21,568 INFO    SystemMonitor:789 [interfaces.py:start():190] Started memory monitoring
2024-02-08 18:45:21,569 INFO    SystemMonitor:789 [interfaces.py:start():190] Started network monitoring
2024-02-08 18:45:21,624 DEBUG   HandlerThread:789 [system_info.py:probe():196] Probing system
2024-02-08 18:45:21,626 DEBUG   HandlerThread:789 [gitlib.py:_init_repo():56] git repository is invalid
2024-02-08 18:45:21,626 DEBUG   HandlerThread:789 [system_info.py:probe():244] Probing system done
2024-02-08 18:45:21,626 DEBUG   HandlerThread:789 [system_monitor.py:probe():223] {'os': 'Linux-4.14.336-253.554.amzn2.x86_64-x86_64-with-glibc2.35', 'python': '3.10.13', 'heartbeatAt': '2024-02-08T18:45:21.624647', 'startedAt': '2024-02-08T18:45:21.172989', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/sagemaker-user/output-7b-26k-lora/../lora_finetuning_push_to_hub_save_local_latest.py', 'codePathLocal': None, 'host': 'default', 'username': 'sagemaker-user', 'executable': '/opt/conda/bin/python3', 'cpu_count': 96, 'cpu_count_logical': 192, 'cpu_freq': {'current': 3165.433677083333, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 3299.529, 'min': 0.0, 'max': 0.0}, {'current': 3299.574, 'min': 0.0, 'max': 0.0}, {'current': 3299.907, 'min': 0.0, 'max': 0.0}, {'current': 3300.263, 'min': 0.0, 'max': 0.0}, {'current': 3300.897, 'min': 0.0, 'max': 0.0}, {'current': 3300.385, 'min': 0.0, 'max': 0.0}, {'current': 3298.81, 'min': 0.0, 'max': 0.0}, {'current': 3299.625, 'min': 0.0, 'max': 0.0}, {'current': 3299.926, 'min': 0.0, 'max': 0.0}, {'current': 3300.158, 'min': 0.0, 'max': 0.0}, {'current': 3300.475, 'min': 0.0, 'max': 0.0}, {'current': 3300.688, 'min': 0.0, 'max': 0.0}, {'current': 3295.206, 'min': 0.0, 'max': 0.0}, {'current': 3294.95, 'min': 0.0, 'max': 0.0}, {'current': 3296.334, 'min': 0.0, 'max': 0.0}, {'current': 3297.722, 'min': 0.0, 'max': 0.0}, {'current': 3296.096, 'min': 0.0, 'max': 0.0}, {'current': 3298.885, 'min': 0.0, 'max': 0.0}, {'current': 3297.66, 'min': 0.0, 'max': 0.0}, {'current': 3297.613, 'min': 0.0, 'max': 0.0}, {'current': 3300.244, 'min': 0.0, 'max': 0.0}, {'current': 3241.28, 'min': 0.0, 'max': 0.0}, {'current': 3298.967, 'min': 0.0, 'max': 0.0}, {'current': 3298.457, 'min': 0.0, 'max': 0.0}, {'current': 3298.049, 'min': 0.0, 'max': 0.0}, {'current': 3299.552, 'min': 0.0, 'max': 0.0}, {'current': 3299.807, 'min': 0.0, 'max': 0.0}, {'current': 3242.538, 'min': 0.0, 'max': 0.0}, {'current': 3299.129, 'min': 0.0, 'max': 0.0}, {'current': 3263.29, 'min': 0.0, 'max': 0.0}, {'current': 3298.421, 'min': 0.0, 'max': 0.0}, {'current': 3299.256, 'min': 0.0, 'max': 0.0}, {'current': 3298.723, 'min': 0.0, 'max': 0.0}, {'current': 3299.38, 'min': 0.0, 'max': 0.0}, {'current': 3299.22, 'min': 0.0, 'max': 0.0}, {'current': 3298.243, 'min': 0.0, 'max': 0.0}, {'current': 3259.228, 'min': 0.0, 'max': 0.0}, {'current': 3297.656, 'min': 0.0, 'max': 0.0}, {'current': 3299.572, 'min': 0.0, 'max': 0.0}, {'current': 3299.246, 'min': 0.0, 'max': 0.0}, {'current': 3299.507, 'min': 0.0, 'max': 0.0}, {'current': 3298.177, 'min': 0.0, 'max': 0.0}, {'current': 3299.762, 'min': 0.0, 'max': 0.0}, {'current': 3300.244, 'min': 0.0, 'max': 0.0}, {'current': 3299.764, 'min': 0.0, 'max': 0.0}, {'current': 3299.71, 'min': 0.0, 'max': 0.0}, {'current': 3299.323, 'min': 0.0, 'max': 0.0}, {'current': 3298.972, 'min': 0.0, 'max': 0.0}, {'current': 2825.298, 'min': 0.0, 'max': 0.0}, {'current': 3300.031, 'min': 0.0, 'max': 0.0}, {'current': 3299.524, 'min': 0.0, 'max': 0.0}, {'current': 3300.753, 'min': 0.0, 'max': 0.0}, {'current': 3300.281, 'min': 0.0, 'max': 0.0}, {'current': 3300.549, 'min': 0.0, 'max': 0.0}, {'current': 3299.256, 'min': 0.0, 'max': 0.0}, {'current': 3300.719, 'min': 0.0, 'max': 0.0}, {'current': 3299.975, 'min': 0.0, 'max': 0.0}, {'current': 3300.721, 'min': 0.0, 'max': 0.0}, {'current': 3300.6, 'min': 0.0, 'max': 0.0}, {'current': 3300.408, 'min': 0.0, 'max': 0.0}, {'current': 3299.691, 'min': 0.0, 'max': 0.0}, {'current': 3299.817, 'min': 0.0, 'max': 0.0}, {'current': 3044.848, 'min': 0.0, 'max': 0.0}, {'current': 3299.416, 'min': 0.0, 'max': 0.0}, {'current': 3300.089, 'min': 0.0, 'max': 0.0}, {'current': 3299.351, 'min': 0.0, 'max': 0.0}, {'current': 2807.753, 'min': 0.0, 'max': 0.0}, {'current': 2853.085, 'min': 0.0, 'max': 0.0}, {'current': 3299.456, 'min': 0.0, 'max': 0.0}, {'current': 3300.145, 'min': 0.0, 'max': 0.0}, {'current': 3299.532, 'min': 0.0, 'max': 0.0}, {'current': 3300.121, 'min': 0.0, 'max': 0.0}, {'current': 3298.716, 'min': 0.0, 'max': 0.0}, {'current': 2964.818, 'min': 0.0, 'max': 0.0}, {'current': 3299.325, 'min': 0.0, 'max': 0.0}, {'current': 3053.968, 'min': 0.0, 'max': 0.0}, {'current': 3027.575, 'min': 0.0, 'max': 0.0}, {'current': 3034.933, 'min': 0.0, 'max': 0.0}, {'current': 3046.955, 'min': 0.0, 'max': 0.0}, {'current': 3017.189, 'min': 0.0, 'max': 0.0}, {'current': 3052.512, 'min': 0.0, 'max': 0.0}, {'current': 3049.645, 'min': 0.0, 'max': 0.0}, {'current': 3056.957, 'min': 0.0, 'max': 0.0}, {'current': 3063.442, 'min': 0.0, 'max': 0.0}, {'current': 3026.186, 'min': 0.0, 'max': 0.0}, {'current': 3059.995, 'min': 0.0, 'max': 0.0}, {'current': 3058.868, 'min': 0.0, 'max': 0.0}, {'current': 3059.978, 'min': 0.0, 'max': 0.0}, {'current': 2639.982, 'min': 0.0, 'max': 0.0}, {'current': 3043.356, 'min': 0.0, 'max': 0.0}, {'current': 3032.312, 'min': 0.0, 'max': 0.0}, {'current': 3024.784, 'min': 0.0, 'max': 0.0}, {'current': 3309.767, 'min': 0.0, 'max': 0.0}, {'current': 3044.167, 'min': 0.0, 'max': 0.0}, {'current': 3074.821, 'min': 0.0, 'max': 0.0}, {'current': 2744.486, 'min': 0.0, 'max': 0.0}, {'current': 2948.546, 'min': 0.0, 'max': 0.0}, {'current': 3265.131, 'min': 0.0, 'max': 0.0}, {'current': 3260.141, 'min': 0.0, 'max': 0.0}, {'current': 3264.163, 'min': 0.0, 'max': 0.0}, {'current': 3299.133, 'min': 0.0, 'max': 0.0}, {'current': 3260.992, 'min': 0.0, 'max': 0.0}, {'current': 3299.601, 'min': 0.0, 'max': 0.0}, {'current': 3266.096, 'min': 0.0, 'max': 0.0}, {'current': 3299.245, 'min': 0.0, 'max': 0.0}, {'current': 3298.423, 'min': 0.0, 'max': 0.0}, {'current': 3262.508, 'min': 0.0, 'max': 0.0}, {'current': 3270.751, 'min': 0.0, 'max': 0.0}, {'current': 3265.57, 'min': 0.0, 'max': 0.0}, {'current': 3268.221, 'min': 0.0, 'max': 0.0}, {'current': 3262.709, 'min': 0.0, 'max': 0.0}, {'current': 3262.206, 'min': 0.0, 'max': 0.0}, {'current': 3270.565, 'min': 0.0, 'max': 0.0}, {'current': 3298.66, 'min': 0.0, 'max': 0.0}, {'current': 3271.159, 'min': 0.0, 'max': 0.0}, {'current': 3269.543, 'min': 0.0, 'max': 0.0}, {'current': 2891.532, 'min': 0.0, 'max': 0.0}, {'current': 3299.121, 'min': 0.0, 'max': 0.0}, {'current': 3267.57, 'min': 0.0, 'max': 0.0}, {'current': 3273.911, 'min': 0.0, 'max': 0.0}, {'current': 3271.579, 'min': 0.0, 'max': 0.0}, {'current': 3271.885, 'min': 0.0, 'max': 0.0}, {'current': 3269.181, 'min': 0.0, 'max': 0.0}, {'current': 3299.12, 'min': 0.0, 'max': 0.0}, {'current': 3272.274, 'min': 0.0, 'max': 0.0}, {'current': 3298.966, 'min': 0.0, 'max': 0.0}, {'current': 3298.849, 'min': 0.0, 'max': 0.0}, {'current': 3298.555, 'min': 0.0, 'max': 0.0}, {'current': 3298.44, 'min': 0.0, 'max': 0.0}, {'current': 3299.027, 'min': 0.0, 'max': 0.0}, {'current': 3299.417, 'min': 0.0, 'max': 0.0}, {'current': 3298.561, 'min': 0.0, 'max': 0.0}, {'current': 3298.684, 'min': 0.0, 'max': 0.0}, {'current': 3298.308, 'min': 0.0, 'max': 0.0}, {'current': 3299.07, 'min': 0.0, 'max': 0.0}, {'current': 3297.982, 'min': 0.0, 'max': 0.0}, {'current': 3298.738, 'min': 0.0, 'max': 0.0}, {'current': 3297.558, 'min': 0.0, 'max': 0.0}, {'current': 3297.74, 'min': 0.0, 'max': 0.0}, {'current': 3299.099, 'min': 0.0, 'max': 0.0}, {'current': 3299.072, 'min': 0.0, 'max': 0.0}, {'current': 3298.608, 'min': 0.0, 'max': 0.0}, {'current': 3299.045, 'min': 0.0, 'max': 0.0}, {'current': 3293.695, 'min': 0.0, 'max': 0.0}, {'current': 3299.228, 'min': 0.0, 'max': 0.0}, {'current': 3299.509, 'min': 0.0, 'max': 0.0}, {'current': 3298.722, 'min': 0.0, 'max': 0.0}, {'current': 3299.9, 'min': 0.0, 'max': 0.0}, {'current': 3299.551, 'min': 0.0, 'max': 0.0}, {'current': 3299.029, 'min': 0.0, 'max': 0.0}, {'current': 3299.307, 'min': 0.0, 'max': 0.0}, {'current': 3298.752, 'min': 0.0, 'max': 0.0}, {'current': 3299.526, 'min': 0.0, 'max': 0.0}, {'current': 3299.18, 'min': 0.0, 'max': 0.0}, {'current': 3299.048, 'min': 0.0, 'max': 0.0}, {'current': 3299.113, 'min': 0.0, 'max': 0.0}, {'current': 3299.319, 'min': 0.0, 'max': 0.0}, {'current': 3299.493, 'min': 0.0, 'max': 0.0}, {'current': 3299.269, 'min': 0.0, 'max': 0.0}, {'current': 3299.472, 'min': 0.0, 'max': 0.0}, {'current': 3299.484, 'min': 0.0, 'max': 0.0}, {'current': 3299.416, 'min': 0.0, 'max': 0.0}, {'current': 3299.596, 'min': 0.0, 'max': 0.0}, {'current': 3299.52, 'min': 0.0, 'max': 0.0}, {'current': 3298.897, 'min': 0.0, 'max': 0.0}, {'current': 3299.216, 'min': 0.0, 'max': 0.0}, {'current': 3299.001, 'min': 0.0, 'max': 0.0}, {'current': 3300.316, 'min': 0.0, 'max': 0.0}, {'current': 2995.097, 'min': 0.0, 'max': 0.0}, {'current': 2690.969, 'min': 0.0, 'max': 0.0}, {'current': 3300.22, 'min': 0.0, 'max': 0.0}, {'current': 3008.014, 'min': 0.0, 'max': 0.0}, {'current': 3299.622, 'min': 0.0, 'max': 0.0}, {'current': 2987.966, 'min': 0.0, 'max': 0.0}, {'current': 3021.177, 'min': 0.0, 'max': 0.0}, {'current': 3032.724, 'min': 0.0, 'max': 0.0}, {'current': 2997.024, 'min': 0.0, 'max': 0.0}, {'current': 3036.103, 'min': 0.0, 'max': 0.0}, {'current': 2998.071, 'min': 0.0, 'max': 0.0}, {'current': 3298.959, 'min': 0.0, 'max': 0.0}, {'current': 3043.183, 'min': 0.0, 'max': 0.0}, {'current': 3299.567, 'min': 0.0, 'max': 0.0}, {'current': 3027.171, 'min': 0.0, 'max': 0.0}, {'current': 2961.029, 'min': 0.0, 'max': 0.0}, {'current': 3059.873, 'min': 0.0, 'max': 0.0}, {'current': 3037.985, 'min': 0.0, 'max': 0.0}, {'current': 3009.778, 'min': 0.0, 'max': 0.0}, {'current': 3032.565, 'min': 0.0, 'max': 0.0}, {'current': 3272.763, 'min': 0.0, 'max': 0.0}, {'current': 3109.523, 'min': 0.0, 'max': 0.0}, {'current': 3299.902, 'min': 0.0, 'max': 0.0}, {'current': 3283.894, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 32.0, 'used': 0.012481689453125}}, 'gpu': 'NVIDIA A10G', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}], 'memory': {'total': 747.9597625732422}}
2024-02-08 18:45:21,626 INFO    HandlerThread:789 [system_monitor.py:probe():224] Finished collecting system info
2024-02-08 18:45:21,626 INFO    HandlerThread:789 [system_monitor.py:probe():227] Publishing system info
2024-02-08 18:45:21,626 DEBUG   HandlerThread:789 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment
2024-02-08 18:45:21,627 DEBUG   HandlerThread:789 [system_info.py:_save_pip():68] Saving pip packages done
2024-02-08 18:45:21,627 DEBUG   HandlerThread:789 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment
2024-02-08 18:45:22,487 INFO    Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:22,487 INFO    Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt
2024-02-08 18:45:35,889 DEBUG   HandlerThread:789 [system_info.py:_save_conda():87] Saving conda packages done
2024-02-08 18:45:35,890 INFO    HandlerThread:789 [system_monitor.py:probe():229] Finished publishing system info
2024-02-08 18:45:35,894 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:35,894 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:45:35,894 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:35,894 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:45:35,895 DEBUG   SenderThread:789 [sender.py:send():382] send: files
2024-02-08 18:45:35,895 INFO    SenderThread:789 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now
2024-02-08 18:45:35,899 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: stop_status
2024-02-08 18:45:35,900 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: stop_status
2024-02-08 18:45:35,906 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:45:36,041 DEBUG   SenderThread:789 [sender.py:send():382] send: telemetry
2024-02-08 18:45:36,042 DEBUG   SenderThread:789 [sender.py:send():382] send: config
2024-02-08 18:45:36,042 DEBUG   SenderThread:789 [sender.py:send():382] send: metric
2024-02-08 18:45:36,042 DEBUG   SenderThread:789 [sender.py:send():382] send: telemetry
2024-02-08 18:45:36,042 DEBUG   SenderThread:789 [sender.py:send():382] send: metric
2024-02-08 18:45:36,042 WARNING SenderThread:789 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen)
2024-02-08 18:45:36,244 INFO    wandb-upload_0:789 [upload_job.py:push():131] Uploaded file /tmp/tmph6r9wm0rwandb/nitc481h-wandb-metadata.json
2024-02-08 18:45:36,488 INFO    Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:36,488 INFO    Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-metadata.json
2024-02-08 18:45:36,488 INFO    Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:36,754 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:38,488 INFO    Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:38,842 DEBUG   SenderThread:789 [sender.py:send():382] send: exit
2024-02-08 18:45:38,842 INFO    SenderThread:789 [sender.py:send_exit():589] handling exit code: 1
2024-02-08 18:45:38,842 INFO    SenderThread:789 [sender.py:send_exit():591] handling runtime: 17
2024-02-08 18:45:38,842 INFO    SenderThread:789 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:45:38,842 INFO    SenderThread:789 [sender.py:send_exit():597] send defer
2024-02-08 18:45:38,843 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 0
2024-02-08 18:45:38,843 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,843 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 0
2024-02-08 18:45:38,843 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 1
2024-02-08 18:45:38,843 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 1
2024-02-08 18:45:38,843 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,843 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 1
2024-02-08 18:45:38,843 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 2
2024-02-08 18:45:38,843 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 2
2024-02-08 18:45:38,843 INFO    HandlerThread:789 [system_monitor.py:finish():203] Stopping system monitor
2024-02-08 18:45:38,844 DEBUG   SystemMonitor:789 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-02-08 18:45:38,844 INFO    HandlerThread:789 [interfaces.py:finish():202] Joined cpu monitor
2024-02-08 18:45:38,844 DEBUG   SystemMonitor:789 [system_monitor.py:_start():179] Finished system metrics aggregation loop
2024-02-08 18:45:38,845 INFO    HandlerThread:789 [interfaces.py:finish():202] Joined disk monitor
2024-02-08 18:45:38,845 DEBUG   SystemMonitor:789 [system_monitor.py:_start():183] Publishing last batch of metrics
2024-02-08 18:45:38,884 INFO    HandlerThread:789 [interfaces.py:finish():202] Joined gpu monitor
2024-02-08 18:45:38,884 INFO    HandlerThread:789 [interfaces.py:finish():202] Joined memory monitor
2024-02-08 18:45:38,884 INFO    HandlerThread:789 [interfaces.py:finish():202] Joined network monitor
2024-02-08 18:45:38,885 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,885 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 2
2024-02-08 18:45:38,885 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 3
2024-02-08 18:45:38,885 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,885 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 3
2024-02-08 18:45:38,885 DEBUG   SenderThread:789 [sender.py:send():382] send: stats
2024-02-08 18:45:38,886 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,886 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 3
2024-02-08 18:45:38,886 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 4
2024-02-08 18:45:38,886 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,886 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 4
2024-02-08 18:45:38,886 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,886 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 4
2024-02-08 18:45:38,886 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 5
2024-02-08 18:45:38,886 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,886 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 5
2024-02-08 18:45:38,887 DEBUG   SenderThread:789 [sender.py:send():382] send: summary
2024-02-08 18:45:38,888 INFO    SenderThread:789 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:45:38,888 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,888 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 5
2024-02-08 18:45:38,888 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 6
2024-02-08 18:45:38,888 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,888 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 6
2024-02-08 18:45:38,888 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,888 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 6
2024-02-08 18:45:38,893 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:39,019 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 7
2024-02-08 18:45:39,019 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:39,019 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 7
2024-02-08 18:45:39,019 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:39,020 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 7
2024-02-08 18:45:39,489 INFO    Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml
2024-02-08 18:45:39,489 INFO    Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json
2024-02-08 18:45:39,842 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:40,054 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 8
2024-02-08 18:45:40,054 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:40,054 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,055 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 8
2024-02-08 18:45:40,055 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,055 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 8
2024-02-08 18:45:40,055 INFO    SenderThread:789 [job_builder.py:build():298] Attempting to build job artifact
2024-02-08 18:45:40,056 INFO    SenderThread:789 [job_builder.py:_get_source_type():439] no source found
2024-02-08 18:45:40,056 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 9
2024-02-08 18:45:40,056 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,056 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 9
2024-02-08 18:45:40,056 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,057 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 9
2024-02-08 18:45:40,057 INFO    SenderThread:789 [dir_watcher.py:finish():358] shutting down directory watcher
2024-02-08 18:45:40,489 INFO    Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:40,489 INFO    SenderThread:789 [dir_watcher.py:finish():388] scan: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml config.yaml
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt requirements.txt
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml conda-environment.yaml
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-metadata.json wandb-metadata.json
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log output.log
2024-02-08 18:45:40,490 INFO    SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json wandb-summary.json
2024-02-08 18:45:40,493 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 10
2024-02-08 18:45:40,493 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,493 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 10
2024-02-08 18:45:40,502 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,502 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 10
2024-02-08 18:45:40,502 INFO    SenderThread:789 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:45:40,709 INFO    wandb-upload_1:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml
2024-02-08 18:45:40,784 INFO    wandb-upload_2:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:40,825 INFO    wandb-upload_4:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json
2024-02-08 18:45:40,825 INFO    wandb-upload_3:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:40,843 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:40,843 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:40,852 INFO    wandb-upload_0:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt
2024-02-08 18:45:41,053 INFO    Thread-11 (_thread_body):789 [sender.py:transition_state():617] send defer: 11
2024-02-08 18:45:41,053 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,053 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 11
2024-02-08 18:45:41,053 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,053 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 11
2024-02-08 18:45:41,053 INFO    SenderThread:789 [file_pusher.py:join():181] waiting for file pusher
2024-02-08 18:45:41,054 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 12
2024-02-08 18:45:41,054 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,054 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 12
2024-02-08 18:45:41,054 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,054 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 12
2024-02-08 18:45:41,054 INFO    SenderThread:789 [file_stream.py:finish():595] file stream finish called
2024-02-08 18:45:41,126 INFO    SenderThread:789 [file_stream.py:finish():599] file stream finish is done
2024-02-08 18:45:41,126 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 13
2024-02-08 18:45:41,126 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,126 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 13
2024-02-08 18:45:41,127 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,127 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 13
2024-02-08 18:45:41,127 INFO    SenderThread:789 [sender.py:transition_state():617] send defer: 14
2024-02-08 18:45:41,127 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,127 INFO    HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 14
2024-02-08 18:45:41,127 DEBUG   SenderThread:789 [sender.py:send():382] send: final
2024-02-08 18:45:41,127 DEBUG   SenderThread:789 [sender.py:send():382] send: footer
2024-02-08 18:45:41,127 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,127 INFO    SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 14
2024-02-08 18:45:41,128 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:41,128 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:41,128 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:41,129 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:41,129 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: server_info
2024-02-08 18:45:41,129 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: get_summary
2024-02-08 18:45:41,130 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: server_info
2024-02-08 18:45:41,131 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: sampled_history
2024-02-08 18:45:41,132 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:45:41,132 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: job_info
2024-02-08 18:45:41,196 DEBUG   SenderThread:789 [sender.py:send_request():409] send_request: job_info
2024-02-08 18:45:41,197 INFO    MainThread:789 [wandb_run.py:_footer_history_summary_info():3837] rendering history
2024-02-08 18:45:41,197 INFO    MainThread:789 [wandb_run.py:_footer_history_summary_info():3869] rendering summary
2024-02-08 18:45:41,197 INFO    MainThread:789 [wandb_run.py:_footer_sync_info():3796] logging synced files
2024-02-08 18:45:41,197 DEBUG   HandlerThread:789 [handler.py:handle_request():146] handle_request: shutdown
2024-02-08 18:45:41,197 INFO    HandlerThread:789 [handler.py:finish():866] shutting down handler
2024-02-08 18:45:42,132 INFO    WriterThread:789 [datastore.py:close():294] close: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/run-kk66dbgv.wandb
2024-02-08 18:45:42,197 INFO    SenderThread:789 [sender.py:finish():1548] shutting down sender
2024-02-08 18:45:42,197 INFO    SenderThread:789 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:45:42,197 INFO    SenderThread:789 [file_pusher.py:join():181] waiting for file pusher