File size: 33,066 Bytes
ca0cb2e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
2024-02-08 18:53:11,296 INFO    StreamThr :1516 [internal.py:wandb_internal():86] W&B internal server running at pid: 1516, started at: 2024-02-08 18:53:11.296140
2024-02-08 18:53:11,300 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: status
2024-02-08 18:53:11,301 INFO    WriterThread:1516 [datastore.py:open_for_write():85] open: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/run-5uym8l7w.wandb
2024-02-08 18:53:11,302 DEBUG   SenderThread:1516 [sender.py:send():382] send: header
2024-02-08 18:53:11,302 DEBUG   SenderThread:1516 [sender.py:send():382] send: run
2024-02-08 18:53:11,515 INFO    SenderThread:1516 [dir_watcher.py:__init__():211] watching files in: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files
2024-02-08 18:53:11,515 INFO    SenderThread:1516 [sender.py:_start_run_threads():1136] run started: 5uym8l7w with start time 1707418391.295585
2024-02-08 18:53:11,519 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: check_version
2024-02-08 18:53:11,520 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: check_version
2024-02-08 18:53:11,603 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: run_start
2024-02-08 18:53:11,632 DEBUG   HandlerThread:1516 [system_info.py:__init__():32] System info init
2024-02-08 18:53:11,633 DEBUG   HandlerThread:1516 [system_info.py:__init__():47] System info init done
2024-02-08 18:53:11,633 INFO    HandlerThread:1516 [system_monitor.py:start():194] Starting system monitor
2024-02-08 18:53:11,633 INFO    SystemMonitor:1516 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-02-08 18:53:11,633 INFO    HandlerThread:1516 [system_monitor.py:probe():214] Collecting system info
2024-02-08 18:53:11,634 INFO    SystemMonitor:1516 [interfaces.py:start():190] Started cpu monitoring
2024-02-08 18:53:11,635 INFO    SystemMonitor:1516 [interfaces.py:start():190] Started disk monitoring
2024-02-08 18:53:11,636 INFO    SystemMonitor:1516 [interfaces.py:start():190] Started gpu monitoring
2024-02-08 18:53:11,637 INFO    SystemMonitor:1516 [interfaces.py:start():190] Started memory monitoring
2024-02-08 18:53:11,637 INFO    SystemMonitor:1516 [interfaces.py:start():190] Started network monitoring
2024-02-08 18:53:11,692 DEBUG   HandlerThread:1516 [system_info.py:probe():196] Probing system
2024-02-08 18:53:11,694 DEBUG   HandlerThread:1516 [gitlib.py:_init_repo():56] git repository is invalid
2024-02-08 18:53:11,694 DEBUG   HandlerThread:1516 [system_info.py:probe():244] Probing system done
2024-02-08 18:53:11,694 DEBUG   HandlerThread:1516 [system_monitor.py:probe():223] {'os': 'Linux-4.14.336-253.554.amzn2.x86_64-x86_64-with-glibc2.35', 'python': '3.10.13', 'heartbeatAt': '2024-02-08T18:53:11.692683', 'startedAt': '2024-02-08T18:53:11.292051', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/sagemaker-user/output-7b-26k-lora/../lora_finetuning_push_to_hub_save_local_latest.py', 'codePathLocal': None, 'host': 'default', 'username': 'sagemaker-user', 'executable': '/opt/conda/bin/python3', 'cpu_count': 96, 'cpu_count_logical': 192, 'cpu_freq': {'current': 3259.3440989583337, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 3299.457, 'min': 0.0, 'max': 0.0}, {'current': 3299.806, 'min': 0.0, 'max': 0.0}, {'current': 3205.787, 'min': 0.0, 'max': 0.0}, {'current': 3178.002, 'min': 0.0, 'max': 0.0}, {'current': 3200.706, 'min': 0.0, 'max': 0.0}, {'current': 3196.209, 'min': 0.0, 'max': 0.0}, {'current': 3188.711, 'min': 0.0, 'max': 0.0}, {'current': 3189.097, 'min': 0.0, 'max': 0.0}, {'current': 3206.992, 'min': 0.0, 'max': 0.0}, {'current': 3200.764, 'min': 0.0, 'max': 0.0}, {'current': 2810.939, 'min': 0.0, 'max': 0.0}, {'current': 2847.942, 'min': 0.0, 'max': 0.0}, {'current': 2923.114, 'min': 0.0, 'max': 0.0}, {'current': 3028.23, 'min': 0.0, 'max': 0.0}, {'current': 3012.472, 'min': 0.0, 'max': 0.0}, {'current': 3044.914, 'min': 0.0, 'max': 0.0}, {'current': 2953.37, 'min': 0.0, 'max': 0.0}, {'current': 2957.586, 'min': 0.0, 'max': 0.0}, {'current': 2984.294, 'min': 0.0, 'max': 0.0}, {'current': 2961.352, 'min': 0.0, 'max': 0.0}, {'current': 2901.559, 'min': 0.0, 'max': 0.0}, {'current': 2801.726, 'min': 0.0, 'max': 0.0}, {'current': 2985.17, 'min': 0.0, 'max': 0.0}, {'current': 2963.11, 'min': 0.0, 'max': 0.0}, {'current': 2912.001, 'min': 0.0, 'max': 0.0}, {'current': 2965.712, 'min': 0.0, 'max': 0.0}, {'current': 2966.821, 'min': 0.0, 'max': 0.0}, {'current': 2871.172, 'min': 0.0, 'max': 0.0}, {'current': 2974.758, 'min': 0.0, 'max': 0.0}, {'current': 2989.099, 'min': 0.0, 'max': 0.0}, {'current': 2948.999, 'min': 0.0, 'max': 0.0}, {'current': 2895.266, 'min': 0.0, 'max': 0.0}, {'current': 3299.988, 'min': 0.0, 'max': 0.0}, {'current': 2924.435, 'min': 0.0, 'max': 0.0}, {'current': 2919.839, 'min': 0.0, 'max': 0.0}, {'current': 2875.943, 'min': 0.0, 'max': 0.0}, {'current': 3300.697, 'min': 0.0, 'max': 0.0}, {'current': 2805.016, 'min': 0.0, 'max': 0.0}, {'current': 3298.583, 'min': 0.0, 'max': 0.0}, {'current': 3298.604, 'min': 0.0, 'max': 0.0}, {'current': 2673.256, 'min': 0.0, 'max': 0.0}, {'current': 3296.503, 'min': 0.0, 'max': 0.0}, {'current': 3139.11, 'min': 0.0, 'max': 0.0}, {'current': 3137.942, 'min': 0.0, 'max': 0.0}, {'current': 2833.969, 'min': 0.0, 'max': 0.0}, {'current': 3153.277, 'min': 0.0, 'max': 0.0}, {'current': 3178.769, 'min': 0.0, 'max': 0.0}, {'current': 3207.604, 'min': 0.0, 'max': 0.0}, {'current': 2892.532, 'min': 0.0, 'max': 0.0}, {'current': 3299.772, 'min': 0.0, 'max': 0.0}, {'current': 3299.641, 'min': 0.0, 'max': 0.0}, {'current': 3300.096, 'min': 0.0, 'max': 0.0}, {'current': 3298.515, 'min': 0.0, 'max': 0.0}, {'current': 3298.26, 'min': 0.0, 'max': 0.0}, {'current': 3299.084, 'min': 0.0, 'max': 0.0}, {'current': 3298.903, 'min': 0.0, 'max': 0.0}, {'current': 3298.866, 'min': 0.0, 'max': 0.0}, {'current': 3300.276, 'min': 0.0, 'max': 0.0}, {'current': 3298.704, 'min': 0.0, 'max': 0.0}, {'current': 3299.342, 'min': 0.0, 'max': 0.0}, {'current': 3297.795, 'min': 0.0, 'max': 0.0}, {'current': 3297.923, 'min': 0.0, 'max': 0.0}, {'current': 3298.013, 'min': 0.0, 'max': 0.0}, {'current': 3297.43, 'min': 0.0, 'max': 0.0}, {'current': 3299.564, 'min': 0.0, 'max': 0.0}, {'current': 3300.54, 'min': 0.0, 'max': 0.0}, {'current': 2747.263, 'min': 0.0, 'max': 0.0}, {'current': 3299.353, 'min': 0.0, 'max': 0.0}, {'current': 3297.896, 'min': 0.0, 'max': 0.0}, {'current': 2533.725, 'min': 0.0, 'max': 0.0}, {'current': 3299.656, 'min': 0.0, 'max': 0.0}, {'current': 3293.031, 'min': 0.0, 'max': 0.0}, {'current': 3027.834, 'min': 0.0, 'max': 0.0}, {'current': 3024.556, 'min': 0.0, 'max': 0.0}, {'current': 3067.379, 'min': 0.0, 'max': 0.0}, {'current': 3010.826, 'min': 0.0, 'max': 0.0}, {'current': 3101.81, 'min': 0.0, 'max': 0.0}, {'current': 2973.599, 'min': 0.0, 'max': 0.0}, {'current': 3061.27, 'min': 0.0, 'max': 0.0}, {'current': 3291.322, 'min': 0.0, 'max': 0.0}, {'current': 3017.723, 'min': 0.0, 'max': 0.0}, {'current': 2660.496, 'min': 0.0, 'max': 0.0}, {'current': 3004.775, 'min': 0.0, 'max': 0.0}, {'current': 3021.086, 'min': 0.0, 'max': 0.0}, {'current': 3027.592, 'min': 0.0, 'max': 0.0}, {'current': 3059.589, 'min': 0.0, 'max': 0.0}, {'current': 3019.568, 'min': 0.0, 'max': 0.0}, {'current': 3029.623, 'min': 0.0, 'max': 0.0}, {'current': 3080.312, 'min': 0.0, 'max': 0.0}, {'current': 3066.263, 'min': 0.0, 'max': 0.0}, {'current': 2998.37, 'min': 0.0, 'max': 0.0}, {'current': 2949.133, 'min': 0.0, 'max': 0.0}, {'current': 2964.0, 'min': 0.0, 'max': 0.0}, {'current': 3222.788, 'min': 0.0, 'max': 0.0}, {'current': 3299.63, 'min': 0.0, 'max': 0.0}, {'current': 2916.281, 'min': 0.0, 'max': 0.0}, {'current': 2825.282, 'min': 0.0, 'max': 0.0}, {'current': 3038.106, 'min': 0.0, 'max': 0.0}, {'current': 2895.235, 'min': 0.0, 'max': 0.0}, {'current': 3092.874, 'min': 0.0, 'max': 0.0}, {'current': 2924.994, 'min': 0.0, 'max': 0.0}, {'current': 2913.404, 'min': 0.0, 'max': 0.0}, {'current': 2935.638, 'min': 0.0, 'max': 0.0}, {'current': 2583.261, 'min': 0.0, 'max': 0.0}, {'current': 3101.162, 'min': 0.0, 'max': 0.0}, {'current': 3063.704, 'min': 0.0, 'max': 0.0}, {'current': 3093.23, 'min': 0.0, 'max': 0.0}, {'current': 3095.386, 'min': 0.0, 'max': 0.0}, {'current': 2925.773, 'min': 0.0, 'max': 0.0}, {'current': 2920.019, 'min': 0.0, 'max': 0.0}, {'current': 2916.15, 'min': 0.0, 'max': 0.0}, {'current': 2944.025, 'min': 0.0, 'max': 0.0}, {'current': 3259.667, 'min': 0.0, 'max': 0.0}, {'current': 3049.572, 'min': 0.0, 'max': 0.0}, {'current': 3263.675, 'min': 0.0, 'max': 0.0}, {'current': 3074.497, 'min': 0.0, 'max': 0.0}, {'current': 2923.985, 'min': 0.0, 'max': 0.0}, {'current': 2910.425, 'min': 0.0, 'max': 0.0}, {'current': 2812.861, 'min': 0.0, 'max': 0.0}, {'current': 2874.988, 'min': 0.0, 'max': 0.0}, {'current': 3120.953, 'min': 0.0, 'max': 0.0}, {'current': 3124.25, 'min': 0.0, 'max': 0.0}, {'current': 3113.753, 'min': 0.0, 'max': 0.0}, {'current': 3119.282, 'min': 0.0, 'max': 0.0}, {'current': 2982.281, 'min': 0.0, 'max': 0.0}, {'current': 3048.291, 'min': 0.0, 'max': 0.0}, {'current': 2987.986, 'min': 0.0, 'max': 0.0}, {'current': 2733.968, 'min': 0.0, 'max': 0.0}, {'current': 3274.202, 'min': 0.0, 'max': 0.0}, {'current': 3120.154, 'min': 0.0, 'max': 0.0}, {'current': 3122.388, 'min': 0.0, 'max': 0.0}, {'current': 2592.46, 'min': 0.0, 'max': 0.0}, {'current': 3121.448, 'min': 0.0, 'max': 0.0}, {'current': 3085.363, 'min': 0.0, 'max': 0.0}, {'current': 3176.23, 'min': 0.0, 'max': 0.0}, {'current': 3098.413, 'min': 0.0, 'max': 0.0}, {'current': 3131.838, 'min': 0.0, 'max': 0.0}, {'current': 3297.418, 'min': 0.0, 'max': 0.0}, {'current': 3144.573, 'min': 0.0, 'max': 0.0}, {'current': 3142.177, 'min': 0.0, 'max': 0.0}, {'current': 3135.089, 'min': 0.0, 'max': 0.0}, {'current': 3124.315, 'min': 0.0, 'max': 0.0}, {'current': 3206.745, 'min': 0.0, 'max': 0.0}, {'current': 3197.608, 'min': 0.0, 'max': 0.0}, {'current': 3271.659, 'min': 0.0, 'max': 0.0}, {'current': 3055.483, 'min': 0.0, 'max': 0.0}, {'current': 3299.813, 'min': 0.0, 'max': 0.0}, {'current': 3299.316, 'min': 0.0, 'max': 0.0}, {'current': 3298.471, 'min': 0.0, 'max': 0.0}, {'current': 3275.344, 'min': 0.0, 'max': 0.0}, {'current': 3298.318, 'min': 0.0, 'max': 0.0}, {'current': 3272.185, 'min': 0.0, 'max': 0.0}, {'current': 3299.032, 'min': 0.0, 'max': 0.0}, {'current': 3273.055, 'min': 0.0, 'max': 0.0}, {'current': 3277.573, 'min': 0.0, 'max': 0.0}, {'current': 3274.44, 'min': 0.0, 'max': 0.0}, {'current': 3275.925, 'min': 0.0, 'max': 0.0}, {'current': 3279.092, 'min': 0.0, 'max': 0.0}, {'current': 3275.089, 'min': 0.0, 'max': 0.0}, {'current': 3277.671, 'min': 0.0, 'max': 0.0}, {'current': 3299.135, 'min': 0.0, 'max': 0.0}, {'current': 3299.31, 'min': 0.0, 'max': 0.0}, {'current': 3298.038, 'min': 0.0, 'max': 0.0}, {'current': 3218.557, 'min': 0.0, 'max': 0.0}, {'current': 3298.859, 'min': 0.0, 'max': 0.0}, {'current': 3298.545, 'min': 0.0, 'max': 0.0}, {'current': 3027.843, 'min': 0.0, 'max': 0.0}, {'current': 3299.687, 'min': 0.0, 'max': 0.0}, {'current': 3053.229, 'min': 0.0, 'max': 0.0}, {'current': 3299.26, 'min': 0.0, 'max': 0.0}, {'current': 3059.862, 'min': 0.0, 'max': 0.0}, {'current': 3090.937, 'min': 0.0, 'max': 0.0}, {'current': 3094.897, 'min': 0.0, 'max': 0.0}, {'current': 3083.774, 'min': 0.0, 'max': 0.0}, {'current': 3027.722, 'min': 0.0, 'max': 0.0}, {'current': 3303.02, 'min': 0.0, 'max': 0.0}, {'current': 3069.951, 'min': 0.0, 'max': 0.0}, {'current': 3049.694, 'min': 0.0, 'max': 0.0}, {'current': 2814.624, 'min': 0.0, 'max': 0.0}, {'current': 3097.913, 'min': 0.0, 'max': 0.0}, {'current': 2788.423, 'min': 0.0, 'max': 0.0}, {'current': 3299.195, 'min': 0.0, 'max': 0.0}, {'current': 3069.533, 'min': 0.0, 'max': 0.0}, {'current': 3074.679, 'min': 0.0, 'max': 0.0}, {'current': 3066.308, 'min': 0.0, 'max': 0.0}, {'current': 2598.471, 'min': 0.0, 'max': 0.0}, {'current': 3299.109, 'min': 0.0, 'max': 0.0}, {'current': 3299.455, 'min': 0.0, 'max': 0.0}, {'current': 3298.945, 'min': 0.0, 'max': 0.0}, {'current': 3298.926, 'min': 0.0, 'max': 0.0}, {'current': 3299.003, 'min': 0.0, 'max': 0.0}, {'current': 3299.535, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 32.0, 'used': 0.01256561279296875}}, 'gpu': 'NVIDIA A10G', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}], 'memory': {'total': 747.9597625732422}}
2024-02-08 18:53:11,694 INFO    HandlerThread:1516 [system_monitor.py:probe():224] Finished collecting system info
2024-02-08 18:53:11,694 INFO    HandlerThread:1516 [system_monitor.py:probe():227] Publishing system info
2024-02-08 18:53:11,694 DEBUG   HandlerThread:1516 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment
2024-02-08 18:53:11,695 DEBUG   HandlerThread:1516 [system_info.py:_save_pip():68] Saving pip packages done
2024-02-08 18:53:11,695 DEBUG   HandlerThread:1516 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment
2024-02-08 18:53:12,517 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/conda-environment.yaml
2024-02-08 18:53:12,517 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/requirements.txt
2024-02-08 18:53:25,997 DEBUG   HandlerThread:1516 [system_info.py:_save_conda():87] Saving conda packages done
2024-02-08 18:53:25,999 INFO    HandlerThread:1516 [system_monitor.py:probe():229] Finished publishing system info
2024-02-08 18:53:26,003 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:53:26,003 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:53:26,003 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:53:26,003 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:53:26,003 DEBUG   SenderThread:1516 [sender.py:send():382] send: files
2024-02-08 18:53:26,004 INFO    SenderThread:1516 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now
2024-02-08 18:53:26,008 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: stop_status
2024-02-08 18:53:26,008 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: stop_status
2024-02-08 18:53:26,010 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:53:26,169 DEBUG   SenderThread:1516 [sender.py:send():382] send: telemetry
2024-02-08 18:53:26,169 DEBUG   SenderThread:1516 [sender.py:send():382] send: config
2024-02-08 18:53:26,170 DEBUG   SenderThread:1516 [sender.py:send():382] send: metric
2024-02-08 18:53:26,170 DEBUG   SenderThread:1516 [sender.py:send():382] send: telemetry
2024-02-08 18:53:26,170 DEBUG   SenderThread:1516 [sender.py:send():382] send: metric
2024-02-08 18:53:26,170 WARNING SenderThread:1516 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen)
2024-02-08 18:53:26,382 INFO    wandb-upload_0:1516 [upload_job.py:push():131] Uploaded file /tmp/tmpo2kahburwandb/uswugwyb-wandb-metadata.json
2024-02-08 18:53:26,518 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/conda-environment.yaml
2024-02-08 18:53:26,518 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/output.log
2024-02-08 18:53:26,518 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/wandb-metadata.json
2024-02-08 18:53:26,821 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:53:28,518 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/output.log
2024-02-08 18:53:28,894 DEBUG   SenderThread:1516 [sender.py:send():382] send: exit
2024-02-08 18:53:28,894 INFO    SenderThread:1516 [sender.py:send_exit():589] handling exit code: 1
2024-02-08 18:53:28,894 INFO    SenderThread:1516 [sender.py:send_exit():591] handling runtime: 17
2024-02-08 18:53:28,894 INFO    SenderThread:1516 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:53:28,894 INFO    SenderThread:1516 [sender.py:send_exit():597] send defer
2024-02-08 18:53:28,894 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,895 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 0
2024-02-08 18:53:28,895 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,895 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 0
2024-02-08 18:53:28,895 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 1
2024-02-08 18:53:28,895 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,895 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 1
2024-02-08 18:53:28,895 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,895 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 1
2024-02-08 18:53:28,895 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 2
2024-02-08 18:53:28,895 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,895 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 2
2024-02-08 18:53:28,895 INFO    HandlerThread:1516 [system_monitor.py:finish():203] Stopping system monitor
2024-02-08 18:53:28,897 DEBUG   SystemMonitor:1516 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-02-08 18:53:28,897 DEBUG   SystemMonitor:1516 [system_monitor.py:_start():179] Finished system metrics aggregation loop
2024-02-08 18:53:28,897 DEBUG   SystemMonitor:1516 [system_monitor.py:_start():183] Publishing last batch of metrics
2024-02-08 18:53:28,899 INFO    HandlerThread:1516 [interfaces.py:finish():202] Joined cpu monitor
2024-02-08 18:53:28,899 INFO    HandlerThread:1516 [interfaces.py:finish():202] Joined disk monitor
2024-02-08 18:53:28,936 INFO    HandlerThread:1516 [interfaces.py:finish():202] Joined gpu monitor
2024-02-08 18:53:28,936 INFO    HandlerThread:1516 [interfaces.py:finish():202] Joined memory monitor
2024-02-08 18:53:28,936 INFO    HandlerThread:1516 [interfaces.py:finish():202] Joined network monitor
2024-02-08 18:53:28,936 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,936 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 2
2024-02-08 18:53:28,936 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 3
2024-02-08 18:53:28,937 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,937 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 3
2024-02-08 18:53:28,937 DEBUG   SenderThread:1516 [sender.py:send():382] send: stats
2024-02-08 18:53:28,938 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,938 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 3
2024-02-08 18:53:28,938 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 4
2024-02-08 18:53:28,938 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,938 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 4
2024-02-08 18:53:28,938 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,938 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 4
2024-02-08 18:53:28,938 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 5
2024-02-08 18:53:28,938 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,938 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 5
2024-02-08 18:53:28,938 DEBUG   SenderThread:1516 [sender.py:send():382] send: summary
2024-02-08 18:53:28,939 INFO    SenderThread:1516 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:53:28,940 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,940 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 5
2024-02-08 18:53:28,940 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 6
2024-02-08 18:53:28,940 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:28,940 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 6
2024-02-08 18:53:28,940 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:28,940 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 6
2024-02-08 18:53:28,944 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:53:29,088 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 7
2024-02-08 18:53:29,089 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:29,089 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 7
2024-02-08 18:53:29,089 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:29,089 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 7
2024-02-08 18:53:29,519 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/config.yaml
2024-02-08 18:53:29,519 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/wandb-summary.json
2024-02-08 18:53:29,894 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:53:30,180 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 8
2024-02-08 18:53:30,180 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:53:30,181 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:30,181 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 8
2024-02-08 18:53:30,181 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:30,181 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 8
2024-02-08 18:53:30,181 INFO    SenderThread:1516 [job_builder.py:build():298] Attempting to build job artifact
2024-02-08 18:53:30,182 INFO    SenderThread:1516 [job_builder.py:_get_source_type():439] no source found
2024-02-08 18:53:30,182 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 9
2024-02-08 18:53:30,182 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:30,182 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 9
2024-02-08 18:53:30,183 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:30,183 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 9
2024-02-08 18:53:30,183 INFO    SenderThread:1516 [dir_watcher.py:finish():358] shutting down directory watcher
2024-02-08 18:53:30,519 INFO    Thread-12 :1516 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/output.log
2024-02-08 18:53:30,519 INFO    SenderThread:1516 [dir_watcher.py:finish():388] scan: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files
2024-02-08 18:53:30,520 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/config.yaml config.yaml
2024-02-08 18:53:30,520 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/requirements.txt requirements.txt
2024-02-08 18:53:30,520 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/conda-environment.yaml conda-environment.yaml
2024-02-08 18:53:30,520 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/wandb-metadata.json wandb-metadata.json
2024-02-08 18:53:30,520 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/output.log output.log
2024-02-08 18:53:30,522 INFO    SenderThread:1516 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/wandb-summary.json wandb-summary.json
2024-02-08 18:53:30,524 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 10
2024-02-08 18:53:30,524 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:30,524 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 10
2024-02-08 18:53:30,525 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:30,525 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 10
2024-02-08 18:53:30,525 INFO    SenderThread:1516 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:53:30,748 INFO    wandb-upload_0:1516 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/config.yaml
2024-02-08 18:53:30,807 INFO    wandb-upload_1:1516 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/requirements.txt
2024-02-08 18:53:30,838 INFO    wandb-upload_4:1516 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/wandb-summary.json
2024-02-08 18:53:30,856 INFO    wandb-upload_3:1516 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/output.log
2024-02-08 18:53:30,857 INFO    wandb-upload_2:1516 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/files/conda-environment.yaml
2024-02-08 18:53:30,895 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:53:30,895 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:53:31,057 INFO    Thread-11 (_thread_body):1516 [sender.py:transition_state():617] send defer: 11
2024-02-08 18:53:31,057 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:31,057 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 11
2024-02-08 18:53:31,058 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:31,058 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 11
2024-02-08 18:53:31,058 INFO    SenderThread:1516 [file_pusher.py:join():181] waiting for file pusher
2024-02-08 18:53:31,058 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 12
2024-02-08 18:53:31,058 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:31,058 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 12
2024-02-08 18:53:31,058 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:31,058 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 12
2024-02-08 18:53:31,058 INFO    SenderThread:1516 [file_stream.py:finish():595] file stream finish called
2024-02-08 18:53:31,123 INFO    SenderThread:1516 [file_stream.py:finish():599] file stream finish is done
2024-02-08 18:53:31,123 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 13
2024-02-08 18:53:31,124 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:31,124 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 13
2024-02-08 18:53:31,124 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:31,124 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 13
2024-02-08 18:53:31,124 INFO    SenderThread:1516 [sender.py:transition_state():617] send defer: 14
2024-02-08 18:53:31,124 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:53:31,124 INFO    HandlerThread:1516 [handler.py:handle_request_defer():172] handle defer: 14
2024-02-08 18:53:31,124 DEBUG   SenderThread:1516 [sender.py:send():382] send: final
2024-02-08 18:53:31,124 DEBUG   SenderThread:1516 [sender.py:send():382] send: footer
2024-02-08 18:53:31,124 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: defer
2024-02-08 18:53:31,124 INFO    SenderThread:1516 [sender.py:send_request_defer():613] handle sender defer: 14
2024-02-08 18:53:31,125 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:53:31,125 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:53:31,126 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:53:31,126 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: server_info
2024-02-08 18:53:31,126 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:53:31,126 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: server_info
2024-02-08 18:53:31,128 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: get_summary
2024-02-08 18:53:31,128 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: sampled_history
2024-02-08 18:53:31,129 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:53:31,129 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: job_info
2024-02-08 18:53:31,179 DEBUG   SenderThread:1516 [sender.py:send_request():409] send_request: job_info
2024-02-08 18:53:31,179 INFO    MainThread:1516 [wandb_run.py:_footer_history_summary_info():3837] rendering history
2024-02-08 18:53:31,180 INFO    MainThread:1516 [wandb_run.py:_footer_history_summary_info():3869] rendering summary
2024-02-08 18:53:31,180 INFO    MainThread:1516 [wandb_run.py:_footer_sync_info():3796] logging synced files
2024-02-08 18:53:31,180 DEBUG   HandlerThread:1516 [handler.py:handle_request():146] handle_request: shutdown
2024-02-08 18:53:31,180 INFO    HandlerThread:1516 [handler.py:finish():866] shutting down handler
2024-02-08 18:53:32,129 INFO    WriterThread:1516 [datastore.py:close():294] close: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_185311-5uym8l7w/run-5uym8l7w.wandb
2024-02-08 18:53:32,179 INFO    SenderThread:1516 [sender.py:finish():1548] shutting down sender
2024-02-08 18:53:32,180 INFO    SenderThread:1516 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:53:32,180 INFO    SenderThread:1516 [file_pusher.py:join():181] waiting for file pusher