File size: 32,869 Bytes
ca0cb2e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
2024-02-08 18:45:21,177 INFO StreamThr :789 [internal.py:wandb_internal():86] W&B internal server running at pid: 789, started at: 2024-02-08 18:45:21.176939
2024-02-08 18:45:21,181 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: status
2024-02-08 18:45:21,182 INFO WriterThread:789 [datastore.py:open_for_write():85] open: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/run-kk66dbgv.wandb
2024-02-08 18:45:21,183 DEBUG SenderThread:789 [sender.py:send():382] send: header
2024-02-08 18:45:21,183 DEBUG SenderThread:789 [sender.py:send():382] send: run
2024-02-08 18:45:21,486 INFO SenderThread:789 [dir_watcher.py:__init__():211] watching files in: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files
2024-02-08 18:45:21,486 INFO SenderThread:789 [sender.py:_start_run_threads():1136] run started: kk66dbgv with start time 1707417921.176525
2024-02-08 18:45:21,490 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: check_version
2024-02-08 18:45:21,491 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: check_version
2024-02-08 18:45:21,534 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: run_start
2024-02-08 18:45:21,565 DEBUG HandlerThread:789 [system_info.py:__init__():32] System info init
2024-02-08 18:45:21,565 DEBUG HandlerThread:789 [system_info.py:__init__():47] System info init done
2024-02-08 18:45:21,565 INFO HandlerThread:789 [system_monitor.py:start():194] Starting system monitor
2024-02-08 18:45:21,565 INFO SystemMonitor:789 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-02-08 18:45:21,566 INFO HandlerThread:789 [system_monitor.py:probe():214] Collecting system info
2024-02-08 18:45:21,566 INFO SystemMonitor:789 [interfaces.py:start():190] Started cpu monitoring
2024-02-08 18:45:21,567 INFO SystemMonitor:789 [interfaces.py:start():190] Started disk monitoring
2024-02-08 18:45:21,568 INFO SystemMonitor:789 [interfaces.py:start():190] Started gpu monitoring
2024-02-08 18:45:21,568 INFO SystemMonitor:789 [interfaces.py:start():190] Started memory monitoring
2024-02-08 18:45:21,569 INFO SystemMonitor:789 [interfaces.py:start():190] Started network monitoring
2024-02-08 18:45:21,624 DEBUG HandlerThread:789 [system_info.py:probe():196] Probing system
2024-02-08 18:45:21,626 DEBUG HandlerThread:789 [gitlib.py:_init_repo():56] git repository is invalid
2024-02-08 18:45:21,626 DEBUG HandlerThread:789 [system_info.py:probe():244] Probing system done
2024-02-08 18:45:21,626 DEBUG HandlerThread:789 [system_monitor.py:probe():223] {'os': 'Linux-4.14.336-253.554.amzn2.x86_64-x86_64-with-glibc2.35', 'python': '3.10.13', 'heartbeatAt': '2024-02-08T18:45:21.624647', 'startedAt': '2024-02-08T18:45:21.172989', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/sagemaker-user/output-7b-26k-lora/../lora_finetuning_push_to_hub_save_local_latest.py', 'codePathLocal': None, 'host': 'default', 'username': 'sagemaker-user', 'executable': '/opt/conda/bin/python3', 'cpu_count': 96, 'cpu_count_logical': 192, 'cpu_freq': {'current': 3165.433677083333, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 3299.529, 'min': 0.0, 'max': 0.0}, {'current': 3299.574, 'min': 0.0, 'max': 0.0}, {'current': 3299.907, 'min': 0.0, 'max': 0.0}, {'current': 3300.263, 'min': 0.0, 'max': 0.0}, {'current': 3300.897, 'min': 0.0, 'max': 0.0}, {'current': 3300.385, 'min': 0.0, 'max': 0.0}, {'current': 3298.81, 'min': 0.0, 'max': 0.0}, {'current': 3299.625, 'min': 0.0, 'max': 0.0}, {'current': 3299.926, 'min': 0.0, 'max': 0.0}, {'current': 3300.158, 'min': 0.0, 'max': 0.0}, {'current': 3300.475, 'min': 0.0, 'max': 0.0}, {'current': 3300.688, 'min': 0.0, 'max': 0.0}, {'current': 3295.206, 'min': 0.0, 'max': 0.0}, {'current': 3294.95, 'min': 0.0, 'max': 0.0}, {'current': 3296.334, 'min': 0.0, 'max': 0.0}, {'current': 3297.722, 'min': 0.0, 'max': 0.0}, {'current': 3296.096, 'min': 0.0, 'max': 0.0}, {'current': 3298.885, 'min': 0.0, 'max': 0.0}, {'current': 3297.66, 'min': 0.0, 'max': 0.0}, {'current': 3297.613, 'min': 0.0, 'max': 0.0}, {'current': 3300.244, 'min': 0.0, 'max': 0.0}, {'current': 3241.28, 'min': 0.0, 'max': 0.0}, {'current': 3298.967, 'min': 0.0, 'max': 0.0}, {'current': 3298.457, 'min': 0.0, 'max': 0.0}, {'current': 3298.049, 'min': 0.0, 'max': 0.0}, {'current': 3299.552, 'min': 0.0, 'max': 0.0}, {'current': 3299.807, 'min': 0.0, 'max': 0.0}, {'current': 3242.538, 'min': 0.0, 'max': 0.0}, {'current': 3299.129, 'min': 0.0, 'max': 0.0}, {'current': 3263.29, 'min': 0.0, 'max': 0.0}, {'current': 3298.421, 'min': 0.0, 'max': 0.0}, {'current': 3299.256, 'min': 0.0, 'max': 0.0}, {'current': 3298.723, 'min': 0.0, 'max': 0.0}, {'current': 3299.38, 'min': 0.0, 'max': 0.0}, {'current': 3299.22, 'min': 0.0, 'max': 0.0}, {'current': 3298.243, 'min': 0.0, 'max': 0.0}, {'current': 3259.228, 'min': 0.0, 'max': 0.0}, {'current': 3297.656, 'min': 0.0, 'max': 0.0}, {'current': 3299.572, 'min': 0.0, 'max': 0.0}, {'current': 3299.246, 'min': 0.0, 'max': 0.0}, {'current': 3299.507, 'min': 0.0, 'max': 0.0}, {'current': 3298.177, 'min': 0.0, 'max': 0.0}, {'current': 3299.762, 'min': 0.0, 'max': 0.0}, {'current': 3300.244, 'min': 0.0, 'max': 0.0}, {'current': 3299.764, 'min': 0.0, 'max': 0.0}, {'current': 3299.71, 'min': 0.0, 'max': 0.0}, {'current': 3299.323, 'min': 0.0, 'max': 0.0}, {'current': 3298.972, 'min': 0.0, 'max': 0.0}, {'current': 2825.298, 'min': 0.0, 'max': 0.0}, {'current': 3300.031, 'min': 0.0, 'max': 0.0}, {'current': 3299.524, 'min': 0.0, 'max': 0.0}, {'current': 3300.753, 'min': 0.0, 'max': 0.0}, {'current': 3300.281, 'min': 0.0, 'max': 0.0}, {'current': 3300.549, 'min': 0.0, 'max': 0.0}, {'current': 3299.256, 'min': 0.0, 'max': 0.0}, {'current': 3300.719, 'min': 0.0, 'max': 0.0}, {'current': 3299.975, 'min': 0.0, 'max': 0.0}, {'current': 3300.721, 'min': 0.0, 'max': 0.0}, {'current': 3300.6, 'min': 0.0, 'max': 0.0}, {'current': 3300.408, 'min': 0.0, 'max': 0.0}, {'current': 3299.691, 'min': 0.0, 'max': 0.0}, {'current': 3299.817, 'min': 0.0, 'max': 0.0}, {'current': 3044.848, 'min': 0.0, 'max': 0.0}, {'current': 3299.416, 'min': 0.0, 'max': 0.0}, {'current': 3300.089, 'min': 0.0, 'max': 0.0}, {'current': 3299.351, 'min': 0.0, 'max': 0.0}, {'current': 2807.753, 'min': 0.0, 'max': 0.0}, {'current': 2853.085, 'min': 0.0, 'max': 0.0}, {'current': 3299.456, 'min': 0.0, 'max': 0.0}, {'current': 3300.145, 'min': 0.0, 'max': 0.0}, {'current': 3299.532, 'min': 0.0, 'max': 0.0}, {'current': 3300.121, 'min': 0.0, 'max': 0.0}, {'current': 3298.716, 'min': 0.0, 'max': 0.0}, {'current': 2964.818, 'min': 0.0, 'max': 0.0}, {'current': 3299.325, 'min': 0.0, 'max': 0.0}, {'current': 3053.968, 'min': 0.0, 'max': 0.0}, {'current': 3027.575, 'min': 0.0, 'max': 0.0}, {'current': 3034.933, 'min': 0.0, 'max': 0.0}, {'current': 3046.955, 'min': 0.0, 'max': 0.0}, {'current': 3017.189, 'min': 0.0, 'max': 0.0}, {'current': 3052.512, 'min': 0.0, 'max': 0.0}, {'current': 3049.645, 'min': 0.0, 'max': 0.0}, {'current': 3056.957, 'min': 0.0, 'max': 0.0}, {'current': 3063.442, 'min': 0.0, 'max': 0.0}, {'current': 3026.186, 'min': 0.0, 'max': 0.0}, {'current': 3059.995, 'min': 0.0, 'max': 0.0}, {'current': 3058.868, 'min': 0.0, 'max': 0.0}, {'current': 3059.978, 'min': 0.0, 'max': 0.0}, {'current': 2639.982, 'min': 0.0, 'max': 0.0}, {'current': 3043.356, 'min': 0.0, 'max': 0.0}, {'current': 3032.312, 'min': 0.0, 'max': 0.0}, {'current': 3024.784, 'min': 0.0, 'max': 0.0}, {'current': 3309.767, 'min': 0.0, 'max': 0.0}, {'current': 3044.167, 'min': 0.0, 'max': 0.0}, {'current': 3074.821, 'min': 0.0, 'max': 0.0}, {'current': 2744.486, 'min': 0.0, 'max': 0.0}, {'current': 2948.546, 'min': 0.0, 'max': 0.0}, {'current': 3265.131, 'min': 0.0, 'max': 0.0}, {'current': 3260.141, 'min': 0.0, 'max': 0.0}, {'current': 3264.163, 'min': 0.0, 'max': 0.0}, {'current': 3299.133, 'min': 0.0, 'max': 0.0}, {'current': 3260.992, 'min': 0.0, 'max': 0.0}, {'current': 3299.601, 'min': 0.0, 'max': 0.0}, {'current': 3266.096, 'min': 0.0, 'max': 0.0}, {'current': 3299.245, 'min': 0.0, 'max': 0.0}, {'current': 3298.423, 'min': 0.0, 'max': 0.0}, {'current': 3262.508, 'min': 0.0, 'max': 0.0}, {'current': 3270.751, 'min': 0.0, 'max': 0.0}, {'current': 3265.57, 'min': 0.0, 'max': 0.0}, {'current': 3268.221, 'min': 0.0, 'max': 0.0}, {'current': 3262.709, 'min': 0.0, 'max': 0.0}, {'current': 3262.206, 'min': 0.0, 'max': 0.0}, {'current': 3270.565, 'min': 0.0, 'max': 0.0}, {'current': 3298.66, 'min': 0.0, 'max': 0.0}, {'current': 3271.159, 'min': 0.0, 'max': 0.0}, {'current': 3269.543, 'min': 0.0, 'max': 0.0}, {'current': 2891.532, 'min': 0.0, 'max': 0.0}, {'current': 3299.121, 'min': 0.0, 'max': 0.0}, {'current': 3267.57, 'min': 0.0, 'max': 0.0}, {'current': 3273.911, 'min': 0.0, 'max': 0.0}, {'current': 3271.579, 'min': 0.0, 'max': 0.0}, {'current': 3271.885, 'min': 0.0, 'max': 0.0}, {'current': 3269.181, 'min': 0.0, 'max': 0.0}, {'current': 3299.12, 'min': 0.0, 'max': 0.0}, {'current': 3272.274, 'min': 0.0, 'max': 0.0}, {'current': 3298.966, 'min': 0.0, 'max': 0.0}, {'current': 3298.849, 'min': 0.0, 'max': 0.0}, {'current': 3298.555, 'min': 0.0, 'max': 0.0}, {'current': 3298.44, 'min': 0.0, 'max': 0.0}, {'current': 3299.027, 'min': 0.0, 'max': 0.0}, {'current': 3299.417, 'min': 0.0, 'max': 0.0}, {'current': 3298.561, 'min': 0.0, 'max': 0.0}, {'current': 3298.684, 'min': 0.0, 'max': 0.0}, {'current': 3298.308, 'min': 0.0, 'max': 0.0}, {'current': 3299.07, 'min': 0.0, 'max': 0.0}, {'current': 3297.982, 'min': 0.0, 'max': 0.0}, {'current': 3298.738, 'min': 0.0, 'max': 0.0}, {'current': 3297.558, 'min': 0.0, 'max': 0.0}, {'current': 3297.74, 'min': 0.0, 'max': 0.0}, {'current': 3299.099, 'min': 0.0, 'max': 0.0}, {'current': 3299.072, 'min': 0.0, 'max': 0.0}, {'current': 3298.608, 'min': 0.0, 'max': 0.0}, {'current': 3299.045, 'min': 0.0, 'max': 0.0}, {'current': 3293.695, 'min': 0.0, 'max': 0.0}, {'current': 3299.228, 'min': 0.0, 'max': 0.0}, {'current': 3299.509, 'min': 0.0, 'max': 0.0}, {'current': 3298.722, 'min': 0.0, 'max': 0.0}, {'current': 3299.9, 'min': 0.0, 'max': 0.0}, {'current': 3299.551, 'min': 0.0, 'max': 0.0}, {'current': 3299.029, 'min': 0.0, 'max': 0.0}, {'current': 3299.307, 'min': 0.0, 'max': 0.0}, {'current': 3298.752, 'min': 0.0, 'max': 0.0}, {'current': 3299.526, 'min': 0.0, 'max': 0.0}, {'current': 3299.18, 'min': 0.0, 'max': 0.0}, {'current': 3299.048, 'min': 0.0, 'max': 0.0}, {'current': 3299.113, 'min': 0.0, 'max': 0.0}, {'current': 3299.319, 'min': 0.0, 'max': 0.0}, {'current': 3299.493, 'min': 0.0, 'max': 0.0}, {'current': 3299.269, 'min': 0.0, 'max': 0.0}, {'current': 3299.472, 'min': 0.0, 'max': 0.0}, {'current': 3299.484, 'min': 0.0, 'max': 0.0}, {'current': 3299.416, 'min': 0.0, 'max': 0.0}, {'current': 3299.596, 'min': 0.0, 'max': 0.0}, {'current': 3299.52, 'min': 0.0, 'max': 0.0}, {'current': 3298.897, 'min': 0.0, 'max': 0.0}, {'current': 3299.216, 'min': 0.0, 'max': 0.0}, {'current': 3299.001, 'min': 0.0, 'max': 0.0}, {'current': 3300.316, 'min': 0.0, 'max': 0.0}, {'current': 2995.097, 'min': 0.0, 'max': 0.0}, {'current': 2690.969, 'min': 0.0, 'max': 0.0}, {'current': 3300.22, 'min': 0.0, 'max': 0.0}, {'current': 3008.014, 'min': 0.0, 'max': 0.0}, {'current': 3299.622, 'min': 0.0, 'max': 0.0}, {'current': 2987.966, 'min': 0.0, 'max': 0.0}, {'current': 3021.177, 'min': 0.0, 'max': 0.0}, {'current': 3032.724, 'min': 0.0, 'max': 0.0}, {'current': 2997.024, 'min': 0.0, 'max': 0.0}, {'current': 3036.103, 'min': 0.0, 'max': 0.0}, {'current': 2998.071, 'min': 0.0, 'max': 0.0}, {'current': 3298.959, 'min': 0.0, 'max': 0.0}, {'current': 3043.183, 'min': 0.0, 'max': 0.0}, {'current': 3299.567, 'min': 0.0, 'max': 0.0}, {'current': 3027.171, 'min': 0.0, 'max': 0.0}, {'current': 2961.029, 'min': 0.0, 'max': 0.0}, {'current': 3059.873, 'min': 0.0, 'max': 0.0}, {'current': 3037.985, 'min': 0.0, 'max': 0.0}, {'current': 3009.778, 'min': 0.0, 'max': 0.0}, {'current': 3032.565, 'min': 0.0, 'max': 0.0}, {'current': 3272.763, 'min': 0.0, 'max': 0.0}, {'current': 3109.523, 'min': 0.0, 'max': 0.0}, {'current': 3299.902, 'min': 0.0, 'max': 0.0}, {'current': 3283.894, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 32.0, 'used': 0.012481689453125}}, 'gpu': 'NVIDIA A10G', 'gpu_count': 8, 'gpu_devices': [{'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}, {'name': 'NVIDIA A10G', 'memory_total': 24146608128}], 'memory': {'total': 747.9597625732422}}
2024-02-08 18:45:21,626 INFO HandlerThread:789 [system_monitor.py:probe():224] Finished collecting system info
2024-02-08 18:45:21,626 INFO HandlerThread:789 [system_monitor.py:probe():227] Publishing system info
2024-02-08 18:45:21,626 DEBUG HandlerThread:789 [system_info.py:_save_pip():52] Saving list of pip packages installed into the current environment
2024-02-08 18:45:21,627 DEBUG HandlerThread:789 [system_info.py:_save_pip():68] Saving pip packages done
2024-02-08 18:45:21,627 DEBUG HandlerThread:789 [system_info.py:_save_conda():75] Saving list of conda packages installed into the current environment
2024-02-08 18:45:22,487 INFO Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:22,487 INFO Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt
2024-02-08 18:45:35,889 DEBUG HandlerThread:789 [system_info.py:_save_conda():87] Saving conda packages done
2024-02-08 18:45:35,890 INFO HandlerThread:789 [system_monitor.py:probe():229] Finished publishing system info
2024-02-08 18:45:35,894 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:35,894 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:45:35,894 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:35,894 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: keepalive
2024-02-08 18:45:35,895 DEBUG SenderThread:789 [sender.py:send():382] send: files
2024-02-08 18:45:35,895 INFO SenderThread:789 [sender.py:_save_file():1392] saving file wandb-metadata.json with policy now
2024-02-08 18:45:35,899 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: stop_status
2024-02-08 18:45:35,900 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: stop_status
2024-02-08 18:45:35,906 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:45:36,041 DEBUG SenderThread:789 [sender.py:send():382] send: telemetry
2024-02-08 18:45:36,042 DEBUG SenderThread:789 [sender.py:send():382] send: config
2024-02-08 18:45:36,042 DEBUG SenderThread:789 [sender.py:send():382] send: metric
2024-02-08 18:45:36,042 DEBUG SenderThread:789 [sender.py:send():382] send: telemetry
2024-02-08 18:45:36,042 DEBUG SenderThread:789 [sender.py:send():382] send: metric
2024-02-08 18:45:36,042 WARNING SenderThread:789 [sender.py:send_metric():1343] Seen metric with glob (shouldn't happen)
2024-02-08 18:45:36,244 INFO wandb-upload_0:789 [upload_job.py:push():131] Uploaded file /tmp/tmph6r9wm0rwandb/nitc481h-wandb-metadata.json
2024-02-08 18:45:36,488 INFO Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:36,488 INFO Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-metadata.json
2024-02-08 18:45:36,488 INFO Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:36,754 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:38,488 INFO Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:38,842 DEBUG SenderThread:789 [sender.py:send():382] send: exit
2024-02-08 18:45:38,842 INFO SenderThread:789 [sender.py:send_exit():589] handling exit code: 1
2024-02-08 18:45:38,842 INFO SenderThread:789 [sender.py:send_exit():591] handling runtime: 17
2024-02-08 18:45:38,842 INFO SenderThread:789 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:45:38,842 INFO SenderThread:789 [sender.py:send_exit():597] send defer
2024-02-08 18:45:38,843 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 0
2024-02-08 18:45:38,843 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,843 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 0
2024-02-08 18:45:38,843 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 1
2024-02-08 18:45:38,843 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 1
2024-02-08 18:45:38,843 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,843 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 1
2024-02-08 18:45:38,843 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 2
2024-02-08 18:45:38,843 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,843 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 2
2024-02-08 18:45:38,843 INFO HandlerThread:789 [system_monitor.py:finish():203] Stopping system monitor
2024-02-08 18:45:38,844 DEBUG SystemMonitor:789 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-02-08 18:45:38,844 INFO HandlerThread:789 [interfaces.py:finish():202] Joined cpu monitor
2024-02-08 18:45:38,844 DEBUG SystemMonitor:789 [system_monitor.py:_start():179] Finished system metrics aggregation loop
2024-02-08 18:45:38,845 INFO HandlerThread:789 [interfaces.py:finish():202] Joined disk monitor
2024-02-08 18:45:38,845 DEBUG SystemMonitor:789 [system_monitor.py:_start():183] Publishing last batch of metrics
2024-02-08 18:45:38,884 INFO HandlerThread:789 [interfaces.py:finish():202] Joined gpu monitor
2024-02-08 18:45:38,884 INFO HandlerThread:789 [interfaces.py:finish():202] Joined memory monitor
2024-02-08 18:45:38,884 INFO HandlerThread:789 [interfaces.py:finish():202] Joined network monitor
2024-02-08 18:45:38,885 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,885 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 2
2024-02-08 18:45:38,885 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 3
2024-02-08 18:45:38,885 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,885 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 3
2024-02-08 18:45:38,885 DEBUG SenderThread:789 [sender.py:send():382] send: stats
2024-02-08 18:45:38,886 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,886 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 3
2024-02-08 18:45:38,886 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 4
2024-02-08 18:45:38,886 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,886 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 4
2024-02-08 18:45:38,886 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,886 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 4
2024-02-08 18:45:38,886 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 5
2024-02-08 18:45:38,886 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,886 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 5
2024-02-08 18:45:38,887 DEBUG SenderThread:789 [sender.py:send():382] send: summary
2024-02-08 18:45:38,888 INFO SenderThread:789 [sender.py:_save_file():1392] saving file wandb-summary.json with policy end
2024-02-08 18:45:38,888 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,888 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 5
2024-02-08 18:45:38,888 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 6
2024-02-08 18:45:38,888 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:38,888 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 6
2024-02-08 18:45:38,888 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:38,888 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 6
2024-02-08 18:45:38,893 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: status_report
2024-02-08 18:45:39,019 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 7
2024-02-08 18:45:39,019 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:39,019 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 7
2024-02-08 18:45:39,019 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:39,020 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 7
2024-02-08 18:45:39,489 INFO Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml
2024-02-08 18:45:39,489 INFO Thread-12 :789 [dir_watcher.py:_on_file_created():271] file/dir created: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json
2024-02-08 18:45:39,842 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:40,054 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 8
2024-02-08 18:45:40,054 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:40,054 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,055 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 8
2024-02-08 18:45:40,055 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,055 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 8
2024-02-08 18:45:40,055 INFO SenderThread:789 [job_builder.py:build():298] Attempting to build job artifact
2024-02-08 18:45:40,056 INFO SenderThread:789 [job_builder.py:_get_source_type():439] no source found
2024-02-08 18:45:40,056 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 9
2024-02-08 18:45:40,056 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,056 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 9
2024-02-08 18:45:40,056 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,057 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 9
2024-02-08 18:45:40,057 INFO SenderThread:789 [dir_watcher.py:finish():358] shutting down directory watcher
2024-02-08 18:45:40,489 INFO Thread-12 :789 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:40,489 INFO SenderThread:789 [dir_watcher.py:finish():388] scan: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml config.yaml
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt requirements.txt
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml conda-environment.yaml
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-metadata.json wandb-metadata.json
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log output.log
2024-02-08 18:45:40,490 INFO SenderThread:789 [dir_watcher.py:finish():402] scan save: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json wandb-summary.json
2024-02-08 18:45:40,493 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 10
2024-02-08 18:45:40,493 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:40,493 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 10
2024-02-08 18:45:40,502 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:40,502 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 10
2024-02-08 18:45:40,502 INFO SenderThread:789 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:45:40,709 INFO wandb-upload_1:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/config.yaml
2024-02-08 18:45:40,784 INFO wandb-upload_2:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/conda-environment.yaml
2024-02-08 18:45:40,825 INFO wandb-upload_4:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/wandb-summary.json
2024-02-08 18:45:40,825 INFO wandb-upload_3:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/output.log
2024-02-08 18:45:40,843 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:40,843 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:40,852 INFO wandb-upload_0:789 [upload_job.py:push():131] Uploaded file /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/files/requirements.txt
2024-02-08 18:45:41,053 INFO Thread-11 (_thread_body):789 [sender.py:transition_state():617] send defer: 11
2024-02-08 18:45:41,053 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,053 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 11
2024-02-08 18:45:41,053 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,053 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 11
2024-02-08 18:45:41,053 INFO SenderThread:789 [file_pusher.py:join():181] waiting for file pusher
2024-02-08 18:45:41,054 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 12
2024-02-08 18:45:41,054 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,054 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 12
2024-02-08 18:45:41,054 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,054 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 12
2024-02-08 18:45:41,054 INFO SenderThread:789 [file_stream.py:finish():595] file stream finish called
2024-02-08 18:45:41,126 INFO SenderThread:789 [file_stream.py:finish():599] file stream finish is done
2024-02-08 18:45:41,126 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 13
2024-02-08 18:45:41,126 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,126 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 13
2024-02-08 18:45:41,127 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,127 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 13
2024-02-08 18:45:41,127 INFO SenderThread:789 [sender.py:transition_state():617] send defer: 14
2024-02-08 18:45:41,127 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: defer
2024-02-08 18:45:41,127 INFO HandlerThread:789 [handler.py:handle_request_defer():172] handle defer: 14
2024-02-08 18:45:41,127 DEBUG SenderThread:789 [sender.py:send():382] send: final
2024-02-08 18:45:41,127 DEBUG SenderThread:789 [sender.py:send():382] send: footer
2024-02-08 18:45:41,127 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: defer
2024-02-08 18:45:41,127 INFO SenderThread:789 [sender.py:send_request_defer():613] handle sender defer: 14
2024-02-08 18:45:41,128 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:41,128 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:41,128 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: poll_exit
2024-02-08 18:45:41,129 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: poll_exit
2024-02-08 18:45:41,129 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: server_info
2024-02-08 18:45:41,129 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: get_summary
2024-02-08 18:45:41,130 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: server_info
2024-02-08 18:45:41,131 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: sampled_history
2024-02-08 18:45:41,132 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: internal_messages
2024-02-08 18:45:41,132 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: job_info
2024-02-08 18:45:41,196 DEBUG SenderThread:789 [sender.py:send_request():409] send_request: job_info
2024-02-08 18:45:41,197 INFO MainThread:789 [wandb_run.py:_footer_history_summary_info():3837] rendering history
2024-02-08 18:45:41,197 INFO MainThread:789 [wandb_run.py:_footer_history_summary_info():3869] rendering summary
2024-02-08 18:45:41,197 INFO MainThread:789 [wandb_run.py:_footer_sync_info():3796] logging synced files
2024-02-08 18:45:41,197 DEBUG HandlerThread:789 [handler.py:handle_request():146] handle_request: shutdown
2024-02-08 18:45:41,197 INFO HandlerThread:789 [handler.py:finish():866] shutting down handler
2024-02-08 18:45:42,132 INFO WriterThread:789 [datastore.py:close():294] close: /home/sagemaker-user/output-7b-26k-lora/wandb/run-20240208_184521-kk66dbgv/run-kk66dbgv.wandb
2024-02-08 18:45:42,197 INFO SenderThread:789 [sender.py:finish():1548] shutting down sender
2024-02-08 18:45:42,197 INFO SenderThread:789 [file_pusher.py:finish():175] shutting down file pusher
2024-02-08 18:45:42,197 INFO SenderThread:789 [file_pusher.py:join():181] waiting for file pusher
|