You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[INFO|2025-02-06 17:01:18] llamafactory.model.model_utils.checkpointing:157 >> Gradient checkpointing enabled.
[INFO|2025-02-06 17:01:18] llamafactory.model.adapter:157 >> Upcasting trainable params to float32.
[INFO|2025-02-06 17:01:18] llamafactory.model.adapter:157 >> Fine-tuning method: LoRA
[INFO|2025-02-06 17:01:18] llamafactory.model.model_utils.misc:157 >> Found linear modules: o_proj,gate_proj,up_proj,k_proj,v_proj,down_proj,q_proj
[WARNING|logging.py:328] 2025-02-06 17:01:19,863 >> Unsloth 2025.1.8 patched 28 layers with 28 QKV layers, 28 O layers and 28 MLP layers.
[INFO|2025-02-06 17:01:21] llamafactory.model.loader:157 >> trainable params: 20,185,088 || all params: 7,635,801,600 || trainable%: 0.2643
Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
[INFO|trainer.py:741] 2025-02-06 17:01:21,223 >> Using auto half precision backend
[WARNING|<string>:215] 2025-02-06 17:01:21,499 >> ==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\\ /| Num examples = 2,477 | Num Epochs = 1
O^O/ \_/ \ Batch size per device = 1 | Gradient Accumulation steps = 1
\ / Total batch size = 1 | Total steps = 2,477
"-____-" Number of trainable parameters = 20,185,088
0%| | 0/2477 [00:00<?, ?it/s]/tmp/tmpi9dfeij2/main.c:6:23: 致命错误:stdatomic.h:没有那个文件或目录
#include <stdatomic.h>
^
编译中断。
Traceback (most recent call last):
File "/sie/anaconda3/envs/yecp_main/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
File "/sie/yecp/code/llama_factory_main/src/llamafactory/cli.py", line 112, in main
run_exp()
File "/sie/yecp/code/llama_factory_main/src/llamafactory/train/tuner.py", line 92, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "/sie/yecp/code/llama_factory_main/src/llamafactory/train/tuner.py", line 66, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/sie/yecp/code/llama_factory_main/src/llamafactory/train/sft/workflow.py", line 101, in run_sft
train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/transformers/trainer.py", line 2171, in train
return inner_training_loop(
File "<string>", line 382, in _fast_inner_training_loop
File "<string>", line 31, in _unsloth_training_step
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/models/_utils.py", line 1069, in _unsloth_pre_compute_loss
return self._old_compute_loss(model, inputs, *args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/transformers/trainer.py", line 3731, in compute_loss
outputs = model(**inputs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/accelerate/utils/operations.py", line 823, in forward
return model_forward(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/accelerate/utils/operations.py", line 811, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 43, in decorate_autocast
return func(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/_compile.py", line 31, in inner
return disable_fn(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 600, in _fn
return fn(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/models/llama.py", line 1130, in PeftModelForCausalLM_fast_forward
return self.base_model(
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 188, in forward
return self.model.forward(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/models/llama.py", line 990, in _CausalLM_fast_forward
outputs = self.model(
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/models/llama.py", line 821, in LlamaModel_fast_forward
hidden_states = Unsloth_Offloaded_Gradient_Checkpointer.apply(
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/autograd/function.py", line 574, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 455, in decorate_fwd
return fwd(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth_zoo/gradient_checkpointing.py", line 147, in forward
output = forward_function(hidden_states, *args)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/models/llama.py", line 507, in LlamaDecoderLayer_fast_forward
hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 600, in _fn
return fn(*args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 210, in fast_rms_layernorm
out = Fast_RMS_Layernorm.apply(X, W, eps, gemma)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/torch/autograd/function.py", line 574, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 156, in forward
fx[(n_rows,)](
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/jit.py", line 345, in <lambda>
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/jit.py", line 607, in run
device = driver.active.get_current_device()
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in __getattr__
self._initialize_obj()
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/driver.py", line 9, in _create_driver
return actives[0]()
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 371, in __init__
self.utils = CudaUtils() # TODO: make static
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 80, in __init__
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 57, in compile_module_from_src
so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/runtime/build.py", line 48, in _build
ret = subprocess.check_call(cc_cmd)
File "/sie/anaconda3/envs/yecp_main/lib/python3.10/subprocess.py", line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpi9dfeij2/main.c', '-O3', '-shared', '-fPIC', '-o', '/tmp/tmpi9dfeij2/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-lcuda', '-L/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/backends/nvidia/lib', '-L/lib64', '-L/lib', '-I/sie/anaconda3/envs/yecp_main/lib/python3.10/site-packages/triton/backends/nvidia/include', '-I/tmp/tmpi9dfeij2', '-I/sie/anaconda3/envs/yecp_main/include/python3.10']' returned non-zero exit status 1.
0%| | 0/2477 [00:00<?, ?it/s]
Others
No response
The text was updated successfully, but these errors were encountered:
Reminder
System Info
llamafactory版本:0.9.2.dev0
python版本:3.10.16
显卡:A100,80G
python依赖
执行脚本
脚本内容
Reproduction
Others
No response
The text was updated successfully, but these errors were encountered: