序列并行批次维度不匹配 #3014

likeaTT · 2025-02-03T15:09:27Z

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)
请问现在是只支持sft的序列并行吗，dpo是否支持呢？
在dpo训练的时候设置了sequence_parallel_size，报错tensor size不匹配

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)

Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35
Python version: 3.11.11
PyTorch version: 2.6.0+cu124 (GPU)
Transformers version: 4.48.0
Datasets version: 3.2.0
Accelerate version: 1.3.0
PEFT version: 0.14.0
TRL version: 0.14.0
GPU type: NVIDIA L40
DeepSpeed version: 0.16.3

Additional context
Add any other context about the problem here(在这里补充其他信息)
训练参数如下：
swift rlhf
--train_type lora
--seed 42
--rlhf_type dpo
--model /data/home/models/sft-llama-3.1-8b
--model_type llama3
--template llama3
--dataset data/swift_dfs_preference.json
--split_dataset_ratio 0.01
--dataloader_num_workers 4
--dataset_num_proc 4
--max_length 8192
--truncation_strategy left
--tools_prompt toolbench
--sequence_parallel_size 2
--output_dir ckpt/dpo
--deepspeed zero3
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 1e-4
--num_train_epochs 1
--gradient_accumulation_steps $(expr 16 / $nproc_per_node)
--save_strategy steps
--save_steps 100
--eval_strategy steps
--eval_steps 100
--warmup_ratio 0.05
--lora_rank 8
--lora_alpha 32
--rlhf_type dpo
--beta 0.1
--rpo_alpha 0.1
--torch_dtype bfloat16
--target_modules all-linear
--logging_steps 5 \

tastelikefeet · 2025-02-04T08:31:50Z

dpo我这里不报错

{
            "name": "dpo",
            "type": "python",
            "request": "launch",
            "console": "integratedTerminal",
            "module": "torch.distributed.launch",
            "justMyCode": false,
            "env": {
                "PYTHONPATH": ".",
            },
            "args": [
                "--nproc_per_node=4",
                "swift/cli/rlhf.py",
                "--rlhf_type", "dpo",
                "--model", "LLM-Research/Meta-Llama-3.1-8B-Instruct",
                "--sft_type", "lora",
                "--dataset", "swift/RLAIF-V-Dataset#100",
                "--eval_steps", "1000",
                "--beta", "0.1",
                "--max_steps", "1000",
                "--save_steps", "1000",
                "--max_length", "8192",
                "--ignore_args_error", "true",
                "--sequence_parallel_size", "4",
                "--gradient_checkpointing_kwargs", "{\"use_reentrant\": false}",
                "--deepspeed", "zero3"
            ]
        },

likeaTT · 2025-02-05T08:49:46Z

dpo我这里不报错

{
            "name": "dpo",
            "type": "python",
            "request": "launch",
            "console": "integratedTerminal",
            "module": "torch.distributed.launch",
            "justMyCode": false,
            "env": {
                "PYTHONPATH": ".",
            },
            "args": [
                "--nproc_per_node=4",
                "swift/cli/rlhf.py",
                "--rlhf_type", "dpo",
                "--model", "LLM-Research/Meta-Llama-3.1-8B-Instruct",
                "--sft_type", "lora",
                "--dataset", "swift/RLAIF-V-Dataset#100",
                "--eval_steps", "1000",
                "--beta", "0.1",
                "--max_steps", "1000",
                "--save_steps", "1000",
                "--max_length", "8192",
                "--ignore_args_error", "true",
                "--sequence_parallel_size", "4",
                "--gradient_checkpointing_kwargs", "{\"use_reentrant\": false}",
                "--deepspeed", "zero3"
            ]
        },

试了一下，还是不行，另外sft也是报同样的错

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

序列并行批次维度不匹配 #3014

序列并行批次维度不匹配 #3014

likeaTT commented Feb 3, 2025

tastelikefeet commented Feb 4, 2025

likeaTT commented Feb 5, 2025

序列并行批次维度不匹配 #3014

序列并行批次维度不匹配 #3014

Comments

likeaTT commented Feb 3, 2025

tastelikefeet commented Feb 4, 2025

likeaTT commented Feb 5, 2025