Skip to content

Commit 9b74ac8

Browse files
authored
[bugfix] Initialize chord dataset after accelerator setup in GRPOTrainer (#6638)
The get_chord_sft_dataloader() method relies on GRPOTrainer.accelerator, but the function was previously called before the parent class (super().__init__) finished initializing the accelerator. As a result, the get_chord_sft_dataloader will raise exception regarding non-existent attribute GRPOTrainer.accelerator.
1 parent 175c9e0 commit 9b74ac8

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

swift/trainers/rlhf_trainer/grpo_trainer.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ def __init__(self,
8181
reward_templates = kwargs.pop('reward_template', None)
8282
self._prepare_algorithm_params()
8383
super().__init__(model, ref_model, *_args, **kwargs)
84+
self._prepare_chord_dataset()
8485
self.prepare_rollout()
8586
self._prepare_rewards(reward_funcs, reward_model, reward_templates)
8687

@@ -1868,6 +1869,7 @@ def _prepare_algorithm_params(self):
18681869
self.advantage_estimator = args.advantage_estimator
18691870
self.kl_in_reward = args.kl_in_reward
18701871

1872+
def _prepare_chord_dataset(self):
18711873
# CHORD, https://arxiv.org/abs/2508.11408
18721874
self.chord_sft_iterator = None
18731875
if self.chord_sft_dataset:

0 commit comments

Comments
 (0)