Skip to content

Pull requests: modelscope/ms-swift

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[megatron] fix: make bridge exported cloned weights store on CPU
#6714 opened Nov 21, 2025 by HollowMan6 Loading…
1 of 4 tasks
[WIP] add mindspeed support
#6689 opened Nov 21, 2025 by ji-huazhong Draft
4 tasks
[WIP] Support GRPO TIS/MIS
#6678 opened Nov 20, 2025 by hjh0119 Draft
add muon clip optimizer
#6662 opened Nov 19, 2025 by vx120 Loading…
1 task
Implement NPU_ENV handling in init.py to support megatron in NPU.
#6661 opened Nov 19, 2025 by vx120 Loading…
1 task done
Support tree-rollout
#6634 opened Nov 17, 2025 by li2zhi Loading…
1 task done
[megatron] support blockwise fp8
#6633 opened Nov 17, 2025 by Jintao-Huang Loading…
Add npu fused operators supported in modeling_qwen2
#6610 opened Nov 15, 2025 by tongtong0613 Loading…
4 tasks
Add conditional distillation support for GKD trainer
#6542 opened Nov 11, 2025 by woshixiaobai2019 Loading…
3 tasks
[megatron] support megatron MTP
#6496 opened Nov 8, 2025 by Jintao-Huang Loading…
Add MFU logging support
#6434 opened Nov 5, 2025 by y2logic Loading…
1 task done
[WIP][Exp]Support ray dpo
#6395 opened Nov 1, 2025 by tastelikefeet Loading…
1 of 4 tasks
[megatron] update megatron_args default_val
#6252 opened Oct 22, 2025 by Jintao-Huang Loading…
feat: Enable for exporting unmerged HF Lora Adapter
#6225 opened Oct 20, 2025 by jason9693 Loading…
1 of 4 tasks
[WIP] refactor template
#6085 opened Oct 11, 2025 by Jintao-Huang Loading…
update docs
#5691 opened Sep 6, 2025 by Jintao-Huang Loading…
[model] update minicpmv-4.5 video processor
#5679 opened Sep 5, 2025 by hjh0119 Loading…
bug fix: RuntimeError when training GRPO with LoRA and PtEngine
#5645 opened Sep 3, 2025 by chenjianhuii Loading…
1 of 4 tasks
Bug fix: eval OOM due to deepcopy of torch model
#5607 opened Aug 29, 2025 by hellopahe Loading…
1 task done
[init]support gptq grpo in colocate mode
#5569 opened Aug 27, 2025 by ItGirls Loading…
1 of 4 tasks
Update dataset_info.json stale
#3723 opened Mar 31, 2025 by sandeep-sm Loading…
3 tasks
[WIP] support reasoning_content
#3159 opened Feb 18, 2025 by Jintao-Huang Loading…
loss_scale bug when meeting <image>
#3036 opened Feb 8, 2025 by mangoyuan Draft
1 of 4 tasks
ProTip! Filter pull requests by the default branch with base:main.