[feat]Add UserLM template support by JimmyMa99 · Pull Request #9021 · modelscope/ms-swift

JimmyMa99 · 2026-04-05T17:23:39Z

• # PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

This PR adds initial UserLM support to ms-swift.

Changes included in this PR:

add a dedicated userlm template type
register microsoft/UserLM-8b under the llama model family with the userlm template
implement native swift backend prompt construction for UserLM
support the UserLM training/inference pattern where the target turn is the final user message instead of
the final assistant message
add a template-level test for prompt encoding behavior
update the supported models tables in both Chinese and English docs

Implementation notes:

this PR does not change the model loader
this PR does not depend on jinja for UserLM
the UserLM template now builds prompts with the native swift template path
during training, the template directly marks the final user turn as the supervised target

Experiment results

Server-side validation was completed with the local model path:

Model:

microsoft/UserLM-8b

Validation completed:

template encoding smoke passed
encoded prompt preserves the last assistant turn and appends the next user header
1-step swift sft smoke passed with template=userlm and template_backend=swift

Minimal training command used:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=/root/ms-swift-head swift sft \
  --model /root/models/microsoft/UserLM-8b/microsoft/UserLM-8b \
  --dataset /root/ms-swift-head/examples/models/userlm/user_turn_sft.jsonl \
  --split_dataset_ratio 0 \
  --tuner_type lora \
  --torch_dtype bfloat16 \
  --num_train_epochs 1 \
  --max_steps 1 \
  --per_device_train_batch_size 1 \
  --learning_rate 1e-4 \
  --lora_rank 8 \
  --lora_alpha 32 \
  --target_modules all-linear \
  --gradient_accumulation_steps 1 \
  --logging_steps 1 \
  --save_steps 1 \
  --save_total_limit 1 \
  --max_length 2048 \
  --dataset_num_proc 1 \
  --dataloader_num_workers 1 \
  --output_dir /root/ms-swift-head/output/userlm-sft-smoke-swift2

Result:

- training finished successfully for 1 step
- checkpoint saved successfully

gemini-code-assist

Code Review

This pull request adds support for the microsoft/UserLM-8b model, including documentation updates, model registration, and the implementation of a new userlm template. Review feedback points out a hardcoded local path in the test suite that should be replaced with a model ID for better portability. Additionally, the reviewer recommends removing a redundant encode method override in the UserLMTemplate class and its associated imports to simplify the code and avoid unnecessary performance overhead.

gemini-code-assist · 2026-04-05T17:25:24Z

tests/test_align/test_template/test_template.py



+def test_userlm():
+    tokenizer = get_processor('/root/models/microsoft/UserLM-8b/microsoft/UserLM-8b')


The model path is hardcoded to a local directory (/root/models/...). This will cause the test to fail in any environment where this specific path does not exist, such as in CI/CD pipelines or on other developers' machines. Please use the model ID microsoft/UserLM-8b instead, which allows the library to download or locate the model automatically.

Suggested change

tokenizer = get_processor('/root/models/microsoft/UserLM-8b/microsoft/UserLM-8b')

tokenizer = get_processor('microsoft/UserLM-8b')

gemini-code-assist · 2026-04-05T17:25:24Z

swift/template/templates/llama.py

+from copy import deepcopy
+from dataclasses import asdict, dataclass, field


The imports deepcopy and asdict are only used in the redundant encode method override. If that method is removed, these imports should also be removed to keep the code clean.

Suggested change

from copy import deepcopy

from dataclasses import asdict, dataclass, field

from dataclasses import dataclass, field

gemini-code-assist · 2026-04-05T17:25:24Z

swift/template/templates/llama.py

 from ..register import TemplateMeta, register_template
-from ..template_inputs import StdTemplateInputs
-from ..utils import Context, Prompt, Word, findall
+from ..template_inputs import StdTemplateInputs, TemplateInputs


TemplateInputs is only used in the redundant encode method override and can be removed from the imports if the method is removed.

Suggested change

from ..template_inputs import StdTemplateInputs, TemplateInputs

from ..template_inputs import StdTemplateInputs

gemini-code-assist · 2026-04-05T17:25:24Z

swift/template/templates/llama.py

+    def encode(self, inputs, return_template_inputs: bool = False, return_length: bool = False):
+        from swift.infer_engine import InferRequest
+        assert self._processor_inited, ('Please initialize the processor before calling the template.encode method: '
+                                        'template.init_processor(processor).')
+        if isinstance(inputs, InferRequest):
+            inputs = asdict(inputs)
+        if isinstance(inputs, dict):
+            inputs = TemplateInputs.from_dict(inputs)
+        elif isinstance(inputs, TemplateInputs):
+            inputs = deepcopy(inputs)
+        return super().encode(inputs, return_template_inputs=return_template_inputs, return_length=return_length)
+


The encode method override in UserLMTemplate is redundant. The base Template.encode already handles the conversion of a dict to TemplateInputs and verifies that the processor is initialized. Furthermore, the use of deepcopy on TemplateInputs is unnecessary and can significantly impact performance when dealing with large inputs (e.g., multi-modal data). The InferRequest handling is also typically managed by the InferEngine before it calls the template. Removing this override simplifies the implementation without loss of functionality.

Jintao-Huang · 2026-04-08T11:42:53Z

thanks!

please run:

pip install pre-commit
pre-commit run --all-files

gemini-code-assist bot reviewed Apr 5, 2026

View reviewed changes

Add UserLM template support

6d2694c

JimmyMa99 force-pushed the feat/userlm-template-upstream branch from c176cb9 to 6d2694c Compare April 5, 2026 17:36

Jintao-Huang approved these changes Apr 8, 2026

View reviewed changes

Run pre-commit for UserLM support

fe15294

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat]Add UserLM template support#9021

[feat]Add UserLM template support#9021
JimmyMa99 wants to merge 2 commits intomodelscope:mainfrom
JimmyMa99:feat/userlm-template-upstream

JimmyMa99 commented Apr 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 5, 2026

Uh oh!

gemini-code-assist bot Apr 5, 2026

Uh oh!

gemini-code-assist bot Apr 5, 2026

Uh oh!

gemini-code-assist bot Apr 5, 2026

Uh oh!

Jintao-Huang commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def test_userlm():
		tokenizer = get_processor('/root/models/microsoft/UserLM-8b/microsoft/UserLM-8b')

	tokenizer = get_processor('/root/models/microsoft/UserLM-8b/microsoft/UserLM-8b')
	tokenizer = get_processor('microsoft/UserLM-8b')

		from copy import deepcopy
		from dataclasses import asdict, dataclass, field

	from copy import deepcopy
	from dataclasses import asdict, dataclass, field
	from dataclasses import dataclass, field

	from ..template_inputs import StdTemplateInputs, TemplateInputs
	from ..template_inputs import StdTemplateInputs

Conversation

JimmyMa99 commented Apr 5, 2026

PR information

Experiment results

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Jintao-Huang commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants