Adapt Mindspeed/Megatron 0.15.3 by addsubmuldiv · Pull Request #25 · modelscope/mcore-bridge

addsubmuldiv · 2026-04-10T01:12:39Z

No description provided.

Copilot

Pull request overview

This PR updates the mcore bridge to better interoperate with MindSpeed/Megatron 0.15.x Transformer Engine (TE) behavior, especially around LoRA adapter injection and TE debug/compat fields.

Changes:

Add MindSpeed version detection and a helper to build “local” TE linear layers with MindSpeed 0.15.x-compatible semantics on NPU.
Make LoRA tensor-parallel group resolution robust to MindSpeed TE variants that store the group under parallel_group instead of tp_group.
Patch TELinear.__repr__ to tolerate MindSpeed 0.15.x field name changes on NPU (input_size/output_size).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/mcore_bridge/tuners/lora.py`	Adds MindSpeed-aware helpers and updates LoRA adapter layer construction / tp group resolution for NPU + MindSpeed 0.15.x.
`src/mcore_bridge/patcher.py`	Updates TE `__repr__` patch for MindSpeed NPU field compatibility.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T01:15:28Z

src/mcore_bridge/tuners/lora.py

 from contextlib import contextmanager, nullcontext
+from importlib import metadata
 from megatron.core import parallel_state


from importlib import metadata introduces a module-level name that collides with the existing metadata parameter used later in sharded_state_dict(...). This shadowing is legal but makes the file harder to read and can lead to accidental misuse of the module vs. the parameter. Consider importing with an alias (e.g., importlib_metadata) and updating _get_mindspeed_version() accordingly.

Copilot · 2026-04-10T01:15:28Z

src/mcore_bridge/patcher.py

+            # semantics.
+            in_features = getattr(self, 'in_features', getattr(self, 'input_size', None))
+            out_features = getattr(self, 'out_features', getattr(self, 'output_size', None))
+            use_bias = getattr(self, 'use_bias', getattr(self, 'bias', None) is not None)


On the NPU path, use_bias = getattr(self, 'use_bias', getattr(self, 'bias', None) is not None) can misreport bias when self.bias is a boolean (e.g., False still makes the expression true because it is not None). To make __repr__ robust across TE/MindSpeed variants, compute use_bias by first reading bias_attr = getattr(self, 'bias', None) and handling the boolean case explicitly (otherwise fall back to bias_attr is not None).

Suggested change

use_bias = getattr(self, 'use_bias', getattr(self, 'bias', None) is not None)

bias_attr = getattr(self, 'bias', None)

if hasattr(self, 'use_bias'):

use_bias = self.use_bias

elif isinstance(bias_attr, bool):

use_bias = bias_attr

else:

use_bias = bias_attr is not None

gemini-code-assist

Code Review

This pull request introduces compatibility for MindSpeed 0.15.x on NPU devices by updating the TELinear patched representation and refactoring LoRA layer updates. It adds logic to handle NPU-specific attribute names and introduces helper functions to manage version-dependent linear layer instantiation and tensor-parallel group resolution. Feedback focuses on improving the robustness of tp_size inference in the patched repr and optimizing performance by caching the MindSpeed version lookup.

gemini-code-assist · 2026-04-10T01:16:27Z

src/mcore_bridge/patcher.py

+                parallel_mode = getattr(self, 'parallel_mode', None)
+                tp_size = 1 if parallel_mode == 'duplicated' else 'unknown'


When tp_size is missing, it is inferred from parallel_mode. However, parallel_mode can be None for local layers (as seen in the GPU path or default initialization), which should also imply a tp_size of 1. Currently, it defaults to 'unknown' in this case.

Suggested change

parallel_mode = getattr(self, 'parallel_mode', None)

tp_size = 1 if parallel_mode == 'duplicated' else 'unknown'

parallel_mode = getattr(self, 'parallel_mode', None)

tp_size = 1 if parallel_mode in ('duplicated', None) else 'unknown'

gemini-code-assist · 2026-04-10T01:16:27Z

src/mcore_bridge/tuners/lora.py

+def _get_mindspeed_version():
+    try:
+        return version.parse(metadata.version('mindspeed'))
+    except metadata.PackageNotFoundError:
+        return None
+    except Exception:
+        return None


The _get_mindspeed_version function is called multiple times during the LoRA update process. Since metadata.version involves filesystem access and parsing, it is inefficient to call it repeatedly. Caching the result would improve performance.

Suggested change

def _get_mindspeed_version():

try:

return version.parse(metadata.version('mindspeed'))

except metadata.PackageNotFoundError:

return None

except Exception:

return None

_MINDSPEED_VERSION = None

def _get_mindspeed_version():

global _MINDSPEED_VERSION

if _MINDSPEED_VERSION is not None:

return _MINDSPEED_VERSION

try:

_MINDSPEED_VERSION = version.parse(metadata.version('mindspeed'))

except (metadata.PackageNotFoundError, Exception):

_MINDSPEED_VERSION = False

return _MINDSPEED_VERSION if _MINDSPEED_VERSION is not False else None

addsubmuldiv added 4 commits April 7, 2026 20:13

adapt mindspeed 0.15

cafd4b9

wip

78c4ba4

update

5e15f77

fix

08c8466

Copilot AI review requested due to automatic review settings April 10, 2026 01:12

Copilot started reviewing on behalf of addsubmuldiv April 10, 2026 01:13 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 10, 2026

View reviewed changes

Jintao-Huang approved these changes Apr 10, 2026

View reviewed changes

addsubmuldiv merged commit 2a0c001 into modelscope:main Apr 10, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt Mindspeed/Megatron 0.15.3#25

Adapt Mindspeed/Megatron 0.15.3#25
addsubmuldiv merged 4 commits intomodelscope:mainfrom
addsubmuldiv:adapt_mindspeed_015

addsubmuldiv commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 10, 2026

Uh oh!

gemini-code-assist bot Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-            use_bias = getattr(self, 'use_bias', getattr(self, 'bias', None) is not None)
+            bias_attr = getattr(self, 'bias', None)
+            if hasattr(self, 'use_bias'):
+                use_bias = self.use_bias
+            elif isinstance(bias_attr, bool):
+                use_bias = bias_attr
+            else:
+                use_bias = bias_attr is not None

		parallel_mode = getattr(self, 'parallel_mode', None)
		tp_size = 1 if parallel_mode == 'duplicated' else 'unknown'

-def _get_mindspeed_version():
-    try:
-        return version.parse(metadata.version('mindspeed'))
-    except metadata.PackageNotFoundError:
-        return None
-    except Exception:
-        return None
+_MINDSPEED_VERSION = None
+def _get_mindspeed_version():
+    global _MINDSPEED_VERSION
+    if _MINDSPEED_VERSION is not None:
+        return _MINDSPEED_VERSION
+    try:
+        _MINDSPEED_VERSION = version.parse(metadata.version('mindspeed'))
+    except (metadata.PackageNotFoundError, Exception):
+        _MINDSPEED_VERSION = False
+    return _MINDSPEED_VERSION if _MINDSPEED_VERSION is not False else None

Conversation

addsubmuldiv commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants