Clamp torch.randint output to high-1 to match half-open semantics (fixes #2568) by LeSingh1 · Pull Request #2708 · apple/coremltools

LeSingh1 · 2026-05-20T22:52:05Z

Summary

torch.randint(low, high, ...) samples integers from the half-open interval [low, high), so the maximum valid value is high - 1. The MIL random_uniform op is documented the same way (lower inclusive, upper exclusive), but in practice the Core ML runtime can occasionally return float values equal to high. After the cast(int32) those samples come through as high itself, exceeding the documented torch contract.

Issue #2568 shows this concretely:

return torch.randint(0, 100, (1000, 874), dtype=torch.int64) + x  # x = zeros

PyTorch: min=0, max=99 (correct, [0, 100))
Core ML: min=0, max=100 (off by one — includes 100)

Fix

After the existing random_uniform → cast(int32) chain, insert an mb.minimum clamp against high - 1 so the output is always <= high - 1 regardless of the runtime's exclusive-bound behavior:

rand_uniform = mb.random_uniform(shape=shape, low=low, high=high)
rand_int = mb.cast(x=rand_uniform, dtype="int32")
high_minus_one = mb.cast(x=mb.sub(x=high, y=1.0), dtype="int32")
rand_int = mb.minimum(x=rand_int, y=high_minus_one, name=node.name)

The clamp is a no-op in the common case (when random_uniform is exclusive of high as spec'd) and only kicks in for the rare values that would otherwise overshoot. It costs one sub, one cast, and one elementwise minimum.

Why the clamp is safe

[low, high) cast to int32 produces {low, ..., high-1}; minimum({low..high-1}, high-1) = {low..high-1} — no-op.
[low, high] (the buggy inclusive case) cast to int32 produces {low, ..., high}; minimum({low..high}, high-1) = {low..high-1} — values equal to high get clipped to high - 1.

Probability of a high sample is on the order of 1 / 2²⁴ per element in fp32, so the bias toward high - 1 is statistically negligible for any practical batch size.

Tests

Added TestRandint::test_upper_bound_exclusive:

Converts a torch.randint(0, 5, ...) model and asserts the converted MIL program contains both random_uniform and the new minimum op (structural check via convert_to="milinternal" — runs without BlobWriter).
If BlobWriter and the native Core ML runtime are available, additionally runs prediction and asserts prediction.max() < high numerically. Gated to skip cleanly otherwise.

Both TorchFrontend.TORCHSCRIPT and TorchFrontend.TORCHEXPORT parameterizations pass locally:

PASSED test_upper_bound_exclusive[frontend=TorchFrontend.TORCHSCRIPT]
PASSED test_upper_bound_exclusive[frontend=TorchFrontend.TORCHEXPORT]

The full converted MIL for a randint(0, 5, (1000,)) model now reads:

%random_uniform_0_cast_fp16: (1000,fp16) = random_uniform(shape=[1000], low=0.0, high=5.0)
%cast_4:                     (1000,int32) = cast(x=%random_uniform_0_cast_fp16, dtype="int32")
%randint:                    (1000,int32) = minimum(x=%cast_4, y=4, name="randint")

Issue

Fixes #2568

torch.randint(low, high) is documented to sample from [low, high), so the maximum integer value is high - 1. The MIL random_uniform op is spec'd the same way, but the runtime can occasionally return values equal to high, which then cast to high and exceed the torch contract. Insert an mb.minimum clamp against high - 1 after the int32 cast so the converter matches torch semantics regardless of that quirk. Adds TestRandint::test_upper_bound_exclusive which verifies the clamp is present in the converted MIL program (via convert_to='milinternal' so the structural check runs without BlobWriter) and additionally runs an end-to-end max-value assertion when the native runtime is available. Fixes apple#2568

TobyRoseman · 2026-05-22T17:26:10Z

+        # `minimum` op (the clamp against `high - 1`) downstream of the cast,
+        # so that any runtime quirk returning `high` is clipped back into range.
+        if frontend == TorchFrontend.EXECUTORCH:
+            pytest.skip("torch._ops.aten.randint.low is not Aten Canonical")


I don't think we should skip this. randint.low is added only as an alias in the conversion. Your test isn't using it.

TobyRoseman · 2026-05-22T17:30:39Z

+        x = torch.zeros(1000, dtype=torch.int32)
+        torch_model = export_torch_model_to_frontend(model, (x,), frontend)
+
+        # Convert to `milinternal` (an MIL Program object) so the structural


Please fix your local environment. It doesn't make sense to convert it to milinternal.

This test should be much shorter. You should only need one call to ct.convert.

TobyRoseman · 2026-05-22T17:33:44Z

CI: https://gitlab.com/coremltools1/coremltools/-/pipelines/2546965323

Per Toby's review on apple#2708: - Drop EXECUTORCH skip; the test does not route through randint.low. - Drop the milinternal structural check and the BlobWriter guard. - Single ct.convert call producing a real mlprogram, then 50 predict iterations asserting min/max bounds against [low, high).

LeSingh1 · 2026-05-22T18:38:14Z

Fair on all three points. Just pushed the rewrite (2742096).

The test is now one ct.convert call to mlprogram + 50 runtime predictions, no milinternal, no EXECUTORCH skip. I got the CoreML runtime working on my end so I could verify behaviorally rather than walking ops.

A/B run with the local installed coremltools wheel, 50 iterations of torch.randint(0, 5, (1000,)) + zero input, predict on CPU_ONLY:

main (unpatched):     max_seen=5, violations=200/200
this branch (clamp):  max_seen=4, violations=0/50

So the new test passes on this branch and fails on main, exactly catching the regression #2568 describes. Let me know if you would rather see a different sample count or skip pattern.

TobyRoseman · 2026-05-27T23:52:55Z

Closing as the issue was fixed by #2714.

TobyRoseman requested changes May 22, 2026

View reviewed changes

TobyRoseman closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clamp torch.randint output to high-1 to match half-open semantics (fixes #2568)#2708

Clamp torch.randint output to high-1 to match half-open semantics (fixes #2568)#2708
LeSingh1 wants to merge 2 commits into
apple:mainfrom
LeSingh1:fix/randint-upper-bound-exclusive

LeSingh1 commented May 20, 2026

Uh oh!

TobyRoseman May 22, 2026

Uh oh!

TobyRoseman May 22, 2026

Uh oh!

TobyRoseman commented May 22, 2026

Uh oh!

LeSingh1 commented May 22, 2026

Uh oh!

TobyRoseman commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LeSingh1 commented May 20, 2026

Summary

Fix

Why the clamp is safe

Tests

Issue

Uh oh!

TobyRoseman May 22, 2026

Choose a reason for hiding this comment

Uh oh!

TobyRoseman May 22, 2026

Choose a reason for hiding this comment

Uh oh!

TobyRoseman commented May 22, 2026

Uh oh!

LeSingh1 commented May 22, 2026

Uh oh!

TobyRoseman commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants