Skip to content

Remove deprecated PyTensor function functionality and reduce overhead#1351

Open
ricardoV94 wants to merge 3 commits intopymc-devs:v3from
ricardoV94:reduce_function_overhead
Open

Remove deprecated PyTensor function functionality and reduce overhead#1351
ricardoV94 wants to merge 3 commits intopymc-devs:v3from
ricardoV94:reduce_function_overhead

Conversation

@ricardoV94
Copy link
Copy Markdown
Member

@ricardoV94 ricardoV94 commented Apr 8, 2025

Removing all the deprecated stuff that nobody uses

I think after this there's not much more fat to trim. We could try to avoid the input/output list thing with linkers that don't need it but we still need to sort them/ recognize kwargs which is pretty much what's left.

The C part of the call takes 100ns, so the "overhead" in this case is 200-250ns.
For reference, calling numpy.zeros(100) takes like ~250ns, and a python identity function is ~20ns.

Results of the new benchmark

Before
--------------------------------------------------------------------------------------------------------------------- benchmark: 4 tests ---------------------------------------------------------------------------------------------------------------------
Name (time in ns)                                                                Min                    Max                  Mean              StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=True]        524.0000 (1.0)      55,453.8000 (8.24)       574.7866 (1.0)      326.9482 (3.11)       560.5500 (1.0)      24.5500 (1.0)      301;2121    1,739.7760 (1.0)       82217          20
test_identity_function_overhead_benchmark[cvm-trust_input=True]             590.9999 (1.13)      6,733.0000 (1.0)        654.5138 (1.14)     105.2706 (1.0)        650.9999 (1.16)     30.9999 (1.26)       89;486    1,527.8516 (0.88)      19552           1
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=False]     1,432.0001 (2.73)     31,008.0000 (4.61)     1,608.2195 (2.80)     296.4835 (2.82)     1,583.0001 (2.82)     50.0002 (2.04)     722;1368      621.8057 (0.36)      37638           1
test_identity_function_overhead_benchmark[cvm-trust_input=False]          1,463.0000 (2.79)     11,541.9998 (1.71)     1,614.7948 (2.81)     329.0091 (3.13)     1,582.9999 (2.82)     60.0000 (2.44)       78;152      619.2737 (0.36)       3899           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
After
--------------------------------------------------------------------------------------------------------------------- benchmark: 4 tests ---------------------------------------------------------------------------------------------------------------------
Name (time in ns)                                                                Min                    Max                  Mean              StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=True]        322.7857 (1.0)       5,672.7143 (1.0)        358.2689 (1.0)       75.7251 (1.0)        354.9286 (1.0)      12.2143 (1.0)     1548;2274    2,791.1995 (1.0)      199641          14
test_identity_function_overhead_benchmark[cvm-trust_input=True]             439.9999 (1.36)     15,960.0002 (2.81)       505.0075 (1.41)     135.9068 (1.79)       500.9999 (1.41)     29.9999 (2.46)      136;793    1,980.1684 (0.71)      32064           1
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=False]     1,312.0000 (4.06)      8,876.9998 (1.56)     1,446.4540 (4.04)     179.0074 (2.36)     1,442.0000 (4.06)     49.9999 (4.09)      114;493      691.3459 (0.25)      36521           1
test_identity_function_overhead_benchmark[cvm-trust_input=False]          1,382.0002 (4.28)      8,705.9998 (1.53)     1,525.5059 (4.26)     198.9647 (2.63)     1,512.9999 (4.26)     50.9999 (4.18)       42;142      655.5202 (0.23)       7106           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


I also removed the strict zip, because passing strict=False is more than 100ns slower than not specifying it at all. Check out with this snippet:

a = tuple(range(10))
b = tuple(range(9))
%timeit tuple(zip(a, b, strict=False))
%timeit tuple(zip(a, b))
587 ns ± 8.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
458 ns ± 17.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Closes #1332
Does the same as and closes #1976


📚 Documentation preview 📚: https://pytensor--1351.org.readthedocs.build/en/1351/

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 2 times, most recently from 0dc3147 to 782bc2d Compare April 9, 2025 18:07
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 4 times, most recently from 9f026c7 to 4c1f269 Compare April 24, 2025 17:56
@ricardoV94 ricardoV94 changed the title Reduce pytensor function call overhead Remove deprecated PyTensor function functionality and reduce overhead Apr 24, 2025
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 3 times, most recently from a15d0e5 to 482cded Compare April 25, 2025 17:05
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 482cded to b5b538f Compare March 18, 2026 09:37
@ricardoV94 ricardoV94 changed the base branch from main to v3 March 18, 2026 09:38
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 7 times, most recently from 2e801bc to aaf2419 Compare March 19, 2026 13:16
@ricardoV94 ricardoV94 marked this pull request as ready for review March 19, 2026 14:48
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from aaf2419 to 7033561 Compare March 24, 2026 11:33
@ricardoV94 ricardoV94 requested a review from Armavica March 24, 2026 15:02
@Armavica
Copy link
Copy Markdown
Member

@ricardoV94 What is this test_overhead_benchmark function?

@ricardoV94
Copy link
Copy Markdown
Member Author

Something that changed name in #1993 probably, let me find and rerun

@ricardoV94
Copy link
Copy Markdown
Member Author

Ah dropped during rebase, it was new to this PR?...

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 7033561 to a6fd083 Compare March 24, 2026 16:29
@ricardoV94
Copy link
Copy Markdown
Member Author

ricardoV94 commented Mar 24, 2026

Re-introduced test_identity_function_overhead_benchmark. Updated benchmark on top. It's not as large as it used to be, in part because we already merged several of the changes since then, including 65d01e1

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 2 times, most recently from d37fd8f to dc5ad26 Compare March 24, 2026 17:21
@ricardoV94 ricardoV94 added this to the 3.0 release milestone Mar 24, 2026
This is actually slower than just not specifying it
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from dc5ad26 to 867f8b3 Compare March 24, 2026 18:09
@Armavica
Copy link
Copy Markdown
Member

Did you mean to add the DefaultOp and Eigvalsh as part of this PR?
No problem if that's the case but I am trying to understand if they are related to the function overhead

@ricardoV94
Copy link
Copy Markdown
Member Author

ricardoV94 commented Mar 25, 2026

Those Ops relied on special casing of inputs being None at runtime (but not typed as such at definition time) which required a general extra if input is not None guard before type.filter all the time.


# Function.__call__ is responsible for updating the inputs, unless the vm promises to do it itself
self.update_input_storage: tuple[int, Container] = ()
self._update_input_storage: tuple[int, Container] = ()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tuple[tuple[int, Container], ...]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I always get tripped by this

@Armavica
Copy link
Copy Markdown
Member

I understood most of it (not all) and it looks mostly okay to me, except in a couple of places where I have doubts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Do not allow passing None for arbitrary arguments in Function

2 participants