[tmva][sofie] Restructure emitted code to be differentiable with Clad by guitargeek · Pull Request #18332 · root-project/root

guitargeek · 2025-04-09T09:40:02Z

Restructure emitted code to be differentiable with Clad.

vgvassilev · 2025-04-09T10:03:40Z

tmva/sofie/inc/TMVA/SOFIE_common.hxx

   return out;
 }

+inline void Copy(float const *b, float const *e, float *o)


Does providing a pullback for std::copy not work?

No, I tried a bit but then gave up. This was my approach:

#include <Math/CladDerivator.h> namespace std { void copy_pullback(double const *first, double const *last, double *out_first, double *_d_out, double *_d_first, double *_d_last, double *_d_out_first) { // Implementation doesn't matter yet, it doesn't compile anyway } } // namespace std void fooImpl(double const *x, double *y) { std::copy(x, x + 1, y); } void foo(double const *x, double *y) { fooImpl(x, y); } double g(double *variables) { double out; foo(variables, &out); return out * variables[1]; } void clademo() { // Call clad to generate the gradient of g. auto g_grad = clad::gradient(g, "variables"); // Execute the generated gradient function. double variables[]{3., 4.}; double grad_output[]{0., 0.}; g_grad.execute(variables, grad_output); std::cout << "grad_output[0]: " << grad_output[0] << std::endl; std::cout << "grad_output[1]: " << grad_output[1] << std::endl; // Dump the generated gradient code to standard output. g_grad.dump(); }

It segfaults. I think Clad just doesn't play well with the STL algos that take iterators, it's better to avoid this, no?

In any case, supporting this is not crucial for this PR. I was refactoring things to avoid this copy call in the generated code anyway.

Once this PR is functional for our usecase (actually now, but I also want to make the ROOT CI pass again), I'll write up what was not perfect in Clad for this and open issues

Ah, I see. Probably worth opening an issue in clad…

vgvassilev · 2025-04-09T10:04:36Z

tmva/sofie/test/TestCustomModelsFromONNX.cxx

   });

   TMVA_SOFIE_Equal::Session s("Equal_FromONNX.dat");
-   std::vector<bool> output = s.infer(input1.data(),input2.data());


Did that fail to differentiate?

No, I didn't even try to differentiate the models in the test. I'm solely focusing on the SBI usecase that we implement with LHCb. The reason why I changed this is because std::vector<bool> is not a good output type parameter. See:

[TMVA][SOFIE] Use uint8_t instead of bool in return types #18302

github-actions · 2025-04-09T12:04:48Z

Test Results

21 files 21 suites 3d 1h 22m 35s ⏱️
3 810 tests 3 809 ✅ 1 💤 0 ❌
73 036 runs 73 027 ✅ 9 💤 0 ❌

Results for commit 02aa923.

♻️ This comment has been updated with latest results.

guitargeek · 2025-04-22T17:06:23Z

Proof of concept test for this PR

Take this ONNX file (remove the .txt suffix after downloading):

VRlL_real_500k_evts_model.onnx.txt

Here are the scripts to convert the model to C++ and then to differentiate it with Clad:

// onnx_to_cpp.C

void onnx_to_cpp()
{
   using namespace TMVA::Experimental;
   SOFIE::RModelParser_ONNX parser;
   SOFIE::RModel model = parser.Parse("./VRlL_real_500k_evts_model.onnx");
   model.SetOptimizationLevel(SOFIE::OptimizationLevel::kBasic);
   model.Generate();
   model.PrintRequiredInputTensors();

   model.OutputGenerated("./VRlL_real_500k_evts_model.hxx");
}

// sofie_ad.C

#include "VRlL_real_500k_evts_model.hxx"

#include <Math/CladDerivator.h>

float my_func(TMVA_SOFIE_VRlL_real_500k_evts_model::Session const *session, float const *tensor_x,
              float *tensor_theory_params)
{
   float out = 0.;
   TMVA_SOFIE_VRlL_real_500k_evts_model::doInfer(session, tensor_x, tensor_theory_params, &out);
   return out;
}

void sofie_ad()
{
   std::vector<float> input1{5.0, 2.0, 1.0, -1.0, 1.0};
   std::vector<float> input2{0.0};

   // Generated header file shall contain a Session class which requires
   // initialization to load the corresponding weights.
   TMVA_SOFIE_VRlL_real_500k_evts_model::Session s("VRlL_real_500k_evts_model.dat");

   // Once instantiated the session object's infer method can be used
   // std::vector<float> out = s.infer(input1.data(), input2.data());

   auto func = [&](std::span<float> params) { return s.infer(input1.data(), params.data())[0]; };

   auto numDiff = [&](int i) {
      const float eps = 1e-4;
      std::vector<float> p{input2};
      p[i] = input2[i] - eps;
      float funcValDown = func(p);
      p[i] = input2[i] + eps;
      float funcValUp = func(p);
      return (funcValUp - funcValDown) / (2 * eps);
   };

   for (std::size_t i = 0; i < input2.size(); ++i) {
      std::cout << i << ":" << std::endl;
      std::cout << "  numr : " << numDiff(i) << std::endl;
   }

   float grad_output[]{0., 0., 0., 0., 0.};
   auto g_grad = clad::gradient<clad::opts::disable_tbr>(my_func, "tensor_theory_params");
   g_grad.execute(&s, input1.data(), input2.data(), grad_output);
   std::fill(std::begin(grad_output), std::end(grad_output), 0);
   g_grad.execute(&s, input1.data(), input2.data(), grad_output);

   std::cout << "  clad : " << grad_output[0] << std::endl;

   g_grad.dump();
}

Note that clad::opts::disable_tbr can probably be removed when this Clad issue is fixed:

Another regression in Clad v1.10 with new crash in code that worked before vgvassilev/clad#1369

Usage with expected output (replace libblas.so location with relevant path for your system):

   ------------------------------------------------------------------
  | Welcome to ROOT 6.35.01                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Jan 01 1980, 00:00:00                 |
  | From heads/sofie_ad@v6-35-01-2277-g8ddebb98bb                    |
  | With g++ (GCC) 14.2.1 20250322                                   |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

root [0] .L /nix/store/6kknwpcf8fl7ihkkxmdb6p764kdn443n-blas-3/lib/libblas.so
root [1] .x onnx_to_cpp.C
Model requires following inputs:
Fully Specified Tensor name: theory_params	type: float	shape: [1]
Fully Specified Tensor name: x	type: float	shape: [5]

root [2] .x sofie_ad.C
0:
  numr : -0.531077
  clad : -0.532437
root [3] .q

vgvassilev · 2025-08-05T17:43:56Z

Why did we decide to not pursue this?

guitargeek · 2025-08-11T08:20:38Z

@vgvassilev, sorry that was totally an accident. Maybe I confused it with another PR, or I wanted to close and re-open the PR to run the tests, but apparently I missed the "reopen" button.

guitargeek · 2026-03-05T17:07:17Z

So it works now, and even includes a test to validate the gradient of a fully-connected multi-layer network! I'll organize the changes a bit better before marking this as not a draft.

…n Session ctor" This reverts commit 1f747b0.

The idea of this commit is to refactor the `doInfer()` function that implements the inference from a member function of the `Session` struct to a free function that takes the `Session` by `const`-reference.

guitargeek added the in:TMVA label Apr 9, 2025

guitargeek self-assigned this Apr 9, 2025

vgvassilev reviewed Apr 9, 2025

View reviewed changes

guitargeek force-pushed the sofie_ad branch 8 times, most recently from 6b90cb6 to 87597cd Compare April 15, 2025 06:33

guitargeek force-pushed the sofie_ad branch from 87597cd to fa2cd12 Compare April 22, 2025 18:17

guitargeek force-pushed the sofie_ad branch 3 times, most recently from 89b638c to a3d545f Compare May 7, 2025 14:42

guitargeek changed the title ~~[TMVA][SOFIE] Restructure emitted code to be differentiable with Clad~~ [tmva][sofie] Restructure emitted code to be differentiable with Clad May 7, 2025

guitargeek force-pushed the sofie_ad branch 2 times, most recently from 3f40542 to 78fcc20 Compare May 8, 2025 09:12

guitargeek force-pushed the sofie_ad branch from 78fcc20 to 66e39bb Compare May 27, 2025 06:34

guitargeek force-pushed the sofie_ad branch 2 times, most recently from 4c9920f to 97903fa Compare July 15, 2025 14:52

guitargeek closed this Aug 5, 2025

guitargeek deleted the sofie_ad branch August 5, 2025 17:14

guitargeek restored the sofie_ad branch August 11, 2025 08:20

guitargeek reopened this Aug 11, 2025

guitargeek force-pushed the sofie_ad branch from 97903fa to 2742478 Compare August 11, 2025 09:22

guitargeek force-pushed the sofie_ad branch from 9ea0f01 to 362f8b3 Compare February 17, 2026 12:34

guitargeek force-pushed the sofie_ad branch 7 times, most recently from 9873d07 to 4f822ad Compare March 4, 2026 13:40

guitargeek added in:SOFIE and removed in:TMVA labels Mar 4, 2026

guitargeek mentioned this pull request Mar 4, 2026

[tmva][sofie] Use this pointer when accessing Session data members #21491

Merged

guitargeek force-pushed the sofie_ad branch 5 times, most recently from 575f9e0 to 728e5ad Compare March 5, 2026 13:21

guitargeek force-pushed the sofie_ad branch 9 times, most recently from 4084f5b to d1dfa3f Compare March 12, 2026 08:59

Revert "[tmva][sofie] Fix NonZero to define max output shape values i…

e9a81b8

…n Session ctor" This reverts commit 1f747b0.

guitargeek force-pushed the sofie_ad branch from d1dfa3f to 0434fc8 Compare March 13, 2026 16:11

[tmva][sofie] Restructure emitted code to be differentiable with Clad

02aa923

The idea of this commit is to refactor the `doInfer()` function that implements the inference from a member function of the `Session` struct to a free function that takes the `Session` by `const`-reference.

guitargeek force-pushed the sofie_ad branch from 0434fc8 to 02aa923 Compare March 13, 2026 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tmva][sofie] Restructure emitted code to be differentiable with Clad#18332

[tmva][sofie] Restructure emitted code to be differentiable with Clad#18332
guitargeek wants to merge 2 commits intoroot-project:masterfrom
guitargeek:sofie_ad

guitargeek commented Apr 9, 2025

Uh oh!

vgvassilev Apr 9, 2025

Uh oh!

guitargeek Apr 9, 2025

Uh oh!

vgvassilev Apr 9, 2025

Uh oh!

vgvassilev Apr 9, 2025

Uh oh!

guitargeek Apr 9, 2025

Uh oh!

github-actions bot commented Apr 9, 2025 •

edited

Loading

Uh oh!

guitargeek commented Apr 22, 2025 •

edited

Loading

Uh oh!

vgvassilev commented Aug 5, 2025

Uh oh!

guitargeek commented Aug 11, 2025 •

edited

Loading

Uh oh!

guitargeek commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

guitargeek commented Apr 9, 2025

Uh oh!

vgvassilev Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

guitargeek Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

vgvassilev Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

vgvassilev Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

guitargeek Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

guitargeek commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proof of concept test for this PR

Uh oh!

vgvassilev commented Aug 5, 2025

Uh oh!

guitargeek commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guitargeek commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 9, 2025 •

edited

Loading

guitargeek commented Apr 22, 2025 •

edited

Loading

guitargeek commented Aug 11, 2025 •

edited

Loading