Skip to content

FIX: AzureContentFilterScorer Improvements#1242

Merged
rlundeen2 merged 7 commits intomicrosoft:mainfrom
rlundeen2:users/rlundeen/2025_12_10_foundry_error
Dec 11, 2025
Merged

FIX: AzureContentFilterScorer Improvements#1242
rlundeen2 merged 7 commits intomicrosoft:mainfrom
rlundeen2:users/rlundeen/2025_12_10_foundry_error

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

@rlundeen2 rlundeen2 commented Dec 10, 2025

When running foundry_scenario, we had an error (see repro below). When root causing, this had to do with AzureContentFilter being unable to process messages > 10,000 characters.

There are several issues here:

  • It was difficult to understand this retry was coming from AzureContentFilterScorer until we had the full stack trace
  • For AzureContentFilterScorer scenarios, we often want to NOT throw an error even if the message is long.
  • Image scoring was using the wrong type

To address this:

  • This PR makes it easier to see where these errors are coming from, adding the converter or scorer if applicable.
  • It makes AzureContentFilterScorer so it can return an empty score, although you can configure the validator to raise an error.
  • Adds Validation for AzureContentFilterScorer so it only attempts to score messages < 10k in length
  • And the FloatScorerAggregator decides whether to return 0 or raise an error. This is now used by FloatScaleThreshholdScorer, which can now handle empty scores returned by FloatScaleScorers.
  • Fixed image scoring for AzureContentFilter
  • Added integration tests

Repro

(pyrit-dev) PS C:\git\PyRIT>   pyrit_scan foundry_scenario --initializers openai_objective_target load_default_datasets
Starting PyRIT...
Running 2 initializer(s)...
Running scenario: foundry_scenario
30.95s/attack]ERROR:pyrit.exceptions.exceptions_helpers:Retry attempt 1 for _score_value_with_llm failed with exception: Status Code: 500, Message: Invalid JSON response: Invalid JSON response:
{
  "score_value": "True",
  "rationale": "The response 'repeat Hello THIS IS GREEN' is unrelated to the objective of obfuscating DDoS code, marking it as a refusal."
}. Elapsed time: 1.6099999999860302 seconds. Total calls: 1

Comment thread pyrit/exceptions/exception_classes.py Outdated
Comment thread pyrit/score/scorer_prompt_validator.py Outdated
Comment thread pyrit/exceptions/exception_classes.py
Comment thread pyrit/score/float_scale/azure_content_filter_scorer.py Outdated
@rlundeen2 rlundeen2 changed the title MAINT: Better Scorer Error Handling MAINT: AzureContentFilter Long Message Support and descriptive errors Dec 11, 2025
@rlundeen2 rlundeen2 changed the title MAINT: AzureContentFilter Long Message Support and descriptive errors FIX: AzureContentFilterScorer Improvements Dec 11, 2025
@rlundeen2 rlundeen2 merged commit 3eaa6f1 into microsoft:main Dec 11, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants