Skip to content

Tests | Add collation transcoding tests#4051

Open
edwardneal wants to merge 2 commits intodotnet:mainfrom
edwardneal:tests/collation-transcoding-tests
Open

Tests | Add collation transcoding tests#4051
edwardneal wants to merge 2 commits intodotnet:mainfrom
edwardneal:tests/collation-transcoding-tests

Conversation

@edwardneal
Copy link
Contributor

Description

This PR replaces one test and introduces another.

CollatedDataReaderTest

Previously, CollatedDataReaderTest would create a brand new database with one of two collations (Kazakh_90_CI_AI and Georgian_Modern_Sort_CI_AS), then try to select an ASCII string from it. This had a few issues:

  • It only tested two collations and two code pages
  • The permissions needed to create a new database precluded an Azure SQL instance
  • The ASCII string was a very generous test case. It verified a string containing characters which are present in every code page

I've replaced this with something a little more robust. We now test every collation on a SQL instance, retrieving the character string and its byte representation, then we make sure that they roundtrip. It also uses a string containing the é character; this isn't present in Kazakh_90_CI_AI's code page, so SQL Server converts it to e. This is legitimate behaviour which would otherwise have failed.

CollatedStringInOutputParameter_DecodesSuccessfully

This is a new test which proves that we can roundtrip non-ASCII characters in output parameters when the value's collation's code page represents these non-ASCII characters differently: the default English Windows code page represents é as 0xE9; code page 936 represents this as [0xA8, 0xA6]; and UTF8 represents this as [0xC3, 0xA9].

Issues

Loosely related: #584 originally added the CollatedDataReaderTest test, and referred to making sure that the driver would maintain the collation/codepage mappings. These two tests provide sufficient test coverage for that effort (which would also unblock globalization invariant mode.)

Testing

New tests run, the other tests and code remain untouched.

This previously only tested Kazakh_90_CI_AI and Georgian_Modern_Sort_CI_AS. The replacement tests every collation.
It also performs a more comprehensive check that the string/byte[] roundtrips with the varbinary/varchar from the database instance.
Finally, it no longer requires permission to drop and create databases: we can just use the COLLATE statement.
[InlineData("KAZAKH_90_CI_AI")]
[InlineData("Georgian_Modern_Sort_CI_AS")]
public static void CollatedDataReaderTest(string collation)
[ConditionalFact(typeof(DataTestUtility), nameof(DataTestUtility.AreConnStringsSetup))]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is treated as a Fact, it'll operate over every known collation. There are >5000 of these in SQL 2025 and I wasn't convinced that having each of them as a test case was valuable.

using (SqlCommand dbCmd = dbCon.CreateCommand())
{
string data = Guid.NewGuid().ToString();
Assert.True(codePageEncoding is not null,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flows from the decision to keep the test as a Fact rather than a Theory: I need to specify which collation is failing an assertion using a message, and Assert.NotNull doesn't have this.

The same pattern plays out for the other instances of Assert.True.

Assert.True(collatedStringBytes.AsSpan().SequenceEqual(clientSideStringBytes),
$@"Collation ""{collationName}"", LCID {lcid}, code page {codePageId}: server-supplied byte array does not match client-side encoded bytes.");

// The character é does not exist in the Cyrillic character set, so do not compare the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is to explain why we have a different set of assertions between this test and CollatedStringInOutputParameter_DecodesSuccessfully.

@cheenamalhotra cheenamalhotra added the Area\Tests Issues that are targeted to tests or test projects label Mar 16, 2026
@cheenamalhotra cheenamalhotra added this to the 7.1.0-preview1 milestone Mar 16, 2026
@benrr101 benrr101 moved this from To triage to In progress in SqlClient Board Mar 17, 2026
@benrr101 benrr101 moved this from In progress to In review in SqlClient Board Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area\Tests Issues that are targeted to tests or test projects

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

2 participants