Skip to content

[SPARK-56332][SQL][TESTS] Use sql.SparkSession in trait SQLTestData#55162

Open
zhengruifeng wants to merge 4 commits intoapache:masterfrom
zhengruifeng:merge_sql_utils
Open

[SPARK-56332][SQL][TESTS] Use sql.SparkSession in trait SQLTestData#55162
zhengruifeng wants to merge 4 commits intoapache:masterfrom
zhengruifeng:merge_sql_utils

Conversation

@zhengruifeng
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Use sql.SparkSession in trait SQLTestData

Why are the changes needed?

this is needed for merging SQLTestUtils and QueryTest

Does this PR introduce any user-facing change?

No, test-only

How was this patch tested?

CI

Was this patch authored or co-authored using generative AI tooling?

Co-authored-by: Claude code (Opus 4.6)

… SQLTestData

Use `sql.SparkSession` instead of `classic.SparkSession` in `SQLTestData`. For datasets that require RDD-based creation (emptyTestData, testData, testData2, testData3, upperCaseData, lowerCaseData, lowerCaseDataWithDuplicates), cast to `classic.SparkSession`. For all other datasets, use `spark.createDataFrame` directly.

Co-authored-by: Isaac
Replace `.toDF()` on RDDs with `spark.createDataFrame(rdd)` and `$"..."` with `col("...")` to eliminate the SQLImplicits dependency.

Co-authored-by: Isaac
Replace `.toDF()` on RDDs with `spark.createDataFrame(rdd)` and `$"..."` with `col("...")` to eliminate the SQLImplicits dependency.

Co-authored-by: Isaac
protected lazy val emptyTestData: DataFrame = {
val df = spark.sparkContext.parallelize(
Seq.empty[Int].map(i => TestData(i, i.toString))).toDF()
val df = spark.createDataFrame(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please avoid using SparkContext.parallelize?

val df = spark.emptyDataset[TestData].toDF()

val df = spark.sparkContext.parallelize(
(1 to 100).map(i => TestData(i, i.toString))).toDF()
val df = spark.createDataFrame(
spark.asInstanceOf[classic.SparkSession].sparkContext.parallelize(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants