Escape object/data names in Yul serializer#16517
Conversation
8aae7ab to
9619746
Compare
| // to prevent a reparse failure during optimization (exercises Data::toString). | ||
| object "a" { | ||
| code { {} } | ||
| data "my\"data" hex"deadbeef" |
There was a problem hiding this comment.
I'd add datasize("my\"data") in there to prove that you can successfully refer to such data, not just declare it.
There was a problem hiding this comment.
Yes, it seems like we can, I updated the test :) Yay. Let me know what you think!
| @@ -0,0 +1,13 @@ | |||
| // Object name containing a backslash must be escaped in serialization | |||
| // to prevent a reparse failure during optimization. | |||
| object "foo\\bar" { | |||
There was a problem hiding this comment.
Can we just use a name that has all the possible escapes? There's only a handful of them:
solidity/libsolutil/CommonData.cpp
Lines 204 to 219 in 57d3d68
And please use datasize() here as well.
There was a problem hiding this comment.
New test case has them all: test/libyul/yulSyntaxTests/objects/object_name_with_all_special_chars.yul :) Let me know what you think!
There was a problem hiding this comment.
I'm not sure seeing assembly outputs adds much to these tests. It shows that they compile, but we only expect issues in the parser anyway. Maybe these would be better off in test/libyul/yulSyntaxTests/objects/ instead?
There was a problem hiding this comment.
This would also allow us to add some coverage for invalid cases. For example I think we should have tests covering names that include an unescaped newline, tab, or unicode character. For example something like this should produce an error:
object "A" {
code {
sstore(0, datasize("B\nC"))
}
data "B
C" hex"1234"
}There was a problem hiding this comment.
Added test test/libyul/yulSyntaxTests/objects/data_name_with_literal_newline.yul to address this concern.
I also moved all the files into test/libyul/yulSyntaxTests/objects/ you are right, it makes a ton more sense. Sorry, I am not very familiar with the testing setup in Solidity :S
Co-authored-by: Kamil Śliwak <kamil.sliwak@codepoets.it>
clonker
left a comment
There was a problem hiding this comment.
Commits need squashing and the branch needs a rebase. We try avoiding merge commits in the history and an up-to-date rebase ensures that the changes integrate well with the current develop as well as keeps the history neat and linear.
Changes and test coverage are good!
Fix: Unescaped object/data names in Yul serializer cause crash on reparse
Fixes: #16516
Problem
Object::toString()andData::toString()concatenated thenamefield directly into the serialized output:The
nameis stored as the raw unescaped string value from the scanner, so a name containing"(e.g. parsed fromobject "foo\"bar" { ... }) would produce malformed Yul like:After optimization,
YulStack::reparse()serializes the AST viaprint()and re-parses it. The malformed output caused a fatal parser error: "Expected end of source but got identifier".Fix
Replace manual quoting with
util::escapeAndQuoteString(name)in bothObject::toString()andData::toString(). This escapes",\, and non-printable characters before re-serialization, making the output round-trip correctly through the parser.formatUseSrcComment()in the same file already usedescapeAndQuoteStringcorrectly for source names — this applies the same pattern to object and data names.Output on crashing input
Disclosures
Code generated by Caude Claude. Reviewed by myself.