DRILL-8501: Json Conversion UDF Not Respecting System JSON Options#2921
DRILL-8501: Json Conversion UDF Not Respecting System JSON Options#2921cgivre merged 3 commits intoapache:masterfrom
Conversation
|
Before I approve, did you consider making these JSON parsing settings parameters of the function itself? It feels odd to me that |
@jnturton I thought about doing exactly what you're describing. Here's the thing. We started some work a while ago to get rid of all the non-EVF2 readers in Drill. It turns out that there are a few places which still use the old non-EVF JSON reader. Specifically, this UDF, the Druid Storage Plugin and the MongoDB storage plugin. I started work on Drill-8316 which addresses the Druid plugin and Drill-8329 addresses converting the UDF. Neither one of these were a high priority so they're kind of sitting at the moment. I agree with your premise that the whole idea of having global settings for file formats (including parquet) is not the best idea. Give me a day or two and I can add additional params to the UDF so that you can just change those settings in the query. |
jnturton
left a comment
There was a problem hiding this comment.
I guess if this UDF is going to make use of the JsonReader (and perhaps one day it should stop doing that and just parse JSON itself using some lightweight parser) then it may as well be the JsonReader as configured by store.json.*.
|
@jnturton I added new versions of the UDF so that the user can specify in the function call whether they want |
|
Oh that's great, thanks for the enhancements +1. |
DRILL-8501: Json Conversion UDF Not Respecting System JSON Options
Description
The
convert_fromJSON()function was ignoring Drill system configuration variables for reading JSON. This PR adds support forallTextModeandreadNumbersAsDoubleto this function. Once merged, theconvert_fromJSON()function will follow the system settings.I also split one of the unit test files because it had all the UDF tests mixed with NaN tests.
Documentation
This PR adds the possibility of setting
allTextModeandreadNumbersAsDoublein the function call itself. The new usage is:SELECT convert_fromJSON(<field>, <allTextMode>, <readNumbersAsDouble>)Thus you could write a query:
Testing
Added unit tests.