Froid is now available as a feature of SQL Server 2019 preview. The feature is called "Scalar UDF Inlining" https://blogs.msdn.microsoft.com/sqlserverstorageengine/2018...
Available to try out for free here: https://www.microsoft.com/en-us/sql-server/sql-server-2019
I'm amazed that the implementation was under 1500 LOC! Was that the research prototype or the shipped preview?
Congratulations on the VLDB paper! Hopefully I'll come say "hi" in LA :)
The shipped preview has only a bit more than 1500LOC.
The VLDB paper was presented at Rio in Aug this year already, but I'll try to come over to LA anyways :)
If you could share any pointers about UDFs and their performance problems in Spark, I would love to investigate more.
The paper includes a brief discussion on synthesis-based techniques, and the reasoning behind Froid's design choices.
- https://medium.com/teads-engineering/spark-performance-tunin... - https://www.inovex.de/blog/efficient-udafs-with-pyspark/
There are definitely some differences between the kind of UDFs that Spark supports and the kind that Froid handles. For one, Spark UDFs cannot invoke a Spark SQL query in their definition AFAIK, whereas TSQL functions can. But still, some techniques might be applicable. Definitely worth digging further!