Polyglot Apache Flink UDF Programming with Iron Functions
Stream-processing technologies have been evolving over the years. The latest iteration borrows many ideas from databases; as a result, streaming SQL has become one of the primary interfaces for data processing.
Apache Flink SQL is an extremely powerful way to define data streaming pipelines. Flink SQL is declarative, supports changelog semantics, and relies on decades of database optimization research.
However, sometimes SQL is just not enough (or too hard!). You may need to work with deeply nested data structures, and writing JSON parsing logic in SQL is still quite challenging. Or you may have certain battle-tested libraries you want to use (I’ve seen this many times). Finally, some engineers just prefer imperative languages they’re already comfortable using.
User-Defined Functions (UDFs) are typically used as a workaround in this case. You author a piece of logic in Java, package it, register with your SQL environment… that’s it! You found a way to bring imperative logic to the declarative world 🙂.
The Iron Functions extension enables you to take it even further.
Announcing Irontools
Hello world, everyone! 👋 Today, I’m excited to introduce Irontools - a suite of Apache Flink® extensions that makes your Flink pipelines more efficient and unlocks new use cases.