Apache Flink 1.8.0 release resolves over 420 issues and adds new features

Apache Flink is a “framework and distributed processing engine for stateful computations over unbounded and bounded data streams”. Its use cases include event-driven applications, data analytics applications, and data pipeline applications.

On April 9, 2019 the latest release became available. Version 1.8.0 resolves more than 420 issues and adds new features and improvements.

Flink v1.8.0 new features

The release announcement by Aljoscha Krettek states that this new version of Apache Flink brings the project “closer to our goals of enabling fast data processing and building data-intensive applications for the Flink community in a seamless way”.

Some of the newest features include:

State schema evolution story: The community worked on this new feature for 2 release spans. Now, v1.8.0 finalizes the effort with support for POJO state schema evolution. All Flink serializers have also updated to use new serialization compatibility asbtractions. Thus, Flink serializers no longer Java-serialize into savepoints. Flink recommends that if you are using custom TypeSerializer implementations for your state serialize, that you upgrade to the new abstractions. This update also provides pre-defined snapshot implementations for common serializers.
Cleanup of old state based on time-to-live (FLINK-7811): Time-to-live was introduced for Keyed state with FLINK-9510. Now old TLL entries are continuously cleaned up for the RocksDB state backend and heap state backend.
SQL pattern detection with user-defined functions and aggregations (FLINK-10597) (FLINK-7599): New extended features for the MATCH_RECOGNIZE clause. This includes custom logic during pattern detection, and adds aggregations for complex CEP definitions.
RFC-compliant CSV format (FLINK-9964): SQL tables now can be read and written in a a RFC-compliant CSV format.
KafkaDeserializationSchema gives direct access to Kafka ConsumerRecord (FLINK-8354): Allows access to all Kafka provided data for a record. This will eventually deprecate KeyedSerializationSchema functionality.
Per-shard watermarking option in FlinkKinesisConsumer (FLINK-5697): Adds pre-shard watermarks for FlinkKinesisConsumer.
New consumer for DynamoDB Streams to capture table changes (FLINK-4582): More added connectivity to AWS services.
Support for global aggregates for subtask coordination (FLINK-10887): Allows sharing of information between parallel subtasks with GlobalAggregateManager .

Latest changes

Convenience Hadoop library changes (FLINK-11266): As per the release notes, “Convenience binaries that include hadoop are no longer released”.
FlinkKafkaConsumer filters out restored partitiions no longer associated with a specific topic (FLINK-10342): Users can retain the previous behavior with the disableFilterRestoredPartitionsWithSubscribedTopics() configuration method on the FlinkKafkaConsumer.
Maven modules Table API changes (FLINK-11064): Update your dependencies to flink-table-planner and the correct dependency of either flink-table-api-java-bridge or flink-table-api-scala-bridge.

SEE ALSO: Adopting Jakarta EE

Download the binaries of the latest stable release.

View the repo on GitHub.

For further reading about the new release, view the entire changelog and release notes.

Our community is happy to announce the release of Flink 1.8.0. Major features include the completion of state evolution support, lazy clean up strategies for state TTL, and improved pattern matching support in SQL. Check out the release announcement: https://t.co/PO94kz2GBg pic.twitter.com/pmFQiA6Dlk

— Apache Flink (@ApacheFlink) April 10, 2019