Skip to main content

This Week in Fluvio #65

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We released Fluvio 0.11.12 last week.

New release

Fluvio v0.11.12 is now available!

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.12. Current version is 0.11.9.
info: Downloading (1/5): fluvio@0.11.12
info: Downloading (2/5): fluvio-cloud@0.2.26
info: Downloading (3/5): fluvio-run@0.11.12
info: Downloading (4/5): cdk@0.11.12
info: Downloading (5/5): smdk@0.11.12
done: Installed fluvio version 0.11.12
done: Now using fluvio version 0.11.12

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

Notable changes in this new version:

  • You can now use fluvio cluster upgrade to seamlessly upgrade local clusters to the latest Fluvio version.
  • Enhanced feedback for SPU-to-SPU connections during mirroring, with clearer error reporting on failures.
  • fluvio profile export <profile> now works with flv_profile_path, thanks to open source contributor @pangan21 for help with this fix, and see docs for how this works.
  • cdk build now defaults to more compatible native targets

Bug fixes

This release includes several important bug fixes:

  • Fixed a file descriptor leak occurring only on Linux.
  • Resolved rare concurrency issues in local clusters under high stress.
  • Improved error message for incompatible watch API versions.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #64

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We released Fluvio 0.11.11 last week.

New release

Fluvio v0.11.11 is now available!

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.11. Current version is 0.11.9.
info: Downloading (1/5): fluvio@0.11.11
info: Downloading (2/5): fluvio-cloud@0.2.25
info: Downloading (3/5): fluvio-run@0.11.11
info: Downloading (4/5): cdk@0.11.11
info: Downloading (5/5): smdk@0.11.11
done: Installed fluvio version 0.11.11
done: Now using fluvio version 0.11.11

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

Notable changes in this new version:

  • Added new argument to register a SPU with a public server local, using --public-server-local or just -l. A public server local allows configuration of additional network contexts when registering spus. Example: fluvio cluster spu register --id 5001 -p 0.0.0.0:9110 -l spu:9010 --private-server spu:9011.

  • The new smdk clean command to clean your project like cargo clean.

  • More updates for an upcoming SDF release.

Upcoming features

InfinyOn Stateful Data Flows is going to be in Public Beta soon. Stateful Data Flows is a new product that will allow you to build end to end stream processing data flows on Fluvio streams.

We have released 10 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of new features, bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.

New blog post


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #63

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We released Fluvio 0.11.10 last week.

New release

Fluvio v0.11.10 is now available!

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.10. Current version is 0.11.9.
info: Downloading (1/5): fluvio@0.11.10
info: Downloading (2/5): fluvio-cloud@0.2.25
info: Downloading (3/5): fluvio-run@0.11.10
info: Downloading (4/5): cdk@0.11.10
info: Downloading (5/5): smdk@0.11.10
done: Installed fluvio version 0.11.10
done: Now using fluvio version 0.11.10

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

We made the self hosted experience easier with the following:

  • Added auth, compression, consumer arguments for edge mirroring
  • Support to publish stateful data flows
  • Version checker for fluvio cluster resume
  • fvm self update support
  • Warning before cluster deletion

Upcoming features

InfinyOn Stateful Data Flows is going to be in Public Beta soon. Stateful Data Flows is a new product that will allow you to build end to end stream processing data flows on Fluvio streams.

We have released 10 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of new features, bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.

New blog post


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #62

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We released Fluvio 0.11.9 last week.

New release

Fluvio v0.11.9 is now available!

Thank you to our newest contributor:

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.9. Current version is 0.11.8.
info: Downloading (1/5): fluvio@0.11.9
info: Downloading (2/5): fluvio-cloud@0.2.21
info: Downloading (3/5): fluvio-run@0.11.9
info: Downloading (4/5): cdk@0.11.9
info: Downloading (5/5): smdk@0.11.9
done: Installed fluvio version 0.11.9
done: Now using fluvio version 0.11.9

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

We made the self hosted experience easier with the following:

  • Added auth, compression, consumer arguments for edge mirroring
  • Support to publish stateful data flows
  • Version checker for fluvio cluster resume
  • fvm self update support
  • Warning before cluster deletion

Upcoming features

InfinyOn Stateful Data Flows is going to be in Public Beta soon. Stateful Data Flows is a new product that will allow you to build end to end stream processing data flows on Fluvio streams.

We have released 10 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of new features, bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.

New blog post

We are building a series on Stateful Data Flow primitives. This is the introduction post: The absolute beginners guide to dataflow primitives


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #61

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We had a couple of back to back releases and another one incoming this week. We released Fluvio 0.11.8 last week and Fluvio 0.11.7 the week before.

New release

Fluvio v0.11.8 is now available!

Thank you to our newest contributor:

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.8. Current version is 0.11.6.
info: Downloading (1/5): fluvio@0.11.8
info: Downloading (2/5): fluvio-cloud@0.2.21
info: Downloading (3/5): fluvio-run@0.11.8
info: Downloading (4/5): cdk@0.11.8
info: Downloading (5/5): smdk@0.11.8
done: Installed fluvio version 0.11.8
done: Now using fluvio version 0.11.8

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

We made the self hosted experience easier with the following:

  • Forbid fluvio cluster start when it should be resumed.
  • Added default to wasi supported to build arch and enabled smart module and connector logging and observability.
  • We have released SPU to SPU mirroring. There is a blog in progress and we will share updated docs in the next update.

Upcoming features

InfinyOn Stateful Data Flows is going to be in Public Beta soon. Stateful Data Flows is a new product that will allow you to build end to end stream processing data flows on Fluvio streams.

We have released 8 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of new features, bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.

New blog post

We are building a series on Stateful Data Flow primitives. This is the introduction post: The absolute beginners guide to dataflow primitives.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #60

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We are back after a month of intense development work. We released Fluvio 0.11.6 this week. Stateful Data Flows is going to launch in Public Beta in May 2024. We are heading into an exciting period with more community activities planned.

New release

We are pleased share that Fluvio v0.11.6 is now available!

Thank you to our newest contributor:

To update you can run fvm update

$ fvm update

info: Updating fluvio stable to version 0.11.6. Current version is 0.11.5.
info: Downloading (1/5): fluvio@0.11.6
info: Downloading (2/5): fluvio-cloud@0.2.19
info: Downloading (3/5): fluvio-run@0.11.6
info: Downloading (4/5): cdk@0.11.6
info: Downloading (5/5): smdk@0.11.6
done: Installed fluvio version 0.11.6
done: Now using fluvio version 0.11.6

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

We made the self hosted experience easier with the following:

  • Comprehensive offset management on Fluvio streams is implemented in this version, we are working on stabilizing and the next version would have it generally available in connectors, consumers and all of fluvio.
  • fluvio cluster shutdown and fluvio cluster resume preserving the starting configuration of the local cluster on resume.
  • Advanced mirroring and caching is in dev and being reviewed to be available in the next release.

Upcoming features

InfinyOn Stateful Data Flows is going to be in Public Beta in May 2024. Stateful Data Flows is a new product that will allow you to build end to end stream processing data flows on Fluvio streams.

We have released 8 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of new features, bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

Good First Issues

We love our open source community contributors. Here are some issues that you could contribute to. All the best.

New blog post

We are building a series on Stateful Data Flow primitives. This is the introduction post: The absolute beginners guide to dataflow primitives.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #59

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We have moved away from a weekly updates. Fluvio open source project has grown significantly in terms of code and components. Our release cadence is divided into Fluvio Core, InfinyOn Cloud, Clients and SDKs. Moving forward our goal is to update the community with the relevant releases.

Today is such an occasion.

We released Fluvio 0.11.5 this week.

New release

We are pleased share that Fluvio v0.11.5 is now available!

Thank you to new contributors to the fluvio project:

To update you can run fvm update

$ fvm update
info: Updating fluvio stable to version 0.11.4. Current version is 0.11.5.
info: Downloading (1/5): fluvio@0.11.5
info: Downloading (2/5): fluvio-cloud@0.2.18
info: Downloading (3/5): fluvio-run@0.11.5
info: Downloading (4/5): cdk@0.11.5
info: Downloading (5/5): smdk@0.11.5
done: Installed fluvio version 0.11.5
done: Now using fluvio version 0.11.5

If you don't have Fluvio in your machine run:

curl -fsS https://hub.infinyon.cloud/install/install.sh | bash

If you are enjoying Fluvio please share with your friends!

New features

We made the self hosted experience easier with the following:

  • Hub access to public smartmodules and connectors no longer requires a cloud login
    • cdk, smdk, and fluvio hub commands should allow access to list and download public components
  • Consumers connected to a topic will receive a notification and shut down if the connected topic is deleted
  • Improvements to support async in our fluvio python client

Upcoming features

InfinyOn Stateful Service Development Kit is 2 releases away from a beta release.

We have released 6 developer preview iterations and shared with 50 to 100 developers. If you'd like access to the private beta, please fill out this form.

Bug fixes

This release includes a number of bug fixes, documentation improvements, and improved error messaging.

See the CHANGELOG for details

New blog post

[Marvin Hansen] wrote this amazing blog after building with Fluvio. Real-time Streaming Analytics with Fluvio, DeepCausality, and Rust

Good First Issues

All the best. Here are some issues that you could contribute to:


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

[fraidev]: https://github.com/fraidev) [Urbit-pilled]: https://github.com/urbit-pilled [Marvin Hansen]: https://github.com/marvin-hansen

This Week in Fluvio #58

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


We paused the weekly updates because we had several big changes in flight. We will refactor this blog to Fluvio Release Updates and remove the weekly cadence in the next iteration.

A lot has happened in the past couple of months that is relevant to update.

New release

Fluvio moved to version 0.11.3 a couple of days ago. We have had a bunch of releases since we last published an update on our site.

Full Changelog of the release is here: Fluvio Changelog

New features

Fluvio has a bunch of exciting updates. The community has been asking for a single binary deployment that can be run locally, using Docker, using Nomad. That meant we needed to decouple our tight coupling with Kubernetes. We did that in 2023 and the community started building with Fluvio!

We have been busy making documentation updates since then. So this update was delayed. Below are the main updates:

  • Fluvio now has a version manager that manages multiple versions, installs, updates, etc. We are calling it Fluvio Version Manager(fvm)
  • fvm can be installed by simply running:
curl -fsS https://hub.infinyon.cloud/install/install.sh | bash
  • You can deploy Fluvio as a single compiled binary using fvm. The installer takes care of everything!
  • You can run a local self-hosted Fluvio cluster by simply running
fluvio cluster start

which will start a self hosted fluvio cluster using local resources

In light of this our Quick Start docs has been updated to reflect the changes. Getting started with fluvio is the easiest it has ever been.

Updated Quick Start Link

Upcoming features

There are some exciting community projects that are in development:

  • There are some awesome new contributions in the works which included integrations with Spider Web Crawler, OpenSearch, ElasticSearch, Qdrant, Surreal, OpenDAL etc.
  • We have an oversubscribed developer preview for Stateful Service Development Kit which makes stateful stream processing a reality.
  • We have docs on docker based deployment.

Good First Issues

If you are excited to contribute to Fluvio Open Source, here are 3 good first issues that you can consider:

All the best.

Get in touch with us on Github Discussions or join our Discord channel and come say hello!

This Week in Fluvio #57

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


Last week we released Fluvio 0.10.14 with exciting updates!

We had 4 conversations with lead data engineers and architects to discuss their data pipelines. We have talked to over 14 data engineers and architects in the last couple of weeks and these have been awesome. If you'd like to a have a conversation about your data pipelines and discuss problems, validate ideas - email me at drc@infinyon.com

Latest release

  • We released topic level deduplication which enables us to implement exactly once semantics on topics and deduplicate based on keys.
  • We have also released timestamp injection in smartmodule context.
  • Full changelog is available here

Upcoming features

  • We are productizing a lean binary that runs on top of edge devices with minimum memory and storage with caching and mirroring to ensure data delivery from edge to cloud without losing data.
  • We are also building out the foundations for multi region deployment to support users in the EU
  • This is a way for us to get hybrid deployments and a single binary install of our stream processing engine.
  • Upcoming blog on deduplication is in progress

New blog post

New video

That's all folks. until next week.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #56

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


Big updates are on the verge of being released this week. Fluvio 0.10.14 is going to have some awesome updates!

This week has been great with our efforts aligning towards our next release.

BTW, we have been offering free data architecture reviews to data practitioners to help with planning data pipelines. We have talked to over 10 data engineers and architects in the last couple of weeks and these have been awesome. If you'd like to a have a conversation about your data pipelines and discuss problems, validate ideas - email me at drc@infinyon.com

Upcoming features

We are testing timestamp access and manipulation in the transformation smart modules. This would complement our lookback functionality to look at the past data, and deduplication on read. We are so close to exactly once delivery guarantees with on stream deduplication based on user defined keys.

Our biggest update that is in the works is an engine that powers unbounded stateful processing in our platform.

This is a big step towards our vision for a composable unified stateful stream processing platform.

Developer experience improvements

We have been working on running our core stream processor at the edge to support ARM based sensors with limited memory and we brought back the raspberry pi out of our shelves and working on a demo.

We are also working on lean implementations of our runtime and control plane to build the foundations to support issues of data privacy and data sovereignty with hybrid deployment patterns.

New blog post

New video

That's all folks. until next week.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #55

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


Fluvio 0.10.13 was released on 14th July 2023! we are back on time for this week in Fluvio - 55.

Latest release

  • We are almost there with the deduplication functionality on stream. We just completed implementing time bound look back and topic based deduplication interface.
  • We have fixed the smart module development kit and the connector development kit publish workflows, and chrono dependency.
  • Full changelog is available here

Upcoming features

We are wrapping up the docs to publish the graphite connector in the InfinyOn Labs repo.

Developer experience improvements

  • We are experimenting with running the fluvio binary on the edge devices and pushing it to the limits to enable more efficient data capture in memory and internet constraints without data loss as opposed to the existing message brokers.
  • We are also testing out inbound webhook implementation to capture data into topics and conceptualizing what an outbound webhook gateway might look like to read from streaming topics.
  • Finally, we have made progress on shaping and prototyping stateful materialized views on stream! This is one of the most asked functionality and we are working with select partners to test out the functionality in their context.

New blog post

New video

That's all folks. until next week.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #54

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


Alright, alright, alright! we are back on time for this week in Fluvio - 54.

This week hase been great with our efforts aligning towards our next release.

Upcoming features

We have updates coming in the interface of the SQL sink connector that will provide the option to configure insert and upsert operations that we are about to release.

We spent some time working on a graphite connector which we will make available through InfinyOn Labs

Developer experience improvements

We are also looking at the Fluvio Python and Go clients to get a baseline of functionality.

We are experimenting with how far we can push our core stream processor in terms of efficiency under significant storage and memory constraints.

New blog post

New video

That's all folks. until next week.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #53

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


After yet another hiatus this week in Fluvio is back! We have released a number of improvements in the last little while.

Our public roadmap is live! The high level functionality that we are building on Fluvio and InfinyOn Cloud is available on the roadmap.

We want to hear from you about the features and functionality you are looking for and encourage you to submit issues to the [Fluvio Repo] (https://github.com/infinyon/fluvio).

New release

Fluvio 0.10.12: Our latest release includes:

  • Updates to the CLI to pass topic-configuration to create topics.
  • Updated SQL Sink Connector to support upsert.
  • We launched a certified DuckDB connector based on Fluvio DUck!

New features

Fluvio stream processing core now enables the functionality to look-back which is exactly what you think! It's a way to time travel into the past retained records. We need this feature to dedupe records on read.

Upcoming features

If you did not guess it from the last section, we are building deduplication on read to deliver records exactly once.

We have also started working on stateful stream processing, specifically grouping and time window based aggregation.

Developer experience improvements

There are updates to logging of connectors to improve the traceability of when the legacy systems struggle to keep up with the streams and the connectors require a refresh. Fluvio Docs are updated with the log levels.

Community Engagement

We are offering free architecture reviews for data practitioners who are looking to improve their data flows and optimize for:

  • Simplicity
  • Efficiency
  • Cost

Send a note to our head of product and part time community advocate if you want a free architecture review session.

Email - drc@infinyon.com


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #52

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

Welcome to the 52nd edition of this week in Fluvio.This one is going to be short and sweet.

New release

https://github.com/infinyon/fluvio/releases/tag/v0.10.7

New features

We have released connector secrets and here are the details:

Summary

Specify user interfaces to cloud connector secrets. This includes a cli for populating the secrets, as well as how a connector configuration can refer to the secrets.

Motivation

Customers need to be able to provide secrets to connectors in a protected way in order to access their own or third-party services with the confidence that Infinyon infrastructure provides protection from loss of the secrets.

Approach

Cloud secrets are set via cli. Each secret is a named value with all secrets sharing a secrets namespace per account. Connector configuration files can refer to secrets by name, and the cloud connector infrastructure will provision the connector with the named secrets.

Due to security concerns, listing actual secret values or downloading them after they have been set is not allowed. However, a listing of secret names as well as what date they were last set is described in the interface.

New fluvio cloud secret subcommands

The secrets cli is an added subcommand to 'fluvio cloud' with the following cli ui interface:

fluvio cloud secret set --connector <NAME> <VALUE>
fluvio cloud secret set --connector <NAME> --file <FILENAME> # e.g. a tls cert file
fluvio cloud secret delete <NAME>
fluvio cloud secret list

Note: The current implementation limits the scope of the secrets to connectors only. Also see open questions below regarding set, update, create.

Displaying secrets: fluvio cloud secret list

One security principle in effect with secret list is that we never want to send customer secrets out of the cloud. They are only decrypted at the point of use inside the connector. But users still need to see what named secrets have been set, and potentially when they were last updated.

$ fluvio cloud secret list
SecretNames LastUpdate
CAT_FACTS_CLIENT_ID 12-10-2022 1:07pm
CAT_FACTS_SECRET 01-02-2023 12:01am

Connector config file references

The connector config files can reference cloud secrets by NAME as follows:

meta:
version: 0.1.0
name: my-connector
type: package-name
topic: a-topic

<CUSTOM>: # named section for custom config parameters, usually a short name like "http", or "mqtt"
param_client_id:
secret:
name: CAT_FACTS_CLIENT_ID
param_client_secret:
secret:
name: CAT_FACTS_SECRET

Upcoming features

We are working on a few interesting problems around deduplication and stream processing. We have also made solid progress on a public roadmap to share with the community.

We would love to know what do you recommend we build next to make your development flow easier. Please comment in our Discord channel and let us know. Tag Deb (DRC) on your feedback and feature requests.

New blog post

Why 87% of all data projects are doomed to fail, and how you can improve the odds of success


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #51

· 4 min read

Fluvio is a distributed, programmable streaming platform written in Rust.


Welcome to the 51st edition of this week in Fluvio.

For this edition we have more details on Community Contribution

Community update

Carson Rajcan presented to our team a really cool contribution to the Fluvio OSS.

In the project Carson developed the functionality to apply SmartModule transformation for Producer - PR #3014

Here is Carson's experience...

The Problem

Get the Stream Processing Unit to apply SmartModule transformations for the Producer requests coming from the CLI (before commit).

At present, Fluvio SmartModules are only applied for Consumer requests after being read from disk. Shaping the data before it enters the topic spares the Consumer or other downstream services from this burden.

Why This Problem?

While investigating another issue, I had recently gained some insight into how the Stream Processing Unit handles Producer requests, as well as how it uses the Fluvio Smart Engine to transform records for the Consumer.

Critically, I noticed that the Smart Engine was well encapsulated -- it was going to be easy to repurpose.

Also that the Producer request handler was well organized, it wasn't going to be a nightmare to plug in some additional functionality.

The Workflow

Where was I going to start? Well, TDD is my friend.

There were plenty of test cases for Consumer SmartModule transformations that used the CLI. All I had to do was move the SmartModule options from the Consumer commands over to the Producer commands.

In this example, we are processing email addresses on the Consumer side.

Our test email is FooBar@test.com (emphasis on the usage of capital letters)

%copy%

echo "FooBar@test.com" | fluvio produce emails
fluvio consume emails -B -d --smartmodule lowercase

In this example, we are still processing email addresses, but now it is possible to do on the Producer side.

%copy%

echo "FooBar@test.com" | fluvio produce emails --smartmodule lowercase
fluvio consume emails -B -d

Both result in the same output with the email address using only lowercase letters:

foobar@test.com

The Task

With my TDD workflow ready to go, I made the changes from the outside-in.

In The CLI

  1. Added the SmartModule options to the CLI Produce Command.
  2. Used those arguments to build SmartModuleInvocation(s), a type used to model SmartModules during network requests.
  3. Added the SmartModuleInvocation(s) to the ProduceRequest

For the transfer

  1. Had to define how the SmartModuleInvocation(s) would be encoded & decoded

On the SPU

  1. Translated the SmartModuleInvocation(s) into a SmartModuleChain, a type which can be passed to the Smart Engine.
  2. Finally, I fed the SmartModuleChain and the Produce requests's records to the SmartEngine.

Problems I Faced

Types In A New Domain

Learning to translate between types in someone else's codebase can be challenging.

There are many types to become familiar with in the Stream Processing Unit. Notably, the SPU uses a few different Batch types to model using different Record types (Record, RawRecords, MemoryRecords).

You end up seeing Batch<BatchRecords>, Batch<RawRecords>, Batch<MemoryRecords> ... quite often. It took me a while to figure out when and where to use each, and how to convert between them.

Compression and SmartModules

What happens when a Producer sends compressed records and requests the SPU performs a SmartModule Transformation?

The records must be decompressed so they can be fed to the SmartEngine, then compressed again before storage. To pull this off I had to dig up the code that performs the compression, decompression and figure out how to utilize it while handling Producer requests.

What Next?

  1. SmartModules that perform filtering and aggregation can now be applied before commit to save storage.
  2. Time intensive SmartModule operations can be performed on write, rather than while consuming.

Upcoming features

Thank you for your feedback on Discord. We are working on a public road map that should be out soon.

Keep the feedback flowing in our Discord channel and let us know if you'd like to see the video of Carson walking through the code.


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #50

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


Welcome to the 50th edition of this week in Fluvio.

It's been 18 weeks since we last published our weekly newsletter. It was our first Fluvio Newsletter winter!

We are rebooting the newsletter and there is a lot more to come...

New release

https://github.com/infinyon/fluvio/releases/tag/v0.10.6

New features

Connector Development Kit Launch: The goal of the Connector Development Kit (CDK) is to enable developers to write connectors to read data from data sources across different services using fluvio and Rust. We have been internally building our connectors using the CDK and are now ready to share it with the open source community.

Generating a project

inline-embed file="embeds/cli/example/cdk-generate-example.md"

This is what the generated rust code for the connector looks like

inline-embed file="embeds/cdk/my-connector-code.md"

If you're interested in learning more, here are the docs to CDK

We are excited to see what you are going to build with the CDK!

We are currently working on a way to certify connectors built by the community and would love your inputs on our Discord channel.

Besides the Connector Development Kit, we have made other improvements to the packaging of Fluvio components.

Full changelog available here: https://github.com/infinyon/fluvio/blob/master/CHANGELOG.md

Upcoming features

We are aware of the frustrations of getting started with fluvio due to the clunky installation process. We are considering a clean binary install, as well as installing using a package manager. However, we have been busy with the CDK, SMDK and other Cloud Features.

We would love to know what do you recommend we build next to make your development flow easier. Please comment in our Discord channel and let us know. Tag Deb (DRC) on your feedback and feature requests.

Bug fixes

Enabled --file and --key-separator to be used together, fix --key handling when producing lines (#3092)

Community update

Carson Rajcan presented to our team a really cool contribution to the Fluvio OSS. In the project Carson developed the functionality to apply Smart Module transformation for Producer - PR #3014

As Winter is now over... We are planning community events to champion community contributors, as well as planning hackathons to build som cool stuff with the community using CDK, SMDK etc.

Let us know what events you are interested in our Discord channel.

InfinyOn Cloud updates

We are putting the finishing touches on connector secrets for you to use the InfinyOn certified connectors built using CDK to interact with InfinyOn Cloud while using passwords, API keys.

We are also shaping work for materialized views of time based aggregates from topics.

We are also exploring a developer user experience for our cloud platform. More updates on this in a couple of months.

New blog post

Using DuckDB with Fluvio


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #49

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

SmartModule transformation chaining was introduced in the last release as a preview with our SQL outbound connector

In this release, support is now available to the Rust client, fluvio and smdk CLI, and connectors wit the keyword transforms.

To get familiar, check out the example configs from our tutorials.

Bug fixes

Developer experience improvements


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #48

· 3 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

Deprecations

The fluvio connector CLI and Fluvio's management of connectors has been removed in this release.

You can still use local connectors with your local Fluvio cluster. For more about local connectors see the local connectors docs

New features

  • SmartModule chaining (#2618)
  • SmartModule Development Kit CLI (#2632)
    • SmartModule packages
  • Add throughput control to Fluvio producer (#2512)
  • Added blocking on Producer if the batch queue is full (#2562)

Developer experience improvements

SmartModule Development Kit

The SmartModule Development kit reduces the number of steps required to get started with developing new custom SmartModules using the smdk CLI.

SmartModule Development Kit docs

SmartModule chaining preview

This release has a preview for SmartModule chaining. This functionality is offered with our Cloud SQL outbound connector.

To see it in action, you can follow the following tutorials:

InfinyOn Cloud updates

New UI

A new version of the InfinyOn Cloud platform UI has been released. We've added the capability to view realtime info about your cluster.

Here's a quick preview

A cropped screenshot of the new InfinyOn Cloud web UI

Check out the New UI tutorial for more information.

Cloud connectors

Management of connectors is now exclusive to InfinyOn Cloud. You can create connectors in InfinyOn with the fluvio cloud connector CLI.

Check out the Cloud connectors docs for more info

SmartModule Hub

SmartModule Hub is a new service for offering public SmartModules. This removes the requirement of installing a SmartModule development environment in order to take advantage of SmartModules. You can download SmartModules from the Hub directly to your cluster to use.

For developers, you can use smdk to publish SmartModules to the Hub to share publicly.

Check out the SmartModule Hub docs for more info

Recent events

We launched the new InfinyOn Cloud platform at KubeCon.

Thanks to those who were in attendance at KubeCon and stopped and said hi to us last week!

A group photo at KubeCon22 with members of the InfinyOn team

A photo of the InfinyOn booth. A screen with the InfinyOn Cloud dashboard displayed in front of an InfinyOn branded purple background


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #47

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


Progress report:

fluvio connector CLI deprecation

This is update #3 since we announced the deprecation of the fluvio connector subcommand.

We are in process of documenting the migration strategy for our Open Source users to continue managing their connectors locally using connector-run.

note

This is an archival newsletter entry, for the latest on running connectors see docs on the cdk cli

If you're interested in trying out the bleeding edge, you can run these commands to build the connector-run CLI, and run your connector in Kubernetes using your existing connector config file:

$ git clone https://github.com/infinyon/fluvio-connectors.git
$ cd fluvio-connectors
$ cargo run --release --bin connector-run -- apply --config /path/to/your/connector.yml

Please connect with us in our Discord channel or you can email us at team@infinyon.com if there are any questions, concerns, comments, etc.

We'll continue to make updates about this matter until resolved.

Recent events

The InfinyOn team spent the week in NYC for an in-person meetup to plan for the future.

For some of us, this was the first time meeting face-to-face. This was the first time we were all in the same room since our last event last year!

A group photo of the InfinyOn team standing in from of the entrance of the NYC Google office at Pier 57

We're not yet ready to talk about it, but we are looking forward to the reveal of this collaboration. Stay tuned!


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #46

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

Please subscribe to This Week in Fluvio to receive new posts in your inbox SUBSCRIBE_BUTTON

BANNER


Progress report:

fluvio connector CLI deprecation

This is an update from the previous issue, which we announced the deprecation of the fluvio connector subcommand.

We are in process of creating an external CLI tool and documenting a migration strategy for our Open Source users to continue managing their connectors locally.

Rust devs can get a sneak peak of this tool in our Connectors GitHub repo, which will continue to support your connector config files.

note

This is an archival newsletter entry, for the latest on running connectors see docs on the [cdk] cli

Please connect with us in our Discord channel or you can email us at team@infinyon.com if there are any questions, concerns, comments, etc.

We'll continue to make updates about this matter until resolved.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #45

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


Upcoming deprecation

fluvio connector CLI deprecation

We intend to remove the fluvio connector subcommand, and related API from the Fluvio open source project, and migrate the support of managed connectors to InfinyOn Cloud.

We will announce the release to expect this change ahead of time. Currently, we are updating our documentation with equivalent workflows in preparation to support OSS Fluvio users though this migration.

note

This is an archival newsletter entry, for the latest on running connectors see docs on the [cdk] cli

Please connect with us in our Discord channel or you can email us at team@infinyon.com if there are any questions, concerns, comments, etc.

We'll continue to make updates about this matter until resolved.

Upcoming InfinyOn Cloud update

We intend to continue support for managed connectors in the CLI in this environment.

Accompanying the removal of the fluvio connector CLI will be an update to the fluvio cloud CLI.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #44

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

Please subscribe to This Week in Fluvio to receive new posts in your inbox


New video

Here is the repo of our most recent webinar: Enhance your Kafka Infrastructure with Fluvio

The repo with the demo content can be found here


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #43

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

Recent events

We held our webinar on April 16 for Enhance your Kafka Infrastructure with Fluvio.

Thanks to everyone who signed up for the event and participated in the Q&A!

For those who missed it, we'll have a link to the recording in a future This Week in Fluvio.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #42

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Added DeliverySemantic to fluvio-cli. (#2508)
  • SmartModule package: add missing metadata (#2532)

Bug fixes

  • Prevent collisions between namespaces (#2539)

Developer experience improvements

  • CLI: Added ability to delete multiple connectors, smart modules and topics with one command. (#2427)
  • Added --use-k8-port-forwarding option to fluvio cluster start. (#2516)
  • Added proxy support during packages installation (#2535)
  • Adds feedback and debug info to 'smart-module create' (#2513)

New blog post


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #41

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

  • SmartModule development environment
  • Rust crate fluvio-jolt
    • This is a native Rust port of the Java library of the same name
    • JSON to JSON transformation where the "specification" for the transform is itself a JSON document
    • Compatible for use in SmartModules

Feature Highlight

This feature was added in the previous release but was not mentioned in last week's issue.

  • Support for at-least-once and at-most-once in the Producer Client. (#2481)
    • This feature introduces the notion of Delivery Semantic to Fluvio Producer. From now, you can choose in which manner you want your records to be transported from the producer to the SPU unit. It's either at-most-once guarantee or at-least-once. The first one is sending without waiting for the response, hence no reaction for errors. The latter one is sending and retrying until succeeded (with certain assumptions). Check out more details in Delivery Semantics section.

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #40

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Support async response in multiplexed socket. (#2488)
  • Rename --smartmodule option in fluvio consume to --smart-module. `--smartmodule is still an alias for backward compatibility. (#2485)

Performance improvements

  • Drop write lock before async IO operations. (#2490)
  • Add Clone trait to DefaultProduceRequest. (#2501)
  • Add AtMostOnce and AtLeastOnce delivery semantics. (#2503)

Bug fixes

  • Restrict usage of --initial, --extra-params and --join-topic in fluvio consume. Those options only should be accepted when using specific smartmodules. (#2476)
  • Keep serving incoming requests even if socket closed to write. (#2484)

Developer experience improvements

  • Measure latency for stats using macro. (#2483)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #39

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Add fluvio connector config <connector-name> (#2464)
  • Add performance counters to producer (#2424)

Performance improvements

  • Prefer ExternalIP to InternalIP if configured in kubernetes (#2448)
  • Move stream publishers to connection-level context (#2452)
  • Upgrade to fluvio-future 0.4.0 (#2470)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #38

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Re-allow string, dictionaries and lists as options to parameters section in connector yaml. (#2446)

Bug fixes

  • Fix issue in producer when sending more than one batch in a request (#2443)
  • Fix bug in last_partition_offset update when handling smartmodules on SPU (#2432)

Developer experience improvements

  • Improve CLI error output when log_dir isn't writable (#2425)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #37

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Display multi-word subcommand aliases in CLI help info (#2033)
  • Add filter-map support to SmartProducer (#2418)

Performance improvements

  • Upgrade to Wasmtime 0.37 (#2400)

Bug fixes

  • Allow Cluster diagnostics to continue even if profile doesn't exist (#2400)

  • Add timeout when creating SPG (#2364)

  • Revert 0.9.28 updates to Connector yaml config (#2436)

    • Soon after the 0.9.28 release, we discovered an issue that slipped past our CI. For those interested, the following are a preview of what changes are coming soon to connectors.
      • Add top level producer and consumer entries to connector yaml configurations. (#2426)
      • Allow string, dictionaries and lists as options to parameters section in connector yaml. (#2426)

Developer experience improvements

  • Log fluvio version and git rev on client creation (#2403)
  • Fix wasi functions binding relying on order (#2428)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #36

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

New release

Developer experience improvements

The Fluvio Java client is now hosted in Maven Central, which should reduce the friction for Java developers to install.

InfinyOn Cloud updates

We've added the capability to query your account's [CPU and memory usage via Cloud CLI]

New blog post


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #35

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

New blog post

Recent events

Last week We held a 30 minute webinar for "How to power event-driven applications with InfinyOn Cloud".

Thanks to everyone who attended!


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #34

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

New release

New features

  • Support installing clusters on Google Kubernetes Engine (#2364)

Developer experience improvements

  • [CI] Make Zig Install more reliable (#2388)
  • Add path setting hint for fish shell in install script (#2389)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #33

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


Recent events

Our CTO, Sehyo Chang spoke at Cloud Native Wasm Day EU 2022.

The talk was called Building WASM Powered Distributed Stream Platform.

Thanks to everyone who attended the talk in person and we appreciate the positive response! We'll be sure to share a link to the recording when it is available.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #32

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

Performance improvements

  • In the previous week, we added in #2302 an ability to specify for Consumers and Producers which level of isolation to use. Isolation, basically, is a trade-off between latency and guarantees. If your workload needs low latency, but you can tolerate some losses, ReadUncommitted is recommended option for you. An alternative is ReadCommitted - latency includes replication but data consistency is stronger. Check out Data Consistency for more details.

Bug fixes

  • Increase default STORAGE_MAX_BATCH_SIZE (#2342)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #31

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

New release

New features

  • Set timestamp in Records while producing. (#2288)
  • Add {{time}} option to --format in fluvio consume to display record timestamp (#2345)

Performance improvements

  • Support ReadCommitted isolation in SPU for Produce requests #2336
  • Producer must respect ReadCommitted isolation #2302

Bug fixes

  • Improve error messages and add --fix option to fluvio cluster check to autofix recoverable errors (#2308)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #30

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

  • Fluvio v0.9.24
  • DynamoDB Sink connector (Unsupported as of Fluvio v0.10.0)
    • The Dynamodb Sink Connector is a sink connector which reads events from a Fluvio topic, deserializes them as json and inserts those key value pairs based on the columns in the config.
  • Slack Sink connector (Unsupported as of Fluvio v0.10.0)
    • The Slack Connector is quite simple. It will stringify any record coming from a Fluvio stream and POST it to the slack via a slack webhook url

New features

  • Storage: Enforce size based retention for topic (#2179)
    • Previously, Fluvio supported only a time-based retention policy for data in a topic. For some workloads, it was inconvenient as it was needed to consider the incoming data pace to properly calculate retention time to fit the data into the available storage size. With this new feature, you can tell Fluvio what is the maximum size of the partition you want, and it will control it for you. Check out the details in Data Retention.
  • Export cluster profile to a file (#2327)
    • Can be used to initialize the connection to a Fluvio cluster via client APIs.

Bug fixes

  • Don't try to use directories as smartmodule if passed as argument (#2292)
  • CLI: Migrate all fluvio crates to comfy-table from prettytable-rs (#2285)

New blog post

  • Real-time Gaining Momentum in the Enterprise
    • Grant shares how legacy data infrastructures operate and how Infinyon with Fluvio's community are positioned to leverage modern technology for faster and higher quality data infrastructures.

Recent events


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #29

· One min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

Thanks

We held a webinar we held on April 12 for Real-time Event Streaming and Data Transformation for Financial Services.

Thanks again for those who showed up and an extra thanks to those who participated in the Q&A!

Recent events

Fluvio just reached 1000 stars on Github!

Thanks to everyone in the community who were generous enough to take this Unclicked Github Star button

And turn it into this: Clicked Github Star button

We really appreciate the support!


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #28

· 2 min read

Fluvio is a distributed, programmable streaming platform written in Rust.

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.


New release

New features

  • Add TYPE column to fluvio connector list (#2218)

Performance improvements

  • Increase default MAX_FETCH_BYTES in fluvio client (#2259)
  • Use Clap instead of StructOpt for all CLI (#2166)

Bug fixes

  • Add fluvio-channel to fluvio update process (#2221)
  • Disable versions from displaying in CLI subcommands (#1805)
  • Re-enable ZSH completions (#2283)

Recent events

We held our webinar on April 12 for Real-time Event Streaming and Data Transformation for Financial Services.

Thanks to everyone who signed up for the event and participated in the Q&A!

For those who missed it, we'll have a link to the recording in a future This Week in Fluvio.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #27

· One min read

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

Fluvio is a distributed, programmable streaming platform written in Rust.


Team updates

This week we have 2 new contributors joining the team.

Please welcome Alexander and Tejas!

Upcoming events

Webinar on Apr 12, 2022: Real-time Event Streaming and Data Transformation for Financial Services

This webinar will cover how to:

  • Quickly build high-performance data pipelines
  • Leverage programmable stream processing to clean and transform data in real-time
  • Eliminate the need for Extract, Transform and Load (ETL) tools

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #26

· One min read

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

Fluvio is a distributed, programmable streaming platform written in Rust.


New release

New features

  • Add configuration for compression at topic level (#2249)
  • Add producer batch options to CLI fluvio produce CLI (#2257)

Upcoming events


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #25

· 2 min read

This Week in Fluvio is our weekly newsletter for development updates to Fluvio open source.

Fluvio is a distributed, programmable streaming platform written in Rust.


New release

New features

  • Compression support (#2082)
  • Disk usage visibility in CLI via Size field added to the output of fluvio partition list (#2148)
  • Add support for partial CA Intermediate Trust Anchors (#2232)

Performance improvements

  • Make store time out configurable for cluster startup (#2212)
  • Optimize partition size computation (#2230)

Bug fixes

  • Fix Installer problem with self-signed certs (#2216)
  • Report SPU error codes to FutureRecordMetadata (#2228)

Miscellaneous

  • Data generator support for fluvio-test (#2237)

Recent events

  • Our CTO Sehyo Chang was on Data Engineer's lunch episode #58 where he introduced Fluvio, gave a short demo of streaming data transformation using the CLI and SmartModules, and concluded with a Q&A.

Upcoming events


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #24

· 3 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


Data Retention

If you are producing a lot of data into Fluvio, you might be interested in keeping that data fresh for your consumers. Older data can be automatically pruned.

The default retention on new topics is 7 days with a default segment size of 1 GB. So any data residing in older 1 GB segments will be pruned 7 days after the last write. The current segment is left alone.

Example data lifecycle

For a given topic with a retention of 7 days using 1 GB segments

  • Day 0: 2.5 GB is written (total topic data: 2.5 GB)
Topic Segment #Segment sizeDays since last write
01 GB0
11 GB0
20.5 GBN/A
  • Day 6: Another 2 GB is written (total topic data: 4.5 GB,)
Topic Segment #Segment sizeDays since last write
01 GB6
11 GB6
21 GB0
31 GB0
40.5 GBN/A
  • Day 7: 2 segments from Day 0 are 7 days old. They are pruned (total topic data: 2.5 GB)
Topic Segment #Segment sizeDays since last write
21 GB1
31 GB1
40.5 GBN/A
  • Day 14: 2 segments from Day 7 are 7 days old. They are pruned (total topic data: 0.5 GB)
Topic Segment #Segment sizeDays since last write
40.5 GBN/A

The newest segment is left alone and only begins to age once a new segment is being written to.

For more detail check out the docs for more about data retention in Fluvio


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG. Until next week!

This Week in Fluvio #23

· 4 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.20

Connector config key deprecation: create_topic

This release deprecates the create_topic key from the connectors config. You can remove this key from your configs. The default behavior is to create the topic given by topic.

# connect.yml
version: 0.2.0
name: cat-facts
type: http
topic: cat-facts
- create_topic: true
direction: source
parameters:
endpoint: https://catfact.ninja/fact
interval: 10
output_parts: body
output_type: text

Update connector config

When developing with connectors, you may need to reconfigure your settings. This is now easier to do.

Let's work through an example using the following config.

# connect.yml
api_version: 0.2.0
name: cat-facts
type: http
topic: cat-facts
direction: source
parameters:
endpoint: https://catfact.ninja/fact
interval: 10
output_parts: body
output_type: text

First we create our connector

$ fluvio connector create --config connect.yml

As expected, we are getting cat facts of different lengths.

$ fluvio consume cat-facts
Consuming records from the end of topic 'cat-facts'. This will wait for new records
{"fact":"The average cat food meal is the equivalent to about five mice.","length":63}
{"fact":"The first commercially cloned pet was a cat named \"Little Nicky.\" He cost his owner $50,000, making him one of the most expensive cats ever.","length":140}
{"fact":"Mohammed loved cats and reportedly his favorite cat, Muezza, was a tabby. Legend says that tabby cats have an \u201cM\u201d for Mohammed on top of their heads because Mohammad would often rest his hand on the cat\u2019s head.","length":210}

Later on I decided that I want to change my endpoint to add a query parameter. We no longer want facts longer than 50 characters.

- endpoint: https://catfact.ninja/fact
+ endpoint: https://catfact.ninja/fact?max_length=50

Running fluvio connector update will handle updating your connector settings. This will delete your existing connector, and recreate with the config settings.

$ fluvio connector update --config connect.yml

After the connector restarts, we see that our output reflects the changes made in our config.

$ fluvio consume cat-facts
Consuming records from the end of topic 'cat-facts'. This will wait for new records
{"fact":"A cat's field of vision is about 200 degrees.","length":45}
{"fact":"Blue-eyed, pure white cats are frequently deaf.","length":47}
{"fact":"Cats have supersonic hearing","length":28}

CLI Version pinning

Fluvio CLI Channels now fully support version pinning. This is done by preventing users who are currently using a version channel from running fluvio update. Users will get a short message informing them how to switch to a different version channel.

Example: If a user switched directly to version 0.9.20 (as opposed to running the stable or latest channel) then running fluvio update should not update to a new release.

First we create our version channel. (This could be any of our previous versions)

$ fluvio version create 0.9.20
🎣 Fetching '0.9.20' channel binary for fluvio...
⏳ Downloading Fluvio CLI with latest version: 0.9.19...
🔑 Downloaded and verified package file
✅ Successfully updated /home/user/.fluvio/bin/fluvio-0.9.20

And switch over to it

$ fluvio version switch 0.9.20
Switched to release channel "0.9.20"

Now if we try to update, we'll get an error message and some short instructions for how to switch to a new version channel.

%copy first-line%

$ fluvio update
Unsupported Feature: The `fluvio update` command is not supported when using a pinned version channel. To use a different version run:
fluvio version create X.Y.Z
fluvio version switch X.Y.Z

Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #22

· 6 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.19

Connector versions

Now when you run fluvio connector list, the version of the connector is returned

Example connector config:

# test-connector-config.yaml
version: 0.1.1
name: my-test-connector
type: test-connector
topic: my-test-connector-topic
direction: source

%copy first-line%

$ fluvio connector create --config test-connector-config.yaml

%copy first-line%

$ fluvio connector list
NAME VERSION STATUS
my-test-connector 0.1.1 Running

SmartModule debugging support using WASI

This is for advanced users who are willing to compile Fluvio locally. Please follow the Fluvio Developer guide to get set up for local development.

SmartModule devs can now compile Fluvio with WASI support. This provides SmartModules access to stdout and stderr for debugging purposes.

$ git clone https://github.com/infinyon/fluvio.git

Build the development Fluvio cluster image with WASI support enabled

$ DEBUG_SMARTMODULE=true make build_k8_image

Build the development Fluvio CLI.

$ make build-cli

Start our development Fluvio cluster with WASI support

$ ./target/debug/fluvio cluster start --develop

Here's our example SmartModule. It is a slight modification of our filter example. For debugging purposes, we print the record to stdout before checking the contents of the record and applying filtering.

use fluvio_smartmodule::{smartmodule, Record, Result};

#[smartmodule(filter)]
pub fn filter(record: &Record) -> Result<bool> {
// Print every record to SPU logs
println!("DEBUG: {record:#?}");
let string = std::str::from_utf8(record.value.as_ref())?;
Ok(string.contains('a'))
}

Before you build the SmartModule, you need to add the wasm32-wasi target with rustup.

$ rustup target add wasm32-wasi

Build SmartModule using the wasm32-wasi target to use it against our WASI-enabled cluster.

$ cargo build --release --target wasm32-wasi

Load the WASI SmartModule into the cluster as wasi-sm

$ ./target/debug/fluvio smart-module create wasi-sm --wasm-file ./target/wasm32-wasi/release/fluvio_wasm_filter.wasm

Create the testing topic twif-22

$ ./target/debug/fluvio topic create twif-22
topic "twif-22" created

For our example producer input, we'll send 2 records to demonstrate the SmartModule output.

$ ./target/debug/fluvio produce twif-22
> a
Ok!
> b
Ok!

In the consumer output using our WASI SmartModule, only the first record prints, which is the correct behavior.

$ ./target/debug/fluvio consume twif-22 --filter wasi-sm
Consuming records from the end of topic 'twif-22'. This will wait for new records
a

To view our SmartModule debug output, we look at the SPU pod logs in Kubernetes. At the bottom of the log we can verify that the contents of each record was printed.

$ kubectl logs -f fluvio-spg-main-0
[...]
2022-02-15T00:45:25.502747Z INFO accept_incoming{self=FluvioApiServer("0.0.0.0:9005")}: fluvio_service::server: Received connection, spawning request handler
DEBUG: Record {
preamble: RecordHeader {
attributes: 0,
timestamp_delta: 0,
offset_delta: 0,
},
key: None,
value: a,
headers: 0,
}
DEBUG: Record {
preamble: RecordHeader {
attributes: 0,
timestamp_delta: 0,
offset_delta: 0,
},
key: None,
value: b,
headers: 0,
}

Connectors

~> Support for inbound and outbound Postgres connectors discontinued since Fluvio release v0.10.0.
This section is for historical purposes only

Postgres

We will provide a more hands-on blog post in the future, but for now we'll summarize the release.

Postgres Source connector

The Fluvio source connector allows you to connect to an external Postgres database and implement Change Data Capture (CDC) patterns by recording all database updates into a Fluvio topic.

There is a little bit of required configuration on the Postgres database side, but the Postgres source connector config looks like this:

# example-pg-source-connect.yml
version: 0.1.0
name: my-postgres-source
type: postgres-source
topic: postgres-topic
parameters:
publication: fluvio
slot: fluvio
secrets:
FLUVIO_PG_DATABASE_URL: postgres://postgres:mysecretpassword@localhost:5432/postgres

Postgres Sink connector

The Postgres sink connector consumes the CDC event data from the Postgres source connector and runs the corresponding SQL against the sink connector's Postgres database.

The Postgres sink connector looks like this:

# connect.yml
version: 0.1.0
name: my-postgres-sink
type: postgres-sink
topic: postgres-topic
parameters:
url: postgres://postgres:mysecretpassword@localhost:5432/postgres
secrets:
FLUVIO_PG_DATABASE_URL: postgres://postgres:mysecretpassword@localhost:5432/postgres

Postgres Connector Docs available now

A lot of work went into the release of our new Postgres connectors that we couldn't cover in depth here.

We encourage you to visit the docs, and expect a walkthrough using the Source and Sink connectors together in the future.

  • Docs for Postgres Source connector
  • Docs for Postgres Sink connector

HTTP

Our HTTP source connector has new options available output_type and output_parts to format its output.

Example HTTP connector config

%copy%

# connect.yml
version: 0.2.0
name: cat-facts
type: http
topic: cat-facts
direction: source
parameters:
endpoint: https://catfact.ninja/fact
interval: 10
output_parts: body # default
output_type: text # default

For example, our endpoint returns a JSON object in the body of the HTTP response.

%copy first-line%

$ fluvio consume cat-facts
Consuming records from the end of topic 'cat-facts'. This will wait for new records
{"fact":"In 1987 cats overtook dogs as the number one pet in America.","length":60}

If you want the full HTTP response, you can use output_parts: full

# connect.yml
[...]
- output_parts: body # default
+ output_parts: full
output_type: text # default

%copy first-line%

$ fluvio consume cat-facts
Consuming records from the end of topic 'cat-facts'. This will wait for new records
HTTP/1.1 200 OK
server: nginx
date: Wed, 16 Feb 2022 00:53:04 GMT
content-type: application/json
transfer-encoding: chunked
connection: keep-alive
vary: Accept-Encoding
cache-control: no-cache, private
x-ratelimit-limit: 100
x-ratelimit-remaining: 96
access-control-allow-origin: *
set-cookie: XSRF-TOKEN=REDACTED expires=Wed, 16-Feb-2022 02:53:04 GMT; path=/; samesite=lax
set-cookie: cat_facts_session=REDACTED expires=Wed, 16-Feb-2022 02:53:04 GMT; path=/; httponly; samesite=lax
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff

{"fact":"Cats only use their meows to talk to humans, not each other. The only time they meow to communicate with other felines is when they are kittens to signal to their mother.","length":170}

If you plan to process the HTTP response details, it might be more useful to use output_type: json.

# connect.yml
[...]
output_parts: full
- output_type: text # default
+ output_type: json
$ fluvio consume cat-facts
Consuming records from the end of topic 'cat-facts'. This will wait for new records
{"status":{"version":"HTTP/1.1","code":200,"string":"OK"},"header":{"set-cookie":["XSRF-TOKEN=REDACTED expires=Wed, 16-Feb-2022 02:56:22 GMT; path=/; samesite=lax","cat_facts_session=REDACTED expires=Wed, 16-Feb-2022 02:56:22 GMT; path=/; httponly; samesite=lax"],"content-type":"application/json","x-frame-options":"SAMEORIGIN","x-content-type-options":"nosniff","x-xss-protection":"1; mode=block","vary":"Accept-Encoding","server":"nginx","x-ratelimit-remaining":"94","date":"Wed, 16 Feb 2022 00:56:22 GMT","transfer-encoding":"chunked","cache-control":"no-cache, private","x-ratelimit-limit":"100","access-control-allow-origin":"*","connection":"keep-alive"},"body":"{\"fact\":\"There are more than 500 million domestic cats in the world, with approximately 40 recognized breeds.\",\"length\":100}"}

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #21

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.18

This release was heavily focused on stability improvements

Local Connector development fix

Previously, if you were developing your own connector, creating connectors would fail because during the creation of the connector pod, Kubernetes would always try to pull from Docker Hub.

You can control this behavior in your connector config through the version key.

If version is dev, Kubernetes will expect an image in its local registry with a name matching the pattern infinyon/fluvio-connect-<your connector name>:latest.

Otherwise, the value of version will refer to the image tag to pull from Docker Hub.

e.g.

infinyon/fluvio-connect-<your connector name>:<version>

Connector config parameter behavior change

Connector key parameters that include underscores in their name will covert the underscores into hyphens.

For example, in this example config

# example-connector-config.yml
[...]
parameters:
some_parameter: foo

The keys under parameters are used as CLI arguments to your connector config.

Prior to this release, the argument would render as --some_parameter=foo.

Now the some_parameter key will will render as --some-parameter=foo


Get in touch with us on Github Discussions or join our Discord channel and come say hello!

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #20

· 7 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


Local Connector Development

note

This is an archival newsletter entry. These days if you want to run or develop and run a connector see the cdk cli utility

This week we wanted to walkthrough the process of developing a connector.

We'll approach connector development in this order:

  1. Build and run your client locally
  2. Package your Client into Docker Image
  3. Load your Connector image into Kubernetes Cluster
  4. Create a new Connector in Fluvio

Tools needed to follow this guide

You'll need the following tools installed

Build and Run Your Client Locally

Let's start with building a simple client.

This client is written in Python, but we have client libraries for Rust, Javascript, Java and Go.

#!/usr/bin/env python3
#
# get-cat-facts.py
# An example Python-based Fluvio connector

from fluvio import Fluvio
import requests
import time

WAIT_SECONDS = 10
CAT_FACTS_API = 'https://catfact.ninja/fact'
CAT_FACTS_TOPIC = 'cat-facts-random'

if __name__ == '__main__':
# Before entering event loop
# Connect to cluster and create a producer before we enter loop
fluvio = Fluvio.connect()
producer = fluvio.topic_producer(CAT_FACTS_TOPIC)

# Event loop
while True:
# Get random cat fact
catfact = requests.get(CAT_FACTS_API)

# Save fact
producer.send_string(catfact.text)

# Print fact to container logs
print(catfact.text)

# Be polite and control the rate we send requests to external API
time.sleep(WAIT_SECONDS)

Before we run this code, we need to create the fluvio topic that our client produces data to

$ fluvio topic create cat-facts-random

Install fluvio package:

$ pip install fluvio requests

Running the Python code prints out a new cat fact every 10 seconds

$ python3 ./get-cat-facts.py
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}

And we verify that these records have made it to the Fluvio cluster by consuming from the topic.

$ fluvio consume cat-facts-random -B
Consuming records from the beginning of topic 'cat-facts-random'
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}

Package Your Client into Docker Image

Now that we have verfied that our client works locally, we need to package it for Kubernetes. We do that by defining our data connector runtime environment with a Dockerfile.

We use the python base image to keep things simple.

Then we create a new user fluvio with a home directory (in /home/fluvio).

This is required for all connectors. The Fluvio cluster shares information with the fluvio user on startup.

# Dockerfile
FROM python

# Copy our python script into the connector image
COPY get-cat-facts.py /usr/local/sbin/get-cat-facts.py
RUN chmod +x /usr/local/sbin/get-cat-facts.py

# This is required to connect to a cluster
# Connectors run as the `fluvio` user
ENV USER=fluvio
RUN useradd --create-home "$USER"
USER $USER

# Install dependencies
RUN pip install fluvio requests

# Start script on start
ENTRYPOINT get-cat-facts.py

Build and Test the Container

You can build the Docker image with this command.

$ docker build -t infinyon/fluvio-connect-cat-facts .

The image should have been created

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
infinyon/fluvio-connect-cat-facts latest 08ced64017f0 5 seconds ago 936MB
...

CAUTION The image name infinyon/fluvio-connect-cat-facts will be significant when we create the connector in the Fluvio cluster with fluvio connector create. CAUTION

Start a the container with this docker command

$ docker run -it --rm -v $HOME/.fluvio:/home/fluvio/.fluvio --network host infinyon/fluvio-connect-cat-facts
{"fact":"In the 1750s, Europeans introduced cats into the Americas to control pests.","length":75}
...
<CTRL>-C

You can check out docker run --help if you want a description of what these do, so I'll describe why you want to use them instead.

Docker optionWhy you want it
-itWithout both -i and -t, you can't use ctrl+c to exit your container
--rmPrevents accumulating Docker-related mess on your host while you are testing your connector
-v $HOME/.fluvio:/home/fluvio/.fluvioShare existing Fluvio config - This is effectively what fluvio connector create does
--network hostThe default docker network can't reach the host's Fluvio cluster

Load Your Connector Image into Kubernetes Cluster

By default, connectors will attempt to pull images from the internet. However, development images need to be testable locally before making them available publicly. So we pre-load the connector images.

Let's create a cluster called fluvio

$ k3d cluster create fluvio

And import the image (use the name of your cluster)

$ k3d image import infinyon/fluvio-connect-cat-facts --cluster fluvio

The image should have been created

$ docker exec k3d-fluvio-server-0 sh -c "ctr image list -q"
docker.io/infinyon/fluvio-connect-cat-facts:latest
...

Load image to minikube:

$ minikube image load infinyon/fluvio-connect-cat-facts

Create a new Connector in Fluvio

Last step for testing our connector is verifying that it runs in the Fluvio cluster. We will create the config file and run the CLI command

The Connector config

Create a connector configuration file example-connector.yaml:

# example-connector.yaml
version: dev
name: cat-facts-connector
type: cat-facts
topic: cat-facts-random
direction: source
Connector config optionDescription
versionThis value must be dev for local development.
nameA unique name for this connector.

It will be displayed in fluvio connector list
typeThe value of this name will be used for tagging image before loading into Kubernetes.

Connector image names follow the pattern: infinyon/fluvio-connect-{type}
topicThe name of the Fluvio topic where the connector will publish the data records. The Fluvio topic will be automatically created if the Fluvio topic does not exist.
directionThe metadata that defines the direction of data flow (source or sink).

This is a source connector.

Lastly, create the connector

$ fluvio connector create --config example-connector.yaml
$ fluvio connector list
NAME STATUS
cat-facts-connector Running

We can look at the container logs, and verify the topic has our records.

$ fluvio connector logs cat-facts-connector
{"fact":"Cats eat grass to aid their digestion and to help them get rid of any fur in their stomachs.","length":92}
{"fact":"When a cat drinks, its tongue - which has tiny barbs on it - scoops the liquid up backwards.","length":92}
{"fact":"Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.","length":102}

And again, to verify we check the contents of the topic. We see the last 3 rows match.

$ fluvio consume cat-facts-random -B
Consuming records from the beginning of topic 'cat-facts-random'
{"fact":"Cats bury their feces to cover their trails from predators.","length":59}
{"fact":"Cats step with both left legs, then both right legs when they walk or run.","length":74}
{"fact":"Cats can jump up to 7 times their tail length.","length":46}
{"fact":"A cat can jump 5 times as high as it is tall.","length":45}
{"fact":"Cats eat grass to aid their digestion and to help them get rid of any fur in their stomachs.","length":92}
{"fact":"When a cat drinks, its tongue - which has tiny barbs on it - scoops the liquid up backwards.","length":92}
{"fact":"Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.","length":102}

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #19

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.17

Tune configuration defaults to Producer auto-batching

This is a small change to the default auto-batching producer configuration linger and batch size.

  • Linger time was set to 250ms but now it is 100ms.
  • Batch size was 16000 but now is 16384 (That's 214)

If you want to configure your producer to something other than the defaults, you can set a different value for your producer when you create the config.

Example of auto-batching producer w/ linger of 1 second and a batch size of 100 bytes.

%copy%

let fluvio_client = Fluvio::connect().await?;
let config = TopicProducerConfigBuilder::default()
.linger(Duration::from_secs(1))
.batch_size(100)
.build()
.expect("failed to build config");

let producer: TopicProducer = fluvio_client
.topic_producer_with_config(topic, config)
.await;

CLI consumer output change

This is a fix to the CLI consumer using the --format output feature. Due to the default behavior of handlebars, the templating engine we use for custom formatting on the CLI, we were unintentionally HTML escaping record data. But that is not longer the behavior.

Before:

# Fluvio v0.9.16 and older
$ fluvio consume twif19 --format "{{offset}} {{value}}" -B -d
Consuming records from the beginning of topic 'twif19'
0 {&quot;examplekey&quot;:&quot;examplevalue&quot;}

After:

# Fluvio v0.9.17+
$ fluvio consume twif19 --format "{{offset}} {{value}}" -B -d
Consuming records from the beginning of topic 'twif19'
0 {"examplekey":"examplevalue"}

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #18

· 4 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.16

Time-based Data Retention

We have added time-based retention policy for data stored in topics. When records are created, we keep track of its age. When a record's age reaches the same duration as the retention policy, it is purged.

You can configure the retention duration time when you create a topic.

%copy%

# Some example durations: '1h', '2d 10s', '7 days'
$ fluvio topic create <topic-name> --retention-time <time>

Along with the introduction of retention policy, new topics will be created with a default 7 day retention.

Docs about retention policy coming soon.

Auto-batching producer

For processing live data, using a batching workflow for sending records improves the efficiency of your data transfers by increasing throughput and reducing latency for each producer send. (As opposed to sending records individually)

Producer batch support already exists in the CLI using fluvio produce, but you can realistically only use this CLI feature if you produce with the --file option.

Using the Rust API, you could have used send_all, but this primarily enables sending multiple records whenever called. Using send_all by itself didn't ensure a consistent behavior.

At the end of the day, it meant that if you want time-based or size-based batching, this was extra effort for the developer to implement themselves.

In this release, we make it easier to use batching in the Rust API. To use create an auto-batching Producer, you need to create your TopicProducer configured with batch and/or linger.

Example:

%copy%

let fluvio_client = Fluvio::connect().await?;
let config = TopicProducerConfigBuilder::default()
.linger(Duration::from_millis(600000))
.batch_size(17)
.build()
.expect("failed to build config");

let producer: TopicProducer = fluvio_client
.topic_producer_with_config(topic, config)
.await;

For more detail on the available config options, see the Rust docs

CLI Release Channel

The ability to test pre-release changes in CLI is now easier to do with CLI channels.

More documentation is coming soon, but if you're familiar with Rust's release channels, you'll be familiar with Fluvio's CLI channels.

New Fluvio installations support the ability to switch back and forth between the most recent stable release or the latest development builds of the Fluvio CLI.

CLI channels will be especially useful for the current users who have reached out to us on Discord. Now we can more easily work together to quickly validate fixes to issues without the need to build the Fluvio code locally.

To try out channels now, you will need to re-install Fluvio with the instructions on the download page. This will download the channel-enabled frontend and the most recent stable release.

# Switch to the `latest` channel
$ fluvio version switch latest
# Switch to the `stable` channel
$ fluvio version switch stable

Consume to end offset (CLI)

In the CLI, to start consuming records for a specific starting offset, you would use the --offset flag. Now you can also provide a final offset to close the Consumer stream when reached with the --end-offset flag.

Example 1:

  • In Terminal 1, we open a consumer stream from the beginning of topic twif with an ending offset of 5.
  • In Terminal 2, we use fluvio produce to send over 10 records, which we will show first.

Terminal 1 - Producer:

$ fluvio produce twif
> 0
Ok!
> 1
Ok!
> 2
Ok!
> 3
Ok!
> 4
Ok!
> 5
Ok!
> 6
Ok!
> 7
Ok!
> 8
Ok!
> 9
Ok!
> 10
Ok!

Terminal 2 - Record indexing is 0-based, so we expect the stream to close when we receive the 6th record.

$ fluvio consume -B --end-offset 5 twif
Consuming records from the beginning of topic 'twif'
0
1
2
3
4
5

Consumer stream has closed

Example 2:

We can also use a starting offset and ending offset together. As a result you can capture chunks of continuous blocks of records.

Here we use the existing twif topic, and consume a small subset of the records we produced earlier between offset 3-7 (inclusive).

$ fluvio consume --offset 3 --end-offset 7 twif
Consuming records from offset 3 in topic 'twif'
3
4
5
6
7

Consumer stream has closed

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #17

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.15

TableFormat improvements

Prior to this release, rendering events with fluvio consume --output=full-table required events to arrive as individual JSON objects. But now we can additionally accept a list of events as an array of JSON objects.

Example topic data. Mix of JSON objects, and an array of objects.

{"key1":"a","key2":"1","key3":"Alice","id":123}
{"key1":"b","key2":"2","key3":"Bob","id":456}
{"key1":"c","key2":"3","key3":"Carol","id":789}
[{"key1":"x","key2":"10","key3":"Alice","id":123},{"key1":"y","key2":"20","key3":"Bob","id":456},{"key1":"c","key2":"30","key3":"Carol","id":789}]

Check out the TableFormat docs for more information about using TableFormat.

Migration to Rust edition 2021

This update affects those using our Rust API. Our crates have transitioned to the new Rust 2021 edition.

If you want to migrate your existing projects with our crates, you can follow the official Rust edition guide

And finally edit your Cargo.toml to use the new edition.

edition = "2021"

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #16

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.14

Connector logs

Logs from connectors are now accessible from the CLI

fluvio connector logs <connector name>

In this example output, we can see some error output from the connector that would have previously been inaccessible.

Example:

$ fluvio connector logs my-test-mqtt -f

2021-12-15T05:20:21.770042Z ERROR mqtt: Mqtt error MqttState(Deserialization(PayloadSizeLimitExceeded(16635)))
2021-12-15T05:20:27.684853Z ERROR mqtt: Mqtt error MqttState(Deserialization(PayloadSizeLimitExceeded(16635)))

CLI Consumer wait spinner

Using the CLI consumer now has a new spinner to give feedback about the stream waiting for data.

Here's a short clip of the spinner in action with streaming data.

Change to multi-word subcommands

Given that these are still long commands, we've added aliases too

Previous subcommandNew subcommandNew alias
smartmodulesmart-modulesm
tableformattable-formattf
derivedstreamderived-streamds

CLI shell autocompletions

This actually isn't new, but it was hidden. Thanks to bohlmannc for helping us resolve this!

$ fluvio completions -h
fluvio-completions 0.0.0
Generate command-line completions for Fluvio

Run the following two commands to enable fluvio command completions.

Open a new terminal for the changes to take effect.

$ fluvio completions bash > ~/fluvio_completions.sh
$ echo "source ~/fluvio_completions.sh" >> ~/.bashrc

USAGE:
fluvio completions <SUBCOMMAND>

FLAGS:
-h, --help Prints help information

SUBCOMMANDS:
bash Generate CLI completions for bash
fish Generate CLI completions for fish
help Prints this message or the help of the given subcommand(s)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #15

· 6 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

BANNER

New Release - Fluvio v0.9.13

This is a big release

Apple M1 support

This is the first release with official Apple M1 support. If you have an Apple M1 machine and you'd like to try out Fluvio, please read our Getting Started page for MacOS.

Getting Started

SmartModules

We previewed this feature a couple weeks ago and now it's here!

All SmartModule types (filter, map, array-map, filter-map, aggregate, joins) can saved into your Fluvio cluster and you can use them just by referring to them by name.

Example creating of SmartModule:

$ fluvio smartmodule create my-filter --wasm-file ./path/to/my-filter.wasm

Example usage:

$ fluvio consume my-topic --filter <name>
$ fluvio consume my-topic --map <name>
$ fluvio consume my-topic --array-map <name>
$ fluvio consume my-topic --filter-map <name>
$ fluvio consume my-topic --aggregate <name>
$ fluvio consume my-topic --join <name>

You can still use SmartModules the original way, by providing a path to your wasm file. But if you're using the SmartModule a lot, we think persistent SmartModules will be more convenient to use.

SmartConnectors

This feature was teased last week, but now it is ready to be tried out.

Check out the new Connector Developer guide for more information about how to create your own connectors.

FilterMap SmartModule

The FilterMap SmartModule enables you to do filtering and reshaping your data at the same time.

Example FilterMap code

In this example, we take in integers.

If those integers are positive, we want to divide it by two and return. Filter out inputs if they are odd.

use fluvio_smartmodule::{smartmodule, Record, RecordData, Result};

#[smartmodule(filter_map)]
pub fn filter_map(record: &Record) -> Result<Option<(Option<RecordData>, RecordData)>> {
let key = record.key.clone();
let string = String::from_utf8_lossy(record.value.as_ref()).to_string();
let int: i32 = string.parse()?;

if int % 2 == 0 {
let output = int / 2;
Ok(Some((key.clone(), RecordData::from(output.to_string()))))
} else {
Ok(None)
}
}

Link to example code

Example input

$ fluvio produce filter-map-topic
> 19
Ok!
> 30
Ok!
> 29
Ok!
> -60
Ok!
> 23
Ok!
> 90
Ok!
> 17
Ok!
> ^C

Example output

$ fluvio consume filter-map-topic --filter-map divide-even-numbers
Consuming records from the end of topic 'filter-map-topic'. This will wait for new records
15
-30
45

For a deeper dive into FilterMap, check out our blog post which covers a use-case.

Join SmartModule

The Join SmartModule uses the stream you are consuming and the value at the end of another topic and allows you to return a new value

Example Join code

In this example, we have 2 topics, left-topic and right-topic, and our example Join SmartModule.

use fluvio_smartmodule::{smartmodule, Record, RecordData, Result};

#[smartmodule(join)]
pub fn join(left_record: &Record, right_record: &Record) -> Result<(Option<RecordData>, RecordData)> {
let left_value: i32 = std::str::from_utf8(left_record.value.as_ref())?.parse()?;
let right_value: i32 = std::str::from_utf8(right_record.value.as_ref())?.parse()?;
let value = left_value + right_value;

Ok((None, value.to_string().into()))
}

Example input for topic right-topic

$ fluvio produce right-topic
> 2
Ok!

Now, when new records come in to left-topic, we will add the new record to the newest record at right-topic.

$ fluvio produce left-topic
> 7
Ok!
> 11
Ok!
> 19
Ok!

Example consume output

$ fluvio consume left-topic --join example-join --join-topic right-topic
Consuming records from the end of topic 'left-topic'. This will wait for new records
9
13
21

And if we change the top value on right-topic

$ fluvio produce right-topic
> -5
Ok!

Then produce new values to left-topic

$ fluvio produce left-topic
> 3
Ok!
> 10
Ok!
> 27
Ok!

The resulting consume output will reflect the new values to left-topic adding itself to the new negative value at right-topic

$ fluvio consume left-topic --join example-join --join-topic right-topic
Consuming records from the end of topic 'left-topic'. This will wait for new records
9
13
21
-2
5
22

Link to example code

Fullscreen Consumer table

This is the first version of an interactive table display for the Consumer. It expects the same json object input as --output=table but the output is a full screen scrollable table. The columns are alphabetized, and the first column serves as a primary key for updating the row. This offers the possibility of viewing live updating data

First you need to create a tableformat to define how you want your table to be displayed. This would be highly dependent on the shape of your data. For example purposes, we will be displaying event-sourced data. This is what our example data looks like:

{"request_id":"123", "requester_name": "Alice", "state":"running", "run_time_minutes":"3"}
{"request_id":"456", "requester_name": "Alice", "state":"waiting", "run_time_minutes":"9"}
{"request_id":"789", "requester_name": "Bob", "state":"done", "run_time_minutes":"10"}

An example tableformat.

Here we only want to display the latest state of a request. We declare the request_id key as primaryKey, which means that any new events matching an existing request_id will update the row.

# tableformat-config.yaml
name: "current-requests"
input_format: "JSON"
columns:
- headerLabel: "ID"
keyPath: "request_id"
primaryKey: true
- headerLabel: "Runtime"
keyPath: "run_time_minutes"
- headerLabel: "State"
keyPath: "state"

Create the tableformat

$ fluvio tableformat create --config tableformat-config.yaml

Consuming from the topic using the full-table output, and our tableformat

$ fluvio consume request-events --output full-table --tableformat current-requests

Output:

('c' to clear table | 'q' or ESC to exit) | Items: 3──┐
│ID Runtime State │
│123 3 running │
│456 9 waiting │
│789 10 done
└──────────────────────────────────────────────────────┘

Docs for this feature will be coming soon!

A few bug fixes

A handful of user-facing issues were fixed

  • Creating a connector that creates a topic will not fail if the topic already exists (#1823)
  • Ability to create Kubernetes-based clusters on MacOS was restored (#1867)
  • Aggregate SmartModule fixed to properly accumulate from previous values instead defaulting to the initial value. (#1869) (Additional special thanks to our Discord community for reporting this bug!)

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #14

· 3 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


No new release

We didn't have a new release this week.

Coming soon: SmartConnectors

Last week we teased SmartModules. One of the things we've been able to do with SmartModules is integrate them with our Connectors. We're calling our data Connectors + SmartModules: SmartConnectors.

Using SmartConnectors is easy if you're already using SmartModules. You only need to add a reference to the SmartModule in your source or sink Connector config.

Let's look at an example. We're going to look at a SmartConnector from the perspective of the config

# config.yaml
version: 0.2.0
name: my-test-mqtt
type: mqtt
topic: public-mqtt
direction: source
parameters:
mqtt_topic: "testtopic/#"
map: "example-parse-mqtt-map"
secrets:
MQTT_URL: "mqtts://broker.hivemq.com:8883

In this example, we're using the my-test-mqtt connector we introduced in a previous TWiF to get a live bytestream from an MQTT broker and store it in a topic. But before we store it, we want to parse and transform the raw bytestream into our own types with a SmartModule.

Here's the SmartModule Map we're going to use to transform the connector output.

use fluvio_smartmodule::{smartmodule, Result, Record, RecordData};
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize)]
struct MqttEvent {
mqtt_topic: String,
payload: Vec<u8>,
}

#[derive(Debug, Serialize, Deserialize, Default)]
pub struct MyMessage {
key1: String,
key2: String,
}

#[smartmodule(map)]
pub fn map(record: &Record) -> Result<(Option<RecordData>, RecordData)> {
// Parse the MQTT event
let mqtt_event: MqttEvent = serde_json::from_slice(record.value.as_ref())?;

// Try to parse the payload bytes
// If we can, return it
let (value, value_record) = if let Ok(my_message) = serde_json::from_slice::<MyMessage>(&mqtt_event.payload) {
// Return the parsed payload
let parsed_value = serde_json::to_vec(&my_message)?;
let parsed_record = RecordData::from(parsed_value);

(record.key.clone(), parsed_record)
} else {
// otherwise return the default value of our type
let default_value = serde_json::to_vec(&MyMessage::default())?;
let default_record = RecordData::from(default_value);

(record.key.clone(), default_record)
};

Ok((value, value_record))
}

Compile and load the SmartModule WASM module

$ cargo build --release --target wasm32-unknown-unknown

Then create the SmartModule like this. Our connector is expecting a SmartModule map named example-parse-mqtt-map

$ fluvio smartmodule create example-parse-mqtt-map --wasm-file ./target/wasm32-unknown-unknown/release/module.wasm

We just need to make sure the SmartModule we're referencing has been created. Let's check with fluvio smartmodule list.

$ fluvio smartmodule list
NAME STATUS SIZE
example-parse-mqtt-map SmartModuleStatus 164590

Now we'll create our connector the usual way

$ fluvio connector create --config ./path/to/config.yaml
$ fluvio connector list
NAME STATUS
my-test-mqtt Running

So lastly, we'll look at the data in the topic. Just for comparison, I'll show data that would have been produced both without the SmartModule and with the SmartModule.

Example topic data without using our SmartModule

{"mqtt_topic":"testtopic/a_home/temp","payload":[51,56]}
{"mqtt_topic":"testtopic/a_home/menu/reg","payload":[50]}
{"mqtt_topic":"testtopic/a_home/menu/rele1","payload":[49]}
{"mqtt_topic":"testtopic/a_home/menu/rele2","payload":[49]}
{"mqtt_topic":"testtopic/a_home/menu/pwm1","payload":[51,48]}
{"mqtt_topic":"testtopic/a_home/menu/pwm2","payload":[51,48]}
{"mqtt_topic":"testtopic/a_home/menu/tc","payload":[49,53]}
{"mqtt_topic":"testtopic/a_home/menu/tzad","payload":[52,53]}
{"mqtt_topic":"testtopic/fluvio","payload":[123,34,107,101,121,49,34,58,34,83,109,97,114,116,67,111,110,110,101,99,116,111,114,32,69,120,97,109,112,108,101,34,44,34,107,101,121,50,34,58,34,72,101,108,108,111,32,119,111,114,108,100,33,34,125]}

Example topic data using our SmartModule

{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"","key2":""}
{"key1":"SmartConnector Example","key2":"Hello world!"}

And we see clear differences. With the inclusion of the SmartModule, we've transformed our external data to our needs.

SmartConnectors are coming soon, so keep an eye out for a future release.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

This Week in Fluvio #13

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


No new release

We didn't have a new release this week.

We're working on a bigger feature and it isn't quite ready. But we can talk about it.

Coming soon: SmartModules

The SmartModules are what we are calling SmartStream code that can uploaded stored on SPUs.

Currently, using a SmartStream filter with a consumer on the CLI, you have to provide a local file path to your filter. Sometimes this can lead to very long commands.

$ fluvio consume my-topic --filter ./target/wasm32-unknown-unknown/release/path/to/my-data-filter.wasm

Soon, you can store your SmartModules on the Fluvio SPUs and give it a name.

$ fluvio smartmodule create <name> --wasm-file ./target/wasm32-unknown-unknown/release/path/to/my-data-filter.wasm

Afterwards, you can apply your SmartModule by referring to it by its name.

$ fluvio consume my-topic --filter <name>

Btw, this new functionality will be in addition to the existing behavior. You can continue using local wasm filters, but we think that if you are using the same filter frequently, you will find SmartModules to be a convenient option. Especially if you are using many devices using the same SmartModule.


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #12

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.12

ArrayMap examples

This is an addition to last week's release of ArrayMap. We have some new examples.

Improved user experience when loading invalid SmartStream code

Prior to this release, trying to use invalid SmartStream code (but valid Rust code) with a consumer would result in the stream closing without a specific reason being reported to the user.

A user can compile their own SmartStream code, but if they forget to decorate their function with the #[smartstream] attribute, it will still compile because it is valid Rust. However, not all valid Rust code is valid SmartStream code.

Example invalid SmartStream code:

%copy%

use fluvio_smartstream::{Record, Result};

// Note the missing #[smartstream] attribute that should be here!
pub fn filter(record: &Record) -> Result<bool> {
let string = std::str::from_utf8(record.value.as_ref())?;
Ok(string.contains('a'))
}

Previous experience using the invalid SmartStream filter:

%copy first-line%

$ fluvio consume fruits --tail --filter=crates/fluvio-smartstream/examples/target/wasm32-unknown-unknown/debug/fluvio_wasm_filter.wasm
Consuming records starting 10 from the end of topic 'fruits'
Consumer stream has closed

Now, if a user tries to use incorrectly compiled SmartStream code they will receive an error message.

%copy first-line%

$ fluvio consume fruits --tail --filter=crates/fluvio-smartstream/examples/target/wasm32-unknown-unknown/debug/fluvio_wasm_filter.wasm
Consuming records starting 10 from the end of topic 'fruits'
Error:
0: Fluvio client error
1: SmartStream error
2: WASM module is not a valid 'filter' SmartStream. Are you missing a #[smartstream(filter)] attribute?

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #11

· 4 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


New Release - Fluvio v0.9.11

Producer Auto-reconnect

Prior to this release, if a client encountered any network hiccups that caused a disconnection during a Producer session, the Producer would experience an error.

Now the Producer will try to reconnect automatically.

SmartStreams w/ inputs

Using SmartStreams is a little more flexible now.

Previously, a SmartStream filter only supported logic with inputs that were hardcoded and compiled before using. This meant that we needed a separate filter per pattern we wanted to match on.

However we've added the capability to pass in user inputs at the time of execution. So we can have a single filter covering multiple patterns based on how we use it.

This is an example filter that takes in a parameter. Our named parameter is key, and the default value is the string value of the letter a. Keep following, as we see how this default influences the behavior of the filter.

use fluvio_smartstream::{smartstream, SmartOpt, Record, Result};

#[derive(SmartOpt)]
pub struct FilterOpt {
key: String
}

impl Default for FilterOpt {
fn default() -> Self {
Self {
key: "a".to_string()
}
}
}

#[smartstream(filter, params)]
pub fn filter(record: &Record, opt: &FilterOpt) -> Result<bool> {
let string = std::str::from_utf8(record.value.as_ref())?;
Ok(string.contains(&opt.key))
}

Continuing this example, we are producing a list of fruits to our topic fruits

Producer:

$ echo "grape" | fluvio produce fruits
$ echo "strawberry" | fluvio produce fruits
$ echo "plum" | fluvio produce fruits
$ echo "raspberry" | fluvio produce fruits
$ echo "mango" | fluvio produce fruits

Here, a Consumer listens to the fruits topic using the filter without providing a value for key. The expected behavior is to fallback to the default value of key we defined.

We see that this matches every fruit except for plum, because it doesn't have the letter a in it.

$ fluvio consume --filter fluvio_wasm_filter_with_parameters.wasm fruits
Consuming records from the end of topic 'fruits'. This will wait for new records
grape
strawberry
raspberry
mango

This is another Consumer also listening to the fruits topic. This time passing in a new value for key with the --extra-params (-e for short) to override the default of a.

Instead of fruits with a, we want any fruit with the word berry. We see that now the output only includes strawberry and raspberry, as expected.

$ fluvio consume --filter fluvio_wasm_filter_with_parameters.wasm fruits --extra-params key=berry
Consuming records from the end of topic 'fruits'. This will wait for new records
strawberry
raspberry

SmartStreams ArrayMap

This is a new type of SmartStream API that will make it easier to chunk up large datasets into smaller pieces.

Here's an example that shows what an array_map SmartStream looks like:

use fluvio_smartstream::{smartstream, Record, RecordData, Result};

#[smartstream(array_map)]
pub fn array_map(record: &Record) -> Result<Vec<(Option<RecordData>, RecordData)>> {
// Deserialize a JSON array with any kind of values inside
let array: Vec<serde_json::Value> = serde_json::from_slice(record.value.as_ref())?;

// Convert each JSON value from the array back into a JSON string
let strings: Vec<String> = array
.into_iter()
.map(|value| serde_json::to_string(&value))
.collect::<core::result::Result<_, _>>()?;

// Create one record from each JSON string to send
let records: Vec<(Option<RecordData>, RecordData)> = strings
.into_iter()
.map(|s| (None, RecordData::from(s)))
.collect();
Ok(records)
}

Let's say we have a topic array-map-array, and we produce data in the form of a JSON array

$ echo '["Apple", "Banana", "Cranberry"]' | fluvio produce array-map-array

If we have a consumer listening to this topic and with the ArrayMap SmartStream, we get output that looks like this:

$ fluvio consume array-map-array --array-map fluvio_wasm_array_map_array.wasm
Consuming records from the end of topic 'array-map-array'. This will wait for new records
"Apple"
"Banana"
"Cranberry"

Docs for ArrayMap are available here. For a more hands-on explanation with a real-world example, please read our blog post demonstrating the capabilities of #[smartstream(array_map)].


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #10

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.


No new release

We didn't have a new release this week.

Instead the team met up in person for the first time!

Team photo at Alcatraz

This is us trying to not look overheated after the audio tour at Alactraz.

New managed connector

MQTT

We have a new connector for the MQTT protocol available.

Example config

# config.yaml
version: 0.2.0
name: my-test-mqtt
type: mqtt
topic: public-mqtt
direction: source
parameters:
mqtt_topic: "testtopic/#"
secrets:
MQTT_URL: "mqtts://broker.hivemq.com:8883"
foo: bar
$ fluvio cluster connector create --config config.yaml
$ fluvio cluster connector list

-------------
NAME STATUS
my-test-mqtt Running

The test connector produces to a topic public-mqtt, where each record is an mqtt record corresponding to our configured mqtt topic.

The testtopic/fluvio payload, for example, says "hello world"

%copy first-line%

$ fluvio consume public-mqtt --tail -d
Consuming records starting 10 from the end of topic 'public-mqtt'
{"mqtt_topic":"testtopic/fluvio","payload":[104,101,108,108,111,32,119,111,114,108,100]}
{"mqtt_topic":"testtopic/a_home/temp","payload":[50,54]}
{"mqtt_topic":"testtopic/a_home/menu/reg","payload":[49]}
{"mqtt_topic":"testtopic/a_home/menu/rele1","payload":[48]}
{"mqtt_topic":"testtopic/a_home/menu/rele2","payload":[48]}
{"mqtt_topic":"testtopic/a_home/menu/pwm1","payload":[51,48]}
{"mqtt_topic":"testtopic/a_home/menu/pwm2","payload":[51,48]}
{"mqtt_topic":"testtopic/a_home/menu/tc","payload":[49,48,48]}
{"mqtt_topic":"testtopic/a_home/menu/tzad","payload":[50,49,52,55,52,56,51,54,52,55]}

In order to keep this connector generic, the payload is encoded as bytes. Rest assured that we'll provide more documentation for best practice in the future.

To stop the connector, you need to delete it:

$ fluvio cluster connector delete my-test-mqtt

See more info about connectors

Features coming soon

Table

Tables will enable materialized views with your structured JSON/YAML/TOML data. You will be able to select and format specific keys for display into a table.

This feature is not yet ready to use, but you may notice this command available CLI.

$ fluvio table

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #9

· 4 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.10

SmartStreams error handling improvements

SmartStreams users will now see error messages if there any problems loading their WASM modules.

Previously, these errors would occur within the Fluvio cluster while trying to consume from a topic, and the connection would abruptly close without an explanation.

Before

$ fluvio consume my-topic -B --filter=invalid-smartstream.wasm
Consuming records from the beginning of topic 'echo'
Consumer stream has closed

from: https://github.com/infinyon/fluvio/issues/1143#issuecomment-873432660

Now the client will report a more helpful error message

$ fluvio consume my-topic -B --filter=invalid-smartstream.wasm
Consuming records from the beginning of topic 'echo'
Error:
0: Fluvio client error
1: SmartStream error
2: WASM Module error: failed to parse WebAssembly module

Producer error handling improvements

Producer connection errors that occur within the Fluvio cluster will also report a more user friendly error message.

Before

$ fluvio produce my-topic
[...]
Error:
0: Consumer Error
1: Fluvio client error
2: Fluvio socket error
3: time out in serial: 0 request: 1

After

$ fluvio produce topic
[...]
Timed out while waiting on socket response

New feature

Managed Connectors

We have always wanted to provide our users with the capability to work with their data (wherever it is) in a meaningful way without the need to write and maintain custom code. Our new feature (which we've been calling Managed Connectors or connectors for short) provides the structure to accomplish this feat.

The idea is that a developer can create a custom connector that will do the following:

  • Read in from your data source and store in a Fluvio topic
  • Send out from a Fluvio topic to your data store

These connectors can then be re-used by anyone without the need to maintain custom code.

First look

The rest of this post is first look at a connector in action.

We create a new connector, with configurable details stored in a this example config file config.yaml.

version: 0.1.0
name: my-test-connector
type: test-connector
topic: my-test-connector
direction: source
parameters:
topic: my-test-connector
$ fluvio cluster connector create --config config.yaml
$ fluvio cluster connector list

-------------
NAME STATUS
my-test-connector Running

The test connector produces to a topic my-test-connector, where each record says Hello, Fluvio! - # where # is a number that counts up.

$ fluvio consume my-test-connector --tail -d
Consuming records starting 10 from the end of topic 'my-test-connector'
Hello, Fluvio! - 166
Hello, Fluvio! - 167
Hello, Fluvio! - 168
Hello, Fluvio! - 169
Hello, Fluvio! - 170
Hello, Fluvio! - 171
Hello, Fluvio! - 172
Hello, Fluvio! - 173
Hello, Fluvio! - 174
Hello, Fluvio! - 175

To stop the connector, you need to delete it:

$ fluvio cluster connector delete my-test-connector

The details about connectors and documentation for creating your own connectors are coming soon! We've created the section dedicated to connectors.

If you would like to help us prioritize what connectors to create, or if this sounds interesting and you'd like to talk to us about connectors then please get in touch with us!

Communicate with us on Github Discussions or join our Discord channel and come say hello!


Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #8

· 3 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.9

Fluvio Consumer can now stream from all partitions in a topic

As of this release, you can now use the Fluvio Consumer CLI and Rust API to stream records from all partitions in a topic, rather than just one at a time. To use this functionality on the CLI, use the new --all-partitions flag.

To see this in action, let's produce some records to a topic with 2 partitions. First, let's create the topic:

$ fluvio topic create fruits -p2
topic "fruits" created

Let's produce some records to our topic. By default, these records will be distributed to the partitions in a round-robin fashion.

$ fluvio produce fruits
> apple
Ok!
> banana
Ok!
> cranberry
Ok!
> date
Ok!
> ^C

Before Fluvio 0.9.9, you would run the consumer and only see the records from one partition at a time! If you don't specify a partition, it simply chooses partition 0.

$ fluvio consume fruits -B
Consuming records from the beginning of topic 'fruits'
apple
cranberry

With the --all-partitions flag, you'll see the records from all the partitions!

$ fluvio consume fruits -B --all-partitions
Consuming records from the beginning of topic 'fruits'
apple
cranberry
banana
date

Notice that the records don't appear in the exact order that we produced them in! That's because ordering is only guaranteed for the records in the same partition. In this case, this means that apple will always come before cranberry, and banana will always come before date. To learn more about partitioning, check out the docs on key/value records and partitions.

Addition of {{partition}} to the Fluvio Consumer --format string

With the arrival of the multi-partition consumer, we also need a way determine which records came from which partitions! You can now include {{partition}} in the consumer's format string to print out the partition of each record. From the example above, if we wanted to know what partition each record belongs to, we can run the following:

$ fluvio consume fruits -B --all-partitions --format="Partition({{partition}}): {{key}}={{value}}"
Consuming records from the beginning of topic 'fruits'
Partition(0): null=apple
Partition(1): null=banana
Partition(0): null=cranberry
Partition(1): null=date

Get in touch with us on Github Discussions or join our Discord channel and come say hello! Watch videos on our InfinyOn Youtube Channel

For the full list of changes this week, be sure to check out our CHANGELOG.

Until next week!

This Week in Fluvio #7

· 2 min read

Welcome to This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.8

Improved progress indicator for fluvio cluster start

We've been making some improvements to the Fluvio CLI to make the experience nicer for new users. This week, we've made some changes to the fluvio cluster start command including the addition of an animated spinner and some better status messages during the cluster installation.

Introducing fluvio cluster diagnostics

Fluvio is still in alpha, and occasionally things go wrong. In the past when helping users, we've found that we need to go back and forth multiple times to ask for logs and other info that helps to debug issues. This was a very time-consuming process - until now.

This week, we've added a new command to the Fluvio CLI, fluvio cluster diagnostics. This command will collect information such as logs from the Fluvio SC and SPUs, Kubernetes metadata relating to Fluvio, and Fluvio metadata such as SPU and SPU Group state. It will collect this information and place it into a diagnostics-<datetime>.tar.gz file. If you find yourself stuck and need help debugging an error, we can ask you just once for this diagnostic information, rather than needing an extended back-and-forth debugging session in chat.

Conclusion

For the full list of changes this week, be sure to check out our CHANGELOG. If you have any questions or are interested in contributing, be sure to join our Discord channel and come say hello!

Until next week!

This Week in Fluvio #6

· 3 min read

Welcome to the sixth edition of This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Releases - Fluvio v0.9.6 and v9.9.7

Custom record formatting for Consumer

The Fluvio CLI's consumer now allows you to provide your own formatting string, which it will use as a template for printing records to the screen. The formatting string may contain the variables {{offset}}, {{key}}, and {{value}}, and will substitute each record's contents into the format string.

This functionality is provided by the --format option on the consumer. Here is an example where we can output each record in a CSV format:

$ fluvio consume my-topic --format="{{offset}},{{key}},{{value}}"
Consuming records from the beginning of topic 'my-topic'
0,null,Apple
1,null,Banana
2,null,Cranberry
3,null,Date

You can get pretty creative with this. For example, you could use a format string to capture your record's contents and metadata in a JSON object by doing this:

$ fluvio consume my-topic -B --format='{"offset":{{offset}},"key":"{{key}}","value":"{{value}}"}'
Consuming records from the beginning of topic 'my-topic'
{"offset":0,"key":"null","value":"Apple"}
{"offset":1,"key":"null","value":"Banana"}
{"offset":2,"key":"null","value":"Cranberry"}

Pretty status printout for cluster installation

When using the Fluvio CLI to provision your own local Fluvio cluster, you'll now see the installation progress indicated by neat and uniform progress messages.

Before:

$ fluvio cluster start --local
checking fluvio crd attempt: 0
fluvio crd installed
Starting sc server
Trying to connect to sc at: localhost:9003, attempt: 0
Got updated SC Version0.9.6
Connection to sc suceed!
Launching spu group with: 1
Starting SPU (1 of 1)
All SPUs(1) are ready
Setting local profile
Successfully installed Fluvio!

After:

$ fluvio cluster start --local
📝 Running pre-flight checks
✅ Supported helm version is installed
✅ Supported kubernetes version is installed
✅ Kubernetes config is loadable
✅ Fluvio system charts are installed
🖥️ Starting SC server
✅ SC Launched
🤖 Launching SPU Group with: 1
🤖 Starting SPU: (1/1)
✅ SPU group launched (1)
💙 Confirming SPUs
✅ All SPUs confirmed
👤 Profile set
🎯 Successfully installed Fluvio!

The emojis let you know that we mean business!

Proper error message for invalid Topic name

There are some limitations on what a Topic may be named, and now when you try to use an invalid Topic name, you'll see a descriptive error message that tells you what characters are and are not allowed.

For example, you cannot use spaces in a topic name, so if you try to create a topic called "hello world" you'll now get the following error:

$ fluvio topic create "hello world"
Error:
0: Invalid argument: Topic name must only contain lowercase alphanumeric characters or '-'.

Conclusion

For the full list of changes this week, be sure to check out our CHANGELOG. If you have any questions or are interested in contributing, be sure to join our Discord channel and come say hello!

Shoutout to the Rustconf shoutout!

Thanks @nellshamrell for the project updates shoutout for This Week in Fluvio! This-Week-in-Rust was a huge inspiration for this newsletter, and we thank you for your tireless work assembling great Rust content and for helping us spread the word about Fluvio!

Until next week!

This Week in Fluvio #5

· 2 min read

Welcome to the fifth edition of This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.5

Bugfix for SmartStream request handler

This week we discovered a bug in the SPU that was causing the request handler for consumer streams to become blocked, preventing concurrent consumer stream requests. This happened because upon receiving a stream request with a SmartStream attached, the main SPU task would start compiling the SmartStream WASM code before spawning a handler task for it. This meant that the SPU could not accept any new stream requests until the WASM was finished compiling. In 0.9.5, this has been fixed by immediately spawning a new task before processing any of the work for a stream request.

Internal improvements: Refactoring around SmartStreams

When we originally implemented SmartStreams, we were new to using embedded WASM runtimes, so we hadn't developed the cleanest patterns and had a fair amount of code duplication. We've since gained a lot of experience in best practices for how to structure common patterns for working with WebAssembly modules. This is also largely thanks to the excellent improvements to the wasmtime API since 0.28, which has made it much easier to work with WASM in a multithreaded server environment.

Conclusion

This was a relatively light week for releasing new features, which is partly due to having started some brand-new features that are still in-flight, stay tuned to hear more about those in upcoming weeks.

For the full list of changes this week, be sure to check out our CHANGELOG. If you have any questions or are interested in contributing, be sure to join our Discord channel and come say hello!

Until next week!

This Week in Fluvio #4

· 3 min read

Welcome to the fourth edition of This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.4

Compression for WASM binaries

This week we received an excellent contribution from @tomindisguise for compressing WebAssembly binaries before uploading them to Fluvio's Streaming Processing Units (SPUs). As some background, one of Fluvio's premiere features is the ability to upload user-defined code to perform inline processing on streaming data, a feature we call SmartStreams.

note

SmartStreams are great, but this is an archival newsletter entry. Today you should look at our new SDF functionality instead for a much richer range of data processing and analysis!

User code must be compiled to WebAssembly and uploaded to SPUs upon opening a stream, and this change helps by significantly reducing the size of the upload request. Thank you to @tomindisguise for the contribution!

Bugfix for applying SmartStreams while using --disable-continuous

This is a fix to a CLI bug that cropped up when we introduced SmartStreams. For some context, Fluvio used to have two types of requests for consuming data: a "fetch" request and a "stream" request. The fetch request would retrieve only data that was known in advance to be available, whereas a stream request keeps the connection open and continues to deliver new data to the consumer as it arrives to the topic.

When using the Fluvio CLI Consumer, the default behavior is to open a stream and continue listening for data, but one may optionally use the -d/--disable-continuous flag in order to make a one-time request and close the connection. These two behaviors internally mapped to the "stream request" and the "fetch request", respectively. The bug that cropped up is that we realized that SmartStream logic was only being applied to stream requests, not to fetch requests.

This led us to simply hurry up on deprecating the fetch request, which is something that we have been meaning to do for a while now. As of this fix, --disable-continuous is now implemented by using a stream request that closes upon reaching the known end of the stream. This should not cause any breakages to CLI users, but now using -d/--disable-continuous with a SmartStream will properly apply the logic to the stream as expected.

Now publishing aarch64 docker images

That's right, not only do we now offer support for aarch64 binaries themselves, we are also publishing Fluvio docker images for aarch64 targets. This means that Fluvio may now be deployed on AWS Graviton! We're very excited about this advancement, and we're looking forward to seeing it deployed on ARM in the near future!

Conclusion

For the full list of changes, be sure to check out our CHANGELOG. If you have any questions or are interested in contributing, be sure to join our Discord channel and come say hello!

Until next week!

This Week in Fluvio #3

· 4 min read

Welcome to the third edition of This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.3

SmartStreams Aggregates

note

Aggregates are great, but this is an archival newsletter entry. Today you should look at our new SDF functionality instead for a much richer range of data processing and analysis!

This week's release has one big feature and several quality-of-life user experience improvements. The big feature is the arrival of our SmartStreams Aggregate feature, which allows users to write functions that combine records into a long-running accumulator state. Good examples of accumulator use-cases are calculating sums or averages of numeric data, or combining structural key-value data.

To quickly illustrate what a SmartStream Aggregate looks like, let's take a look at the simplest example of just summing numbers together:

use fluvio_smartstream::{smartstream, Result, Record, RecordData};

#[smartstream(aggregate)]
pub fn aggregate(accumulator: RecordData, current: &Record) -> Result<RecordData> {
// Parse the accumulator and current record as strings
let accumulator_string = std::str::from_utf8(accumulator.as_ref())?;
let current_string = std::str::from_utf8(current.value.as_ref())?;

// Parse the strings into integers
let accumulator_int = accumulator_string.parse::<i32>().unwrap_or(0);
let current_int = current_string.parse::<i32>()?;

// Take the sum of the two integers and return it as a string
let sum = accumulator_int + current_int;
Ok(sum.to_string().into())
}

Every aggregate function has two inputs: an "accumulator" and the current record from the stream. The function's jobs is to combine these two values and produce a new accumulator value, which will be used when processing subsequent records in the stream. In this example, we simply parse the accumulator and input record as integers, then add them together.

Client fix: Out-of-bounds relative offsets no longer cause freezing

When consuming records in Fluvio, we generally open a stream by asking for a particular Topic and partition, and providing an Offset for the first record where we would like our stream to begin reading. Offsets may be specified in three ways: by giving an absolute index into the partition, by a relative-from-beginning offset, and by a relative-from-end offset. When you specify a relative offset, they are first resolved to absolute offsets and then used to make a stream request.

Prior to 0.9.3, there was a bug in the client where relative offsets could overflow the actual size of the stream, which would cause the consumer to simply freeze and not yield any records or errors. An example that would cause this problem would be if you had a topic with 10 records in it, and you asked for a stream starting "20 from the end" of that topic. This would incorrectly resolve to an absolute offset of 10 - 20 = -10.

This bug has been fixed in 0.9.3, with a new behavior where relative offsets that are too large simply "bottom out" at the start or end of the stream. That is, if you ask for "1000 from the end" in a stream of 100 elements, you'll just start streaming from the start, and if you ask for "1000 from the beginning" of a stream of 100 elements, you'll just start at the end, waiting for new records to arrive.

CLI usability: Added fluvio consume -T/--tail for "tailing" consumer streams

This is a nice usability improvement for Fluvio CLI users, where you may now "tail" your streams. This acts similarly to the UNIX tail command, which reads the last 10 lines of a file. fluvio consume --tail=X will open a stream that begins X elements from the end, letting you quickly see the most recent records in your stream. By default, using --tail with no argument (no X) will give you the last 10 elements for some easy context over the latest records (just like UNIX tail does).

As a quick example, this is what happens when you use --tail on a stream with 20 sequential integers.

$ fluvio consume ints --tail
Consuming records starting 10 from the end of topic 'ints'
11
12
13
14
15
16
17
18
19
20

Conclusion

For the full list of changes, be sure to check out our CHANGELOG. If you have any questions or are interested in contributing, be sure to join our Discord channel and come say hello!

Until next week!

This Week in Fluvio #2

· 3 min read

Welcome to the second edition of This Week in Fluvio, our weekly newsletter for development updates to Fluvio open source. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.1

In 0.9.1, we added support for running a Fluvio cluster locally on an Apple M1 machine (architecture aarch64-apple-darwin). This means that if you are running an M1-powered device, you can now install Fluvio and run a cluster directly on your machine, without needing to deploy to some Kubernetes instance somewhere.

On M1, you can now run the standard install.sh script and get both the Fluvio CLI and the fluvio-run cluster binary:

inline-embed file="embeds/download-cli/curl-bash-copy.md"

After this has run (and you've set up your PATH), you can now run a local cluster on M1 with the following command:

%copy first-line%

$ fluvio cluster start

See our complete getting-started guide for a full set of instructions for getting set up from scratch.

Release of Fluvio 0.9.2

That's right, we had two point releases this week. 0.9.2 was shipped because of a bug we discovered with a version handshake between Fluvio clients and servers that is supposed to ensure compatibility.

Basically, each build of Fluvio "knows" what version it is, e.g. 0.9.0 or 0.9.1. When a Fluvio client and server first connect, they make sure that they each have versions that are compatible with each other. Unfortunately, we had a build problem where the server (Fluvio SC) did not pick up the latest version number, and therefore the client thought it was talking to an old server! It turns out that this bug was present since the update from 0.8.5 to 0.9.0, so the 0.9.x clients thought they were talking to an 0.8.x server. This version gap spanned a major version bump, so the client rejected the connections! We suspect this bug happened because we did not have any code changes to the SC for 0.9.1 or 0.9.2, so it actually did not get recompiled with the new version number baked in. All this to say, we are now back on track with matching versions in the client/server builds.

In addition to these compatibility fixes, 0.9.2 also included some internal fixes to the Streaming Processing Units (SPUs) that should make them more reliable when being deployed in a Kubernetes cluster.

Conclusion

That's it for this week, short and sweet! If you have any questions or would like to get involved, feel free to join our Discord channel.

Until next week!

This Week in Fluvio #1

· 8 min read

Welcome to the very first edition of This Week in Fluvio, our weekly newsletter for development updates to [Fluvio open source]. Fluvio is a distributed, programmable streaming platform written in Rust.

New Release - Fluvio v0.9.0

Today we're releasing a new "major version" of Fluvio, which includes crate-level and protocol-level breaking changes from the 0.8.x line of releases. Note that although Fluvio is still pre-1.0, we do our best to stick to semantic versioning, treating the second digit as our major version. Let's dive in and see what new features and breaking changes we have to talk about.

New target support and the Tier system

We've been putting in a lot of work to support new build targets so that users can run the Fluvio CLI on more platforms and architectures. We have also introduced a Tier system that describes which targets must be able to build and pass tests. Here is a breakdown of our current Tiers, what they mean, and which executables belong to those tiers:

Tier 1

Tier 1 targets are those that must compile and pass tests. We have configured our CI to reject any changes that cause any of these targets to stop compiling successfully or to stop passing tests. The following executable/target combinations are currently considered Tier 1:

  • Fluvio CLI (fluvio)

    • x86_64-unknown-linux-musl (Linux x86 64-bit)
  • Fluvio cluster (fluvio-run)

    • x86_64-unknown-linux-musl (Linux x86 64-bit)

Tier 2

Tier 2 targets are those that must compile, but for which we don't yet run tests or block progress on those tests passing.

  • Fluvio CLI (fluvio)
    • aarch64-unknown-linux-musl (Linux ARM 64-bit)
    • x86_64-pc-windows-msvc (Windows x86 64-bit)
    • aarch64-apple-darwin (Apple M1)
    • x86_64-apple-darwin (Apple x86 64-bit)
    • arm-unknown-linux-gnueabihf (Raspberry Pi zero)
    • armv7-unknown-linux-gnueabihf (Raspberry Pi)

Better Kubernetes Support

With 0.9.0, Fluvio now supports any standards compliant Kubernetes distribution. It defaults to standard storage drivers, but can be configured to use different drivers. We have tested with the following Kubernetes distributions:

  • Minikube
  • Kind
  • K3d
  • AWS EKS

Please see Fluvio's [Kubernetes documentation] for more information.

Helm changes

Fluvio's CLI bundles helm charts for easy installation. Fluvio's charts are no longer published to the Fluvio registry.

Please use following commands to update your Fluvio installation:

fluvio cluster upgrade --sys      # upgrade CRD
fluvio cluster upgrade # upgrade rest

Error Handling for SmartStreams (#1198)

One of Fluvio's premiere features is SmartStreams, which allow users to write custom WebAssembly modules to perform server-side data processing. Until recently, there was no way for user code to return Errors to indicate that something had gone wrong while processing records.

Prior to 0.9.0, the only type of SmartStream was filters, which looked something like this:

use fluvio_smartstream::{smartstream, Record};

#[smartstream(filter)]
pub fn filter_odd(record: &Record) -> bool {
// Parse the input bytes as a UTF-8 string, or return false
let string_result = std::str::from_utf8(record.value.as_ref());
let string = match string_result {
Ok(s) => s,
_ => return false,
};

// Parse the string as an i32, or return false
let int_result = string.parse::<i32>();
let int = match int_result {
Ok(i) => i,
_ => return false,
};

int % 2 == 0
}

Note that this function is required to return a boolean, which indicates whether the Records should be kept in the stream (if true) or discarded (if false). However, there is no way to indicate whether a logic error has occurred during processing.

For example, what happens if the Record data we are given is not valid UTF-8 data? We have no way to report this situation to the consumer, and therefore the best course of action we have is to just return false and discard any records that are invalid. This means that we risk mixing up the logically distinct cases of:

  • "we have valid Records, and successfully discarded some of them", and
  • "we have an invalid Record, so ignore it"

and since we had no way to report this to the consumer, it is very difficult to debug this situation.

Enter SmartStream Error handling

With the 0.9.0 update, SmartStreams are now written like this!

use fluvio_smartstream::{smartstream, Record, Result};

#[smartstream(filter)]
pub fn filter_odd(record: &Record) -> Result<bool> {
let string = std::str::from_utf8(record.value.as_ref())?;
let int = string.parse::<i32>()?;
Ok(int % 2 == 0)
}

Notice that now, the filter function returns a Result<bool>, meaning that SmartStream authors now have tha ability to return an Err describing any problems that happened when running their code.

The fluvio_smartstream::Result type allows you to return any error type that implements std::error::Error*, which means that you can simply propagate most errors up and out using ?, like we do in the example above.

*and Send + Sync + 'static. We use the [eyre] crate to capture returned errors.

This SmartStream parses incoming records as integers, then filters out odd numbers. When we run a consumer with this SmartStream, we can see the filtered data, and we can see the consumer deliver our error to us if we give an input that can't be parsed as an integer.

SmartStream Maps

Another big feature that we've had in the works for a while is a new type of SmartStream, #[smartstream(map)], used to transform the data in each record in a stream. This feature has been available in "preview" since 0.8.5, but we did not want to release it until we had error-handling ready for SmartStreams, which happened this release! Let's take a look at what a SmartStream Map function looks like.

use fluvio_smartstream::{smartstream, Record, RecordData, Result};

#[smartstream(map)]
pub fn map(record: &Record) -> Result<(Option<RecordData>, RecordData)> {
let key = record.key.clone();

let string = std::str::from_utf8(record.value.as_ref())?;
let int = string.parse::<i32>()?;
let value = (int * 2).to_string();

Ok((key, value.into()))
}

See the [full source code for this example on GitHub]!

In this example, we are reading in Records and first parsing them as UTF-8 strings, then parsing those strings as integers. If either of those steps fails, the error is returned with ? and the consumer receives an error item in the stream.

Notice that the return type for Map is different from we have seen before with Filters. In order to edit the Records in our stream, we manipulate them in our function and then return the transformed output. The successful return value is a tuple of the new Key (optional) and Value for the output record. The RecordData type may be constructed from any type that has impl Into<Vec<u8>>, so you can just use .into() for a lot of types such as String when you want to return them.

When we give valid integers, our output comes back transformed as expected - the integers have been doubled. However, if we give invalid input, the Consumer CLI prints the error that was returned by the user code, along with diagnostic information such as the record's offset.

Internal improvements

For the full list of updates in this release, [check out our CHANGELOG]. Here are some highlights:

Improved install.sh to work for more targets (#1269)

Prior to 0.9.0, the one-line install script only worked for MacOS and Linux on x86 architectures. Now, it also works for non-x86 targets such as arm and armv7, which, notably, allows it to work directly on Raspberry Pi.

Updated ConsumerConfig builder API to match standard builder patterns (#1271)

Our ConsumerConfig type is used when constructing consumers programmatically. Prior to 0.9.0, we used an [owned-builder] pattern (chained by passing mut self), which is less flexible than the [mutable-builder] pattern (chained by passing &mut self) that we have now adopted.

Improved #[derive(fluvio_protocol::{Encoder, Decoder})] for enums (#1232)

We use the procedural macros fluvio_protocol::{Encoder, Decoder} to derive traits by the same name. These macros used to have a limitation where they did not work on enums that carry data, but now this works as expected.

Conclusion

That's it for this week, we'll be publishing a newsletter once a week from now on, so stay tuned for more Fluvio updates! If you have any questions or would like to get involved, feel free to join our Discord channel!

Until next week!