Sample dataset generator for Aiven for Apache Kafka®#
Learning to work with streaming data is much more fun with data, so to get you started on your Apache Kafka® journey we help you create fake streaming data to a topic.
The following example assumes you have an Aiven for Apache Kafka® service running. You can create one following the dedicated instructions.
Fake data generator on Docker#
To learn data streaming, you need a continuous flow of data and for that you can use the Dockerized fake data producer for Aiven for Apache Kafka®. To start using the generator:
Clone the repository:
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
Copy the file
avn user access-token create \ --description "Token used by Fake data generator" \ --max-age-seconds 3600 \ --json | jq -r '..full_token'
The above command uses
jq (https://stedolan.github.io/jq/) to parse the result of the Aiven CLI command. If you don’t have
jq installed, you can remove the
| jq -r '..full_token' section from the above command and parse the JSON result manually to extract the access token.
conf/env.conffile filling the following placeholders:
my_project_name: the name of your Aiven project
my_kafka_service_name: the name of your Aiven for Apache Kafka instance
my_topic_name: the name of the target topic, can be any name
my_aiven_email: the email address used as username to log in to Aiven services
my_aiven_token: the access token generated during the previous step
Build the Docker image with:
docker build -t fake-data-producer-for-apache-kafka-docker .
Every time you change any parameters in the
conf/env.conf file, you need to rebuild the Docker image to start using them.
Start the streaming data flow with:
docker run fake-data-producer-for-apache-kafka-docker