Skip to main content

Apache Kafka® concepts

A comprehensive glossary of essential Apache Kafka® terms and their meanings.

Broker

A server that operates Apache Kafka, responsible for message storage, processing, and delivery. Typically part of a cluster for enhanced scalability and reliability, each broker functions independently but is integral to Kafka's overall operations, separate from tools like Apache Kafka Connect.

Consumer

An application that reads data from Apache Kafka, often processing or acting upon it. Various tools used with Apache Kafka ultimately function as either a producer or a consumer when communicating with Apache Kafka.

Consumer groups

Groups of consumers in Apache Kafka are used to scale beyond a single application instance. Multiple instances of an application coordinate to handle messages, with each group allocated to different partitions for even workload distribution.

Event-driven architecture

Application architecture centered around responding to and processing events.

Event

A single discrete data unit in Apache Kafka, consisting of a value (the message body) and often a key (for quick identification) and headers (metadata about the message).

Kafka node

See Broker

Kafka server

See Broker

Message

See Event

Partitioning

A method used by Apache Kafka to distribute a topic across multiple servers. Each server acts as the leader for a partition, ensuring data sharding and message order within each partition.

Producer

An application that writes data into Apache Kafka without concern for the data's consumers. The data can range from well-structured to simple text, often accompanied by metadata.

Pub/sub

A publish-subscribe messaging architecture where messages are broadcasted by publishers and received by any listening subscribers, unlike point-to-point systems.

Queueing

A messaging system where messages are sent and received in the order they are produced. Apache Kafka maintains a watermark for each consumer to track the most recent message read.

Record

See Event

Replication

Apache Kafka's feature for data replication across multiple servers, ensuring data preservation even if a server fails. This is configurable per topic.

Topic

Logical channels in Apache Kafka through which messages are organized. Topics are named in a human-readable manner, like sensor-readings or kubernetes-logs.