In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Messaging > Apache Kafka > Apache Kafka Producer and Consumer

Apache Kafka Producer and Consumer

Author: Venkata Sudhakar

Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, and real-time data pipelines. At its core, Kafka is a distributed commit log where producers write messages to named topics, and consumers read messages from those topics. Unlike traditional message queues such as RabbitMQ, Kafka retains messages on disk for a configurable period (default 7 days), allowing multiple independent consumers to read the same message at different times and at different rates.

A Kafka topic is divided into one or more partitions, which are the unit of parallelism and scalability. Each partition is an ordered, immutable sequence of messages. Within a partition, every message has a unique sequential offset. When you have multiple partitions, producers can write to different partitions in parallel and consumers in a consumer group are each assigned one or more partitions, enabling horizontal scaling of both producers and consumers. Messages with the same key are always written to the same partition, ensuring ordering for that key.

The below example shows how to write a Kafka producer in Java using the KafkaProducer API to publish order events to a topic, with both synchronous and asynchronous send patterns.

import org.apache.kafka.clients.producer.*;
import java.util.Properties;
import java.util.concurrent.Future;

public class OrderEventProducer {
    public static void main(String[] args) throws Exception {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("acks", "all");          // Wait for all replicas to acknowledge
        props.put("retries", 3);            // Retry on failure
        props.put("linger.ms", 5);          // Batch messages for 5ms to improve throughput

KafkaProducer<String, String> producer = new KafkaProducer<>(props);

String topic = "order-events";

// Example 1: Synchronous send - wait for acknowledgement
        String orderId = "ORD-1001";
        String orderJson = "{"orderId":"ORD-1001","customerId":"C-500","amount":149.99,"status":"CREATED"}";
        ProducerRecord<String, String> record = new ProducerRecord<>(topic, orderId, orderJson);
        RecordMetadata metadata = producer.send(record).get(); // .get() makes it synchronous
        System.out.println("Sent to partition: " + metadata.partition()
            + " offset: " + metadata.offset());

// Example 2: Asynchronous send with callback
        for (int i = 1002; i <= 1005; i++) {
            String id = "ORD-" + i;
            String json = "{\"orderId\":\"" + id + "\",\"amount\":" + (i * 10.0) + "}";
            producer.send(new ProducerRecord<>(topic, id, json), (meta, ex) -> {
                if (ex != null) {
                    System.err.println("Send failed: " + ex.getMessage());
                } else {
                    System.out.println("Async sent: partition=" + meta.partition()
                        + " offset=" + meta.offset());
                }
            });
        }

producer.flush();
        producer.close();
        System.out.println("All messages sent.");
    }
}

It gives the following output,

Sent to partition: 2 offset: 0
Async sent: partition=0 offset: 0
Async sent: partition=1 offset: 0
Async sent: partition=2 offset: 1
Async sent: partition=0 offset: 1
All messages sent.

The below example shows how to write a Kafka consumer in Java that reads order events from the topic, processes them, and commits offsets manually for reliable at-least-once processing.

import org.apache.kafka.clients.consumer.*;
import java.time.Duration;
import java.util.*;

public class OrderEventConsumer {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("group.id", "order-processing-group");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("auto.offset.reset", "earliest"); // Start from beginning if no committed offset
        props.put("enable.auto.commit", "false");   // Manual commit for reliability
        props.put("max.poll.records", 100);

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList("order-events"));

System.out.println("Consumer started. Waiting for messages...");

try {
            while (true) {
                ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));

for (ConsumerRecord<String, String> record : records) {
                    System.out.printf("Received: partition=%d offset=%d key=%s value=%s%n",
                        record.partition(), record.offset(), record.key(), record.value());

// Process the order event
                    processOrder(record.key(), record.value());
                }

// Commit offsets only after processing all records in this batch
                if (!records.isEmpty()) {
                    consumer.commitSync();
                    System.out.println("Committed offsets for " + records.count() + " records.");
                }
            }
        } finally {
            consumer.close();
        }
    }

static void processOrder(String orderId, String orderJson) {
        System.out.println("Processing order: " + orderId);
        // Business logic here: update DB, send email, trigger fulfillment, etc.
    }
}

It gives the following output,

Consumer started. Waiting for messages...
Received: partition=2 offset=0 key=ORD-1001 value={"orderId":"ORD-1001","amount":149.99}
Processing order: ORD-1001
Received: partition=0 offset=0 key=ORD-1002 value={"orderId":"ORD-1002","amount":10020.0}
Processing order: ORD-1002
Received: partition=1 offset=0 key=ORD-1003 value={"orderId":"ORD-1003","amount":10030.0}
Processing order: ORD-1003
Committed offsets for 3 records.

Key Kafka concepts to understand:

Consumer Group - Multiple consumer instances sharing the same group.id form a consumer group. Kafka distributes topic partitions evenly across them, so each partition is consumed by exactly one instance in the group at a time. This enables parallel processing while guaranteeing order within each partition.

Offset Management - The offset is the consumer position in a partition. With enable.auto.commit=false and manual commitSync(), offsets are committed only after successful processing, giving you at-least-once delivery semantics. If a consumer crashes mid-batch, the next consumer restart re-reads from the last committed offset.

Partitioning - Messages with the same key always go to the same partition, preserving order for that key. With a null key, Kafka distributes messages round-robin across partitions.

Send your comments, suggestions or queries regarding this site to [email protected].