Java

Building Apache Kafka Event Streaming Apps with Spring Boot and Schema Registry Performance Guide

Learn to build high-performance event streaming apps with Apache Kafka, Spring Boot & Schema Registry. Complete guide with producers, consumers, error handling & testing.

Building Apache Kafka Event Streaming Apps with Spring Boot and Schema Registry Performance Guide

I’ve been thinking a lot lately about how modern applications handle massive streams of data in real-time. The challenge isn’t just processing events—it’s doing so reliably, at scale, and with the flexibility to evolve over time. That’s why I want to share my approach to building high-performance event streaming systems using Apache Kafka, Spring Boot, and Schema Registry.

What if you could design systems that handle millions of events per second while maintaining data consistency across services? Let me show you how.

Starting with infrastructure, I use Docker Compose to set up a local development environment. Here’s a basic configuration:

version: '3.8'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    ports: ["2181:2181"]

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    depends_on: [zookeeper]
    ports: ["9092:9092", "9101:9101"]
    environment:
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092

  schema-registry:
    image: confluentinc/cp-schema-registry:7.5.0
    depends_on: [kafka]
    ports: ["8081:8081"]

With the infrastructure running, I focus on schema design. Using Avro with Schema Registry ensures data consistency and enables evolution. Have you considered how your data contracts might change in six months?

Here’s a sample Avro schema for order events:

{
  "type": "record",
  "name": "OrderEvent",
  "namespace": "com.example.events",
  "fields": [
    {"name": "orderId", "type": "string"},
    {"name": "customerId", "type": "string"},
    {"name": "amount", "type": "double"},
    {"name": "timestamp", "type": "long"}
  ]
}

For Spring Boot producers, I configure efficient serialization and error handling:

@Configuration
public class KafkaProducerConfig {
    
    @Value("${spring.kafka.bootstrap-servers}")
    private String bootstrapServers;

    @Bean
    public ProducerFactory<String, OrderEvent> producerFactory() {
        Map<String, Object> config = new HashMap<>();
        config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
        config.put("schema.registry.url", "http://localhost:8081");
        return new DefaultKafkaProducerFactory<>(config);
    }

    @Bean
    public KafkaTemplate<String, OrderEvent> kafkaTemplate() {
        return new KafkaTemplate<>(producerFactory());
    }
}

Consumers need to handle various failure scenarios. How would you manage duplicate messages or processing failures?

@KafkaListener(topics = "orders")
public void consume(OrderEvent event, Acknowledgment ack) {
    try {
        processOrder(event);
        ack.acknowledge();
    } catch (Exception e) {
        log.error("Failed to process order: {}", event.getOrderId(), e);
        // Implement retry or dead letter queue logic
    }
}

Performance optimization involves careful configuration. I tune these parameters based on throughput requirements:

spring:
  kafka:
    producer:
      batch-size: 16384
      buffer-memory: 33554432
      compression-type: snappy
    consumer:
      max-poll-records: 500
      fetch-max-wait: 500

Monitoring is crucial. I integrate Spring Boot Actuator with Micrometer to track consumer lag and processing metrics:

@Bean
public MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
    return registry -> registry.config().commonTags(
        "application", "order-service",
        "kafka-cluster", "production"
    );
}

Testing with TestContainers ensures reliability:

@Testcontainers
@SpringBootTest
class OrderServiceTest {
    
    @Container
    static KafkaContainer kafka = new KafkaContainer(
        DockerImageName.parse("confluentinc/cp-kafka:7.5.0")
    );

    @Test
    void shouldPublishOrderEvent() {
        // Test implementation
    }
}

Building event streaming applications requires balancing performance, reliability, and maintainability. The patterns I’ve shared come from solving real-world problems at scale—they’re not just theoretical concepts.

What challenges have you faced with event-driven architectures? I’d love to hear your experiences and solutions. If this approach resonates with you, please share your thoughts in the comments and pass this along to others who might benefit from these patterns.

// Similar Posts

Keep Reading