Skip to main content

Step 3: Convert JSON to Protobuf

Protocol Buffers (Protobuf) is another common binary format, widely used in high-performance microservice communication, especially with gRPC. Like Avro, it is schema-based, compact, and efficient.

This step teaches you how to convert a JSON message to the Protobuf format.

The Goal

You will create a pipeline that takes a JSON message, validates it against a Protobuf schema, and converts it to the binary Protobuf format.

The "Prepare -> Convert" Pattern

  1. Define Schema: Protobuf requires a schema in a .proto file that describes the message structure.
  2. Prepare: The incoming JSON must be structured to match the .proto definition.
  3. Convert: Use the to_protobuf processor to perform the conversion.

Implementation

  1. Define the Protobuf Schema: Create a file named sensor.proto. This schema defines the structure for our sensor data.

    sensor.proto
    syntax = "proto3";

    message SensorReading {
    string sensor_id = 1;
    double temperature = 2;
    int64 timestamp_unix_ms = 3;
    }
  2. Create the Conversion Pipeline: Copy the following configuration into a file named json-to-protobuf.yaml.

    json-to-protobuf.yaml
    name: json-to-protobuf-converter
    description: A pipeline that converts JSON to Protobuf.

    config:
    input:
    generate:
    interval: 1s
    mapping: |
    root = {
    "sensor_id": "sensor-1",
    "temperature": 25.5,
    "timestamp": now().ts_format_iso8601()
    }

    pipeline:
    processors:
    # 1. PREPARE: Ensure the data matches the Protobuf schema.
    # The field names must match exactly, and we need to convert the
    # timestamp to the format expected by the schema (unix milliseconds).
    - mapping: |
    root = {
    "sensor_id": this.sensor_id,
    "temperature": this.temperature,
    "timestamp_unix_ms": this.timestamp.parse_timestamp().unix_milli()
    }

    # 2. CONVERT: Convert the prepared JSON object to Protobuf binary format.
    - to_protobuf:
    proto_path: "file://./sensor.proto"
    message: "SensorReading"

    output:
    stdout:
    codec: lines
  3. Deploy and Observe: Watch the logs. The output will be a stream of binary Protobuf data, just like with Avro.

Verification

The unreadable binary text in your console confirms the successful conversion. A gRPC service or any other microservice with access to the sensor.proto file would be able to deserialize this binary data into a type-safe object in its native programming language.

You have now learned how to convert JSON to another popular and powerful binary format.