Skip to main content

Define OpenSearch® timestamp data in SQL pipeline

Frequently results in Apache Flink® data pipelines include one or more timestamps, either contained in the source events or generated by window aggregations.

When the output of the Apache Flink® data pipeline is an Aiven for OpenSearch® index, you need to convert the Flink timestamps to a format recognizable by OpenSearch®, otherwise they will be interpreted as strings, losing the benefits of time filtering.

OpenSearch® recognises the following as correct date/time formats:

  • yyyy/MM/dd for a date field
  • HH:mm:ss for a time field
  • yyyy/MM/dd HH:mm:ss for a timestamp field

Therefore you need to structure the data pipeline output to follow one of the acceptable formats

When the result of the data pipeline contains a timestamp column like the below:

EVENT_TIME TIMESTAMP(3),
HOSTNAME STRING,
CPU DOUBLE

to push the data correctly to an OpenSearch® index, you'll need to set the target column format as STRING in the Flink table definition, like:

EVENT_TIME STRING,
HOSTNAME STRING,
CPU DOUBLE

and, assuming the EVENT_TIME is a timestamp, you'll need to specify it in the format understood by OpenSearch® using the DATE_FORMAT function, like:

DATE_FORMAT(EVENT_TIME, 'yyyy/MM/dd HH:mm:ss')

Once the pipeline is running, you can check that the EVENT_TIME field in OpenSearch® is recognized as a timestamp.