Skip to content

filter_lua: add support to access groups and metadata #10457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Jun 9, 2025

Prior to this change, the Lua filter in Fluent Bit supported processing of individual log records with a function signature that allowed modifying the timestamp and the record only:

function append_tag(tag, timestamp, record)
    ...
    return 1, timestamp, record
end

This PR introduces an optional extended prototype that allows access to group metadata and record metadata, making the function more powerful and better suited for structured formats like OpenTelemetry Logs.

New function signature

The new supported Lua function prototype is:

function cb_metadata(tag, timestamp, group, metadata, record)
    ...
    return 1, timestamp, metadata, record
end

Arguments:

  • tag: the input tag of the log record.
  • timestamp: the timestamp of the log record.
  • group: a read-only table that contains group-level metadata (e.g., OpenTelemetry resource or scope info). This will be an empty table if the log is not part of a group.
  • metadata: a table representing the record-specific metadata. You may modify this if needed.
  • record: the actual log record table, same as in the original signature.

Return Values:

The function must return exactly 4 values, in the following order:

  • Return Code:
    • 1: Record was modified.
    • 0: Record was not modified.
    • -1: Record should be dropped.
  • Timestamp: The updated timestamp.
  • Metadata Table: A new or modified metadata table.
  • Record Table: A new or modified log record.

How Fluent Bit Chooses the Function Signature ?

At load time, the Lua filter analyzes the function signature by checking the number of parameters. If the Lua function accepts:

  • 3 arguments: it assumes the classic mode (tag, timestamp, record)
  • 5 arguments: it uses the metadata-aware mode (tag, timestamp, group, metadata, record)

This ensures backward compatibility with existing Lua scripts.

Support for Returning Arrays (Multiple Records)

As with the original Lua callback design, the function may optionally return multiple records
as arrays.

When using the metadata-aware prototype, you must return:

return 1, timestamp, {metadata_1, metadata_2, ...}, {record_1, record_2, ...}

Example:

function cb_metadata(tag, ts, group, metadata, record)
    -- first record with its metadata
    m1 = {foo = "meta1"}
    r1 = {msg = "first log", old_record = record}

    -- second record with its metadata
    m2 = {foo = "meta2"}
    r2 = {msg = "second log", old_record = record}

    return 1, ts, {m1, m2}, {r1, r2}
end

note: The metadata and record arrays must be the same length.

OpenTelemetry Test

The following is a simple OpenTelemetry Test logs, we ingest a log with Curl, receive it with Fluent Bit OpenTelemetry input plugin, process it with Lua and print the results to stdout:

JSON log

{
  "resourceLogs": [
    {
      "resource": {
        "attributes": [
          { "key": "service.name", "value": { "stringValue": "my-app" } },
          { "key": "host.name", "value": { "stringValue": "localhost" } }
        ]
      },
      "scopeLogs": [
        {
          "scope": {
            "name": "example-logger",
            "version": "1.0.0"
          },
          "logRecords": [
            {
              "timeUnixNano": "1717920000000000000",
              "severityNumber": 9,
              "severityText": "INFO",
              "body": {
                "stringValue": "User logged in successfully"
              },
              "attributes": [
                { "key": "user.id", "value": { "stringValue": "12345" } },
                { "key": "env", "value": { "stringValue": "prod" } }
              ]
            }
          ]
        }
      ]
    }
  ]
}

Fluent Bit Configuration

The inline Lua script will put the OTLP service name inside the log record (body) and change the severity from 9 to 13 if this has been set as part of the record metadata:

pipeline:
  inputs:
    - name: opentelemetry
      port: 4318
      processors:
        logs:
          - name: lua
            call: cb_groups_and_metadata
            code: |
              function cb_groups_and_metadata(tag, timestamp, group, metadata, record)
                -- copy the OTLP metadata 'service.name' to the record
                if group['resource']['attributes']['service.name'] then
                  record['service_name'] = group['resource']['attributes']['service.name']
                end

                -- change OTLP Log severity by modifying the record metadata
                if metadata['otlp']['severity_number'] then
                  if metadata['otlp']['severity_number'] == 9 then
                    -- change severity 9 to 13
                    metadata['otlp']['severity_number'] = 13
                    metadata['otlp']['severity_text '] = 'WARN'
                  end
                end

                return 1, timestamp, metadata, record
              end


  outputs:
    - name : stdout
      match: '*'

Use Curl to send the data to Fluent Bit

curl -X POST http://localhost:4318/v1/logs \
  -H "Content-Type: application/json" \
  --data-binary @otel-log.json

Final comments:

  • Group metadata is read-only and should not be modified.
  • If you don’t need group or metadata support, you can continue using the 3-argument prototype.
  • Mixed return types (single record vs array) are supported, but must follow the proper structure.
  • This PR REMOVES the MPACK version of the Lua code serializer just to avoid duplicating code and logic.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

edsiper added 2 commits June 9, 2025 16:56
… access

Prior to this change, the Lua filter in Fluent Bit supported processing of individual log
records with a function signature that allowed modifying the timestamp and the record only:

  function append_tag(tag, timestamp, record)
      ...
      return 1, timestamp, record
  end

This commit introduces an optional extended prototype that allows access to group metadata and
record metadata, making the function more powerful and better suited for structured formats
like OpenTelemetry Logs.

New function signature
----------------------

The new supported Lua function prototype is:

  function cb_metadata(tag, timestamp, group, metadata, record)
      ...

      return 1, timestamp, metadata, record
  end

Arguments:

- tag: the input tag of the log record.
- timestamp: the timestamp of the log record.
- group: a read-only table that contains group-level metadata (e.g., OpenTelemetry resource or
         scope info). This will be an empty table if the log is not part of a group.
- metadata: a table representing the record-specific metadata. You may modify this if needed.
- record: the actual log record table, same as in the original signature.

Return Values:

The function must return exactly 4 values, in the following order:

- Return Code:
  - 1: Record was modified.
  - 0: Record was not modified.
  - -1: Record should be dropped.
- Timestamp: The updated timestamp.
- Metadata Table: A new or modified metadata table.
- Record Table: A new or modified log record.

How Fluent Bit Chooses the Function Signature ?
-----------------------------------------------

At load time, the Lua filter analyzes the function signature by checking the number of
parameters. If the Lua function accepts:

- 3 arguments: it assumes the classic mode (tag, timestamp, record)
- 5 arguments: it uses the metadata-aware mode (tag, timestamp, group, metadata, record)

This ensures backward compatibility with existing Lua scripts.

Support for Returning Arrays (Multiple Records)
-----------------------------------------------

As with the original Lua callback design, the function may optionally return multiple records
as arrays.

When using the metadata-aware prototype, you must return:

  return 1, timestamp, {metadata_1, metadata_2, ...}, {record_1, record_2, ...}

Example:

  function cb_metadata(tag, ts, group, metadata, record)
      -- first record with its metadata
      m1 = {foo = "meta1"}
      r1 = {msg = "first log", old_record = record}

      -- second record with its metadata
      m2 = {foo = "meta2"}
      r2 = {msg = "second log", old_record = record}

      return 1, ts, {m1, m2}, {r1, r2}
  end

note: The metadata and record arrays must be the same length.

Final comments:

- Group metadata is read-only and should not be modified.
- If you don’t need group or metadata support, you can continue using the 3-argument prototype.
- Mixed return types (single record vs array) are supported, but must follow the proper structure.

Signed-off-by: Eduardo Silva <[email protected]>
@ryn9
Copy link

ryn9 commented Jun 10, 2025

I am very excited for this capability :)

Questions:

Will the metadata table be an empty table if there no metadata to be passed in?

Should the metadata table be an returned as empty table if there no metadata to be passed back?

Will the processor be able to handle if the metadata is returned malformed (ie a non-table type)?
I would suggest the processor drop malformed returned metadata

@pwhelan
Copy link
Contributor

pwhelan commented Jun 10, 2025

Will it be possible to overwrite the group data and metadata as well? Will it be possible to delete them?

@ryn9
Copy link

ryn9 commented Jun 10, 2025

Will it be possible to overwrite the group data and metadata as well? Will it be possible to delete them?

Group data would not be modifiable.
It is not even in the return statement.

The issue with modifying the group info (which is being passed in as read-only) is that the upstream group information may be shared across many different messages.

This is an issue also seen the OTEL collector - where to deal with modifying resource or scope info you generally first have to flatten the the data down to one resource, one scope, one log.

Reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor#transformflattenlogs

@edsiper
Copy link
Member Author

edsiper commented Jun 10, 2025

@pwhelan

Will it be possible to overwrite the group data and metadata as well? Will it be possible to delete them?

Not in this version, the main reason is that the Lua callback works per log record / I am planning a next gen of Lua scripting as a processor. For now you can use content modifier processor

@edsiper
Copy link
Member Author

edsiper commented Jun 10, 2025

....unless we add an option to the filter to process only on Group records, so you have 2 callbacks: one for the group and other for the log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants