Skip to content
This repository was archived by the owner on Jul 26, 2022. It is now read-only.

Log lines might end up with duplicate JSON keys #401

Closed
bgdnlp opened this issue Jun 5, 2020 · 16 comments
Closed

Log lines might end up with duplicate JSON keys #401

bgdnlp opened this issue Jun 5, 2020 · 16 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@bgdnlp
Copy link

bgdnlp commented Jun 5, 2020

In some situations the log JSON will have duplicate keys, I assume because the "message" dict/map is merged with the log one. For example, the "time" key is included twice here:

{
  "level":50,
  "time":1591347107037,
  "pid":19,
  "hostname":"kubernetes-external-secrets-6799c84569-kfjcj",
  "message":"Access to KMS is not allowed",
  "code":"AccessDeniedException",
  "time":"2020-06-05T08:51:47.036Z",
  "requestId":"1894b6c9-b698-4657-a3cb-ecd8f044afee",
  "statusCode":400,
  "retryable":false,
  "retryDelay":77.07928574264942,
  "stack":"AccessDeniedException: Access to KMS is not allowed\n    at Request.extractError (/app/node_modules/aws-sdk/lib/protocol/json.js:51:27)\n    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\n    at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\n    at Request.emit (/app/node_modules/aws-sdk/lib/request.js:683:14)\n    at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)\n    at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)\n    at /app/node_modules/aws-sdk/lib/state_machine.js:26:10\n    at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:38:9)\n    at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:685:12)\n    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:116:18)",
  "type":"Error",
  "msg":"failure while polling the secret kubernetes-external-secrets/test"
}
@pierluigilenoci
Copy link
Contributor

This problem happens because Docker itself add a "time" field inside its log when merged. If you want to produce a JSON log you need to use a different "time" field name. I will suggest avoiding any docker/logstash/fluentbit/fluentd/elasticsearch "reserved" key (example: "@timestamp")

@pierluigilenoci
Copy link
Contributor

@silasbw @Flydiverny can you please take a look?

@pierluigilenoci
Copy link
Contributor

@Niksko @moolen @megakid can you help in some way?

@moolen
Copy link
Member

moolen commented Nov 10, 2020

I can confirm that this is an issue, thanks for reporting it!

It seems that the error that was caught and logged here contains additional fields like retryable, retryDelay aswell as time . Those come frome the aws-sdk (as the stack suggests)

https://github.com/godaddy/kubernetes-external-secrets/blob/e719c8724f4496d51f962b689b40ee49a0cd928a/lib/poller.js#L129-L131

Having a field that indicates the time is necessary for debugging IMO (e.g. tailing logs directly from a pod without using additional tooling like logstash/fluent*/elasticsearch/...). I think it should be up to the user: how one organizes log fields and make up conventions.

@Flydiverny
Copy link
Member

If the root cause here is the err object containing an additional time key we could configure pino to use a nestedkey for logged objects.

Log message would probably end up something like

{
  "level": 50,
  "time": 1591347107037,
  "pid": 19,
  "payload": {
    "hostname": "kubernetes-external-secrets-6799c84569-kfjcj",
    "message": "Access to KMS is not allowed",
    "code": "AccessDeniedException",
    "time": "2020-06-05T08:51:47.036Z",
    "requestId": "1894b6c9-b698-4657-a3cb-ecd8f044afee",
    "statusCode": 400,
    "retryable": false,
    "retryDelay": 77.07928574264942,
    "stack": "AccessDeniedException: Access to KMS is not allowed\n    at Request.extractError (/app/node_modules/aws-sdk/lib/protocol/json.js:51:27)\n    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\n    at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\n    at Request.emit (/app/node_modules/aws-sdk/lib/request.js:683:14)\n    at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)\n    at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)\n    at /app/node_modules/aws-sdk/lib/state_machine.js:26:10\n    at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:38:9)\n    at Request.<anonymous> (/app/node_modules/aws-sdk/lib/request.js:685:12)\n    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:116:18)",
    "type": "Error"
  },
  "msg": "failure while polling the secret kubernetes-external-secrets/test"
}

I was under the impression this had to do with log merging (eg docker log + output from KES or so) later on. In which case I think its up to the end users configuration.

PR welcome :)

@Flydiverny Flydiverny added bug Something isn't working good first issue Good for newcomers labels Nov 11, 2020
@pierluigilenoci
Copy link
Contributor

pierluigilenoci commented Nov 11, 2020

@moolen

Having a field that indicates the time is necessary for debugging IMO

docker itself adds a "time" field when it writes the log to disk, adding it to the pod log is a bit redundant. If you want to put it anyway, you can choose an alternative field name or nest it inside another field. Or avoid JSON format.

@Flydiverny

This is an example of log produced in one of our clusters.

{
    "@timestamp": "2020-11-06T12:59:59.275Z",
    "log": "{\"level\":\"info\",\"time\":1604667599274,\"pid\":17,\"hostname\":\"kubernetes-external-secrets-65f77ddcc7-hdq22\",\"msg\":\"stopping poller for NAMESPACE/external-secrets-update-job\"}\n",
    "stream": "stdout",
    "level": "info",
    "time": 1604667599274,
    "pid": 17,
    "hostname": "kubernetes-external-secrets-65f77ddcc7-hdq22",
    "msg": "stopping poller for NAMESPACE/external-secrets-update-job",
    "kubernetes": {
      "pod_name": "kubernetes-external-secrets-65f77ddcc7-hdq22",
      "namespace_name": "NAMESPACE",
      "pod_id": "845cf1df-ce7f-4aa0-8cad-564ee2ccb325",
      "labels": {
        "app_kubernetes_io/instance": "kubernetes-external-secrets",
        "app_kubernetes_io/name": "kubernetes-external-secrets",
        "pod-template-hash": "65f77ddcc7"
      },
      "annotations": {
        "kubernetes_io/psp": "restricted",
        "seccomp_security_alpha_kubernetes_io/pod": "runtime/default"
      },
      "host": "ip-xx-xx-xx-xx.eu-west-1.compute.internal",
      "container_name": "kubernetes-external-secrets",
      "docker_id": "1946eab2256435daf4e006abf7348618a250a03f0681f95d092c898e104bd4f3",
      "container_hash": "godaddy/kubernetes-external-secrets@sha256:e827f7bc272117a97a8562ac70e37810266467f07d2fce798c8abc360b892c08",
      "container_image": "godaddy/kubernetes-external-secrets:6.0.0"
    }

In my case, fluent-bit is able to merge the fields but it overwrites the docker's log field with the pod log value also creating an additional indexing problem because the format is different. The default docker format for field time is "time": "2020-11-06T12:59:59.669350409Z".

In older versions of fluent-bit (or other software), the merge is less refined and a JSON with two time fields is generated. This JSON is obviously malformed and is rejected by Elastic Search.

IMHO kubernetes-external-secrets should work optimally without having to force configuration changes in case of a vanilla EKS cluster configured with the most common CNCF packages (specifically the EFK stack).

@bgdnlp
Copy link
Author

bgdnlp commented Nov 20, 2020

Having a field that indicates time is necessary, true. The problem is that there's two of them, which results in invalid JSON, which in turn will confuse log analysis tools that would otherwise work without modifications.

One option is to nest the log, as @Flydiverny said. This is probably the technically correct option, but it might be a breaking change for anyone relying on the existing log structure.

Another would be to simply rename the "time" field in the record generated by kubernetes-external-secrets to something else. "message_time", "recorded_at", "kes_time", whatever. Probably easier to implement, but it might break a different set of tools.

Either way, the sooner it's changed, the less people are impacted by it I guess.

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the Stale label Feb 19, 2021
@pierluigilenoci
Copy link
Contributor

@moolen @Flydiverny is there any hope that this bug will be fixed?

@pierluigilenoci
Copy link
Contributor

pierluigilenoci commented Feb 19, 2021

Ops... I found a fix here https://github.com/external-secrets/kubernetes-external-secrets/blob/master/config/index.js so I can assume that this ticket can be closed. Thanks.

@github-actions github-actions bot removed the Stale label Feb 20, 2021
@moolen moolen closed this as completed Feb 20, 2021
@pierluigilenoci
Copy link
Contributor

pierluigilenoci commented Mar 5, 2021

@moolen I was wrong. The issue is still there.
@bgdnlp @Flydiverny can you confirm?

IMHO the ultimate solution would be to rename the time field if it is really necessary. For me, it is not really needed because docker itself adds its "time" field when it writes the log to disk. Hence the primal problem.

Here my log #401 (comment)

@pierluigilenoci
Copy link
Contributor

@moolen @bgdnlp @Flydiverny can I have any feedback?

@moolen
Copy link
Member

moolen commented Mar 15, 2021

Hey @pierluigilenoci, let's change the time field, im OK with that.

@pierluigilenoci
Copy link
Contributor

@moolen I used message_time as proposed by @bgdnlp
If you don't like it, choose the one you prefer. Just avoid time, timestamp, and @timestamp.
#660

@pierluigilenoci
Copy link
Contributor

@Flydiverny this issue can be closed.

@moolen
Copy link
Member

moolen commented Mar 18, 2021

Thank you @pierluigilenoci for your contribution ❤️!

@moolen moolen closed this as completed Mar 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants