admin管理员组

文章数量:1315347

I have the following otel collector config:

receivers:
  filelog/nginx/1:
    include:
    - /opt/nginx/log/access.log
    include_file_path: true
    include_file_name: false
    operators:
    - type: router
      routes:
        - output: parse1
          expr: 'body matches "^\\[\\d"'
          attributes:
            debug_body_router: '${body}'
      default: parse2
    - type: regex_parser
      id: parse1
        # Example line:
        # [28/Jan/2025:04:32:25 +0000] 127.0.0.1 34096 /__cq/status ...
      regex: '^\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d{4})\]\s+(?P<ip>\S+)\s+(?P<port>\d+)\
s+(?P<path>\S+).*$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'
      attributes:
        debug_body_parse1: '${body}'
    - type: regex_parser
      id: parse2
        # Example lines:
        # 127.0.0.1 - - [29/Jan/2025:08:33:22 +0000] ...
        # 127.0.0.1 - MitigatorConnector [28/Jan/2025:14:23:46 +0000] ...
      regex: '^(?P<ip>\S+)\s+-\s+(?P<service>\S+)\s+\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d
{4})\]\s+(?P<msg>.*)$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'
      attributes:
        debug_body_parse2: '${body}'

exporters:
  debug:
    verbosity: "detailed" # comment this out to minimize logging

service:
  pipelines:
    logs:
      receivers: [filelog/nginx/1]
      processors: []
      exporters: [debug]
  telemetry:
    logs:
      level: debug

I see the following trace in the logs:

LogRecord #1
ObservedTimestamp: 2025-01-30 09:15:19.792630808 +0000 UTC
Timestamp: 2025-01-30 09:15:19 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str([30/Jan/2025:09:15:19 +0000] 127.0.0.1 37158 /__cq/cache/info /__cq/cache/info 200 214 "python-requests/2.25.1" 0.000 "-" "-" "-" "-" "14183" "167" "-" "localhost:9999" - -)
Attributes:
     -> path: Str(/__cq/cache/info)
     -> log.file.path: Str(/opt/nginx/log/access.log)
     -> debug_body_router: Str()
     -> timestamp: Str(30/Jan/2025:09:15:19 +0000)
     -> ip: Str(127.0.0.1)
     -> port: Str(37158)
Trace ID: 
Span ID: 

And there are parse2 and router "regex pattern does not match" errors. I have few concerns:

  • why am I seeing regex pattern does not match error for router? Isn't that the whole point of routers? If it does not match, it should move to the next route instead of throwing an error.
  • if the regex pattern did not match for the router, why do I still see the attribute debug_body_router in the exported log?
  • most importantly, how do I match the body correctly in the router expr?

I have the following otel collector config:

receivers:
  filelog/nginx/1:
    include:
    - /opt/nginx/log/access.log
    include_file_path: true
    include_file_name: false
    operators:
    - type: router
      routes:
        - output: parse1
          expr: 'body matches "^\\[\\d"'
          attributes:
            debug_body_router: '${body}'
      default: parse2
    - type: regex_parser
      id: parse1
        # Example line:
        # [28/Jan/2025:04:32:25 +0000] 127.0.0.1 34096 /__cq/status ...
      regex: '^\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d{4})\]\s+(?P<ip>\S+)\s+(?P<port>\d+)\
s+(?P<path>\S+).*$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'
      attributes:
        debug_body_parse1: '${body}'
    - type: regex_parser
      id: parse2
        # Example lines:
        # 127.0.0.1 - - [29/Jan/2025:08:33:22 +0000] ...
        # 127.0.0.1 - MitigatorConnector [28/Jan/2025:14:23:46 +0000] ...
      regex: '^(?P<ip>\S+)\s+-\s+(?P<service>\S+)\s+\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d
{4})\]\s+(?P<msg>.*)$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'
      attributes:
        debug_body_parse2: '${body}'

exporters:
  debug:
    verbosity: "detailed" # comment this out to minimize logging

service:
  pipelines:
    logs:
      receivers: [filelog/nginx/1]
      processors: []
      exporters: [debug]
  telemetry:
    logs:
      level: debug

I see the following trace in the logs:

LogRecord #1
ObservedTimestamp: 2025-01-30 09:15:19.792630808 +0000 UTC
Timestamp: 2025-01-30 09:15:19 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str([30/Jan/2025:09:15:19 +0000] 127.0.0.1 37158 /__cq/cache/info /__cq/cache/info 200 214 "python-requests/2.25.1" 0.000 "-" "-" "-" "-" "14183" "167" "-" "localhost:9999" - -)
Attributes:
     -> path: Str(/__cq/cache/info)
     -> log.file.path: Str(/opt/nginx/log/access.log)
     -> debug_body_router: Str()
     -> timestamp: Str(30/Jan/2025:09:15:19 +0000)
     -> ip: Str(127.0.0.1)
     -> port: Str(37158)
Trace ID: 
Span ID: 

And there are parse2 and router "regex pattern does not match" errors. I have few concerns:

  • why am I seeing regex pattern does not match error for router? Isn't that the whole point of routers? If it does not match, it should move to the next route instead of throwing an error.
  • if the regex pattern did not match for the router, why do I still see the attribute debug_body_router in the exported log?
  • most importantly, how do I match the body correctly in the router expr?
Share Improve this question asked Jan 30 at 9:30 kaushal agrawalkaushal agrawal 3803 silver badges22 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

As a workaround, I was able to get the desired behavior by using if and removing router altogether. Updated receiver config for reference:

receivers:
  filelog/nginx/1:
    include:
    - /opt/nginx/log/access.log
    include_file_path: true
    include_file_name: false
    operators:
    - type: regex_parser
      id: parse1
        # Example line:
        # [28/Jan/2025:04:32:25 +0000] 127.0.0.1 34096 /__cq/status ...
      if: 'body matches "^\\[\\d"'
      regex: '^\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d{4})\]\s+(?P<ip>\S+)\s+(?P<port>\d+)\s+(?P<path>\S+).*$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'
    - type: regex_parser
      id: parse2
        # Example lines:
        # 127.0.0.1 - - [29/Jan/2025:08:33:22 +0000] ...
        # 127.0.0.1 - MitigatorConnector [28/Jan/2025:14:23:46 +0000] ...
      if: 'body not matches "^\\[\\d"'
      regex: '^(?P<ip>\S+)\s+-\s+(?P<service>\S+)\s+\[(?P<timestamp>\d{2}\/[A-Za-z]{3}\/\d{4}:\d{2}:\d{2}:\d{2}\s+[+\-]\d{4})\]\s+(?P<msg>.*)$'
      timestamp:
        parse_from: attributes.timestamp
        layout: '%d/%b/%Y:%H:%M:%S %z'

本文标签: Otel collector router expr regex matchingStack Overflow