Introduction

Supported Functionality

  • Inclusion and exclusion of log files
  • Parsing logs based on Regex and JSON
  • Specifying custom OTLP operators using custom formatting
  • Filtering based on tag
  • Scrubbing/Redaction of text

Components of a log format

In OpsRamp, all collected logs are structured into three sections, making them easy to query and organize.

  • Log Resource Attributes: These attributes accompany each log. They contain details about the source from which the logs are collected, such as host, file name, file path, and other sources.
  • Log Record Attributes: These attributes are specific to the individual log records. They are user-defined and can vary based on the specific needs and requirements of the organization. By default, this section remains blank unless the user explicitly defines it.
  • Body: This field contains the raw log line as collected unless other transformations using Regex or JSON parsers are applied. You can transform the log line to extract fields using regex or a JSON parser. The body is then structured to contain key-value pairs. Once the body contains these key-value pairs, you can specify whether these extracted fields should be treated as record or resource attributes. Any attributes and resource attributes set in this manner will be indexed within OpsRamp.

File Configurations

In Linux and Windows systems, configuration files play a crucial role in defining the behaviour of applications and services.

For OpsRamp Agent, log configuration files are in the following paths:

Linux File Configurations

Default Location:

Path: /opt/opsramp/agent/conf/log/log-config.yaml

User-Defined Config Location:

Path: /opt/opsramp/agent/conf/log.d/log-config.yaml

A user is supposed to modify the User Defined Config as per the configuration located in the /opt/opsramp/agent/conf/log.d/log-config.yaml path. If configuration is not specified in the user-defined config file, the system collects the logs for Syslog and Journal ID using the default configuration.

Windows File Configuration

Default Location:

Path: C:\Program Files (x86)\OpsRamp\Agent\conf\log\log-config.yaml

Default Location:

Path: C:\Program Files (x86)\OpsRamp\Agent\conf\log.d\log-config.yaml

A user is supposed to modify the User Defined Config which is in the “” path. If no configuration is specified in the user-defined config file, the default configuration is used by which it collects application event error logs.

FAQ

  1. If the User Defined Config is the only configuration required by the user, what is the purpose of the Default Config file?

    • The default configuration is used concurrently with the auto-detection of applications. So, if the agent identifies any application, the log collection becomes active for those applications as well.
  2. How can we verify the applications detected through auto-detection?

    • Detected applications are recorded in the /opt/opsramp/agent/conf/log.d/log-config.yaml.sample in Linux and C:\Program Files (x86)\OpsRamp\Agent\conf\log.d\log-config.yaml.sample in Windows. This file is updated each time auto-detection occurs, allowing users to review the list of detected applications.

Note:

  • All configuration modifications must be made in the User-Defined Config.
  • If a User-Defined Config exists, the Default Config is entirely disregarded.
  • Auto-discovery of applications is deactivated for the User-Defined Config.

Simple File Log Collection

Simple log file contains usually contains a basic text file with log messages. You can configure the log collection within OpsRamp by directly sending raw logs to the portal using only Regex or JSON parsing using Simple File log collection approach. Consider the following limitations before implementing this approach:

  • The timestamps in these logs are based on when they are collected, rather than when they are generated.
  • Log levels are determined using a basic string-matching technique within the log message for common log-level strings. However, this method may be unreliable if the log line contains multiple log-level substrings.
  • Timestamp parsing and severity mapping fields are disregarded in this type of log collection.
inputs:
  source_name:
    type: file
    source: ''
    include:
      - /var/log/application_name/log_name.log
    exclude: 
      - '' 
    fingerprint_size: 1000

File Log Collection Using Regex

File log collection using Regex involves configuring the system to extract structured information from log files based on predefined patterns. For example, This section explains the process of reading and parsing an Nginx log.

inputs: 
  nginx_access: 
    type: file 
    source: nginx_access 
    include: 
      - /var/log/nginx/access.log 
    exclude: 
      - '' 
    parser_type: regex 
    regex: ^(?P<remote_addr>[^\s]*)\s-\s(?P<remote_user>[^\s]*)\s\[(?P<timestamp>[^]]*)]\s*"(?P<request>[^"]*)"\s*(?P<status_code>\d*)\s*(?P<body_bytes_sent>\d*)\s*"(?P<http_referer>[^"]*)"\s*"(?P<http_user_agent>[^"]*)"$ 
    timestamp: 
      layout_type: strptime 
      layout: '%d/%b/%Y:%H:%M:%S %z' 
      location: UTC

Here, Regex parser is used to parse Nginx logs. This involves specifying the parser_type as Regex and providing a valid regular expression that matches the log line format of the application. Note: You can validate the regex using tools like REGEX 101.
The following example explains how Regex extracts information from the sample Nginx log:
Sample Nginx logs:

172.26.32.120 - - [05/Oct/2021:12:57:04 +0000] "GET /nginx_status HTTP/1.1" 200 110 "-" "Go-http-client/1.1"
172.26.32.120 - - [05/Oct/2021:12:58:04 +0000] "GET /nginx_health HTTP/1.1" 200 110 "-" "Go-http-client/1.1"
172.26.32.120 - - [05/Oct/2021:12:59:04 +0000] "GET /nginx_info HTTP/1.1" 200 110 "-" "Go-http-client/1.1" 
172.26.32.120 - - [05/Oct/2021:13:00:04 +0000] "GET /nginx_status HTTP/1.1" 200 110 "-" "Go-http-client/1.1"

Regex used:

^(?P<remote_addr>[^\s]*)\s-\s(?P<remote_user>[^\s]*)\s\[(?P<timestamp>[^]]*)]\s*"(?P<request>[^"]*)"\s*(?P<status_code>\d*)\s*(?P<body_bytes_sent>\d*)\s*"(?P<http_referer>[^"]*)"\s*"(?P<http_user_agent>[^"]*)"$

As observed below, anything enclosed in (?P<group_name>regex_expression) captures and extracts specific fields. These extracted fields can be customized using advanced configurations such as custom formatting, filtering, scrubbing, and more. These will be explained further in the Advanced Configurations section.

The subsequent step in the configuration is the timestamp field. This feature enables you to synchronize the timestamp of the log in OpsRamp with the time it was originally generated. To accomplish this, parse the time field using formats like strptime, gotime, or epoch using Regex. See Supported Timestamp Formats for more details.

File Log Collection Using JSON

You can configure the system to extract structured information from log files formatted in JSON.
log-config.yaml:

eventstore: 
    type: "file" 
    source: "eventstore" 
    include: 
      - /var/log/eventstore/**/*.json 
    parser_type: "json"                                   	 
    custom_formatting: |-									 
      [ 
        { 
          "type":"move", 
          "from": "body.[\"@t\"]", 
          "to":"body.timestamp" 
        } 
      ] 
    timestamp: 
      layout_type: strptime 
      layout: "%Y-%m-%dT%H:%M:%S.%s%z"

In the above code snippet, enables you to parse the EventStore logs, which are structured in JSON format instead of plain text logs. To complete this task, extract relevant fields from the JSON-formatted logs using JSON parser.
The following code snippet is an example of an event store log parsing process:
sample eventstore log:

{"@t":"2020-08-18T10:57:05.6398910Z","@mt":"SLOW BUS MSG [{bus}]: {message} - {elapsed}ms. Handler: {handler}.","@l":"Debug","bus":"MainBus","message":"BecomeShuttingDown","elapsed":102,"handler":"WideningHandler`2","SourceContext":"EventStore.Core.Bus.InMemoryBus","ProcessId":2950,"ThreadId":12} 
{"@t":"2020-08-18T10:57:05.6560627Z","@mt":"SLOW QUEUE MSG [{queue}]: {message} - {elapsed}ms. Q: {prevQueueCount}/{curQueueCount}.","@l":"Debug","queue":"MainQueue","message":"RequestShutdown","elapsed":124,"prevQueueCount":0,"curQueueCount":8,"SourceContext":"EventStore.Core.Bus.QueuedHandlerMRES","ProcessId":2950,"ThreadId":12} 
{"@t":"2020-08-18T10:57:05.6623165Z","@mt":"========== [{httpEndPoint}] Service '{service}' has shut down.","httpEndPoint":"127.0.0.1:1113","service":"StorageWriter","SourceContext":"EventStore.Core.Services.VNode.ClusterVNodeController","ProcessId":2950,"ThreadId":12}

The above log is transformed into the following and the @t key is now renamed to “timestamp” which allows you to parse the timestamp.

When the JSON format is not used to parse the logs, not all fields will be parsed, resulting in incomplete information. Consequently, in the UI, the logs may not display correctly in the expanded view.

Advanced Configurations

Multiline Configuration

The multiline configuration refers to the ability to handle log entries that span multiple lines. If set, the multiline configuration block instructs the file source type to split log entries on a pattern other than newlines.
The multiline configuration block must contain exactly one line_start_pattern or line_end_pattern. These are regex patterns designed to match either the beginning of a new log entry or the end of a log entry.

For example:
Consider the following logs as input.
sample logs:

2022/09/07 20:20:54 error 373  Exception in thread "main" 
java.lang.NullPointerException 
        at com.example.myproject.Book.getTitle(Book.java:16) 
        at com.example.myproject.Author.getBookTitles(Author.java:25) 
        at com.example.myproject.Bootstrap.main(Bootstrap.java:14) 
2022/09/07 20:20:54 info 374  recovery sucess 
2022/09/07 20:20:54 info 375  random log line 
2022/09/07 20:20:54 warn 376  another random log line

Following is the configuration to parse the multiline: 

log-config.yaml 
inputs: 
  application: 
    type: file 
    include: 
      - "/var/log/application/*.log" 
    parser_type: "regex" 
    multiline: 
      line_start_pattern: ^(?P<timestamp>\d{4}\/\d{2}\/\d{2}\s*\d{2}:\d{2}:\d{2})\s*(?P<level>\w*)

When the collector attempts to read a line, it interprets everything from the matched pattern until the subsequent match as a single line.

In the above example, the first line, which spans across multiple lines, will be treated as a single line by the collector.

Severity From

The severity_from refers to the method or source from which the severity level of log entries is derived or determined. In log management, the severity level indicates the importance or severity of a log entry.
log-config.yaml:

redis: 
    type: "file" 
    source: "redis" 
    include: 
      - /home/lokesh/experiments/otel_collector/test_otel_collector_integration/redis-server-tampered.log 
    multiline: 
      line_start_pattern: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*) 
    parser_type: "regex" 
    regex: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*)\s*(?P<level>\W)(?P<message>.*)$ 
    timestamp: 
      layout_type: strptime 
      layout: "%e %b %H:%M:%S.%f" 
    severity_from: "body.lvl"

Severity Mapping

Severity mapping is the process of correlating or mapping severity levels from log entries to standardized severity levels for better organization, analysis, and prioritization of logs. It enables you to define rules or criteria for assigning severity levels to logs that lack explicit severity information.

Following is the example of Redis logs that lack proper log levels:

18842:M 27 Jun 10:55:21.636 * DB saved on disk 
18842:M 27 Jun 10:55:21.636 * Removing the pid file. 
18842:M 27 Jun 10:55:21.637 # Redis is now ready to exit, bye bye... 
24187:C 27 Jun 10:55:21.786 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo

You can configure the severity mapping as follows:

redis: 
    type: "file" 
    source: "redis" 
    include: 
      - /home/lokesh/experiments/otel_collector/test_otel_collector_integration/redis-server-tampered.log 
    multiline: 
      line_start_pattern: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*) 
    parser_type: "regex" 
    regex: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*)\s*(?P<level>\W)(?P<message>.*)$ 
    timestamp: 
      layout_type: strptime 
      layout: "%e %b %H:%M:%S.%f" 
 	severity_priority_order: [ "error", "fatal", "warn", "info", "debug", "trace" ]      
    severity_mapping:							 
      warn: [ "#" ] 
      info: [ "*" ] 
      debug: [ "." ] 
      trace: [ "-" ] 
      error: [] 
      fatal: [] 

In the above scenario, the mapping of characters to severity levels is as follows. This mapping dictates how log entries containing these characters will be represented in terms of severity levels.

  • # is mapped to WARN
  • * is mapped to INFO
  • . is mapped to DEBUG
  • - is mapped to TRACE

If the severity levels map is blank, it will be ignored. We demonstrate the basis of the mapping by displaying it as a log attribute named Custom Severity.

Encoding

Encoding is a method or scheme used to represent characters or data within log entries. It is useful when the log is not following standard UTF-8 encoding. 

application_name: 
    type: "file" 
    source: "application_name" 
    encoding: "utf-8"                 
    include: 
      - /var/log/applicaton/*.log 

The valid encoding types include:

KeyDescription
nopNo encoding validation. Treats the file as a stream of raw bytes
utf-8UTF-8 encoding
utf-16leUTF-16 encoding with little-endian byte order
utf-16beUTF-16 encoding with big-endian byte order
asciiASCII encoding
big5The Big5 Chinese character encoding

User Defined Constant Labels

The User Defined Constant Labels are the custom labels or tags that users can define and assign to log entries or events. You can configure up to five resource labels. If more than five labels are specified, only the first five labels are considered, and any additional labels are ignored.

application: 
    type: "file" 
    include: 
      - /var/log/application/*.log 
    labels: 
      label_key_1: "lable_value_1" 
      label_key_2: "lable_value_2" 
      label_key_3: "lable_value_3" 

Setting ParseTo

The Setting ParseTo specifies the target location where the parsed log fields should be stored or mapped within a log management system. Using ParseTo can introduce undefined behaviour and potentially disrupt functionalities such as level and timestamp parsing.
It is recommended to exercise caution when employing ParseTo and only utilize it if there is a clear benefit, especially when combined with custom formatting.
When using a Regex or JSON parser, you can specify the location of parsed result within the log management system. Following are three destinations:

  • Body - Parsed results are stored within the body of the log entry itself.
  • Attributes - Parsed results are stored as attributes associated with the log entry.
  • Resource - Parsed results are stored as resource attributes, which are specific to the resource or entity represented by the log entry.
application: 
    type: "file" 
    include: 
      - /var/log/application/*.log 
    parser_type: "regex" 
    regex: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*)\s*(?P<level>\W)(?P<message>.*)$ 
    parse_to: "body"  
    timestamp: 
      layout_type: strptime 
      layout: "%e %b %H:%M:%S.%f" 

Set Values

Setting values in record attributes and resource attributes from parsed fields involves extracting information from log entries using parsers like regex or JSON and then assigning this extracted information to attributes associated with the log entry or the resource being monitored.
This is useful if the you want to set certain parsed fields as either record or resource attributes.

application: 
    type: "file" 
    include: 
      - /var/log/application/*.log 
    parser_type: "regex" 
    regex: ^(?P<pid>\w*):(?P<role>\w*)\s*(?P<timestamp>\d*\s*\w*\s*\d*\s*\d*:\d*:\d*.\d*)\s*(?P<level>\W)(?P<message>.*)$ 
    timestamp: 
      layout_type: strptime 
      layout: "%e %b %H:%M:%S.%f" 
    attributes: [ "role", "pid" ] 
    resource_attributes: [ "role" ] 

In the above example, the preference for role and pid to serve as record attributes is indicated2, while role is designated as a resource attribute as well.

It is important to note that anything defined as an attribute or resource attribute will be indexed within OpsRamp.

Custom Formatting

If you require full control over the parsing process, you must employ custom formatting to tailor log entries according to their needs. The Custom Formatting field accepts a list of operators, from which information can be extracted.

See OpenTelemetry collector operators for more details.

Filter Logs

Filtering logs involves selecting specific log entries from a larger set based on defined criteria. To filter logs based on specific criteria, you must define the rules to filter in the filtering section for their respective input.
filtering syntax:

filters:  
  - key: ""  
    include: "" 
  - key: "" 
    exclude: "" 
  - key: "" 
    include: "" 
  - attribute_type: "resource" 
    key: "" 
    exclude: "" 
  . 
  . 
  .
Tag NameDescription
excludeRemoves the records that match the specified regular expression pattern.
includeKeeps the records that match the specified regular expression pattern.
keyTag for which the respective filtering rule must be applied.
attribute_typeAccepted values are body, attributes, and resource (defaults to body if the field is ignored).

The filter rule must have either include or exclude but not both at the same time. It is recommended to use either include or exclude within a single filter rule.
However, if there is a need to incorporate both include and exclude conditions, it is advised to create separate filter rules and chain them together. This chaining allows you to be more precise and ensures flexible filtering criteria.

- key: "" 
  include: "" 
- key: "" 
  exclude: "" 

Filtering occurs in the sequence in which the rules are specified. If the attribute_type is designated as body and the key is left empty, the filtering process assumes that no regex or JSON parsers are applied and attempts to filter directly on the body of the log entry.

Note: When using this approach, as it may result in errors if the body is not a raw string.

Masking/Scrubbing Logs

Masking or scrubbing logs involves obscuring sensitive or personally identifiable information (PII) found within log entries. This practice aims to safeguard privacy and maintain compliance with data protection regulations. To mask any sensitive data present in the logs, you must define masking rules within the masking section corresponding to their respective log inputs.

Masking Syntax:

masking:   
  - text: "" 
    placeholder: "" 
  - text: "" 
    placeholder: ""
Tag NameDescription
textRegular Expression for the text that needs to be masked.
placeholderString which will be used for replacing the text specified.

Multiple Regexes for Single Source

The Multiple Regexes for Single Source refers to the use of multiple regular expressions (regexes) to parse and extract information from log files. In cases where a log file contains entries with multiple formats, you can utilize the following configuration:

inputs: 
  source_name: 
    type: file 
    include: 
      - /var/log/file1.log 
      - /var/log/file2.log 
    multiline: 
      line_start_pattern: "" 
    parser_type: "regex" 
    multi_regex: # specify as many regex expression as required (only works when parser_type is regex) 
      - regex: '^(?P<timestamp>\d{4}-\d{2}-\d{2}) (?P<level>[A-Z]*) (?P<message>.*)$' 
        timestamp: 
          layout_type: strptime 
          layout: "%b %e %H:%M:%S" 
        severity_from: "body.lvl" 
        severity_priority_order: 
          ["error", "fatal", "warn", "info", "debug", "trace"] 
        severity_mapping: 
          warn: ["#"] 
          info: ["*"] 
          debug: [] 
          trace: [] 
          error: [] 
          fatal: [] 
        filters: 
          - attribute_type: "body" 
            key: "" 
            include: "" 
          - key: "" 
            exclude: "" 
        masking: 
          - text: "" 
            placeholder: "" 
        attributes: [] 
        resource_attributes: [] 
        custom_formatting: |- 
          [ 
            { 
              "type": "copy", 
              "from": "body.message", 
              "to": "body.duplicate" 
            } 
          ] 
      - regex: '^(?P<timestamp>\d{4}-\d{2}-\d{2}) (?P<message>.*)$'

Point to remember: 

  • Ensure that the multiple line start, or end pattern is configured to match all the regex patterns defined for the source, given that log line splitting occurs before regex matching.
  • All the specified regex patterns are compared in the order in which they are defined, and the first one that matches is considered, with all subsequent patterns being ignored.

Additional File Type Settings

Fingerprint Size

The parameter fingerprint_size determines the number of bytes used from the beginning of a file to uniquely identify it across polls. Since this is the only criterion for identifying file uniqueness, it is essential to adjust this value if the initial 1000 bytes of multiple files are identical.

inputs: 
  source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
   # Fingerprint Size 
    ######################################################################################################### 
    fingerprint_size: 1000 
    ######################################################################################################### 

Poll and Flush intervals

Poll and flush intervals dictate how frequently data is collected and transmitted in systems.

  • Poll Interval: This refers to the frequency at which a system retrieves or collects data from a source.
  • Flush Interval: This determines how often collected data is transmitted or flushed to its destination.

For example,

  • poll_interval: It is the duration to check log files for new changes.
  • force_flush_period: Determines the period elapsed since the last data read from a file. Once this interval is completed, the currently buffered log will be forwarded to the pipeline for processing. A setting of 0 disables forced flushing, a configuration usually not necessary in the majority of cases.
inputs: 
  source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
    # Poll and Flush Settings  
    ######################################################################################################### 
    poll_interval: 200ms               
    force_flush_period: 500ms           	    
    ######################################################################################################### 

Retry on Failure

The Retry on Failure refers to the capability of log collection systems to automatically retry sending log data to a central repository or processing pipeline when a failure occurs during the initial attempt.

inputs: 
  source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
    # Retry On failure Settings  
    ######################################################################################################### 
    retry_on_failure: 
      enabled: false                   
      initial_interval: 1s             
      max_interval: 30s                
      max_elapsed_time: 5m             
    ######################################################################################################### 

Max Concurrent Files

When processing logs, the agent restricts the maximum number of files it can open at any given point in time. This setting determines that value on a per-source basis, with a default of 5 files. It is advisable to proceed with caution when considering increasing this number.

inputs: 
 source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
    # Max Concurrent Files Settings  
    #########################################################################################################      
max_concurrent_files: 5   
######################################################################################################### 

Refresh Interval

This configuration parameter specifies the interval at which the list of files is updated for the specified paths to monitor in the include configuration.

inputs: 
  source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
    # Refresh Interval 
    #########################################################################################################             
    refresh_interval: 60s   
    ######################################################################################################### 

Ordering Criteria

The agent selectively collects a limited number of files based on their modified time, adhering to a predetermined upper limit.

inputs: 
  source_name: 
    type: "file" 
    include: 
      - "/var/log/application_name/log_name.log" 
    ######################################################################################################### 
    # Ordering Criteria 
     ##########################################################################################################        
ordering_criteria: 
      top_n: 5 
      max_time: 2h     
    ######################################################################################################### 


  For all files with type: file, it defaults to:

ordering_criteria: 
  top_n: 5 
  max_time: 2h 

In the case of Kubernetes and Docker, these limits are set to:

ordering_criteria: 
  top_n: 500 
  max_time: 2h 


  To understand the function of these settings, let us examine the following example:
  Suppose we have an application that generates logs in the directory /var/log/application/*.log, we utilize the following configuration to collect the logs from that path:

inputs: 
  application: 
    type: "file" 
    include: 
      - "/var/log/application/*.log 

In this scenario, let us assume the directory contains the following files:
-rw-r--r--  1 root  wheel  0 Apr 14 16:35 file_1.log 
-rw-r--r--  1 root  wheel  0 Apr 14 16:45 file_2.log 
-rw-r--r--  1 root  wheel  0 Apr 14 16:55 file_3.log 
-rw-r--r--  1 root  wheel  0 Apr 14 17:05 file_4.log 
-rw-r--r--  1 root  wheel  0 Apr 14 17:15 file_5.log 
-rw-r--r--  1 root  wheel  0 Apr 14 17:25 file_6.log 
-rw-r--r--  1 root  wheel  0 Apr 14 17:35 file_7.log 
-rw-r--r--  1 root  wheel  0 Apr 14 18:00 file_8.log 
-rw-r--r--  1 root  wheel  0 Apr 14 18:30 file_9.log 
-rw-r--r--  1 root  wheel  0 Apr 14 19:36 file_10.log 

Case 1: If the present time is 20:00
The agent will only watch the following files: 

  • file_10.log
  • file_9.log
  • file_8.log
    This is because the max_time is set to 2hr by default and any file that has a modified time that is older than 2hr is ignored.

Case 2: If the present time is 19:40
The agent will only watch the following files:

  • file_10.log
  • file_9.log
  • file_8.log
  • file_7.log
  • file_6.log
    Here, the other logs which are still under 2 hrs ignored since we hit the top_n limit of 5 by default.
    These defaults ensure that the agent does not try to watch historic files which never get updated and are only used for bookkeeping. Setting these values to a high number is usually not advised since it comes with a performance penalty.