Skip to content

initial_read_offset is missing #16642

@e-gleba

Description

@e-gleba

Relevant telegraf.conf

# https://github.com/influxdata/telegraf/tree/master/plugins/parsers/csv
# https://github.com/influxdata/telegraf/tree/master/plugins/inputs/tail
[[inputs.tail]]
## File names or a pattern to tail.
## These accept standard unix glob matching rules, but with the addition of
## ** as a "super asterisk". ie:
##   "/var/log/**.log"  -> recursively find all .log files in /var/log
##   "/var/log/*/*.log" -> find all .log files with a parent dir in /var/log
##   "/var/log/apache.log" -> just tail the apache log file
##   "/var/log/log[!1-2]*  -> tail files without 1-2
##   "/var/log/log[^1-2]*  -> identical behavior as above
## See https://github.com/gobwas/glob for more examples
##
files = ["/data/sun/**.csv"]

## Offset to start reading at
## The following methods are available:
##   beginning          -- start reading from the beginning of the file ignoring any persisted offset
##   end                -- start reading from the end of the file ignoring any persisted offset
##   saved-or-beginning -- use the persisted offset of the file or, if no offset persisted, start from the beginning of the file
##   saved-or-end       -- use the persisted offset of the file or, if no offset persisted, start from the end of the file
initial_read_offset = "beginning"

## Whether file is a named pipe
# pipe = false

## Method used to watch for file updates.  Can be either "inotify" or "poll".
## inotify is supported on linux, *bsd, and macOS, while Windows requires
## using poll. Poll checks for changes every 250ms.
watch_method = "inotify"

## Maximum lines of the file to process that have not yet be written by the
## output.  For best throughput set based on the number of metrics on each
## line and the size of the output's metric_batch_size.
# max_undelivered_lines = 1000

## Character encoding to use when interpreting the file contents.  Invalid
## characters are replaced using the unicode replacement character.  When set
## to the empty string the data is not decoded to text.
##   ex: character_encoding = "utf-8"
##       character_encoding = "utf-16le"
##       character_encoding = "utf-16be"
##       character_encoding = ""
# character_encoding = ""

## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
##   https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "csv"

## Indicates how many rows to treat as a header. By default, the parser assumes
## there is no header and will parse the first row as data. If set to anything more
## than 1, column names will be concatenated with the name listed in the next header row.
## If `csv_column_names` is specified, the column names in header will be overridden.
csv_header_row_count = 1

## For assigning custom names to columns
## If this is specified, all columns should have a name
## Unnamed columns will be ignored by the parser.
## If `csv_header_row_count` is set to 0, this config must be used
csv_column_names = []

## For assigning explicit data types to columns.
## Supported types: "int", "float", "bool", "string".
## Specify types in order by column (e.g. `["string", "int", "float"]`)
## If this is not specified, type conversion will be done on the types above.
csv_column_types = []

## Indicates the number of rows to skip before looking for metadata and header information.
csv_skip_rows = 0

## Indicates the number of rows to parse as metadata before looking for header information.
## By default, the parser assumes there are no metadata rows to parse.
## If set, the parser would use the provided separators in the csv_metadata_separators to look for metadata.
## Please note that by default, the (key, value) pairs will be added as tags.
## If fields are required, use the converter processor.
csv_metadata_rows = 0

## A list of metadata separators. If csv_metadata_rows is set,
## csv_metadata_separators must contain at least one separator.
## Please note that separators are case sensitive and the sequence of the separators are respected.
csv_metadata_separators = [":", "="]

## A set of metadata trim characters.
## If csv_metadata_trim_set is not set, no trimming is performed.
## Please note that the trim cutset is case sensitive.
csv_metadata_trim_set = ""

## Indicates the number of columns to skip before looking for data to parse.
## These columns will be skipped in the header as well.
csv_skip_columns = 0

## The separator between csv fields
## By default, the parser assumes a comma (",")
## Please note that if you use invalid delimiters (e.g. "\u0000"), commas
## will be changed to "\ufffd", the invalid delimiters changed to a comma
## during parsing, and afterwards the invalid characters and commas are
## returned to their original values.
csv_delimiter = ","

## The character reserved for marking a row as a comment row
## Commented rows are skipped and not parsed
csv_comment = ""

## If set to true, the parser will remove leading whitespace from fields
## By default, this is false
csv_trim_space = false

## Columns listed here will be added as tags. Any other columns
## will be added as fields.
csv_tag_columns = []

## Set to true to let the column tags overwrite the metadata and default tags.
csv_tag_overwrite = false

## The column to extract the name of the metric from. Will not be
## included as field in metric.
csv_measurement_column = ""

## The column to extract time information for the metric
## `csv_timestamp_format` must be specified if this is used.
## Will not be included as field in metric.
csv_timestamp_column = "time"

## The format of time data extracted from `csv_timestamp_column`
## this must be specified if `csv_timestamp_column` is specified
csv_timestamp_format = "2006-01-02"

## The timezone of time data extracted from `csv_timestamp_column`
## in case of there is no timezone information.
## It follows the  IANA Time Zone database.
csv_timezone = ""

## Indicates values to skip, such as an empty string value "".
## The field will be skipped entirely where it matches any values inserted here.
csv_skip_values = []

## If set to true, the parser will skip csv lines that cannot be parsed.
## By default, this is false
csv_skip_errors = false

## Reset the parser on given conditions.
## This option can be used to reset the parser's state e.g. when always reading a
## full CSV structure including header etc. Available modes are
##    "none"   -- do not reset the parser (default)
##    "always" -- reset the parser with each call (ignored in line-wise parsing)
##                Helpful when e.g. reading whole files in each gather-cycle.
# csv_reset_mode = "none"

## Set the tag that will contain the path of the tailed file. If you don't want this tag, set it to an empty string.
# path_tag = "path"

## Filters to apply to files before generating metrics
## "ansi_color" removes ANSI colors
# filters = []

## multiline parser/codec
## https://www.elastic.co/guide/en/logstash/2.4/plugins-filters-multiline.html
#[inputs.tail.multiline]
## The pattern should be a regexp which matches what you believe to be an indicator that the field is part of an event consisting of multiple lines of log data.
#pattern = "^\s"

## The field's value must be previous or next and indicates the relation to the
## multi-line event.
#match_which_line = "previous"

## The invert_match can be true or false (defaults to false).
## If true, a message not matching the pattern will constitute a match of the multiline filter and the what will be applied. (vice-versa is also true)
#invert_match = false

## The handling method for quoted text (defaults to 'ignore').
## The following methods are available:
##   ignore  -- do not consider quotation (default)
##   single-quotes -- consider text quoted by single quotes (')
##   double-quotes -- consider text quoted by double quotes (")
##   backticks     -- consider text quoted by backticks (`)
## When handling quotes, escaped quotes (e.g. \") are handled correctly.
#quotation = "ignore"

## The preserve_newline option can be true or false (defaults to false).
## If true, the newline character is preserved for multiline elements,
## this is useful to preserve message-structure e.g. for logging outputs.
#preserve_newline = false

#After the specified timeout, this plugin sends the multiline event even if no new pattern is found to start a new event. The default is 5s.
#timeout = 5s

# https://www.influxdata.com/blog/running-influxdb-2-0-and-telegraf-using-docker/
# Output Configuration for telegraf agent
[[outputs.influxdb_v2]]
## The URLs of the InfluxDB cluster nodes.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
## urls exp: http://127.0.0.1:8086
urls = ["http://influxdb:8086"]

## Token for authentication.
token = "$DOCKER_INFLUXDB_INIT_ADMIN_TOKEN"

## Organization is the name of the organization you wish to write to; must exist.
organization = "$DOCKER_INFLUXDB_INIT_ORG"

## Destination bucket to write into.
bucket = "$DOCKER_INFLUXDB_INIT_BUCKET"

insecure_skip_verify = true

Logs from Telegraf

Attaching to influxdb, telegraf
telegraf  | 2025-03-15T14:35:12Z I! Using config file: /etc/telegraf/telegraf.conf
telegraf  | 2025-03-15T14:35:12Z E! error loading config file /etc/telegraf/telegraf.conf: plugin inputs.tail: line 3: configuration specified the fields ["initial_read_offset"], but they weren't used
telegraf exited with code 1
influxdb  | 2025-03-15T14:35:12.	info	found existing boltdb file, skipping setup wrapper	{"system": "docker", "bolt_path": "/var/lib/influxdb2/influxd.bolt"}
influxdb  | 2025-03-15T14:35:12.	info	found existing boltdb file, skipping setup wrapper	{"system": "docker", "bolt_path": "/var/lib/influxdb2/influxd.bolt"}
telegraf  | 2025-03-15T14:35:12Z I! Using config file: /etc/telegraf/telegraf.conf
telegraf  | 2025-03-15T14:35:12Z E! error loading config file /etc/telegraf/telegraf.conf: plugin inputs.tail: line 3: configuration specified the fields ["initial_read_offset"], but they weren't used

System info

telegraf 1.25 alpine, Linux fedora 6.13.6-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Mar 7 21:33:48 UTC 2025 x86_64 GNU/Linux

Docker

networks:
default:
driver: bridge

services:
# https://www.influxdata.com/blog/running-influxdb-2-0-and-telegraf-using-docker/
influxdb:
image: influxdb:2.6-alpine
container_name: influxdb
restart: unless-stopped
ports:
- "8086:8086"
volumes:
- ./influxdbv2:/etc/influxdb2:Z
env_file:
- path: ./influxv2.env
required: true

telegraf:
    image: telegraf:1.25-alpine
    container_name: telegraf
    restart: unless-stopped
    volumes:
        - ./telegraf.toml:/etc/telegraf/telegraf.conf:Z
        - ./../data/sun:/data/sun:Z
    depends_on:
        - influxdb
    env_file:
        - path: ./influxv2.env
          required: true

Steps to reproduce

  1. Start docker compose
  2. Use the config

Expected behavior

Parse config

Actual behavior

Falls with error

Additional info

Here is my repo - https://github.com/geugenm/satellite-weather-impact-analysis

Everything is in it

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugunexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions