-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Closed
Closed
Copy link
Labels
area/windowsRelated to windows plugins (win_eventlog, win_perf_counters, win_services)Related to windows plugins (win_eventlog, win_perf_counters, win_services)bugunexpected problem or unintended behaviorunexpected problem or unintended behaviorplatform/windows
Description
Relevant telegraf.conf
# Configuration for telegraf agent
[agent]
## Default data collection interval for all inputs
interval = "10s"
## Rounds collection interval to 'interval'
## ie, if interval="10s" then always collect on :00, :10, :20, etc.
round_interval = true
## Telegraf will send metrics to outputs in batches of at most
## metric_batch_size metrics.
## This controls the size of writes that Telegraf sends to output plugins.
metric_batch_size = 1000
## Maximum number of unwritten metrics per output. Increasing this value
## allows for longer periods of output downtime without dropping metrics at the
## cost of higher maximum memory usage.
metric_buffer_limit = 10000
## Collection jitter is used to jitter the collection by a random amount.
## Each plugin will sleep for a random time within jitter before collecting.
## This can be used to avoid many plugins querying things like sysfs at the
## same time, which can have a measurable effect on the system.
collection_jitter = "0s"
## Default flushing interval for all outputs. Maximum flush_interval will be
## flush_interval + flush_jitter
flush_interval = "10s"
## Jitter the flush interval by a random amount. This is primarily to avoid
## large write spikes for users running a large number of telegraf instances.
## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
flush_jitter = "0s"
## By default or when set to "0s", precision will be set to the same
## timestamp order as the collection interval, with the maximum being 1s.
## ie, when interval = "10s", precision will be "1s"
## when interval = "250ms", precision will be "1ms"
## Precision will NOT be used for service inputs. It is up to each individual
## service input to set the timestamp at the appropriate precision.
## Valid time units are "ns", "us" (or "µs"), "ms", "s".
precision = ""
## Log at debug level.
# debug = false
## Log only error level messages.
# quiet = false
## Log target controls the destination for logs and can be one of "file",
## "stderr" or, on Windows, "eventlog". When set to "file", the output file
## is determined by the "logfile" setting.
# logtarget = "file"
## Name of the file to be logged to when using the "file" logtarget. If set to
## the empty string then logs are written to stderr.
# logfile = ""
## The logfile will be rotated after the time interval specified. When set
## to 0 no time based rotation is performed. Logs are rotated only when
## written to, if there is no log activity rotation may be delayed.
# logfile_rotation_interval = "0d"
## The logfile will be rotated when it becomes larger than the specified
## size. When set to 0 no size based rotation is performed.
# logfile_rotation_max_size = "0MB"
## Maximum number of rotated archives to keep, any older logs are deleted.
## If set to -1, no archives are removed.
# logfile_rotation_max_archives = 5
## Pick a timezone to use when logging or type 'local' for local time.
## Example: America/Chicago
# log_with_timezone = ""
## Override default hostname, if empty use os.Hostname()
hostname = ""
## If set to true, do no set the "host" tag in the telegraf agent.
omit_hostname = false
[[outputs.influxdb_v2]]
## The URLs of the InfluxDB cluster nodes.
##
## Multiple URLs can be specified for a single cluster, only ONE of the
## urls will be written to each interval.
## ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
urls = ["https://our.url.com"]
## Token for authentication.
token = "OUR_TOKEN"
## Organization is the name of the organization you wish to write to; must exist.
organization = "OUR_ORG"
## Destination bucket to write into.
bucket = "OUR-BUCKET"
## The value of this tag will be used to determine the bucket. If this
## tag is not set the 'bucket' option is used as the default.
# bucket_tag = ""
## If true, the bucket tag will not be added to the metric.
# exclude_bucket_tag = false
## Timeout for HTTP messages.
# timeout = "5s"
## Additional HTTP headers
# http_headers = {"X-Special-Header" = "Special-Value"}
## HTTP Proxy override, if unset values the standard proxy environment
## variables are consulted to determine which proxy, if any, should be used.
# http_proxy = "http://corporate.proxy:3128"
## HTTP User-Agent
# user_agent = "telegraf"
## Content-Encoding for write request body, can be set to "gzip" to
## compress body or "identity" to apply no encoding.
# content_encoding = "gzip"
## Enable or disable uint support for writing uints influxdb 2.0.
# influx_uint_support = false
## Optional TLS Config for use on HTTP connections.
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false
[[inputs.cpu]]
## Whether to report per-cpu stats or not
percpu = true
## Whether to report total system cpu stats or not
totalcpu = true
## If true, collect raw CPU time metrics
collect_cpu_time = false
## If true, compute and report the sum of all non-idle CPU states
report_active = false
[[inputs.disk]]
## By default stats will be gathered for all mount points.
## Set mount_points will restrict the stats to only the specified mount points.
# mount_points = ["/"]
## Ignore mount points by filesystem type.
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
## By default, telegraf will gather stats for all devices including
## disk partitions.
## Setting devices will restrict the stats to the specified devices.
# devices = ["sda", "sdb", "vd*"]
## Uncomment the following line if you need disk serial numbers.
# skip_serial_number = false
#
## On systems which support it, device metadata can be added in the form of
## tags.
## Currently only Linux is supported via udev properties. You can view
## available properties for a device by running:
## 'udevadm info -q property -n /dev/sda'
## Note: Most, but not all, udev properties can be accessed this way. Properties
## that are currently inaccessible include DEVTYPE, DEVNAME, and DEVPATH.
# device_tags = ["ID_FS_TYPE", "ID_FS_USAGE"]
#
## Using the same metadata source as device_tags, you can also customize the
## name of the device via templates.
## The 'name_templates' parameter is a list of templates to try and apply to
## the device. The template may contain variables in the form of '$PROPERTY' or
## '${PROPERTY}'. The first template which does not contain any variables not
## present for the device is used as the device name tag.
## The typical use case is for LVM volumes, to get the VG/LV name instead of
## the near-meaningless DM-0 name.
# name_templates = ["$ID_FS_LABEL","$DM_VG_NAME/$DM_LV_NAME"]
[[inputs.mem]]
# no configuration
[[inputs.net]]
## By default, telegraf gathers stats from any up interface (excluding loopback)
## Setting interfaces will tell it to gather these explicit interfaces,
## regardless of status.
##
# interfaces = ["eth0"]
##
## On linux systems telegraf also collects protocol stats.
## Setting ignore_protocol_stats to true will skip reporting of protocol metrics.
##
# ignore_protocol_stats = false
##
[[inputs.processes]]
# no configuration
[[inputs.swap]]
# no configuration
[[inputs.system]]
## Uncomment to remove deprecated metrics.
# fielddrop = ["uptime_format"]
# Monitors internet speed in the network
[[inputs.internet_speed]]
## Sets if runs file download test
## Default: false
enable_file_download = true
[[inputs.win_eventlog]]
## Telegraf should have Administrator permissions to subscribe for some Windows Events channels
## (System log, for example)
## LCID (Locale ID) for event rendering
## 1033 to force English language
## 0 to use default Windows locale
# locale = 0
## Name of eventlog, used only if xpath_query is empty
## Example: "Application"
# eventlog_name = ""
## xpath_query can be in defined short form like "Event/System[EventID=999]"
## or you can form a XML Query. Refer to the Consuming Events article:
## https://docs.microsoft.com/en-us/windows/win32/wes/consuming-events
## XML query is the recommended form, because it is most flexible
## You can create or debug XML Query by creating Custom View in Windows Event Viewer
## and then copying resulting XML here
xpath_query = '''
<QueryList>
<Query Id="0" Path="Security">
<Select Path="Security">*</Select>
<Suppress Path="Security">*[System[( (EventID >= 5152 and EventID <= 5158) or EventID=5379 or EventID=4672)]]</Suppress>
</Query>
<Query Id="1" Path="Application">
<Select Path="Application">*[System[(Level < 4)]]</Select>
</Query>
<Query Id="2" Path="Windows PowerShell">
<Select Path="Windows PowerShell">*[System[(Level < 4)]]</Select>
</Query>
<Query Id="3" Path="System">
<Select Path="System">*</Select>
</Query>
<Query Id="4" Path="Setup">
<Select Path="Setup">*</Select>
</Query>
</QueryList>
'''
## System field names:
## "Source", "EventID", "Version", "Level", "Task", "Opcode", "Keywords", "TimeCreated",
## "EventRecordID", "ActivityID", "RelatedActivityID", "ProcessID", "ThreadID", "ProcessName",
## "Channel", "Computer", "UserID", "UserName", "Message", "LevelText", "TaskText", "OpcodeText"
## In addition to System, Data fields can be unrolled from additional XML nodes in event.
## Human-readable representation of those nodes is formatted into event Message field,
## but XML is more machine-parsable
# Process UserData XML to fields, if this node exists in Event XML
process_userdata = true
# Process EventData XML to fields, if this node exists in Event XML
process_eventdata = true
## Separator character to use for unrolled XML Data field names
separator = "_"
## Get only first line of Message field. For most events first line is usually more than enough
only_first_line_of_message = true
## Parse timestamp from TimeCreated.SystemTime event field.
## Will default to current time of telegraf processing on parsing error or if set to false
timestamp_from_event = true
## Fields to include as tags. Globbing supported ("Level*" for both "Level" and "LevelText")
event_tags = ["Source", "EventID", "Level", "LevelText", "Task", "TaskText", "Opcode", "OpcodeText", "Keywords", "Channel", "Computer"]
## Default list of fields to send. All fields are sent by default. Globbing supported
event_fields = ["*"]
## Fields to exclude. Also applied to data fields. Globbing supported
exclude_fields = ["TimeCreated", "Binary", "Data_Address*"]
## Skip those tags or fields if their value is empty or equals to zero. Globbing supported
exclude_empty = ["*ActivityID", "UserID"]
Logs from Telegraf
We have telegraf running as a service. This is using the command `C:\Apps\Telegraf\telegraf.exe --config C:\Apps\Telegraf\telegraf_windows.conf --test --debug`.
2022-05-19T14:50:54Z I! Starting Telegraf 1.22.4
2022-05-19T14:50:54Z I! Loaded inputs: cpu disk diskio internet_speed mem net processes swap system win_eventlog
2022-05-19T14:50:54Z I! Loaded aggregators:
2022-05-19T14:50:54Z I! Loaded processors:
2022-05-19T14:50:54Z W! �[31mOutputs are not used in testing mode!�[0m
2022-05-19T14:50:54Z I! Tags enabled: host=DESKTOP-5F6URBF
2022-05-19T14:50:54Z D! [agent] Initializing plugins
2022-05-19T14:50:54Z W! [inputs.processes] Current platform is not supported
2022-05-19T14:50:54Z D! [agent] Starting service inputs
> system,host=DESKTOP-5F6URBF load1=0,load15=0,load5=0,n_cpus=8i 1652971854000000000
> system,host=DESKTOP-5F6URBF uptime=7221i 1652971854000000000
> system,host=DESKTOP-5F6URBF uptime_format=" 2:00" 1652971854000000000
> diskio,host=DESKTOP-5F6URBF,name=C: io_time=0i,iops_in_progress=0i,merged_reads=0i,merged_writes=0i,read_bytes=5004852736i,read_time=209i,reads=308692i,weighted_io_time=0i,write_bytes=878959104i,write_time=17i,writes=36026i 1652971854000000000
> disk,device=C:,fstype=NTFS,host=DESKTOP-5F6URBF,mode=rw,path=\C: free=862828892160i,inodes_free=0i,inodes_total=0i,inodes_used=0i,total=999559262208i,used=136730370048i,used_percent=13.679065886095259 1652971854000000000
> mem,host=DESKTOP-5F6URBF available=30120779776i,available_percent=87.88167885335757,total=34274242560i,used=4153462784i,used_percent=12.118321146642431 1652971854000000000
> swap,host=DESKTOP-5F6URBF free=32598978560i,total=39374516224i,used=6775537664i,used_percent=17.20792612524874 1652971854000000000
> swap,host=DESKTOP-5F6URBF in=0i,out=0i 1652971854000000000
2022-05-19T14:50:54Z D! [inputs.win_eventlog] Subscription handle id:1
> net,host=DESKTOP-5F6URBF,interface=Wi-Fi bytes_recv=7662036718i,bytes_sent=3544926366i,drop_in=0i,drop_out=0i,err_in=0i,err_out=0i,packets_recv=5839293i,packets_sent=3894043i 1652971854000000000
> cpu,cpu=cpu0,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=96.7741935483871,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=3.225806451612903 1652971855000000000
> cpu,cpu=cpu1,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=96.875,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=3.125,usage_user=0 1652971855000000000
> cpu,cpu=cpu2,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=93.75,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=3.125,usage_user=3.125 1652971855000000000
> cpu,cpu=cpu3,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1652971855000000000
> cpu,cpu=cpu4,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1652971855000000000
> cpu,cpu=cpu5,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=80,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=16.666666666666668,usage_user=3.3333333333333335 1652971855000000000
> cpu,cpu=cpu6,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=93.75,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=6.25,usage_user=0 1652971855000000000
> cpu,cpu=cpu7,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=100,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=0,usage_user=0 1652971855000000000
> cpu,cpu=cpu-total,host=DESKTOP-5F6URBF usage_guest=0,usage_guest_nice=0,usage_idle=95.25691699604744,usage_iowait=0,usage_irq=0,usage_nice=0,usage_softirq=0,usage_steal=0,usage_system=3.5573122529644268,usage_user=1.1857707509881423 1652971855000000000
2022-05-19T14:50:54Z D! [inputs.internet_speed] Found server: [38690] 4.79km
Montréal, QC (Canada) by Altima Telecom 10G
2022-05-19T14:50:54Z D! [inputs.internet_speed] Starting Speed Test
2022-05-19T14:50:54Z D! [inputs.internet_speed] Running Ping...
2022-05-19T14:50:54Z D! [inputs.internet_speed] Running Download...
2022-05-19T14:50:58Z D! [inputs.internet_speed] Running Upload...
2022-05-19T14:51:00Z D! [inputs.internet_speed] Test finished.
2022-05-19T14:51:00Z D! [agent] Stopping service inputs
2022-05-19T14:51:00Z D! [agent] Input channel closed
2022-05-19T14:51:00Z D! [agent] Stopped Successfully
> internet_speed,host=DESKTOP-5F6URBF download=29.105152093974716,latency=3.96315,upload=28.83633147074662 1652971860000000000
System info
Telegraf 1.22.4 / Windows 10
Docker
No response
Steps to reproduce
This is our internal install instructions. Maybe something is off? telegraf_windows.conf
is the telegraf.conf
above:
- First off, create a new directory in C:\Apps\Telegraf
- Download the telegraf_windows.conf base config and place it in that folder
- Download the latest telegraf Windows binary
- Extract the zip file and move the telegraf.exe file to your C:\Apps\Telegraf folder
- Open up a PowerShell terminal as admin by right-clicking on the icon and choosing "Open as Administrator"
- Test your config: C:\Apps\Telegraf\telegraf.exe --config C:\Apps\Telegraf\telegraf_windows.conf --test
- Install the service: C:\Apps\Telegraf\telegraf.exe --service install --service-auto-restart --config -
- C:\Apps\Telegraf\telegraf_windows.conf
- Start the service: net start telegraf
Expected behavior
A few MBs of data usage each day.
Actual behavior
Many GBs of data usage each day.
Additional info
Running the telegraf
series without inputs.internet_speed
does not result in high data usage. In Influx, it looks like the script is run every few seconds:
The resolution windows for this is 5m and each Download/Upload are 10s apart... But it seems like these are ran constantly on the host?
Metadata
Metadata
Assignees
Labels
area/windowsRelated to windows plugins (win_eventlog, win_perf_counters, win_services)Related to windows plugins (win_eventlog, win_perf_counters, win_services)bugunexpected problem or unintended behaviorunexpected problem or unintended behaviorplatform/windows