-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Closed
Labels
Description
What did you do?
Tried to start prometheus.
What did you expect to see?
Prometheus up & running, web interface showing up.
What did you see instead? Under which circumstances?
Prometheus runs out of RAM during "WAL segment loaded" process.
Environment
Debian 9
-
System information:
Linux 4.9.0-11-amd64 x86_64
-
Prometheus version:
prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec)
build user: root@7ea0ae865f12
build date: 20200213-23:50:02
go version: go1.13.8
- Prometheus configuration file:
global:
evaluation_interval: 60s
scrape_interval: 60s
...
...
...
- Logs:
This is what happends during the start after 10+- minutes:
... prometheus[39101]: level=info ts=2020-03-05T14:02:26.811Z caller=head.go:625 component=tsdb msg="WAL segment loaded" segment=41869 maxSegment=41871
... prometheus[39101]: level=info ts=2020-03-05T14:02:26.812Z caller=head.go:625 component=tsdb msg="WAL segment loaded" segment=41870 maxSegment=41871
... prometheus[39101]: level=info ts=2020-03-05T14:02:26.812Z caller=head.go:625 component=tsdb msg="WAL segment loaded" segment=41871 maxSegment=41871
... prometheus[39101]: fatal error: runtime: out of memory
... prometheus[39101]: runtime stack:
... prometheus[39101]: runtime.throw(0x253885d, 0x16)
... prometheus[39101]: /usr/local/go/src/runtime/panic.go:774 +0x72
... prometheus[39101]: runtime.sysMap(0xce78000000, 0x14000000, 0x3f5bc78)
... prometheus[39101]: /usr/local/go/src/runtime/mem_linux.go:169 +0xc5
... prometheus[39101]: runtime.(*mheap).sysAlloc(0x3f432c0, 0x11de6000, 0xc000, 0x4373e7)
... prometheus[39101]: /usr/local/go/src/runtime/malloc.go:701 +0x1cd
... prometheus[39101]: runtime.(*mheap).grow(0x3f432c0, 0x8ef3, 0xffffffff)
... prometheus[39101]: /usr/local/go/src/runtime/mheap.go:1255 +0xa3
... prometheus[39101]: runtime.(*mheap).allocSpanLocked(0x3f432c0, 0x8ef3, 0x3f5bc88, 0x20339d00000000)
... prometheus[39101]: /usr/local/go/src/runtime/mheap.go:1170 +0x266
... prometheus[39101]: runtime.(*mheap).alloc_m(0x3f432c0, 0x8ef3, 0x101, 0x7f5861cc3fff)
... prometheus[39101]: /usr/local/go/src/runtime/mheap.go:1022 +0xc2
... prometheus[39101]: runtime.(*mheap).alloc.func1()
... prometheus[39101]: /usr/local/go/src/runtime/mheap.go:1093 +0x4c
... prometheus[39101]: runtime.(*mheap).alloc(0x3f432c0, 0x8ef3, 0x7f5861010101, 0x7f5861d11008)
... prometheus[39101]: /usr/local/go/src/runtime/mheap.go:1092 +0x8a
... prometheus[39101]: runtime.largeAlloc(0x11de5ec0, 0x450101, 0x7f5861d11008)
... prometheus[39101]: /usr/local/go/src/runtime/malloc.go:1138 +0x97
... prometheus[39101]: runtime.mallocgc.func1()
... prometheus[39101]: /usr/local/go/src/runtime/malloc.go:1033 +0x46
... prometheus[39101]: runtime.systemstack(0x0)
... prometheus[39101]: /usr/local/go/src/runtime/asm_amd64.s:370 +0x66
... prometheus[39101]: runtime.mstart()
... prometheus[39101]: /usr/local/go/src/runtime/proc.go:1146
... prometheus[39101]: goroutine 225 [running]:
... prometheus[39101]: runtime.systemstack_switch()
... prometheus[39101]: /usr/local/go/src/runtime/asm_amd64.s:330 fp=0xc0022234c0 sp=0xc0022234b8 pc=0x45d180
... prometheus[39101]: runtime.mallocgc(0x11de5ec0, 0x1fd78e0, 0x5949d401, 0xc0003631d0)
... prometheus[39101]: /usr/local/go/src/runtime/malloc.go:1032 +0x895 fp=0xc002223560 sp=0xc0022234c0 pc=0x40c755
... prometheus[39101]: runtime.makeslice(0x1fd78e0, 0x0, 0x23bcbd8, 0xd)
... prometheus[39101]: /usr/local/go/src/runtime/slice.go:49 +0x6c fp=0xc002223590 sp=0xc002223560 pc=0x445bac
... prometheus[39101]: github.com/prometheus/prometheus/tsdb/index.(*MemPostings).Delete(0xc000f29a70, 0xce71eef470)
This is how systemd service looks like:
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=prometheus
Restart=always
RestartSec=5s
LimitNOFILE=infinity
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/data \
--web.listen-address="127.0.0.1:1234" \
--web.external-url="https://example.com" \
--web.enable-admin-api \
--storage.tsdb.retention.time=30d
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
Here is RAM usage of the server (pas 1 hour) - note that RAM fills up, runs out of RAM, service gets killed and is being restarted:
Please advise how do I troubleshoot further this issue?
se7entyse7en, lwsanty, ifalex, relferreira, ahilsend and 114 moremoolen, insider89, ifalex, ahilsend, HubertSzymanskiCDQ and 10 more