This repository was archived by the owner on Sep 16, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4
This repository was archived by the owner on Sep 16, 2020. It is now read-only.
Add benchmark result to README.md #17
Copy link
Copy link
Closed
Description
FILEgrain benchmark
FILEgrain version: 969524a
amount of blobs
openjdk:8@sha256:5da842d59f76009fa27ffde06888ebd560c7ad17607d7cd1e52fc0757c9a45fb
$ ../du.sh
Pure blobs (excludes continuity): 704288157 [671.6615266799927MiB]
Tarred blobs (excludes continuity): 718243840 [684.970703125MiB]
Tarred + Gzipped blobs (excludes continuity): 273990124 [261.2973442077637MiB]
FILEgrain (gzipped): 273344520 [260.68164825439453MiB]
- Mount: 2 blobs, 5.416MiB
- then
sh
: 8 blobs, 7.31MiB - then
java -version
: 30 blobs, 88.18MiB - then
javac HelloWorld.java
: 50 blobs, 137.3MiB
kdeneon/all@sha256:e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247
$ ../du.sh
Pure blobs (excludes continuity): 5123620038 [4.771743005141616GiB]
Tarred blobs (excludes continuity): 5228851200 [4.869747161865234GiB]
Tarred + Gzipped blobs (excludes continuity): 2236129577 [2.082557954825461GiB]
FILEgrain (gzipped): 2235640216 [2.0821022018790245GiB]
- Mount: 2 blobs, 34.49MiB
- then
sh
: 8 blobs, 36.73MiB - then
DISPLAY=:1 startkde
, with host-sideXephyr -screen 1024x768 :1
: 4267 blobs, 742.7MiB - then start Firefox via the KDE start-menu: 4506 blobs, 866.6MiB
kaggle/python@sha256:335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98
$ ../du.sh
Pure blobs (excludes continuity): 8937194028 [8.323410551995039GiB]
Tarred blobs (excludes continuity): 9025382400 [8.405542373657227GiB]
Tarred + Gzipped blobs (excludes continuity): 3818209353 [3.5559845650568604GiB]
FILEgrain (gzipped): 3822555054 [3.5600318145006895GiB]
- Mount: 2 blobs, 38.18MiB
- then
sh
: 8 blobs, 40.14MiB - then
ipython -c 'print("hello")'
: 1033 blobs, 75.4MiB - then
ipython -c 'import nltk
: 2779 blobs, 352MiB
deduplication benchmark
$ (cd kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256; find .) > /tmp/a
$ (cd kaggle-python-sha256-335103c998aea22a5608c2eeca7dcf109e0828ed233b75f5098182c5b058fe98/blobs/sha256; find .) > /tmp/b
$ wc -l /tmp/a /tmp/b
156916 /tmp/a
131552 /tmp/b
288468 total
$ cat /tmp/a /tmp/b | sort | uniq | wc -l
279749
$ cat /tmp/a /tmp/b | sort | uniq -D | uniq | wc -l
8719
$ echo $((156916 + 131552 - 8719))
279749
$ sum=0; for f in $(cat /tmp/a /tmp/b | sort | uniq -D | uniq);do let s=$(stat -c %s kdeneon-all-sha256-e3e7f216a5f8f1fdcff4eab8807d7afcd291c050099ab3e8a8355b7b28a19247/blobs/sha256/$f); sum=$(($sum + $s)); done; echo $sum
79064496 [75.40177917480469MiB]
These are totally different images but have 75MiB of common Debian files.
FUSE
(on Fedora 26, 2 vCPUs, 2GB RAM, VMware Fusion on MacBookPro)
Result of export TIMEFORMAT=%R; for f in $(seq 1 10); do bash -c "cd /; time tar cf - usr | tar tvf - > /dev/null"; done
on openjdk:8
.
docker run -it --rm
:
9.238
9.950
10.098
10.446
6.487
7.425
3.004
0.846
0.775
0.714
FILEgrain without FOPEN_KEEP_CACHE
(old commit: b33bc29):
35.777 [pull & cache blobs]
20.870
13.877
19.071
18.319
18.053
19.357
14.154
22.630
17.400
FILEgrain with FOPEN_KEEP_CACHE
(not so effective?):
28.318 [pull & cache blobs]
15.833
15.014
16.962
18.809
17.566
15.545
17.971
18.071
15.742
Docker Registry I/O (TODO)
N/A because current FILEgrain does not support Docker Registry API yet.
TODO: integrate FILEgrain into containerd and do real benchmark
Appendix
du.sh
#!/bin/sh
set -e
echo -n "Pure blobs (excludes continuity): "
du -bs $(../print-du-exclude-extra-blobs.py) ./blobs | awk '{print $1}'
echo -n "Tarred blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | wc -c
echo -n "Tarred + Gzipped blobs (excludes continuity): "
tar cf - $(../print-du-exclude-extra-blobs.py) ./blobs | gzip -9 | wc -c
sum=0
for f in $(find ./blobs -type f); do
sum=$(($sum + $(gzip -9c $f | wc -c )))
done
echo "FILEgrain (gzipped): $sum"
print-du-exclude-extra-blobs.py
#!/usr/bin/python3
# Usage: du -bs $(this.py) ./blobs
import json
def dig2blobpath(s):
spl = s.split(':')
algo, heks =spl[0], spl[1]
return 'blobs/' + algo+'/'+heks
excludes = []
for m_entry in json.load(open('index.json'))['manifests']:
m_blob = dig2blobpath(m_entry['digest'])
excludes.append(m_blob)
m = json.load(open(m_blob))
excludes.append(dig2blobpath(m['config']['digest']))
for l in m['layers']:
excludes.append(dig2blobpath(l['digest']))
for f in excludes:
print('--exclude '+f)
Metadata
Metadata
Assignees
Labels
No labels