This is a cache server for the huggingface.co hub. It is designed to be deployed in a private network to cache models and datasets downloaded from the hub to reduce the load on the public servers and speed up the download process.
The cache server listens on port 36080 by default. You can change the port by setting the PORT
environment variable.
To run the cache server, you can use the following command:
docker run -p 36080:36080 -v /path/to/cache:/var/cache/huggingface -d chocolatefrappe/huggingface-cacheserver:main
or via Docker Compose:
services:
huggingface-cache-server:
image: ghcr.io/cacheserver/huggingface:dev
ports:
- mode: ingress
target: 36080
published: 36080
protocol: tcp
volumes:
- type: volume
source: huggingface-cache
target: /var/cache/huggingface
stop_grace_period: 1m
restart: always
volumes:
huggingface-cache:
exo
: Run your own AI cluster at home with everyday devices. Maintained by exo labs.
Models are downloaded from Hugging Face. If you are running exo in a country with strict internet censorship, you may need to download the models manually and put them in the ~/.cache/exo/downloads directory.
To download models from a proxy endpoint, set the HF_ENDPOINT
environment variable. For example, to run exo with the huggingface mirror endpoint:
HF_ENDPOINT=https://localhost:36080 exo