-
Notifications
You must be signed in to change notification settings - Fork 129
Description
Description
Hey y'all,
Ever since upgrading my M1 Pro MacBook Pro to macOS Sonoma, I haven't been able to generate any embeddings using Chroma in my Docker container. Every time I run docker-compose up --index embedding-test
with the basic code below, I receive exited with code 139.
What I don't understand is that I can run this same Docker container on another Mac running Monterey and on an EC2 server without any issue. I can get the same script below running on my main machine if I run it outside of a Docker container and in a simple venv.
This is only the second Github issue I've ever submitted (first being earlier this morning), so apologies in advance if this posting is misplaced or incomplete. I'm also attaching my Dockerfile. If I can supply any other information that would lead to any assistance, let me know. I would appreciate any help on this issue.
I believe this may have something to do with #7006
embedding_test.py
import chromadb
from chromadb.utils import embedding_functions
# Let's define the embedding function
embedding = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="intfloat/e5-base-v2",
)
persist_directory = "/code/scripts/testDB"
client = chromadb.PersistentClient(path=persist_directory)
client.delete_collection(name="Students")
collection = client.create_collection(name="Students")
student_info = """
Alexandra Thompson, a 19-year-old computer science sophomore with a 3.7 GPA,
is a member of the programming and chess clubs who enjoys pizza, swimming, and hiking
in her free time in hopes of working at a tech company after graduating from the University of Washington.
"""
club_info = """
The university chess club provides an outlet for students to come together and enjoy playing
the classic strategy game of chess. Members of all skill levels are welcome, from beginners learning
the rules to experienced tournament players. The club typically meets a few times per week to play casual games,
participate in tournaments, analyze famous chess matches, and improve members' skills.
"""
university_info = """
The University of Washington, founded in 1861 in Seattle, is a public research university
with over 45,000 students across three campuses in Seattle, Tacoma, and Bothell.
As the flagship institution of the six public universities in Washington state,
UW encompasses over 500 buildings and 20 million square feet of space,
including one of the largest library systems in the world.
"""
embeddings = embedding([student_info, club_info, university_info])
collection.add(
documents=embeddings,
metadatas=[
{"source": "student info"},
{"source": "club info"},
{"source": "university info"},
],
ids=["id1", "id2", "id3"],
)
results = collection.query(query_texts=["What is the student name?"], n_results=2)
print(results)
Dockerfile
FROM python:3.11.4-slim-bookworm
ADD requirements.txt /code/requirements.txt
WORKDIR /code
RUN apt-get update
RUN apt-get install build-essential -y
RUN apt-get install -y gdal-bin libgdal-dev
RUN pip install --upgrade pip
RUN pip install -r "requirements.txt"
RUN python -m spacy download en_core_web_sm
# Add the scripts from the local 'scripts` folder
ADD scripts/embedding_test.py /code/scripts/
WORKDIR /code/scripts
requirements.txt
beautifulsoup4==4.12.2
chromadb==0.4.13
geopandas==0.14.0
gspread==5.11.3
gspread_dataframe==3.3.1
langchain==0.0.306
openai==0.27.9
pandas==2.0.3
python-dotenv==1.0.0
pytz==2023.3
Requests==2.31.0
slack_bolt==1.18.0
slack_sdk==3.21.3
spacy==3.6.1
tenacity==8.2.3
tiktoken==0.4.0
sentence_transformers==2.2.2
lark==1.1.7
versions
Chroma v0.4.13, macOS 14.0 Sonoma, Docker v4.24.0, python:3.11.4-slim-bookworm image
Reproduce
docker-compose up --build embedding-test
Expected behavior
docker-compose up --build embedding-test
runs and generates embeddings successfully.
docker version
Client:
Cloud integration: v1.0.35+desktop.5
Version: 24.0.6
API version: 1.43
Go version: go1.20.7
Git commit: ed223bc
Built: Mon Sep 4 12:28:49 2023
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.24.0 (122432)
Engine:
Version: 24.0.6
API version: 1.43 (minimum version 1.12)
Go version: go1.20.7
Git commit: 1a79695
Built: Mon Sep 4 12:31:36 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.6.22
GitCommit: 8165feabfdfe38c65b599c4993d227328c231fca
runc:
Version: 1.1.8
GitCommit: v1.1.8-0-g82f18fe
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Version: 24.0.6
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2-desktop.5
Path: /Users/ryan/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.22.0-desktop.2
Path: /Users/ryan/.docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /Users/ryan/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.20
Path: /Users/ryan/.docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.8
Path: /Users/ryan/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/ryan/.docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /Users/ryan/.docker/cli-plugins/docker-scan
scout: Docker Scout (Docker Inc.)
Version: v1.0.7
Path: /Users/ryan/.docker/cli-plugins/docker-scout
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 2
Server Version: 24.0.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
runc version: v1.1.8-0-g82f18fe
init version: de40ad0
Security Options:
seccomp
Profile: unconfined
cgroupns
Kernel Version: 6.4.16-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 9
Total Memory: 15.61GiB
Name: docker-desktop
ID: 5edac7f3-7a82-4e53-b391-ca3f50bf24f5
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
Diagnostics ID
78FCD04C-5FB1-457D-97C2-F7F899584238/20231006183318
Additional Info
docker logs 8c38e53b2186
Downloading (…)a20e8/.gitattributes: 100%|██████████| 1.48k/1.48k [00:00<00:00, 16.0MB/s]
Downloading (…)_Pooling/config.json: 100%|██████████| 200/200 [00:00<00:00, 2.78MB/s]
Downloading (…)16616a20e8/README.md: 100%|██████████| 67.6k/67.6k [00:00<00:00, 12.3MB/s]
Downloading (…)616a20e8/config.json: 100%|██████████| 650/650 [00:00<00:00, 10.2MB/s]
Downloading model.safetensors: 100%|██████████| 438M/438M [00:51<00:00, 8.58MB/s]
Downloading (…)0e8/onnx/config.json: 100%|██████████| 632/632 [00:00<00:00, 929kB/s]
Downloading model.onnx: 100%|██████████| 436M/436M [00:50<00:00, 8.55MB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 125/125 [00:00<00:00, 223kB/s]
Downloading (…)/onnx/tokenizer.json: 100%|██████████| 711k/711k [00:00<00:00, 3.54MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 314/314 [00:00<00:00, 1.29MB/s]
Downloading (…)a20e8/onnx/vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 3.20MB/s]
Downloading pytorch_model.bin: 100%|██████████| 438M/438M [00:51<00:00, 8.56MB/s]
Downloading (…)nce_bert_config.json: 100%|██████████| 57.0/57.0 [00:00<00:00, 74.2kB/s]
Downloading (…)cial_tokens_map.json: 100%|██████████| 125/125 [00:00<00:00, 579kB/s]
Downloading (…)a20e8/tokenizer.json: 100%|██████████| 711k/711k [00:00<00:00, 9.27MB/s]
Downloading (…)okenizer_config.json: 100%|██████████| 314/314 [00:00<00:00, 1.38MB/s]
Downloading (…)16616a20e8/vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 3.39MB/s]
Downloading (…)16a20e8/modules.json: 100%|██████████| 387/387 [00:00<00:00, 1.87MB/s]