codeboxgptpythondemo

Sleeping

App Files Files Community

Henk717 commited on Jun 26

Commit

86d84ef

•

1 Parent(s): 1c97e32

Overhaul to use precompiled binaries

Browse files

Llamacpp has been adding a lot of CUDA kernels, this causes compile time concerns on the HF spaces.
To ensure everything keeps working smoothly the space is now using our precompiled binary, this ensures maximum compatibility between the different GPU's and also severely reduces the time it takes to build the space.

Files changed (1) hide show

Dockerfile +9 -10

Dockerfile CHANGED Viewed

@@ -1,4 +1,4 @@
-FROM nvidia/cuda:12.1.1-devel-ubuntu22.04
 ARG MODEL
 ARG IMGMODEL
 ARG WHISPERMODEL
@@ -6,14 +6,13 @@ ARG MMPROJ
 ARG MODEL_NAME
 ARG ADDITIONAL
 RUN mkdir /opt/koboldcpp
-RUN apt update && apt install git build-essential libopenblas-dev wget python3-pip -y
-RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
 WORKDIR /opt/koboldcpp
 COPY default.json /opt/koboldcpp/default.json
-RUN make -j$(nproc) LLAMA_OPENBLAS=1 LLAMA_CUBLAS=1 LLAMA_PORTABLE=1 LLAMA_COLAB=1
-RUN wget -O model.ggml $MODEL || true
-RUN wget -O imgmodel.ggml $IMGMODEL || true
-RUN wget -O mmproj.ggml $MMPROJ || true
-RUN wget -O whispermodel.ggml $WHISPERMODEL || true
-CMD /bin/python3 ./koboldcpp.py --model model.ggml --whispermodel whispermodel.ggml --sdmodel imgmodel.ggml --sdthreads 4 --sdquant --sdclamped --mmproj mmproj.ggml $ADDITIONAL --port 7860 --hordemodelname $MODEL_NAME --hordemaxctx 1 --hordegenlen 1 --preloadstory default.json --ignoremissing

+FROM ubuntu
 ARG MODEL
 ARG IMGMODEL
 ARG WHISPERMODEL
 ARG MODEL_NAME
 ARG ADDITIONAL
 RUN mkdir /opt/koboldcpp
+RUN apt update && apt install curl -y
 WORKDIR /opt/koboldcpp
 COPY default.json /opt/koboldcpp/default.json
+RUN curl -fLo koboldcpp https://koboldai.org/cpplinuxcu12
+RUN chmod +x ./koboldcpp
+RUN curl -fLo model.ggml $MODEL || true
+RUN curl -fLo imgmodel.ggml $IMGMODEL || true
+RUN curl -fLo mmproj.ggml $MMPROJ || true
+RUN curl -fLo whispermodel.ggml $WHISPERMODEL || true
+CMD ./koboldcpp --model model.ggml --whispermodel whispermodel.ggml --sdmodel imgmodel.ggml --sdthreads 4 --sdquant --sdclamped --mmproj mmproj.ggml $ADDITIONAL --port 7860 --hordemodelname $MODEL_NAME --hordemaxctx 1 --hordegenlen 1 --preloadstory default.json --ignoremissing