Spaces:
Running
on
T4
Running
on
T4
Overhaul to use precompiled binaries
Browse filesLlamacpp has been adding a lot of CUDA kernels, this causes compile time concerns on the HF spaces.
To ensure everything keeps working smoothly the space is now using our precompiled binary, this ensures maximum compatibility between the different GPU's and also severely reduces the time it takes to build the space.
- Dockerfile +9 -10
Dockerfile
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
FROM
|
2 |
ARG MODEL
|
3 |
ARG IMGMODEL
|
4 |
ARG WHISPERMODEL
|
@@ -6,14 +6,13 @@ ARG MMPROJ
|
|
6 |
ARG MODEL_NAME
|
7 |
ARG ADDITIONAL
|
8 |
RUN mkdir /opt/koboldcpp
|
9 |
-
RUN apt update && apt install
|
10 |
-
RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
|
11 |
WORKDIR /opt/koboldcpp
|
12 |
COPY default.json /opt/koboldcpp/default.json
|
13 |
-
RUN
|
14 |
-
RUN
|
15 |
-
RUN
|
16 |
-
RUN
|
17 |
-
RUN
|
18 |
-
|
19 |
-
|
|
|
1 |
+
FROM ubuntu
|
2 |
ARG MODEL
|
3 |
ARG IMGMODEL
|
4 |
ARG WHISPERMODEL
|
|
|
6 |
ARG MODEL_NAME
|
7 |
ARG ADDITIONAL
|
8 |
RUN mkdir /opt/koboldcpp
|
9 |
+
RUN apt update && apt install curl -y
|
|
|
10 |
WORKDIR /opt/koboldcpp
|
11 |
COPY default.json /opt/koboldcpp/default.json
|
12 |
+
RUN curl -fLo koboldcpp https://koboldai.org/cpplinuxcu12
|
13 |
+
RUN chmod +x ./koboldcpp
|
14 |
+
RUN curl -fLo model.ggml $MODEL || true
|
15 |
+
RUN curl -fLo imgmodel.ggml $IMGMODEL || true
|
16 |
+
RUN curl -fLo mmproj.ggml $MMPROJ || true
|
17 |
+
RUN curl -fLo whispermodel.ggml $WHISPERMODEL || true
|
18 |
+
CMD ./koboldcpp --model model.ggml --whispermodel whispermodel.ggml --sdmodel imgmodel.ggml --sdthreads 4 --sdquant --sdclamped --mmproj mmproj.ggml $ADDITIONAL --port 7860 --hordemodelname $MODEL_NAME --hordemaxctx 1 --hordegenlen 1 --preloadstory default.json --ignoremissing
|