is this a duplicate of falcon-40B or a fork?

#1
by kehsani - opened

I duplicated this space for CPU and simply typed "hello" for the input-text but it just times out.

I supose, the model should be downloaded into the HuggingFace Space and be run inference on. My strong guess would be that the capacities of a standard HF space is no enough. If you need a model to be deployed on some system, you'll need a stronger hardware (A100m + 80 GB min). https://huggingface.co/spaces/HuggingFaceH4/falcon-chat did a better job with their space.

Sign up or log in to comment