Request for Model Sharding to Support Lower-End PCs
Hello NousResearch team,
I hope this message finds you well. I am reaching out to inquire about the possibility of sharding the Yarn-Mistral-7b-64k model. The aim of this request is to facilitate users with lower-end PCs to run the model more effectively.
As you're aware, the computational demands of large models can be a barrier to entry for individuals without access to high-end hardware. By providing a sharded version of the Yarn-Mistral-7b-64k model, we can democratize access, enabling a broader range of users to experiment with and benefit from this impressive model.
Moreover, sharding could also benefit users who prefer to run models on cloud-based platforms like Google Colab or Kaggle, where there may be limitations on resources or users seek to optimize their allocated compute time.
I understand that model sharding can come with its own set of challenges, but I believe the benefits to the community could be significant. If there's any possibility to consider this, or if there are alternative solutions that could accommodate the needs mentioned above, I'd love to hear your thoughts.
Thank you for your time and for your contributions to the AI community.