ccrains
/

larson-gemma-2b-chinese-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

larson-gemma-2b-chinese-v0.1 / README.md

ccrains's picture

Update README.md

c3a4789 verified 9 months ago

|

history blame contribute delete

1.25 kB

	---
	license: apache-2.0
	---

	base_model:https://huggingface.co/google/gemma-2b

	Chinese chat demo of gemma-2b：

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/63e4a2ce5bbdd8d44b504628/RVxNl9oMDMQ8s2lbjz4wh.png)




	the language of model: chinese and english

	The following uses gemma-2b (a language model that only supports English) to train a large model process that supports Chinese and English.

	step 1:
	Use SentencePiece(bpe) to train Chinese corpus to obtain tokenizer.model and tokenizer.vocab

	step 2：
	Merge the Chinese of tokenizer.model and the original of tokenizer.model

	step 3：
	Use the merged special_tokens_map.json, tokenizer.model, tokenizer_config.json to replace the files of the original model (such as gemma-2b)

	step 4:
	Use LLaMA-Factory for pre-training. Pay attention to the pre-training parameters. Resize vocab and resize embedding are required.

	step 5:
	Based on the model pre-trained in step 4, the instructions are fine-tuned, which significantly improves the model's ability to understand and execute instructions.

	step 6:
	Based on the instruction fine-tuning model, we can use this model for SFT training under different specific tasks, so that the model can perform better on specific tasks.