AttributeError: 'NoneType' object has no attribute 'size'

#4
by Archeane - opened

Hi all, getting the below error when using ragatouille to run the sample code

Process Process-3:
Traceback (most recent call last):
  File "/home/jenny/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/jenny/.pyenv/versions/3.10.14/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/jenny/Desktop/projects/sp-python/.venv/lib/python3.10/site-packages/colbert/infra/launcher.py", line 134, in setup_new_process
    return_val = callee(config, *args)
  File "/home/jenny/Desktop/projects/sp-python/.venv/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 33, in encode
    encoder.run(shared_lists)
  File "/home/jenny/Desktop/projects/sp-python/.venv/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 63, in run
    self.setup() # Computes and saves plan for whole collection
  File "/home/jenny/Desktop/projects/sp-python/.venv/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 101, in setup
    avg_doclen_est = self._sample_embeddings(sampled_pids)
  File "/home/jenny/Desktop/projects/sp-python/.venv/lib/python3.10/site-packages/colbert/indexing/collection_indexer.py", line 141, in _sample_embeddings
    self.num_sample_embs = torch.tensor([local_sample_embs.size(0)]).cuda()
AttributeError: 'NoneType' object has no attribute 'size'
[Aug 17, 21:13:42] [1] 		 #> Encoding 0 passages..

Have ran:

pip install --upgrade ragatouille
pip install --upgrade colbert-ai

Still ran into this issue, Here's code to repro

from ragatouille import RAGPretrainedModel

RAG = RAGPretrainedModel.from_pretrained("answerdotai/answerai-colbert-small-v1")

docs = ['Hayao Miyazaki is a Japanese director and is founded ghibli studio', 'Walt Disney is an American author, director and founder of disney']

RAG.index(docs, index_name="ghibli")

query = 'Who directed spirited away?'
results = RAG.search(query)

The reranker way works fine, but I need ragatouille so I can index the docs ahead of time. Appreciate some assistance!

Answer.AI org

Hey! This is a RAGatouille-related issue, so could you open an issue on the GitHub repo?

Could you also provide information in the issue, such as:

  • Operating system
  • Full output of pip freeze
  • Platform information (number/type of GPU, etc...)
bclavie changed discussion status to closed

Sign up or log in to comment