More training inputs: inventory
Hi, as far as the issue of inventory stability goes, I believe storing the current inventory as text during training, and feeding that into the model is a good way of preventing temporal inconsistencies. The inventory, hotbar, armor, offhand, and what block the mouse is currently transfering between slots in an inventory.
Small issue though:
It might get tricky since it needs to use that list as input during runtime while also needing to be able to edit it? But I thought of a possible solution: increase the screen ratio of the model, store all the information as color data in the increased resolution, and then just crop it out for the viewer! It’s in the context window as part of the image data! Also, you would have to generate it with code as part of the training data though. Or you can just make it omnimodal and be able to use a sidecar of text data, if it gets good enough at using the sidecar text file, you could even store things like nearby entities/mobs in the text data, or other important stuff that you want to protect from hallucinations.
To summarize, you should use a text file in the model to provide context to the model to prevent hallucinating select things like inventory!