openai tiktoken chromadb langchain gradio pypdf requests unstructured validators pytesseract pdf2image tabulate nltk python-dotenv faiss-cpu requests tokenizers