--- license: mit language: - en library_name: fasttext tags: - schema - word-embeddings - embeddings - fasttext - unsupervised-learning - tables - web-table - schema-data --- # Pre-trained Web Table Embeddings The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data. The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings) ## Quick Start You can install the table_embeddings package to encode text from tables by running the following commands: ```bash pip install cython pip install pip install git+https://github.com/guenthermi/table-embeddings.git ``` After that you can encode text with the following Python snippet: ```python from table_embeddings import TableEmbeddingModel model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_plain64') embedding = model.get_header_vector('headline') ``` ## Model Types | Model Type | Description | Download-Links | | ---------- | ----------- | -------------- | | W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150)) | W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150)) | W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150)) | W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150)) ## More Information For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings) More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892) ``` @inproceedings{gunther2021pre, title={Pre-Trained Web Table Embeddings for Table Discovery}, author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang}, booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management}, pages={24--31}, year={2021} } ```