guenthermi commited on
Commit
86ea4dd
1 Parent(s): adac9dd

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ library_name: fasttext
6
+ tags:
7
+ - schema
8
+ - word-embeddings
9
+ - embeddings
10
+ - fasttext
11
+ - unsupervised-learning
12
+ - tables
13
+ - web-table
14
+ - schema-data
15
+ ---
16
+ # Pre-trained Web Table Embeddings
17
+
18
+ The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data.
19
+
20
+ The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings)
21
+
22
+ ## Quick Start
23
+
24
+ You can install the table_embeddings package to encode text from tables by running the following commands:
25
+
26
+
27
+ ```bash
28
+ pip install cython
29
+ pip install pip install git+https://github.com/guenthermi/table-embeddings.git
30
+ ```
31
+
32
+ After that you can encode text with the following Python snippet:
33
+
34
+ ```python
35
+ from table_embeddings import TableEmbeddingModel
36
+ model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_row150')
37
+ embedding = model.get_header_vector('headline')
38
+ ```
39
+
40
+ ## Model Types
41
+
42
+ | Model Type | Description | Download-Links |
43
+ | ---------- | ----------- | -------------- |
44
+ | W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150))
45
+ | W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150))
46
+ | W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150))
47
+ | W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150))
48
+
49
+ ## More Information
50
+
51
+ For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings)
52
+
53
+ More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892)
54
+ ```
55
+ @inproceedings{gunther2021pre,
56
+ title={Pre-Trained Web Table Embeddings for Table Discovery},
57
+ author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang},
58
+ booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management},
59
+ pages={24--31},
60
+ year={2021}
61
+ }
62
+ ```