metadata
license: mit
datasets:
- WhereIsAI/github-issue-similarity
language:
- en
library_name: sentence-transformers
pipeline_tag: feature-extraction
WhereIsAI/UAE-Code-Large-V1
📢 WhereIsAI/UAE-Code-Large-V1
is licensed under MIT. Feel free to use it in any scenario.
If you use it for academic papers, we would greatly appreciate it if you could cite us. 👉 citation info.
This model builds upon WhereIsAI/UAE-Large-V1 and is fine-tuned on the GIS: Github Issue Similarity dataset using AnglE loss (https://arxiv.org/abs/2309.12871). It can be used to measure code/issue similarity.
Results (test set):
- Spearman correlation: 71.19
- Accuracy: 84.37
Usage
1. angle-emb
You can use it via angle-emb
as follows:
install:
python -m pip install -U angle-emb
example:
from scipy import spatial
from angle_emb import AnglE
model = AnglE.from_pretrained('WhereIsAI/UAE-Code-Large-V1').cuda()
quick_sort = '''# Approach 2: Quicksort using list comprehension
def quicksort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[0]
left = [x for x in arr[1:] if x < pivot]
right = [x for x in arr[1:] if x >= pivot]
return quicksort(left) + [pivot] + quicksort(right)
# Example usage
arr = [1, 7, 4, 1, 10, 9, -2]
sorted_arr = quicksort(arr)
print("Sorted Array in Ascending Order:")
print(sorted_arr)'''
bubble_sort = '''def bubblesort(elements):
# Looping from size of array from last index[-1] to index [0]
for n in range(len(elements)-1, 0, -1):
swapped = False
for i in range(n):
if elements[i] > elements[i + 1]:
swapped = True
# swapping data if the element is less than next element in the array
elements[i], elements[i + 1] = elements[i + 1], elements[i]
if not swapped:
# exiting the function if we didn't make a single swap
# meaning that the array is already sorted.
return
elements = [39, 12, 18, 85, 72, 10, 2, 18]
print("Unsorted list is,")
print(elements)
bubblesort(elements)
print("Sorted Array is, ")
print(elements)'''
vecs = model.encode([
'def echo(): print("hello world")',
quick_sort,
bubble_sort
])
print('cos sim (0, 1):', 1 - spatial.distance.cosine(vecs[0], vecs[1]))
print('cos sim (0, 2)', 1 - spatial.distance.cosine(vecs[0], vecs[2]))
print('cos sim (1, 2):', 1 - spatial.distance.cosine(vecs[1], vecs[2]))
output:
cos sim (0, 1): 0.34329649806022644
cos sim (0, 2) 0.3627094626426697
cos sim (1, 2): 0.6972219347953796
sentence-transformers
You can also use it via sentence-transformers
from scipy import spatial
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('WhereIsAI/UAE-Code-Large-V1').cuda()
quick_sort = '''# Approach 2: Quicksort using list comprehension
def quicksort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[0]
left = [x for x in arr[1:] if x < pivot]
right = [x for x in arr[1:] if x >= pivot]
return quicksort(left) + [pivot] + quicksort(right)
# Example usage
arr = [1, 7, 4, 1, 10, 9, -2]
sorted_arr = quicksort(arr)
print("Sorted Array in Ascending Order:")
print(sorted_arr)'''
bubble_sort = '''def bubblesort(elements):
# Looping from size of array from last index[-1] to index [0]
for n in range(len(elements)-1, 0, -1):
swapped = False
for i in range(n):
if elements[i] > elements[i + 1]:
swapped = True
# swapping data if the element is less than next element in the array
elements[i], elements[i + 1] = elements[i + 1], elements[i]
if not swapped:
# exiting the function if we didn't make a single swap
# meaning that the array is already sorted.
return
elements = [39, 12, 18, 85, 72, 10, 2, 18]
print("Unsorted list is,")
print(elements)
bubblesort(elements)
print("Sorted Array is, ")
print(elements)'''
vecs = model.encode([
'def echo(): print("hello world")',
quick_sort,
bubble_sort
])
print('cos sim (0, 1):', 1 - spatial.distance.cosine(vecs[0], vecs[1]))
print('cos sim (0, 2)', 1 - spatial.distance.cosine(vecs[0], vecs[2]))
print('cos sim (1, 2):', 1 - spatial.distance.cosine(vecs[1], vecs[2]))
output:
cos sim (0, 1): 0.34329649806022644
cos sim (0, 2) 0.3627094626426697
cos sim (1, 2): 0.6972219347953796
Citation
@article{li2023angle,
title={AnglE-optimized Text Embeddings},
author={Li, Xianming and Li, Jing},
journal={arXiv preprint arXiv:2309.12871},
year={2023}
}