File size: 2,706 Bytes
40fac91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
""" from https://github.com/keithito/tacotron """
from text import cleaners
from text.symbols import symbols
from text import cleaners1
from text.symbols1 import symbols1

# Mappings from symbol to numeric ID and vice versa:
_symbol_to_id = {s: i for i, s in enumerate(symbols)}
_id_to_symbol = {i: s for i, s in enumerate(symbols)}
_symbol_to_id1 = {s: i for i, s in enumerate(symbols1)}
_id_to_symbol1 = {i: s for i, s in enumerate(symbols1)}

def text_to_sequence(text, cleaner_names):
  '''Converts a string of text to a sequence of IDs corresponding to the symbols in the text.
    Args:
      text: string to convert to a sequence
      cleaner_names: names of the cleaner functions to run the text through
    Returns:
      List of integers corresponding to the symbols in the text
  '''
  sequence = []

  clean_text = _clean_text(text, cleaner_names)
  return cleaned_text_to_sequence(clean_text)


def text_to_sequence1(text, cleaner_names):
  sequence = []
  clean_text = _clean_text1(text, cleaner_names)
  return cleaned_text_to_sequence1(clean_text)


def cleaned_text_to_sequence(cleaned_text):
  '''Converts a string of text to a sequence of IDs corresponding to the symbols in the text.
    Args:
      text: string to convert to a sequence
    Returns:
      List of integers corresponding to the symbols in the text
  '''
  sequence = []
  for symbol in cleaned_text.split(" "):
    if symbol in _symbol_to_id:
      sequence.append(_symbol_to_id[symbol])
    else:
      for s in symbol:
        sequence.append(_symbol_to_id[s])
    sequence.append(_symbol_to_id[" "])
  if sequence[-1] == _symbol_to_id[" "]:
    sequence = sequence[:-1]
  return sequence

def cleaned_text_to_sequence1(cleaned_text):
  sequence = []
  for symbol1 in cleaned_text.split(" "):
    if symbol1 in _symbol_to_id1:
      sequence.append(_symbol_to_id1[symbol1])
    else:
      for s in symbol1:
        sequence.append(_symbol_to_id1[s])
    sequence.append(_symbol_to_id1[" "])
  if sequence[-1] == _symbol_to_id1[" "]:
    sequence = sequence[:-1]
  return sequence



def sequence_to_text(sequence):
  '''Converts a sequence of IDs back to a string'''
  result = ''
  for symbol_id in sequence:
    s = _id_to_symbol[symbol_id]
    result += s
  return result

  

def _clean_text(text, cleaner_names):
  for name in cleaner_names:
    cleaner = getattr(cleaners, name)
    if not cleaner:
      raise Exception('Unknown cleaner: %s' % name)
    text = cleaner(text)
  return text

def _clean_text1(text, cleaner_names):
  for name in cleaner_names:
    cleaner = getattr(cleaners1, name)
    if not cleaner:
      raise Exception('Unknown cleaner: %s' % name)
    text = cleaner(text)
  return text