trajdl.tokenizers.locseq module#

class trajdl.tokenizers.locseq.LocSeqTokenizer(vocab: Dict[str, int])[source]#

Bases: AbstractLocSeqTokenizer

classmethod build(loc_seqs: Iterable[LocSeq], count_start_end_token: bool = False, min_count: int = 0, enable_progress_bar: bool = False) LocSeqTokenizer[source]#

类方法,用于构建Tokenizer实例,可以根据子类需求调整参数

classmethod construct_vocab(loc_seqs: Iterable[LocSeq], count_start_end_token: bool = True, min_count: int = 0, enable_progress_bar: bool = False) Dict[str, int][source]#

静态方法,根据输入数据构造词汇表

loc2idx(loc: str) int[source]#

将位置转换为下标