trajdl.tokenizers.t2vec module#

class trajdl.tokenizers.t2vec.T2VECTokenizer(grid: SimpleGridSystem, gps_boundary: RectangleBoundary, vocab: Dict[str, int], with_kd_tree: bool = False, hot_locations: List[str] = None, kdtree: KDTree = None)[source]#

Bases: AbstractTrajTokenizer

t2vec的tokenizer,专门处理轨迹序列

classmethod build(grid: SimpleGridSystem, boundary: RectangleBoundary, trajectories: Iterable[Trajectory], max_vocab_size: int, min_freq: int, with_kd_tree: bool = False) T2VECTokenizer[source]#

类方法,用于构建Tokenizer实例,可以根据子类需求调整参数

classmethod construct_vocab(grid: SimpleGridSystem, gps_boundary: RectangleBoundary, trajectories: Iterable[Trajectory], max_vocab_size: int, min_freq: int) Dict[str, int][source]#
Parameters:
  • grid (SimpleGridSystem) – 这个是基于web mercator坐标系的网格系统

  • gps_boundary (trajdl_cpp.RectangleBoundary) – 这个是基于WGS84坐标系的boundary

k_nearest_hot_loc(loc_list: List[str], k: int) Tuple[ndarray, List[List[str]]][source]#

search k-nearest neighbors for given loc_list

loc2idx(loc: str) int[source]#

将位置转换为下标

tokenize_traj(traj: Trajectory | ndarray, add_start_end_token: bool = False, return_as: str = 'py') List[int][source]#

transform trajectory into location sequence

traj_to_loc_seq(traj: Trajectory | ndarray, add_start_end_token: bool) List[str][source]#

Transform a trajectory into a location sequence