trajdl.tokenizers.t2vec module#
- class trajdl.tokenizers.t2vec.T2VECTokenizer(grid: SimpleGridSystem, gps_boundary: RectangleBoundary, vocab: Dict[str, int], with_kd_tree: bool = False, hot_locations: List[str] = None, kdtree: KDTree = None)[source]#
Bases:
AbstractTrajTokenizert2vec的tokenizer,专门处理轨迹序列
- classmethod build(grid: SimpleGridSystem, boundary: RectangleBoundary, trajectories: Iterable[Trajectory], max_vocab_size: int, min_freq: int, with_kd_tree: bool = False) T2VECTokenizer[source]#
类方法,用于构建Tokenizer实例,可以根据子类需求调整参数
- classmethod construct_vocab(grid: SimpleGridSystem, gps_boundary: RectangleBoundary, trajectories: Iterable[Trajectory], max_vocab_size: int, min_freq: int) Dict[str, int][source]#
- Parameters:
grid (SimpleGridSystem) – 这个是基于web mercator坐标系的网格系统
gps_boundary (trajdl_cpp.RectangleBoundary) – 这个是基于WGS84坐标系的boundary
- k_nearest_hot_loc(loc_list: List[str], k: int) Tuple[ndarray, List[List[str]]][source]#
search k-nearest neighbors for given loc_list
- tokenize_traj(traj: Trajectory | ndarray, add_start_end_token: bool = False, return_as: str = 'py') List[int][source]#
transform trajectory into location sequence
- traj_to_loc_seq(traj: Trajectory | ndarray, add_start_end_token: bool) List[str][source]#
Transform a trajectory into a location sequence