trajdl.algorithms.embeddings.base module#

class trajdl.algorithms.embeddings.base.BaseTokenEmbeddingLayer[source]#

Bases: Module

Base class for token embedding layers.

forward(x: torch.Tensor) torch.Tensor[source]#

Computes the embedding for the input tokens.

freeze_parameters() None[source]#

Freezes the parameters of the embedding layer, preventing them from being trained.

unfreeze_parameters() None[source]#

Unfreezes the parameters of the embedding layer, allowing them to be trained.

is_frozen() bool#

Returns whether the parameters are currently frozen.

abstract property embedding_dim: int#
abstract forward(x: LongTensor) Tensor[source]#

Must override in subclass to compute embeddings.

Parameters:

x (torch.LongTensor) – Input tensor containing token indices.

Returns:

Embedding tensor for the input tokens, with increased dimensions.

Return type:

torch.Tensor

freeze_parameters() None[source]#

Freeze the parameters to prevent training.

property is_frozen: bool#

Check if the parameters are frozen.

Returns:

True if parameters are frozen, otherwise False.

Return type:

bool

unfreeze_parameters() None[source]#

Unfreeze the parameters to allow training.

class trajdl.algorithms.embeddings.base.SimpleEmbedding(tokenizer: AbstractTokenizer, embedding_dim: int)[source]#

Bases: BaseTokenEmbeddingLayer

Token embedding layer that uses PyTorch’s nn.Embedding.

Parameters:
  • tokenizer (AbstractTokenizer) – Tokenizer

  • embedding_dim (int) – The dimensionality of the embeddings.

forward(x: torch.Tensor) torch.Tensor[source]#

Computes the embeddings for the input tokens.

property embedding_dim: int#
forward(x: LongTensor) Tensor[source]#

Computes the embeddings for the input tokens.

Parameters:

x (torch.LongTensor) – Input tensor containing token indices.

Returns:

Embeddings for the input tokens, with increased dimensions.

Return type:

torch.Tensor

class trajdl.algorithms.embeddings.base.Word2VecEmbedding(tokenizer: AbstractTokenizer, model_path: str)[source]#

Bases: BaseTokenEmbeddingLayer

Token embedding layer that uses a Gensim Word2Vec model.

Parameters:
  • tokenizer (AbstractTokenizer) – Tokenizer

  • model_path (str) – Path to the Word2Vec model file.

forward(x: torch.Tensor) torch.Tensor[source]#

Computes the Word2Vec embeddings for the input tokens.

property embedding_dim: int#
forward(x: LongTensor) Tensor[source]#

Computes the Word2Vec embeddings for the input tokens.

Parameters:

x (torch.LongTensor) – Input tensor containing token indices.

Returns:

Word2Vec embeddings for the input tokens, with increased dimensions.

Return type:

torch.Tensor

load_pretrained_word2vec_embeddings(tokenizer: AbstractTokenizer, word2vec_model_path: str) Embedding[source]#

load word2vec embeddings