ernie_m

`mindnlp.transformers.models.ernie_m.configuration_ernie_m` ¶

ErnieM model configuration

`mindnlp.transformers.models.ernie_m.configuration_ernie_m.ErnieMConfig` ¶

Bases: PretrainedConfig

This is the configuration class to store the configuration of a [ErnieMModel]. It is used to instantiate a Ernie-M model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the Ernie-M susnato/ernie-m-base_pytorch architecture.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

PARAMETER	DESCRIPTION
`vocab_size`	Vocabulary size of `inputs_ids` in [`ErnieMModel`]. Also is the vocab size of token embedding matrix. Defines the number of different tokens that can be represented by the `inputs_ids` passed when calling [`ErnieMModel`]. TYPE: `int`, optional, defaults to 250002 DEFAULT: `250002`
`hidden_size`	Dimensionality of the embedding layer, encoder layers and pooler layer. TYPE: `int`, optional, defaults to 768 DEFAULT: `768`
`num_hidden_layers`	Number of hidden layers in the Transformer encoder. TYPE: `int`, optional, defaults to 12 DEFAULT: `12`
`num_attention_heads`	Number of attention heads for each attention layer in the Transformer encoder. TYPE: `int`, optional, defaults to 12 DEFAULT: `12`
`intermediate_size`	Dimensionality of the feed-forward (ff) layer in the encoder. Input tensors to feed-forward layers are firstly projected from hidden_size to intermediate_size, and then projected back to hidden_size. Typically intermediate_size is larger than hidden_size. TYPE: `int`, optional, defaults to 3072 DEFAULT: `3072`
`hidden_act`	The non-linear activation function in the feed-forward layer. `"gelu"`, `"relu"` and any other torch supported activation functions are supported. TYPE: `str`, optional, defaults to `"gelu"` DEFAULT: `'gelu'`
`hidden_dropout_prob`	The dropout probability for all fully connected layers in the embeddings and encoder. TYPE: `float`, optional, defaults to 0.1 DEFAULT: `0.1`
`attention_probs_dropout_prob`	The dropout probability used in `MultiHeadAttention` in all encoder layers to drop some attention target. TYPE: `float`, optional, defaults to 0.1 DEFAULT: `0.1`
`act_dropout`	This dropout probability is used in `ErnieMEncoderLayer` after activation. TYPE: `float`, optional, defaults to 0.0 DEFAULT: `0.0`
`max_position_embeddings`	The maximum value of the dimensionality of position encoding, which dictates the maximum supported length of an input sequence. TYPE: `int`, optional, defaults to 512 DEFAULT: `514`
`layer_norm_eps`	The epsilon used by the layer normalization layers. TYPE: `float`, optional, defaults to 1e-05 DEFAULT: `1e-05`
`classifier_dropout`	The dropout ratio for the classification head. TYPE: `float`, optional DEFAULT: `None`
`initializer_range`	The standard deviation of the normal initializer for initializing all weight matrices. TYPE: `float`, optional, defaults to 0.02 DEFAULT: `0.02`
pad_token_id(`int`,	The index of padding token in the token vocabulary. TYPE: `optional, defaults to 1`

A normal_initializer initializes weight matrices as normal distributions. See ErnieMPretrainedModel._init_weights() for how weights are initialized in ErnieMModel.

Source code in mindnlp\transformers\models\ernie_m\configuration_ernie_m.py

class ErnieMConfig(PretrainedConfig):
    r"""
    This is the configuration class to store the configuration of a [`ErnieMModel`]. It is used to instantiate a
    Ernie-M model according to the specified arguments, defining the model architecture. Instantiating a configuration
    with the defaults will yield a similar configuration to that of the `Ernie-M`
    [susnato/ernie-m-base_pytorch](https://hf-mirror.com/susnato/ernie-m-base_pytorch) architecture.

    Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
    documentation from [`PretrainedConfig`] for more information.

    Args:
        vocab_size (`int`, *optional*, defaults to 250002):
            Vocabulary size of `inputs_ids` in [`ErnieMModel`]. Also is the vocab size of token embedding matrix.
            Defines the number of different tokens that can be represented by the `inputs_ids` passed when calling
            [`ErnieMModel`].
        hidden_size (`int`, *optional*, defaults to 768):
            Dimensionality of the embedding layer, encoder layers and pooler layer.
        num_hidden_layers (`int`, *optional*, defaults to 12):
            Number of hidden layers in the Transformer encoder.
        num_attention_heads (`int`, *optional*, defaults to 12):
            Number of attention heads for each attention layer in the Transformer encoder.
        intermediate_size (`int`, *optional*, defaults to 3072):
            Dimensionality of the feed-forward (ff) layer in the encoder. Input tensors to feed-forward layers are
            firstly projected from hidden_size to intermediate_size, and then projected back to hidden_size. Typically
            intermediate_size is larger than hidden_size.
        hidden_act (`str`, *optional*, defaults to `"gelu"`):
            The non-linear activation function in the feed-forward layer. `"gelu"`, `"relu"` and any other torch
            supported activation functions are supported.
        hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
            The dropout probability for all fully connected layers in the embeddings and encoder.
        attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
            The dropout probability used in `MultiHeadAttention` in all encoder layers to drop some attention target.
        act_dropout (`float`, *optional*, defaults to 0.0):
            This dropout probability is used in `ErnieMEncoderLayer` after activation.
        max_position_embeddings (`int`, *optional*, defaults to 512):
            The maximum value of the dimensionality of position encoding, which dictates the maximum supported length
            of an input sequence.
        layer_norm_eps (`float`, *optional*, defaults to 1e-05):
            The epsilon used by the layer normalization layers.
        classifier_dropout (`float`, *optional*):
            The dropout ratio for the classification head.
        initializer_range (`float`, *optional*, defaults to 0.02):
            The standard deviation of the normal initializer for initializing all weight matrices.
        pad_token_id(`int`, *optional*, defaults to 1):
            The index of padding token in the token vocabulary.

    A normal_initializer initializes weight matrices as normal distributions. See
    `ErnieMPretrainedModel._init_weights()` for how weights are initialized in `ErnieMModel`.
    """
    model_type = "ernie_m"
    attribute_map: Dict[str, str] = {"dropout": "classifier_dropout", "num_classes": "num_labels"}

    def __init__(
        self,
        vocab_size: int = 250002,
        hidden_size: int = 768,
        num_hidden_layers: int = 12,
        num_attention_heads: int = 12,
        intermediate_size: int = 3072,
        hidden_act: str = "gelu",
        hidden_dropout_prob: float = 0.1,
        attention_probs_dropout_prob: float = 0.1,
        max_position_embeddings: int = 514,
        initializer_range: float = 0.02,
        pad_token_id: int = 1,
        layer_norm_eps: float = 1e-05,
        classifier_dropout=None,
        is_decoder=False,
        act_dropout=0.0,
        **kwargs,
    ):
        """
        This method initializes an instance of the ErnieMConfig class.

        Args:
            self: The instance of the class.
            vocab_size (int): The size of the vocabulary. Default is 250002.
            hidden_size (int): The size of the hidden layers. Default is 768.
            num_hidden_layers (int): The number of hidden layers. Default is 12.
            num_attention_heads (int): The number of attention heads. Default is 12.
            intermediate_size (int): The size of the intermediate layer in the transformer. Default is 3072.
            hidden_act (str): The activation function for the hidden layers. Default is 'gelu'.
            hidden_dropout_prob (float): The dropout probability for the hidden layers. Default is 0.1.
            attention_probs_dropout_prob (float): The dropout probability for the attention probabilities. Default is 0.1.
            max_position_embeddings (int): The maximum position for the embeddings. Default is 514.
            initializer_range (float): The range for the weight initializers. Default is 0.02.
            pad_token_id (int): The ID for padding tokens. Default is 1.
            layer_norm_eps (float): The epsilon value for layer normalization. Default is 1e-05.
            classifier_dropout (None): The dropout rate for the classifier layer. Default is None.
            is_decoder (bool): Whether the model is a decoder. Default is False.
            act_dropout (float): The dropout rate for the activation function. Default is 0.0.

        Returns:
            None.

        Raises:
            None
        """
        super().__init__(pad_token_id=pad_token_id, **kwargs)
        self.vocab_size = vocab_size
        self.hidden_size = hidden_size
        self.num_hidden_layers = num_hidden_layers
        self.num_attention_heads = num_attention_heads
        self.intermediate_size = intermediate_size
        self.hidden_act = hidden_act
        self.hidden_dropout_prob = hidden_dropout_prob
        self.attention_probs_dropout_prob = attention_probs_dropout_prob
        self.max_position_embeddings = max_position_embeddings
        self.initializer_range = initializer_range
        self.layer_norm_eps = layer_norm_eps
        self.classifier_dropout = classifier_dropout
        self.is_decoder = is_decoder
        self.act_dropout = act_dropout

`mindnlp.transformers.models.ernie_m.configuration_ernie_m.ErnieMConfig.init(vocab_size=250002, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=514, initializer_range=0.02, pad_token_id=1, layer_norm_eps=1e-05, classifier_dropout=None, is_decoder=False, act_dropout=0.0, **kwargs)` ¶

This method initializes an instance of the ErnieMConfig class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`vocab_size`	The size of the vocabulary. Default is 250002. TYPE: `int` DEFAULT: `250002`
`hidden_size`	The size of the hidden layers. Default is 768. TYPE: `int` DEFAULT: `768`
`num_hidden_layers`	The number of hidden layers. Default is 12. TYPE: `int` DEFAULT: `12`
`num_attention_heads`	The number of attention heads. Default is 12. TYPE: `int` DEFAULT: `12`
`intermediate_size`	The size of the intermediate layer in the transformer. Default is 3072. TYPE: `int` DEFAULT: `3072`
`hidden_act`	The activation function for the hidden layers. Default is 'gelu'. TYPE: `str` DEFAULT: `'gelu'`
`hidden_dropout_prob`	The dropout probability for the hidden layers. Default is 0.1. TYPE: `float` DEFAULT: `0.1`
`attention_probs_dropout_prob`	The dropout probability for the attention probabilities. Default is 0.1. TYPE: `float` DEFAULT: `0.1`
`max_position_embeddings`	The maximum position for the embeddings. Default is 514. TYPE: `int` DEFAULT: `514`
`initializer_range`	The range for the weight initializers. Default is 0.02. TYPE: `float` DEFAULT: `0.02`
`pad_token_id`	The ID for padding tokens. Default is 1. TYPE: `int` DEFAULT: `1`
`layer_norm_eps`	The epsilon value for layer normalization. Default is 1e-05. TYPE: `float` DEFAULT: `1e-05`
`classifier_dropout`	The dropout rate for the classifier layer. Default is None. TYPE: `None` DEFAULT: `None`
`is_decoder`	Whether the model is a decoder. Default is False. TYPE: `bool` DEFAULT: `False`
`act_dropout`	The dropout rate for the activation function. Default is 0.0. TYPE: `float` DEFAULT: `0.0`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\configuration_ernie_m.py

def __init__(
    self,
    vocab_size: int = 250002,
    hidden_size: int = 768,
    num_hidden_layers: int = 12,
    num_attention_heads: int = 12,
    intermediate_size: int = 3072,
    hidden_act: str = "gelu",
    hidden_dropout_prob: float = 0.1,
    attention_probs_dropout_prob: float = 0.1,
    max_position_embeddings: int = 514,
    initializer_range: float = 0.02,
    pad_token_id: int = 1,
    layer_norm_eps: float = 1e-05,
    classifier_dropout=None,
    is_decoder=False,
    act_dropout=0.0,
    **kwargs,
):
    """
    This method initializes an instance of the ErnieMConfig class.

    Args:
        self: The instance of the class.
        vocab_size (int): The size of the vocabulary. Default is 250002.
        hidden_size (int): The size of the hidden layers. Default is 768.
        num_hidden_layers (int): The number of hidden layers. Default is 12.
        num_attention_heads (int): The number of attention heads. Default is 12.
        intermediate_size (int): The size of the intermediate layer in the transformer. Default is 3072.
        hidden_act (str): The activation function for the hidden layers. Default is 'gelu'.
        hidden_dropout_prob (float): The dropout probability for the hidden layers. Default is 0.1.
        attention_probs_dropout_prob (float): The dropout probability for the attention probabilities. Default is 0.1.
        max_position_embeddings (int): The maximum position for the embeddings. Default is 514.
        initializer_range (float): The range for the weight initializers. Default is 0.02.
        pad_token_id (int): The ID for padding tokens. Default is 1.
        layer_norm_eps (float): The epsilon value for layer normalization. Default is 1e-05.
        classifier_dropout (None): The dropout rate for the classifier layer. Default is None.
        is_decoder (bool): Whether the model is a decoder. Default is False.
        act_dropout (float): The dropout rate for the activation function. Default is 0.0.

    Returns:
        None.

    Raises:
        None
    """
    super().__init__(pad_token_id=pad_token_id, **kwargs)
    self.vocab_size = vocab_size
    self.hidden_size = hidden_size
    self.num_hidden_layers = num_hidden_layers
    self.num_attention_heads = num_attention_heads
    self.intermediate_size = intermediate_size
    self.hidden_act = hidden_act
    self.hidden_dropout_prob = hidden_dropout_prob
    self.attention_probs_dropout_prob = attention_probs_dropout_prob
    self.max_position_embeddings = max_position_embeddings
    self.initializer_range = initializer_range
    self.layer_norm_eps = layer_norm_eps
    self.classifier_dropout = classifier_dropout
    self.is_decoder = is_decoder
    self.act_dropout = act_dropout

`mindnlp.transformers.models.ernie_m.modeling_ernie_m` ¶

MindSpore ErnieM model.

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention` ¶

Bases: Module

ErnieMAttention is a class that represents an attention mechanism used in the ERNIE-M model. It contains methods for initializing the attention mechanism, pruning attention heads, and forwarding attention outputs. This class inherits from nn.Module and utilizes an ErnieMSelfAttention module for self-attention calculations. The attention mechanism includes projection layers for query, key, and value, as well as an output projection layer. The prune_heads method allows for pruning specific attention heads based on provided indices. The forward method processes input hidden states through the self-attention mechanism and output projection layer to generate attention outputs.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMAttention(nn.Module):

    """
    ErnieMAttention is a class that represents an attention mechanism used in the ERNIE-M model.
    It contains methods for initializing the attention mechanism, pruning attention heads, and forwarding attention outputs.
    This class inherits from nn.Module and utilizes an ErnieMSelfAttention module for self-attention calculations.
    The attention mechanism includes projection layers for query, key, and value, as well as an output projection layer.
    The `prune_heads` method allows for pruning specific attention heads based on provided indices.
    The `forward` method processes input hidden states through the self-attention mechanism and output projection
    layer to generate attention outputs.
    """
    def __init__(self, config, position_embedding_type=None):
        """
        Initialize the ErnieMAttention class.

        Args:
            self: The instance of the class.
            config: An object containing configuration parameters.
            position_embedding_type: Type of position embedding to be used, default is None.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        self.self_attn = ErnieMSelfAttention(config, position_embedding_type=position_embedding_type)
        self.out_proj = nn.Linear(config.hidden_size, config.hidden_size)
        self.pruned_heads = set()

    def prune_heads(self, heads):
        """
        This method 'prune_heads' belongs to the class 'ErnieMAttention' and is responsible for pruning specific
        attention heads in the model based on the provided list of heads.

        Args:
            self: Instance of the 'ErnieMAttention' class. It is used to access attributes and methods within the class.
            heads: A list containing the indices of the attention heads that need to be pruned. Each element in the list
                should be an integer representing the index of the head to be pruned.

        Returns:
            None: This method does not return any value but modifies the attention heads in the model in-place.

        Raises:
            None:
                However, it is assumed that the functions called within this method, 
                such as 'find_pruneable_heads_and_indices' and 'prune_linear_layer', may raise exceptions related to 
                input validation or processing errors.
        """
        if len(heads) == 0:
            return
        heads, index = find_pruneable_heads_and_indices(
            heads, self.self_attn.num_attention_heads, self.self_attn.attention_head_size, self.pruned_heads
        )

        # Prune linear layers
        self.self_attn.q_proj = prune_linear_layer(self.self_attn.q_proj, index)
        self.self_attn.k_proj = prune_linear_layer(self.self_attn.k_proj, index)
        self.self_attn.v_proj = prune_linear_layer(self.self_attn.v_proj, index)
        self.out_proj = prune_linear_layer(self.out_proj, index, dim=1)

        # Update hyper params and store pruned heads
        self.self_attn.num_attention_heads = self.self_attn.num_attention_heads - len(heads)
        self.self_attn.all_head_size = self.self_attn.attention_head_size * self.self_attn.num_attention_heads
        self.pruned_heads = self.pruned_heads.union(heads)

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        encoder_hidden_states: Optional[mindspore.Tensor] = None,
        encoder_attention_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
    ) -> Tuple[mindspore.Tensor]:
        """
        This method forwards the ErnieMAttention module.

        Args:
            self: The instance of the ErnieMAttention class.
            hidden_states (mindspore.Tensor): The input hidden states tensor.
            attention_mask (Optional[mindspore.Tensor]): Optional tensor containing attention mask values.
            head_mask (Optional[mindspore.Tensor]): Optional tensor containing head mask values.
            encoder_hidden_states (Optional[mindspore.Tensor]): Optional tensor containing encoder hidden states.
            encoder_attention_mask (Optional[mindspore.Tensor]): Optional tensor containing encoder attention mask values.
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): Optional tuple containing past key and value tensors.
            output_attentions (Optional[bool]): Optional boolean indicating whether to output attentions.

        Returns:
            Tuple[mindspore.Tensor]: A tuple containing the attention output tensor.

        Raises:
            None
        """
        self_outputs = self.self_attn(
            hidden_states,
            attention_mask,
            head_mask,
            encoder_hidden_states,
            encoder_attention_mask,
            past_key_value,
            output_attentions,
        )
        attention_output = self.out_proj(self_outputs[0])
        outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them
        return outputs

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.init(config, position_embedding_type=None)` ¶

Initialize the ErnieMAttention class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An object containing configuration parameters.
`position_embedding_type`	Type of position embedding to be used, default is None. DEFAULT: `None`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config, position_embedding_type=None):
    """
    Initialize the ErnieMAttention class.

    Args:
        self: The instance of the class.
        config: An object containing configuration parameters.
        position_embedding_type: Type of position embedding to be used, default is None.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    self.self_attn = ErnieMSelfAttention(config, position_embedding_type=position_embedding_type)
    self.out_proj = nn.Linear(config.hidden_size, config.hidden_size)
    self.pruned_heads = set()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

This method forwards the ErnieMAttention module.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMAttention class.
`hidden_states`	The input hidden states tensor. TYPE: `Tensor`
`attention_mask`	Optional tensor containing attention mask values. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	Optional tensor containing head mask values. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_hidden_states`	Optional tensor containing encoder hidden states. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_attention_mask`	Optional tensor containing encoder attention mask values. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	Optional tuple containing past key and value tensors. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Optional boolean indicating whether to output attentions. TYPE: `Optional[bool]` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the attention output tensor.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    encoder_hidden_states: Optional[mindspore.Tensor] = None,
    encoder_attention_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
) -> Tuple[mindspore.Tensor]:
    """
    This method forwards the ErnieMAttention module.

    Args:
        self: The instance of the ErnieMAttention class.
        hidden_states (mindspore.Tensor): The input hidden states tensor.
        attention_mask (Optional[mindspore.Tensor]): Optional tensor containing attention mask values.
        head_mask (Optional[mindspore.Tensor]): Optional tensor containing head mask values.
        encoder_hidden_states (Optional[mindspore.Tensor]): Optional tensor containing encoder hidden states.
        encoder_attention_mask (Optional[mindspore.Tensor]): Optional tensor containing encoder attention mask values.
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): Optional tuple containing past key and value tensors.
        output_attentions (Optional[bool]): Optional boolean indicating whether to output attentions.

    Returns:
        Tuple[mindspore.Tensor]: A tuple containing the attention output tensor.

    Raises:
        None
    """
    self_outputs = self.self_attn(
        hidden_states,
        attention_mask,
        head_mask,
        encoder_hidden_states,
        encoder_attention_mask,
        past_key_value,
        output_attentions,
    )
    attention_output = self.out_proj(self_outputs[0])
    outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them
    return outputs

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.prune_heads(heads)` ¶

This method 'prune_heads' belongs to the class 'ErnieMAttention' and is responsible for pruning specific attention heads in the model based on the provided list of heads.

PARAMETER	DESCRIPTION
`self`	Instance of the 'ErnieMAttention' class. It is used to access attributes and methods within the class.
`heads`	A list containing the indices of the attention heads that need to be pruned. Each element in the list should be an integer representing the index of the head to be pruned.

RETURNS	DESCRIPTION
`None`	This method does not return any value but modifies the attention heads in the model in-place.

RAISES	DESCRIPTION
`None`	However, it is assumed that the functions called within this method, such as 'find_pruneable_heads_and_indices' and 'prune_linear_layer', may raise exceptions related to input validation or processing errors.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def prune_heads(self, heads):
    """
    This method 'prune_heads' belongs to the class 'ErnieMAttention' and is responsible for pruning specific
    attention heads in the model based on the provided list of heads.

    Args:
        self: Instance of the 'ErnieMAttention' class. It is used to access attributes and methods within the class.
        heads: A list containing the indices of the attention heads that need to be pruned. Each element in the list
            should be an integer representing the index of the head to be pruned.

    Returns:
        None: This method does not return any value but modifies the attention heads in the model in-place.

    Raises:
        None:
            However, it is assumed that the functions called within this method, 
            such as 'find_pruneable_heads_and_indices' and 'prune_linear_layer', may raise exceptions related to 
            input validation or processing errors.
    """
    if len(heads) == 0:
        return
    heads, index = find_pruneable_heads_and_indices(
        heads, self.self_attn.num_attention_heads, self.self_attn.attention_head_size, self.pruned_heads
    )

    # Prune linear layers
    self.self_attn.q_proj = prune_linear_layer(self.self_attn.q_proj, index)
    self.self_attn.k_proj = prune_linear_layer(self.self_attn.k_proj, index)
    self.self_attn.v_proj = prune_linear_layer(self.self_attn.v_proj, index)
    self.out_proj = prune_linear_layer(self.out_proj, index, dim=1)

    # Update hyper params and store pruned heads
    self.self_attn.num_attention_heads = self.self_attn.num_attention_heads - len(heads)
    self.self_attn.all_head_size = self.self_attn.attention_head_size * self.self_attn.num_attention_heads
    self.pruned_heads = self.pruned_heads.union(heads)

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings` ¶

Bases: Module

Construct the embeddings from word and position embeddings.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMEmbeddings(nn.Module):
    """Construct the embeddings from word and position embeddings."""
    def __init__(self, config):
        """
        Args:
            self (object): The instance of the ErnieMEmbeddings class.
            config (object): An object containing configuration parameters for the ErnieMEmbeddings instance,
                including the hidden size, vocabulary size, maximum position embeddings, padding token ID, layer
                normalization epsilon, and hidden dropout probability.

        Returns:
            None.

        Raises:
            TypeError: If the config parameter is not of the expected type.
            ValueError: If the config parameter does not contain required attributes or if the padding token ID is not valid.
        """
        super().__init__()
        self.hidden_size = config.hidden_size
        self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size, padding_idx=config.pad_token_id)
        self.position_embeddings = nn.Embedding(
            config.max_position_embeddings, config.hidden_size, padding_idx=config.pad_token_id
        )
        self.layer_norm = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.dropout = nn.Dropout(p=config.hidden_dropout_prob)
        self.padding_idx = config.pad_token_id

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values_length: int = 0,
    ) -> mindspore.Tensor:
        """
        This method 'forward' in the class 'ErnieMEmbeddings' forwards the embeddings for the input tokens.

        Args:
            self: The instance of the class.
            input_ids (Optional[mindspore.Tensor]):
                The input token IDs. Default is None. If None, 'inputs_embeds' is used to generate the embeddings.
            position_ids (Optional[mindspore.Tensor]): The position IDs for the input tokens.
                Default is None. If None, position IDs are calculated based on the input shape.
            inputs_embeds (Optional[mindspore.Tensor]): The input embeddings.
                Default is None. If None, input embeddings are generated using 'word_embeddings' based on 'input_ids'.
            past_key_values_length (int): The length of past key values.
                Default is 0. It is used to adjust the 'position_ids' if past key values are present.

        Returns:
            mindspore.Tensor: The forwarded embeddings for the input tokens.

        Raises:
            ValueError: If the input shape is invalid or if 'position_ids' cannot be calculated.
            TypeError: If the input types are not as expected.
        """
        if inputs_embeds is None:
            inputs_embeds = self.word_embeddings(input_ids)
        if position_ids is None:
            input_shape = inputs_embeds.shape[:-1]
            ones = ops.ones(input_shape, dtype=mindspore.int64)
            seq_length = ops.cumsum(ones, dim=1)
            position_ids = seq_length - ones

            if past_key_values_length > 0:
                position_ids = position_ids + past_key_values_length
        # to mimic paddlenlp implementation
        position_ids += 2
        position_embeddings = self.position_embeddings(position_ids)
        embeddings = inputs_embeds + position_embeddings
        embeddings = self.layer_norm(embeddings)
        embeddings = self.dropout(embeddings)

        return embeddings

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.init(config)` ¶

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMEmbeddings class. TYPE: `object`
`config`	An object containing configuration parameters for the ErnieMEmbeddings instance, including the hidden size, vocabulary size, maximum position embeddings, padding token ID, layer normalization epsilon, and hidden dropout probability. TYPE: `object`

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`TypeError`	If the config parameter is not of the expected type.
`ValueError`	If the config parameter does not contain required attributes or if the padding token ID is not valid.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Args:
        self (object): The instance of the ErnieMEmbeddings class.
        config (object): An object containing configuration parameters for the ErnieMEmbeddings instance,
            including the hidden size, vocabulary size, maximum position embeddings, padding token ID, layer
            normalization epsilon, and hidden dropout probability.

    Returns:
        None.

    Raises:
        TypeError: If the config parameter is not of the expected type.
        ValueError: If the config parameter does not contain required attributes or if the padding token ID is not valid.
    """
    super().__init__()
    self.hidden_size = config.hidden_size
    self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size, padding_idx=config.pad_token_id)
    self.position_embeddings = nn.Embedding(
        config.max_position_embeddings, config.hidden_size, padding_idx=config.pad_token_id
    )
    self.layer_norm = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.dropout = nn.Dropout(p=config.hidden_dropout_prob)
    self.padding_idx = config.pad_token_id

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0)` ¶

This method 'forward' in the class 'ErnieMEmbeddings' forwards the embeddings for the input tokens.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`input_ids`	The input token IDs. Default is None. If None, 'inputs_embeds' is used to generate the embeddings. TYPE: `Optional[Tensor]` DEFAULT: `None`
`position_ids`	The position IDs for the input tokens. Default is None. If None, position IDs are calculated based on the input shape. TYPE: `Optional[Tensor]` DEFAULT: `None`
`inputs_embeds`	The input embeddings. Default is None. If None, input embeddings are generated using 'word_embeddings' based on 'input_ids'. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values_length`	The length of past key values. Default is 0. It is used to adjust the 'position_ids' if past key values are present. TYPE: `int` DEFAULT: `0`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: The forwarded embeddings for the input tokens.

RAISES	DESCRIPTION
`ValueError`	If the input shape is invalid or if 'position_ids' cannot be calculated.
`TypeError`	If the input types are not as expected.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values_length: int = 0,
) -> mindspore.Tensor:
    """
    This method 'forward' in the class 'ErnieMEmbeddings' forwards the embeddings for the input tokens.

    Args:
        self: The instance of the class.
        input_ids (Optional[mindspore.Tensor]):
            The input token IDs. Default is None. If None, 'inputs_embeds' is used to generate the embeddings.
        position_ids (Optional[mindspore.Tensor]): The position IDs for the input tokens.
            Default is None. If None, position IDs are calculated based on the input shape.
        inputs_embeds (Optional[mindspore.Tensor]): The input embeddings.
            Default is None. If None, input embeddings are generated using 'word_embeddings' based on 'input_ids'.
        past_key_values_length (int): The length of past key values.
            Default is 0. It is used to adjust the 'position_ids' if past key values are present.

    Returns:
        mindspore.Tensor: The forwarded embeddings for the input tokens.

    Raises:
        ValueError: If the input shape is invalid or if 'position_ids' cannot be calculated.
        TypeError: If the input types are not as expected.
    """
    if inputs_embeds is None:
        inputs_embeds = self.word_embeddings(input_ids)
    if position_ids is None:
        input_shape = inputs_embeds.shape[:-1]
        ones = ops.ones(input_shape, dtype=mindspore.int64)
        seq_length = ops.cumsum(ones, dim=1)
        position_ids = seq_length - ones

        if past_key_values_length > 0:
            position_ids = position_ids + past_key_values_length
    # to mimic paddlenlp implementation
    position_ids += 2
    position_embeddings = self.position_embeddings(position_ids)
    embeddings = inputs_embeds + position_embeddings
    embeddings = self.layer_norm(embeddings)
    embeddings = self.dropout(embeddings)

    return embeddings

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder` ¶

Bases: Module

ErnieMEncoder represents a multi-layer Transformer-based encoder model for processing sequences of input data.

The ErnieMEncoder class inherits from nn.Module and implements a multi-layer Transformer-based encoder, with the ability to return hidden states and attention weights if specified. The class provides methods for initializing the model and processing input data through its layers.

ATTRIBUTE	DESCRIPTION
`config`	A configuration object containing the model's hyperparameters.
`layers`	A list of ErnieMEncoderLayer instances representing the individual layers of the encoder model.

METHOD	DESCRIPTION
`forward`	Processes input embeddings through the encoder layers, optionally returning hidden states and

Please note that the actual code implementation is not included in this docstring.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMEncoder(nn.Module):

    """
    ErnieMEncoder represents a multi-layer Transformer-based encoder model for processing sequences of input data.

    The ErnieMEncoder class inherits from nn.Module and implements a multi-layer Transformer-based encoder,
    with the ability to return hidden states and attention weights if specified.
    The class provides methods for initializing the model and processing input data through its layers.

    Attributes:
        config: A configuration object containing the model's hyperparameters.
        layers: A list of ErnieMEncoderLayer instances representing the individual layers of the encoder model.

    Methods:
        forward: Processes input embeddings through the encoder layers, optionally returning hidden states and
        attention weights based on the specified parameters.

    Please note that the actual code implementation is not included in this docstring.
    """
    def __init__(self, config):
        """
        Initializes an instance of the ErnieMEncoder class.

        Args:
            self (ErnieMEncoder): The instance of the ErnieMEncoder class.
            config (object): The configuration object containing settings for the ErnieMEncoder.
                This parameter is required for configuring the ErnieMEncoder instance.
                It should be an object that provides necessary configuration details.
                It is expected to have attributes such as num_hidden_layers to specify the number of hidden layers.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        self.config = config
        self.layers = nn.ModuleList([ErnieMEncoderLayer(config) for _ in range(config.num_hidden_layers)])

    def forward(
        self,
        input_embeds: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
        output_hidden_states: Optional[bool] = False,
        return_dict: Optional[bool] = True,
    ) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPastAndCrossAttentions]:
        """
        Constructs the ErnieMEncoder.

        Args:
            self: The instance of the class.
            input_embeds (mindspore.Tensor): The input embeddings. Shape (batch_size, sequence_length, hidden_size).
            attention_mask (Optional[mindspore.Tensor]): The attention mask. Shape (batch_size, sequence_length).
            head_mask (Optional[mindspore.Tensor]): The head mask. Shape (num_layers, num_heads).
            past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key values.
                Shape (num_layers, 2, batch_size, num_heads, sequence_length // num_heads, hidden_size // num_heads).
            output_attentions (Optional[bool]): Whether to output attention weights. Default is False.
            output_hidden_states (Optional[bool]): Whether to output hidden states. Default is False.
            return_dict (Optional[bool]): Whether to return a BaseModelOutputWithPastAndCrossAttentions. Default is True.

        Returns:
            Union[Tuple[mindspore.Tensor], BaseModelOutputWithPastAndCrossAttentions]:
                The encoded last hidden state, optional hidden states, and optional attention weights.

        Raises:
            None.
        """
        hidden_states = () if output_hidden_states else None
        attentions = () if output_attentions else None

        output = input_embeds
        if output_hidden_states:
            hidden_states = hidden_states + (output,)
        for i, layer in enumerate(self.layers):
            layer_head_mask = head_mask[i] if head_mask is not None else None
            past_key_value = past_key_values[i] if past_key_values is not None else None

            output, opt_attn_weights = layer(
                hidden_states=output,
                attention_mask=attention_mask,
                head_mask=layer_head_mask,
                past_key_value=past_key_value,
            )

            if output_hidden_states:
                hidden_states = hidden_states + (output,)
            if output_attentions:
                attentions = attentions + (opt_attn_weights,)

        last_hidden_state = output
        if not return_dict:
            return tuple(v for v in [last_hidden_state, hidden_states, attentions] if v is not None)

        return BaseModelOutputWithPastAndCrossAttentions(
            last_hidden_state=last_hidden_state, hidden_states=hidden_states, attentions=attentions
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.init(config)` ¶

Initializes an instance of the ErnieMEncoder class.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMEncoder class. TYPE: `ErnieMEncoder`
`config`	The configuration object containing settings for the ErnieMEncoder. This parameter is required for configuring the ErnieMEncoder instance. It should be an object that provides necessary configuration details. It is expected to have attributes such as num_hidden_layers to specify the number of hidden layers. TYPE: `object`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the ErnieMEncoder class.

    Args:
        self (ErnieMEncoder): The instance of the ErnieMEncoder class.
        config (object): The configuration object containing settings for the ErnieMEncoder.
            This parameter is required for configuring the ErnieMEncoder instance.
            It should be an object that provides necessary configuration details.
            It is expected to have attributes such as num_hidden_layers to specify the number of hidden layers.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    self.config = config
    self.layers = nn.ModuleList([ErnieMEncoderLayer(config) for _ in range(config.num_hidden_layers)])

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False, return_dict=True)` ¶

Constructs the ErnieMEncoder.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`input_embeds`	The input embeddings. Shape (batch_size, sequence_length, hidden_size). TYPE: `Tensor`
`attention_mask`	The attention mask. Shape (batch_size, sequence_length). TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The head mask. Shape (num_layers, num_heads). TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values`	The past key values. Shape (num_layers, 2, batch_size, num_heads, sequence_length // num_heads, hidden_size // num_heads). TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Whether to output attention weights. Default is False. TYPE: `Optional[bool]` DEFAULT: `False`
`output_hidden_states`	Whether to output hidden states. Default is False. TYPE: `Optional[bool]` DEFAULT: `False`
`return_dict`	Whether to return a BaseModelOutputWithPastAndCrossAttentions. Default is True. TYPE: `Optional[bool]` DEFAULT: `True`

RETURNS	DESCRIPTION
`Union[Tuple[Tensor], BaseModelOutputWithPastAndCrossAttentions]`	Union[Tuple[mindspore.Tensor], BaseModelOutputWithPastAndCrossAttentions]: The encoded last hidden state, optional hidden states, and optional attention weights.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_embeds: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
    output_hidden_states: Optional[bool] = False,
    return_dict: Optional[bool] = True,
) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPastAndCrossAttentions]:
    """
    Constructs the ErnieMEncoder.

    Args:
        self: The instance of the class.
        input_embeds (mindspore.Tensor): The input embeddings. Shape (batch_size, sequence_length, hidden_size).
        attention_mask (Optional[mindspore.Tensor]): The attention mask. Shape (batch_size, sequence_length).
        head_mask (Optional[mindspore.Tensor]): The head mask. Shape (num_layers, num_heads).
        past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key values.
            Shape (num_layers, 2, batch_size, num_heads, sequence_length // num_heads, hidden_size // num_heads).
        output_attentions (Optional[bool]): Whether to output attention weights. Default is False.
        output_hidden_states (Optional[bool]): Whether to output hidden states. Default is False.
        return_dict (Optional[bool]): Whether to return a BaseModelOutputWithPastAndCrossAttentions. Default is True.

    Returns:
        Union[Tuple[mindspore.Tensor], BaseModelOutputWithPastAndCrossAttentions]:
            The encoded last hidden state, optional hidden states, and optional attention weights.

    Raises:
        None.
    """
    hidden_states = () if output_hidden_states else None
    attentions = () if output_attentions else None

    output = input_embeds
    if output_hidden_states:
        hidden_states = hidden_states + (output,)
    for i, layer in enumerate(self.layers):
        layer_head_mask = head_mask[i] if head_mask is not None else None
        past_key_value = past_key_values[i] if past_key_values is not None else None

        output, opt_attn_weights = layer(
            hidden_states=output,
            attention_mask=attention_mask,
            head_mask=layer_head_mask,
            past_key_value=past_key_value,
        )

        if output_hidden_states:
            hidden_states = hidden_states + (output,)
        if output_attentions:
            attentions = attentions + (opt_attn_weights,)

    last_hidden_state = output
    if not return_dict:
        return tuple(v for v in [last_hidden_state, hidden_states, attentions] if v is not None)

    return BaseModelOutputWithPastAndCrossAttentions(
        last_hidden_state=last_hidden_state, hidden_states=hidden_states, attentions=attentions
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer` ¶

Bases: Module

The ErnieMEncoderLayer class represents a single layer of the ErnieM (Enhanced Representation through kNowledge Integration) encoder, which is designed for natural language processing tasks. This class inherits from the nn.Module class and implements the functionality for processing input hidden states using multi-head self-attention mechanism and feedforward neural network layers with layer normalization and dropout.

ATTRIBUTE	DESCRIPTION
`self_attn`	Instance of ErnieMAttention for multi-head self-attention mechanism.
`linear1`	Instance of nn.Linear for the first feedforward neural network layer.
`dropout`	Instance of nn.Dropout for applying dropout within the feedforward network.
`linear2`	Instance of nn.Linear for the second feedforward neural network layer.
`norm1`	Instance of nn.LayerNorm for the first layer normalization.
`norm2`	Instance of nn.LayerNorm for the second layer normalization.
`dropout1`	Instance of nn.Dropout for applying dropout after the first feedforward network layer.
`dropout2`	Instance of nn.Dropout for applying dropout after the second feedforward network layer.
`activation`	Activation function for the feedforward network.

METHOD DESCRIPTION

forward

Applies the multi-head self-attention mechanism and feedforward network layers to the input hidden states, optionally producing attention weights.

Args:

hidden_states (mindspore.Tensor): The input hidden states.
attention_mask (Optional[mindspore.Tensor]): Optional tensor for masking the attention scores.
head_mask (Optional[mindspore.Tensor]): Optional tensor for masking specific attention heads.
past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): Optional tuple containing past key and value tensors for fast decoding.
output_attentions (Optional[bool]): Optional boolean indicating whether to return attention weights.

Returns:

mindspore.Tensor or Tuple[mindspore.Tensor]: The processed hidden states and optionally the attention weights.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMEncoderLayer(nn.Module):

    """
    The ErnieMEncoderLayer class represents a single layer of the ErnieM (Enhanced Representation through kNowledge 
    Integration) encoder, which is designed for natural language processing tasks. This class inherits from the nn.Module 
    class and implements the functionality for processing input hidden states using multi-head self-attention mechanism 
    and feedforward neural network layers with layer normalization and dropout.

    Attributes:
        self_attn: Instance of ErnieMAttention for multi-head self-attention mechanism.
        linear1: Instance of nn.Linear for the first feedforward neural network layer.
        dropout: Instance of nn.Dropout for applying dropout within the feedforward network.
        linear2: Instance of nn.Linear for the second feedforward neural network layer.
        norm1: Instance of nn.LayerNorm for the first layer normalization.
        norm2: Instance of nn.LayerNorm for the second layer normalization.
        dropout1: Instance of nn.Dropout for applying dropout after the first feedforward network layer.
        dropout2: Instance of nn.Dropout for applying dropout after the second feedforward network layer.
        activation: Activation function for the feedforward network.

    Methods:
        forward(self, hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True):
            Applies the multi-head self-attention mechanism and feedforward network layers to the input hidden states, 
            optionally producing attention weights.

            Args:

            - hidden_states (mindspore.Tensor): The input hidden states.
            - attention_mask (Optional[mindspore.Tensor]): Optional tensor for masking the attention scores.
            - head_mask (Optional[mindspore.Tensor]): Optional tensor for masking specific attention heads.
            - past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]):
            Optional tuple containing past key and value tensors for fast decoding.
            - output_attentions (Optional[bool]): Optional boolean indicating whether to return attention weights.

            Returns:

            - mindspore.Tensor or Tuple[mindspore.Tensor]: The processed hidden states and optionally the attention weights.
    """
    def __init__(self, config):
        """
        Initialize an instance of the ErnieMEncoderLayer class.

        Args:
            self (ErnieMEncoderLayer): The instance of the ErnieMEncoderLayer class.
            config (object): 
                An object containing configuration parameters for the encoder layer.

                - hidden_dropout_prob (float): The probability of dropout for hidden layers. Default is 0.1.
                - act_dropout (float): The probability of dropout for activation functions. 
                Default is the value of hidden_dropout_prob.
                - hidden_size (int): The size of the hidden layers.
                - intermediate_size (int): The size of the intermediate layers.
                - layer_norm_eps (float): The epsilon value for layer normalization.
                - hidden_act (str or function): The activation function to be used. 
                If a string, it will be converted to a function using ACT2FN dictionary.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        # to mimic paddlenlp implementation
        dropout = 0.1 if config.hidden_dropout_prob is None else config.hidden_dropout_prob
        act_dropout = config.hidden_dropout_prob if config.act_dropout is None else config.act_dropout

        self.self_attn = ErnieMAttention(config)
        self.linear1 = nn.Linear(config.hidden_size, config.intermediate_size)
        self.dropout = nn.Dropout(p=act_dropout)
        self.linear2 = nn.Linear(config.intermediate_size, config.hidden_size)
        self.norm1 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.norm2 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.dropout1 = nn.Dropout(p=dropout)
        self.dropout2 = nn.Dropout(p=dropout)
        if isinstance(config.hidden_act, str):
            self.activation = ACT2FN[config.hidden_act]
        else:
            self.activation = config.hidden_act

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = True,
    ):
        """
        Constructs an ErnieMEncoderLayer.

        This method applies the ErnieMEncoderLayer transformation to the input hidden states.

        Args:
            self: An instance of the ErnieMEncoderLayer class.
            hidden_states (mindspore.Tensor): The input hidden states. This should be a tensor.
            attention_mask (Optional[mindspore.Tensor]): The attention mask tensor. Defaults to None.
            head_mask (Optional[mindspore.Tensor]): The head mask tensor. Defaults to None.
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key value tensor. Defaults to None.
            output_attentions (Optional[bool]): Whether to output attention weights. Defaults to True.

        Returns:
            None.

        Raises:
            None.
        """
        residual = hidden_states
        if output_attentions:
            hidden_states, attention_opt_weights = self.self_attn(
                hidden_states=hidden_states,
                attention_mask=attention_mask,
                head_mask=head_mask,
                past_key_value=past_key_value,
                output_attentions=output_attentions,
            )

        else:
            hidden_states = self.self_attn(
                hidden_states=hidden_states,
                attention_mask=attention_mask,
                head_mask=head_mask,
                past_key_value=past_key_value,
                output_attentions=output_attentions,
            )
        hidden_states = residual + self.dropout1(hidden_states)
        hidden_states = self.norm1(hidden_states)
        residual = hidden_states

        hidden_states = self.linear1(hidden_states)
        hidden_states = self.activation(hidden_states)
        hidden_states = self.dropout(hidden_states)
        hidden_states = self.linear2(hidden_states)
        hidden_states = residual + self.dropout2(hidden_states)
        hidden_states = self.norm2(hidden_states)

        if output_attentions:
            return hidden_states, attention_opt_weights
        return hidden_states

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.init(config)` ¶

Initialize an instance of the ErnieMEncoderLayer class.

PARAMETER DESCRIPTION

self

The instance of the ErnieMEncoderLayer class.

TYPE: ErnieMEncoderLayer

config

An object containing configuration parameters for the encoder layer.

hidden_dropout_prob (float): The probability of dropout for hidden layers. Default is 0.1.
act_dropout (float): The probability of dropout for activation functions. Default is the value of hidden_dropout_prob.
hidden_size (int): The size of the hidden layers.
intermediate_size (int): The size of the intermediate layers.
layer_norm_eps (float): The epsilon value for layer normalization.
hidden_act (str or function): The activation function to be used. If a string, it will be converted to a function using ACT2FN dictionary.

TYPE: object

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initialize an instance of the ErnieMEncoderLayer class.

    Args:
        self (ErnieMEncoderLayer): The instance of the ErnieMEncoderLayer class.
        config (object): 
            An object containing configuration parameters for the encoder layer.

            - hidden_dropout_prob (float): The probability of dropout for hidden layers. Default is 0.1.
            - act_dropout (float): The probability of dropout for activation functions. 
            Default is the value of hidden_dropout_prob.
            - hidden_size (int): The size of the hidden layers.
            - intermediate_size (int): The size of the intermediate layers.
            - layer_norm_eps (float): The epsilon value for layer normalization.
            - hidden_act (str or function): The activation function to be used. 
            If a string, it will be converted to a function using ACT2FN dictionary.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    # to mimic paddlenlp implementation
    dropout = 0.1 if config.hidden_dropout_prob is None else config.hidden_dropout_prob
    act_dropout = config.hidden_dropout_prob if config.act_dropout is None else config.act_dropout

    self.self_attn = ErnieMAttention(config)
    self.linear1 = nn.Linear(config.hidden_size, config.intermediate_size)
    self.dropout = nn.Dropout(p=act_dropout)
    self.linear2 = nn.Linear(config.intermediate_size, config.hidden_size)
    self.norm1 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.norm2 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.dropout1 = nn.Dropout(p=dropout)
    self.dropout2 = nn.Dropout(p=dropout)
    if isinstance(config.hidden_act, str):
        self.activation = ACT2FN[config.hidden_act]
    else:
        self.activation = config.hidden_act

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True)` ¶

Constructs an ErnieMEncoderLayer.

This method applies the ErnieMEncoderLayer transformation to the input hidden states.

PARAMETER	DESCRIPTION
`self`	An instance of the ErnieMEncoderLayer class.
`hidden_states`	The input hidden states. This should be a tensor. TYPE: `Tensor`
`attention_mask`	The attention mask tensor. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The head mask tensor. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	The past key value tensor. Defaults to None. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Whether to output attention weights. Defaults to True. TYPE: `Optional[bool]` DEFAULT: `True`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = True,
):
    """
    Constructs an ErnieMEncoderLayer.

    This method applies the ErnieMEncoderLayer transformation to the input hidden states.

    Args:
        self: An instance of the ErnieMEncoderLayer class.
        hidden_states (mindspore.Tensor): The input hidden states. This should be a tensor.
        attention_mask (Optional[mindspore.Tensor]): The attention mask tensor. Defaults to None.
        head_mask (Optional[mindspore.Tensor]): The head mask tensor. Defaults to None.
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key value tensor. Defaults to None.
        output_attentions (Optional[bool]): Whether to output attention weights. Defaults to True.

    Returns:
        None.

    Raises:
        None.
    """
    residual = hidden_states
    if output_attentions:
        hidden_states, attention_opt_weights = self.self_attn(
            hidden_states=hidden_states,
            attention_mask=attention_mask,
            head_mask=head_mask,
            past_key_value=past_key_value,
            output_attentions=output_attentions,
        )

    else:
        hidden_states = self.self_attn(
            hidden_states=hidden_states,
            attention_mask=attention_mask,
            head_mask=head_mask,
            past_key_value=past_key_value,
            output_attentions=output_attentions,
        )
    hidden_states = residual + self.dropout1(hidden_states)
    hidden_states = self.norm1(hidden_states)
    residual = hidden_states

    hidden_states = self.linear1(hidden_states)
    hidden_states = self.activation(hidden_states)
    hidden_states = self.dropout(hidden_states)
    hidden_states = self.linear2(hidden_states)
    hidden_states = residual + self.dropout2(hidden_states)
    hidden_states = self.norm2(hidden_states)

    if output_attentions:
        return hidden_states, attention_opt_weights
    return hidden_states

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction` ¶

Bases: ErnieMPreTrainedModel

ErnieMForInformationExtraction is a class that represents an ErnieM model for information extraction tasks. It inherits from ErnieMPreTrainedModel and includes methods for initializing the model and forwarding the forward pass.

ATTRIBUTE	DESCRIPTION
`ernie_m`	The ErnieM model used for information extraction. TYPE: `ErnieMModel`
`linear_start`	Linear layer for predicting the start position in the input sequence. TYPE: `Linear`
`linear_end`	Linear layer for predicting the end position in the input sequence. TYPE: `Linear`
`sigmoid`	Sigmoid activation function for probability calculation. TYPE: `Sigmoid`

METHOD	DESCRIPTION
`__init__`	Initializes the ErnieMForInformationExtraction class with the provided configuration.
`forward`	Constructs the forward pass of the model for information extraction tasks.

PARAMETER	DESCRIPTION
`input_ids`	Input tensor containing token ids. TYPE: `Tensor`
`attention_mask`	Tensor specifying which tokens should be attended to. TYPE: `Tensor`
`position_ids`	Tensor specifying the position ids of tokens. TYPE: `Tensor`
`head_mask`	Tensor for masking specific heads in the self-attention layers. TYPE: `Tensor`
`inputs_embeds`	Tensor for providing custom embeddings instead of token ids. TYPE: `Tensor`
`start_positions`	Labels for start positions in the input sequence. TYPE: `Tensor`
`end_positions`	Labels for end positions in the input sequence. TYPE: `Tensor`
`output_attentions`	Flag to output attention weights. TYPE: `bool`
`output_hidden_states`	Flag to output hidden states. TYPE: `bool`
`return_dict`	Flag to return outputs as a dictionary. TYPE: `bool`

RETURNS	DESCRIPTION
	Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]: Tuple of output tensors or a QuestionAnsweringModelOutput object.

RAISES	DESCRIPTION
`ValueError`	If start_positions or end_positions are not of the expected shape.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMForInformationExtraction(ErnieMPreTrainedModel):

    """
    ErnieMForInformationExtraction is a class that represents an ErnieM model for information extraction tasks. 
    It inherits from ErnieMPreTrainedModel and includes methods for initializing the model and forwarding the forward pass.

    Attributes:
        ernie_m (ErnieMModel): The ErnieM model used for information extraction.
        linear_start (nn.Linear): Linear layer for predicting the start position in the input sequence.
        linear_end (nn.Linear): Linear layer for predicting the end position in the input sequence.
        sigmoid (nn.Sigmoid): Sigmoid activation function for probability calculation.

    Methods:
        __init__: Initializes the ErnieMForInformationExtraction class with the provided configuration.
        forward: Constructs the forward pass of the model for information extraction tasks.

    Args:
        input_ids (mindspore.Tensor): Input tensor containing token ids.
        attention_mask (mindspore.Tensor): Tensor specifying which tokens should be attended to.
        position_ids (mindspore.Tensor): Tensor specifying the position ids of tokens.
        head_mask (mindspore.Tensor): Tensor for masking specific heads in the self-attention layers.
        inputs_embeds (mindspore.Tensor): Tensor for providing custom embeddings instead of token ids.
        start_positions (mindspore.Tensor): Labels for start positions in the input sequence.
        end_positions (mindspore.Tensor): Labels for end positions in the input sequence.
        output_attentions (bool): Flag to output attention weights.
        output_hidden_states (bool): Flag to output hidden states.
        return_dict (bool): Flag to return outputs as a dictionary.

    Returns:
        Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]: Tuple of output tensors or a QuestionAnsweringModelOutput object.

    Raises:
        ValueError: If start_positions or end_positions are not of the expected shape.

    """
    def __init__(self, config):
        """
        Initializes a new instance of the ErnieMForInformationExtraction class.

        Args:
            self: The instance of the class.
            config: An instance of the ErnieMConfig class containing the configuration parameters for the model.

        Returns:
            None

        Raises:
            None
        """
        super().__init__(config)
        self.ernie_m = ErnieMModel(config)
        self.linear_start = nn.Linear(config.hidden_size, 1)
        self.linear_end = nn.Linear(config.hidden_size, 1)
        self.sigmoid = nn.Sigmoid()
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = True,
    ) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
                not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
                taken into account for computing the loss.
        """
        result = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )
        if return_dict:
            sequence_output = result.last_hidden_state
        elif not return_dict:
            sequence_output = result[0]

        start_logits = self.linear_start(sequence_output)
        start_logits = start_logits.squeeze(-1)
        start_prob = self.sigmoid(start_logits)
        end_logits = self.linear_end(sequence_output)
        end_logits = end_logits.squeeze(-1)
        end_prob = self.sigmoid(end_logits)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = F.binary_cross_entropy_with_logits(start_prob, start_positions)
            end_loss = F.binary_cross_entropy_with_logits(end_prob, end_positions)
            total_loss = (start_loss + end_loss) / 2

        if not return_dict:
            return tuple(
                i
                for i in [total_loss, start_prob, end_prob, result.hidden_states, result.attentions]
                if i is not None
            )

        return QuestionAnsweringModelOutput(
            loss=total_loss,
            start_logits=start_prob,
            end_logits=end_prob,
            hidden_states=result.hidden_states,
            attentions=result.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.init(config)` ¶

Initializes a new instance of the ErnieMForInformationExtraction class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An instance of the ErnieMConfig class containing the configuration parameters for the model.

RETURNS	DESCRIPTION
	None

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes a new instance of the ErnieMForInformationExtraction class.

    Args:
        self: The instance of the class.
        config: An instance of the ErnieMConfig class containing the configuration parameters for the model.

    Returns:
        None

    Raises:
        None
    """
    super().__init__(config)
    self.ernie_m = ErnieMModel(config)
    self.linear_start = nn.Linear(config.hidden_size, 1)
    self.linear_end = nn.Linear(config.hidden_size, 1)
    self.sigmoid = nn.Sigmoid()
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) for computing the start_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = True,
) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
            not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
            taken into account for computing the loss.
    """
    result = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )
    if return_dict:
        sequence_output = result.last_hidden_state
    elif not return_dict:
        sequence_output = result[0]

    start_logits = self.linear_start(sequence_output)
    start_logits = start_logits.squeeze(-1)
    start_prob = self.sigmoid(start_logits)
    end_logits = self.linear_end(sequence_output)
    end_logits = end_logits.squeeze(-1)
    end_prob = self.sigmoid(end_logits)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = F.binary_cross_entropy_with_logits(start_prob, start_positions)
        end_loss = F.binary_cross_entropy_with_logits(end_prob, end_positions)
        total_loss = (start_loss + end_loss) / 2

    if not return_dict:
        return tuple(
            i
            for i in [total_loss, start_prob, end_prob, result.hidden_states, result.attentions]
            if i is not None
        )

    return QuestionAnsweringModelOutput(
        loss=total_loss,
        start_logits=start_prob,
        end_logits=end_prob,
        hidden_states=result.hidden_states,
        attentions=result.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice` ¶

Bases: ErnieMPreTrainedModel

ErnieMForMultipleChoice is a class that represents a multiple choice question answering model based on the ERNIE-M architecture. It inherits from ErnieMPreTrainedModel and implements methods for forwarding the model and computing the multiple choice classification loss.

ATTRIBUTE	DESCRIPTION
`ernie_m`	The ERNIE-M model used for processing inputs. TYPE: `ErnieMModel`
`dropout`	Dropout layer used in the classifier. TYPE: `Dropout`
`classifier`	Dense layer for classification. TYPE: `Linear`

METHOD	DESCRIPTION
`__init__`	Initializes the ErnieMForMultipleChoice model with the given configuration.
`forward`	Constructs the model for multiple choice question answering and computes the classification loss.

The forward method takes various input tensors and parameters, processes them through the ERNIE-M model, applies dropout, and computes the classification logits. If labels are provided, it calculates the cross-entropy loss. The method returns the loss and model outputs based on the return_dict parameter.

This class is designed to be used for multiple choice question answering tasks with ERNIE-M models.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMForMultipleChoice(ErnieMPreTrainedModel):

    """
    ErnieMForMultipleChoice is a class that represents a multiple choice question answering model based on the
    ERNIE-M architecture.
    It inherits from ErnieMPreTrainedModel and implements methods for forwarding the model and computing the multiple
    choice classification loss.

    Attributes:
        ernie_m (ErnieMModel): The ERNIE-M model used for processing inputs.
        dropout (nn.Dropout): Dropout layer used in the classifier.
        classifier (nn.Linear): Dense layer for classification.

    Methods:
        __init__: Initializes the ErnieMForMultipleChoice model with the given configuration.
        forward: Constructs the model for multiple choice question answering and computes the classification loss.

    The forward method takes various input tensors and parameters, processes them through the ERNIE-M model,
    applies dropout, and computes the classification logits.
    If labels are provided, it calculates the cross-entropy loss. The method returns the loss and model outputs based on
    the return_dict parameter.

    This class is designed to be used for multiple choice question answering tasks with ERNIE-M models.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForMultipleChoice.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of the ErnieMForMultipleChoice class.

        Args:
            self: The object instance.
            config: An instance of the ErnieMConfig class containing the model configuration.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(config)

        self.ernie_m = ErnieMModel(config)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, 1)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        labels: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = True,
    ) -> Union[Tuple[mindspore.Tensor], MultipleChoiceModelOutput]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for computing the multiple choice classification loss. Indices should be in `[0, ...,
                num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See
                `input_ids` above)
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
        num_choices = input_ids.shape[1] if input_ids is not None else inputs_embeds.shape[1]

        input_ids = input_ids.view(-1, input_ids.shape[-1]) if input_ids is not None else None
        attention_mask = attention_mask.view(-1, attention_mask.shape[-1]) if attention_mask is not None else None
        position_ids = position_ids.view(-1, position_ids.shape[-1]) if position_ids is not None else None
        inputs_embeds = (
            inputs_embeds.view(-1, inputs_embeds.shape[-2], inputs_embeds.shape[-1])
            if inputs_embeds is not None
            else None
        )

        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        pooled_output = outputs[1]

        pooled_output = self.dropout(pooled_output)
        logits = self.classifier(pooled_output)
        reshaped_logits = logits.view(-1, num_choices)

        loss = None
        if labels is not None:
            loss = F.cross_entropy(reshaped_logits, labels)

        if not return_dict:
            output = (reshaped_logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return MultipleChoiceModelOutput(
            loss=loss,
            logits=reshaped_logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.init(config)` ¶

Initializes an instance of the ErnieMForMultipleChoice class.

PARAMETER	DESCRIPTION
`self`	The object instance.
`config`	An instance of the ErnieMConfig class containing the model configuration.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the ErnieMForMultipleChoice class.

    Args:
        self: The object instance.
        config: An instance of the ErnieMConfig class containing the model configuration.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(config)

    self.ernie_m = ErnieMModel(config)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, 1)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the multiple choice classification loss. Indices should be in `[0, ..., num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See `input_ids` above) TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    labels: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = True,
) -> Union[Tuple[mindspore.Tensor], MultipleChoiceModelOutput]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for computing the multiple choice classification loss. Indices should be in `[0, ...,
            num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See
            `input_ids` above)
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict
    num_choices = input_ids.shape[1] if input_ids is not None else inputs_embeds.shape[1]

    input_ids = input_ids.view(-1, input_ids.shape[-1]) if input_ids is not None else None
    attention_mask = attention_mask.view(-1, attention_mask.shape[-1]) if attention_mask is not None else None
    position_ids = position_ids.view(-1, position_ids.shape[-1]) if position_ids is not None else None
    inputs_embeds = (
        inputs_embeds.view(-1, inputs_embeds.shape[-2], inputs_embeds.shape[-1])
        if inputs_embeds is not None
        else None
    )

    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    reshaped_logits = logits.view(-1, num_choices)

    loss = None
    if labels is not None:
        loss = F.cross_entropy(reshaped_logits, labels)

    if not return_dict:
        output = (reshaped_logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return MultipleChoiceModelOutput(
        loss=loss,
        logits=reshaped_logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering` ¶

Bases: ErnieMPreTrainedModel

ErnieMForQuestionAnswering is a class that represents a fine-tuned ErnieM model for question answering tasks. It is a subclass of ErnieMPreTrainedModel.

This class extends the functionality of the base ErnieM model by adding a question answering head on top of it. It takes as input the configuration of the model and initializes the necessary components. The class provides a method called 'forward' which performs the forward pass of the model for question answering.

The 'forward' method takes several input tensors such as 'input_ids', 'attention_mask', 'position_ids', 'head_mask', and 'inputs_embeds'. It also supports optional inputs like 'start_positions', 'end_positions', 'output_attentions', 'output_hidden_states', and 'return_dict'. The method returns the question answering model output, which includes the start and end logits, hidden states, attentions, and an optional total loss.

The 'forward' method internally calls the 'ernie_m' method of the base ErnieM model to obtain the sequence output. It then passes the sequence output through a dense layer 'qa_outputs' to get the logits for the start and end positions. The logits are then processed to obtain the final start and end logits. If 'start_positions' and 'end_positions' are provided, the method calculates the cross-entropy loss for the predicted logits and the provided positions. The total loss is computed as the average of the start and end losses.

The 'forward' method returns the model output in a structured manner based on the 'return_dict' parameter.

If 'return_dict' is False, the method returns a tuple containing the total loss, start logits, end logits, and any additional hidden states or attentions.
If 'return_dict' is True, the method returns an instance of the 'QuestionAnsweringModelOutput' class, which encapsulates the output elements as attributes.

Note

If 'start_positions' and 'end_positions' are not provided, the total loss will be None.
The start and end positions are clamped to the length of the sequence and positions outside the sequence are ignored for computing the loss.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMForQuestionAnswering(ErnieMPreTrainedModel):

    """
    ErnieMForQuestionAnswering is a class that represents a fine-tuned ErnieM model for question answering tasks.
    It is a subclass of ErnieMPreTrainedModel.

    This class extends the functionality of the base ErnieM model by adding a question answering head on top of it.
    It takes as input the configuration of the model and initializes the necessary components.
    The class provides a method called 'forward' which performs the forward pass of the model for question answering.

    The 'forward' method takes several input tensors such as 'input_ids', 'attention_mask', 'position_ids',
    'head_mask', and 'inputs_embeds'. It also supports optional inputs like 'start_positions', 'end_positions',
    'output_attentions', 'output_hidden_states', and 'return_dict'.
    The method returns the question answering model output, which includes the start and end logits, hidden states,
    attentions, and an optional total loss.

    The 'forward' method internally calls the 'ernie_m' method of the base ErnieM model to obtain the sequence output.
    It then passes the sequence output through a dense layer 'qa_outputs' to get the logits for the start and end
    positions. The logits are then processed to obtain the final start and end logits. If 'start_positions' and
    'end_positions' are provided, the method calculates the cross-entropy loss for the predicted logits and the provided
    positions. The total loss is computed as the average of the start and end losses.

    The 'forward' method returns the model output in a structured manner based on the 'return_dict' parameter.

    - If 'return_dict' is False, the method returns a tuple containing the total loss, start logits, end logits, and any
    additional hidden states or attentions.
    - If 'return_dict' is True, the method returns an instance of the 'QuestionAnsweringModelOutput' class, which
    encapsulates the output elements as attributes.

    Note:
        - If 'start_positions' and 'end_positions' are not provided, the total loss will be None.
        - The start and end positions are clamped to the length of the sequence and positions outside the sequence are
        ignored for computing the loss.

    """
    # Copied from transformers.models.bert.modeling_bert.BertForQuestionAnswering.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """Initializes a new instance of the ErnieMForQuestionAnswering class.

        Args:
            self: The object itself.
            config: An instance of the ErnieMConfig class containing the model configuration.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(config)
        self.num_labels = config.num_labels

        self.ernie_m = ErnieMModel(config, add_pooling_layer=False)
        self.qa_outputs = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = True,
    ) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) of the start of the labelled span for computing the token classification loss.
                Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
                are not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) of the end of the labelled span for computing the token classification loss.
                Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
                are not taken into account for computing the loss.
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        sequence_output = outputs[0]

        logits = self.qa_outputs(sequence_output)
        start_logits, end_logits = logits.split(1, axis=-1)
        start_logits = start_logits.squeeze(-1)
        end_logits = end_logits.squeeze(-1)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = F.cross_entropy(start_logits, start_positions, ignore_index=ignored_index)
            end_loss = F.cross_entropy(end_logits, end_positions, ignore_index=ignored_index)
            total_loss = (start_loss + end_loss) / 2

        if not return_dict:
            output = (start_logits, end_logits) + outputs[2:]
            return ((total_loss,) + output) if total_loss is not None else output

        return QuestionAnsweringModelOutput(
            loss=total_loss,
            start_logits=start_logits,
            end_logits=end_logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.init(config)` ¶

Initializes a new instance of the ErnieMForQuestionAnswering class.

PARAMETER	DESCRIPTION
`self`	The object itself.
`config`	An instance of the ErnieMConfig class containing the model configuration.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """Initializes a new instance of the ErnieMForQuestionAnswering class.

    Args:
        self: The object itself.
        config: An instance of the ErnieMConfig class containing the model configuration.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(config)
    self.num_labels = config.num_labels

    self.ernie_m = ErnieMModel(config, add_pooling_layer=False)
    self.qa_outputs = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) of the start of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) of the end of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = True,
) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) of the start of the labelled span for computing the token classification loss.
            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
            are not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) of the end of the labelled span for computing the token classification loss.
            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
            are not taken into account for computing the loss.
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    sequence_output = outputs[0]

    logits = self.qa_outputs(sequence_output)
    start_logits, end_logits = logits.split(1, axis=-1)
    start_logits = start_logits.squeeze(-1)
    end_logits = end_logits.squeeze(-1)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = F.cross_entropy(start_logits, start_positions, ignore_index=ignored_index)
        end_loss = F.cross_entropy(end_logits, end_positions, ignore_index=ignored_index)
        total_loss = (start_loss + end_loss) / 2

    if not return_dict:
        output = (start_logits, end_logits) + outputs[2:]
        return ((total_loss,) + output) if total_loss is not None else output

    return QuestionAnsweringModelOutput(
        loss=total_loss,
        start_logits=start_logits,
        end_logits=end_logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification` ¶

Bases: ErnieMPreTrainedModel

ErnieMForSequenceClassification is a class that represents a fine-tuned ErnieM model for sequence classification tasks. It inherits from ErnieMPreTrainedModel and implements methods for initializing the model and forwarding predictions.

ATTRIBUTE	DESCRIPTION
`num_labels`	Number of labels for sequence classification.
`config`	Configuration object for the model.
`ernie_m`	ErnieMModel instance for processing input sequences.
`dropout`	Dropout layer for regularization.
`classifier`	Dense layer for classification predictions.

METHOD DESCRIPTION

__init__

Initializes the ErnieMForSequenceClassification instance with the provided configuration.

forward

Constructs the model for making predictions on input sequences and computes the loss based on predicted labels.

Args:

input_ids (Optional[mindspore.Tensor]): Tensor of input token IDs.
attention_mask (Optional[mindspore.Tensor]): Tensor of attention masks.
position_ids (Optional[mindspore.Tensor]): Tensor of position IDs.
head_mask (Optional[mindspore.Tensor]): Tensor of head masks.
inputs_embeds (Optional[mindspore.Tensor]): Tensor of input embeddings.
past_key_values (Optional[List[mindspore.Tensor]]): List of past key values for caching.
use_cache (Optional[bool]): Flag for using caching.
output_hidden_states (Optional[bool]): Flag for outputting hidden states.
output_attentions (Optional[bool]): Flag for outputting attentions.
return_dict (Optional[bool]): Flag for returning output in a dictionary format.
labels (Optional[mindspore.Tensor]): Tensor of target labels for computing loss.

Returns:

Union[Tuple[mindspore.Tensor], SequenceClassifierOutput]: Tuple of model outputs and loss.

Raises:

ValueError: If the provided labels are not in the expected format or number.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMForSequenceClassification(ErnieMPreTrainedModel):

    """
    ErnieMForSequenceClassification is a class that represents a fine-tuned ErnieM model for sequence classification tasks.
    It inherits from ErnieMPreTrainedModel and implements methods for initializing the model and forwarding predictions.

    Attributes:
        num_labels: Number of labels for sequence classification.
        config: Configuration object for the model.
        ernie_m: ErnieMModel instance for processing input sequences.
        dropout: Dropout layer for regularization.
        classifier: Dense layer for classification predictions.

    Methods:
        __init__: Initializes the ErnieMForSequenceClassification instance with the provided configuration.
        forward:
            Constructs the model for making predictions on input sequences and computes the loss based on predicted labels.

            Args:

            - input_ids (Optional[mindspore.Tensor]): Tensor of input token IDs.
            - attention_mask (Optional[mindspore.Tensor]): Tensor of attention masks.
            - position_ids (Optional[mindspore.Tensor]): Tensor of position IDs.
            - head_mask (Optional[mindspore.Tensor]): Tensor of head masks.
            - inputs_embeds (Optional[mindspore.Tensor]): Tensor of input embeddings.
            - past_key_values (Optional[List[mindspore.Tensor]]): List of past key values for caching.
            - use_cache (Optional[bool]): Flag for using caching.
            - output_hidden_states (Optional[bool]): Flag for outputting hidden states.
            - output_attentions (Optional[bool]): Flag for outputting attentions.
            - return_dict (Optional[bool]): Flag for returning output in a dictionary format.
            - labels (Optional[mindspore.Tensor]): Tensor of target labels for computing loss.

            Returns:

            - Union[Tuple[mindspore.Tensor], SequenceClassifierOutput]: Tuple of model outputs and loss.

            Raises:

            - ValueError: If the provided labels are not in the expected format or number.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForSequenceClassification.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of the ErnieMForSequenceClassification class.

        Args:
            self: The instance of the class.
            config (object): The configuration object containing settings for the model initialization.
                It must have the following attributes:

                - num_labels (int): The number of labels for classification.
                - classifier_dropout (float, optional): The dropout probability for the classifier layer.
                If not provided, it defaults to the hidden dropout probability.
                - hidden_dropout_prob (float): The default hidden dropout probability.

        Returns:
            None.

        Raises:
            ValueError: If the config object is missing the num_labels attribute.
            TypeError: If the config object does not have the expected attributes or if their types are incorrect.
        """
        super().__init__(config)
        self.num_labels = config.num_labels
        self.config = config

        self.ernie_m = ErnieMModel(config)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[List[mindspore.Tensor]] = None,
        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        return_dict: Optional[bool] = True,
        labels: Optional[mindspore.Tensor] = None,
    ) -> Union[Tuple[mindspore.Tensor], SequenceClassifierOutput]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
                config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
                `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            past_key_values=past_key_values,
            output_hidden_states=output_hidden_states,
            output_attentions=output_attentions,
            return_dict=return_dict,
        )

        pooled_output = outputs[1]

        pooled_output = self.dropout(pooled_output)
        logits = self.classifier(pooled_output)

        loss = None
        if labels is not None:
            if self.config.problem_type is None:
                if self.num_labels == 1:
                    self.config.problem_type = "regression"
                elif self.num_labels > 1 and labels.dtype in (mindspore.int64, mindspore.int32):
                    self.config.problem_type = "single_label_classification"
                else:
                    self.config.problem_type = "multi_label_classification"

            if self.config.problem_type == "regression":
                if self.num_labels == 1:
                    loss = F.mse_loss(logits.squeeze(), labels.squeeze())
                else:
                    loss = F.mse_loss(logits, labels)
            elif self.config.problem_type == "single_label_classification":
                loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))
            elif self.config.problem_type == "multi_label_classification":
                loss = F.binary_cross_entropy_with_logits(logits, labels)
        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return SequenceClassifierOutput(
            loss=loss,
            logits=logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification.init(config)` ¶

Initializes an instance of the ErnieMForSequenceClassification class.

PARAMETER DESCRIPTION

self

The instance of the class.

config

The configuration object containing settings for the model initialization. It must have the following attributes:

num_labels (int): The number of labels for classification.
classifier_dropout (float, optional): The dropout probability for the classifier layer. If not provided, it defaults to the hidden dropout probability.
hidden_dropout_prob (float): The default hidden dropout probability.

TYPE: object

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`ValueError`	If the config object is missing the num_labels attribute.
`TypeError`	If the config object does not have the expected attributes or if their types are incorrect.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the ErnieMForSequenceClassification class.

    Args:
        self: The instance of the class.
        config (object): The configuration object containing settings for the model initialization.
            It must have the following attributes:

            - num_labels (int): The number of labels for classification.
            - classifier_dropout (float, optional): The dropout probability for the classifier layer.
            If not provided, it defaults to the hidden dropout probability.
            - hidden_dropout_prob (float): The default hidden dropout probability.

    Returns:
        None.

    Raises:
        ValueError: If the config object is missing the num_labels attribute.
        TypeError: If the config object does not have the expected attributes or if their types are incorrect.
    """
    super().__init__(config)
    self.num_labels = config.num_labels
    self.config = config

    self.ernie_m = ErnieMModel(config)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, return_dict=True, labels=None)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the sequence classification/regression loss. Indices should be in `[0, ..., config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If `config.num_labels > 1` a classification loss is computed (Cross-Entropy). TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[List[mindspore.Tensor]] = None,
    use_cache: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    return_dict: Optional[bool] = True,
    labels: Optional[mindspore.Tensor] = None,
) -> Union[Tuple[mindspore.Tensor], SequenceClassifierOutput]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
            config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
            `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        past_key_values=past_key_values,
        output_hidden_states=output_hidden_states,
        output_attentions=output_attentions,
        return_dict=return_dict,
    )

    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)

    loss = None
    if labels is not None:
        if self.config.problem_type is None:
            if self.num_labels == 1:
                self.config.problem_type = "regression"
            elif self.num_labels > 1 and labels.dtype in (mindspore.int64, mindspore.int32):
                self.config.problem_type = "single_label_classification"
            else:
                self.config.problem_type = "multi_label_classification"

        if self.config.problem_type == "regression":
            if self.num_labels == 1:
                loss = F.mse_loss(logits.squeeze(), labels.squeeze())
            else:
                loss = F.mse_loss(logits, labels)
        elif self.config.problem_type == "single_label_classification":
            loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))
        elif self.config.problem_type == "multi_label_classification":
            loss = F.binary_cross_entropy_with_logits(logits, labels)
    if not return_dict:
        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return SequenceClassifierOutput(
        loss=loss,
        logits=logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification` ¶

Bases: ErnieMPreTrainedModel

This class represents a fine-tuned ErnieM model for token classification tasks. It inherits from the ErnieMPreTrainedModel class.

The ErnieMForTokenClassification class implements the necessary methods and attributes for token classification tasks. It takes a configuration object as input during initialization and sets up the model architecture accordingly. The model consists of an ErnieMModel instance, a dropout layer, and a classifier layer.

METHOD	DESCRIPTION
`__init__`	Initializes the ErnieMForTokenClassification instance with the given configuration. It sets the number of labels, creates an ErnieMModel object, initializes the dropout layer, and creates the classifier layer.
`forward`	Constructs the forward pass of the model. It takes various input tensors and returns the token classification output. Optionally, it can also compute the token classification loss if labels are provided.

ATTRIBUTE	DESCRIPTION
`num_labels`	The number of possible labels for the token classification task.

Example

>>> config = ErnieMConfig()
>>> model = ErnieMForTokenClassification(config)
>>> input_ids = ...
>>> attention_mask = ...
>>> output = model.forward(input_ids=input_ids, attention_mask=attention_mask)

Note

It is important to provide the input tensors in the correct shape and format to ensure proper model functioning.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMForTokenClassification(ErnieMPreTrainedModel):

    """
    This class represents a fine-tuned ErnieM model for token classification tasks. It inherits from the ErnieMPreTrainedModel class.

    The ErnieMForTokenClassification class implements the necessary methods and attributes for token classification tasks.
    It takes a configuration object as input during initialization and sets up the model architecture accordingly.
    The model consists of an ErnieMModel instance, a dropout layer, and a classifier layer.

    Methods:
        __init__: Initializes the ErnieMForTokenClassification instance with the given configuration.
            It sets the number of labels, creates an ErnieMModel object, initializes the dropout layer, and
            creates the classifier layer.

        forward: Constructs the forward pass of the model. It takes various input tensors and returns the token
            classification output. Optionally, it can also compute the token classification loss if labels are provided.

    Attributes:
        num_labels: The number of possible labels for the token classification task.

    Example:
        ```python
        >>> config = ErnieMConfig()
        >>> model = ErnieMForTokenClassification(config)
        >>> input_ids = ...
        >>> attention_mask = ...
        >>> output = model.forward(input_ids=input_ids, attention_mask=attention_mask)
        ```

    Note:
        It is important to provide the input tensors in the correct shape and format to ensure proper model functioning.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForTokenClassification.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of the ErnieMForTokenClassification class.

        Args:
            self: The instance of the ErnieMForTokenClassification class.
            config: An instance of the configuration class containing the model configuration settings.

        Returns:
            None

        Raises:
            None.
        """
        super().__init__(config)
        self.num_labels = config.num_labels

        self.ernie_m = ErnieMModel(config, add_pooling_layer=False)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[List[mindspore.Tensor]] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        return_dict: Optional[bool] = True,
        labels: Optional[mindspore.Tensor] = None,
    ) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`.
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            past_key_values=past_key_values,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        sequence_output = outputs[0]

        sequence_output = self.dropout(sequence_output)
        logits = self.classifier(sequence_output)

        loss = None
        if labels is not None:
            loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return TokenClassifierOutput(
            loss=loss,
            logits=logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.init(config)` ¶

Initializes an instance of the ErnieMForTokenClassification class.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMForTokenClassification class.
`config`	An instance of the configuration class containing the model configuration settings.

RETURNS	DESCRIPTION
	None

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the ErnieMForTokenClassification class.

    Args:
        self: The instance of the ErnieMForTokenClassification class.
        config: An instance of the configuration class containing the model configuration settings.

    Returns:
        None

    Raises:
        None.
    """
    super().__init__(config)
    self.num_labels = config.num_labels

    self.ernie_m = ErnieMModel(config, add_pooling_layer=False)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, return_dict=True, labels=None)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[List[mindspore.Tensor]] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    return_dict: Optional[bool] = True,
    labels: Optional[mindspore.Tensor] = None,
) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`.
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        past_key_values=past_key_values,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    sequence_output = outputs[0]

    sequence_output = self.dropout(sequence_output)
    logits = self.classifier(sequence_output)

    loss = None
    if labels is not None:
        loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

    if not return_dict:
        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return TokenClassifierOutput(
        loss=loss,
        logits=logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel` ¶

Bases: ErnieMPreTrainedModel

This class represents an ERNIE-M (Enhanced Representation through kNowledge Integration) model for multi-purpose pre-training and fine-tuning on downstream tasks. It incorporates ERNIE-M embeddings, encoder, and optional pooling layer. The class provides methods for initializing, getting and setting input embeddings, pruning model heads, and forwarding the model with various input and output options. The class inherits from ErnieMPreTrainedModel and extends its functionality to support specific ERNIE-M model architecture and operations.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMModel(ErnieMPreTrainedModel):

    """
    This class represents an ERNIE-M (Enhanced Representation through kNowledge Integration) model for multi-purpose
    pre-training and fine-tuning on downstream tasks. It incorporates ERNIE-M embeddings, encoder, and optional pooling
    layer. The class provides methods for initializing, getting and setting input embeddings, pruning model heads,
    and forwarding the model with various input and output options.
    The class inherits from ErnieMPreTrainedModel and extends its functionality to support specific ERNIE-M model
    architecture and operations.
    """
    def __init__(self, config, add_pooling_layer=True):
        """
        Initializes the ErnieMModel.

        Args:
            self: The instance of the class.
            config (object): The configuration object containing model settings.
            add_pooling_layer (bool): A flag indicating whether to add a pooling layer to the model.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(config)
        self.initializer_range = config.initializer_range
        self.embeddings = ErnieMEmbeddings(config)
        self.encoder = ErnieMEncoder(config)
        self.pooler = ErnieMPooler(config) if add_pooling_layer else None
        self.post_init()

    def get_input_embeddings(self):
        """
        This method returns the input embeddings from the ErnieMModel.

        Args:
            self: ErnieMModel object. The instance of the ErnieMModel class.

        Returns:
            word_embeddings: The method returns the input embeddings from the ErnieMModel.

        Raises:
            None.
        """
        return self.embeddings.word_embeddings

    def set_input_embeddings(self, value):
        """
        Set the input embeddings for the ErnieMModel.

        Args:
            self (ErnieMModel): The instance of the ErnieMModel class.
            value: The input embeddings value to be set. It should be a tensor representing the input embeddings.

        Returns:
            None.

        Raises:
            None.
        """
        self.embeddings.word_embeddings = value

    def _prune_heads(self, heads_to_prune):
        """
        Prunes heads of the model. heads_to_prune: dict of {layer_num: list of heads to prune in this layer} See base
        class PreTrainedModel
        """
        for layer, heads in heads_to_prune.items():
            self.encoder.layers[layer].self_attn.prune_heads(heads)

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        return_dict: Optional[bool] = None,
    ) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
        """
        Constructs the ERNIE-M model.

        Args:
            self: The object instance.
            input_ids (Optional[mindspore.Tensor]): The input tensor of token indices. Default is None.
            position_ids (Optional[mindspore.Tensor]): The tensor indicating the position of tokens. Default is None.
            attention_mask (Optional[mindspore.Tensor]):
                The tensor indicating which elements in the input do not need to be attended to. Default is None.
            head_mask (Optional[mindspore.Tensor]):
                The tensor indicating the heads in the multi-head attention layer to be masked. Default is None.
            inputs_embeds (Optional[mindspore.Tensor]): The input embeddings. Default is None.
            past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): The previous key values. Default is None.
            use_cache (Optional[bool]): Whether to use the cache. Default is None.
            output_hidden_states (Optional[bool]): Whether to output the hidden states. Default is None.
            output_attentions (Optional[bool]): Whether to output the attentions. Default is None.
            return_dict (Optional[bool]): Whether to return a dictionary. Default is None.

        Returns:
            Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
                Depending on the value of `return_dict`, returns a tuple of tensors including the last hidden state and
                the pooler output, or a BaseModelOutputWithPoolingAndCrossAttentions object.

        Raises:
            ValueError: If both `input_ids` and `inputs_embeds` are specified.
        """
        if input_ids is not None and inputs_embeds is not None:
            raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time.")

        # init the default bool value
        output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
        output_hidden_states = (
            output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
        )
        return_dict = return_dict if return_dict is not None else self.config.return_dict

        head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

        past_key_values_length = 0
        if past_key_values is not None:
            past_key_values_length = past_key_values[0][0].shape[2]

        # Adapted from paddlenlp.transformers.ernie_m.ErnieMModel
        if attention_mask is None:
            attention_mask = (input_ids == 0).to(self.dtype)
            attention_mask *= mindspore.tensor(np.finfo(mindspore.dtype_to_nptype(attention_mask.dtype)).min, attention_mask.dtype)
            if past_key_values is not None:
                batch_size = past_key_values[0][0].shape[0]
                past_mask = ops.zeros([batch_size, 1, 1, past_key_values_length], dtype=attention_mask.dtype)
                attention_mask = ops.concat([past_mask, attention_mask], dim=-1)
        # For 2D attention_mask from tokenizer
        elif attention_mask.ndim == 2:
            attention_mask = attention_mask.to(self.dtype)
            attention_mask = 1.0 - attention_mask
            attention_mask *= mindspore.tensor(np.finfo(mindspore.dtype_to_nptype(attention_mask.dtype)).min, attention_mask.dtype)

        extended_attention_mask = attention_mask.unsqueeze(1).unsqueeze(1)

        embedding_output = self.embeddings(
            input_ids=input_ids,
            position_ids=position_ids,
            inputs_embeds=inputs_embeds,
            past_key_values_length=past_key_values_length,
        )
        encoder_outputs = self.encoder(
            embedding_output,
            attention_mask=extended_attention_mask,
            head_mask=head_mask,
            past_key_values=past_key_values,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        if not return_dict:
            sequence_output = encoder_outputs[0]
            pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
            return (sequence_output, pooler_output) + encoder_outputs[1:]

        sequence_output = encoder_outputs["last_hidden_state"]
        pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
        hidden_states = None if not output_hidden_states else encoder_outputs["hidden_states"]
        attentions = None if not output_attentions else encoder_outputs["attentions"]

        return BaseModelOutputWithPoolingAndCrossAttentions(
            last_hidden_state=sequence_output,
            pooler_output=pooler_output,
            hidden_states=hidden_states,
            attentions=attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.init(config, add_pooling_layer=True)` ¶

Initializes the ErnieMModel.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	The configuration object containing model settings. TYPE: `object`
`add_pooling_layer`	A flag indicating whether to add a pooling layer to the model. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config, add_pooling_layer=True):
    """
    Initializes the ErnieMModel.

    Args:
        self: The instance of the class.
        config (object): The configuration object containing model settings.
        add_pooling_layer (bool): A flag indicating whether to add a pooling layer to the model.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(config)
    self.initializer_range = config.initializer_range
    self.embeddings = ErnieMEmbeddings(config)
    self.encoder = ErnieMEncoder(config)
    self.pooler = ErnieMPooler(config) if add_pooling_layer else None
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, return_dict=None)` ¶

Constructs the ERNIE-M model.

PARAMETER	DESCRIPTION
`self`	The object instance.
`input_ids`	The input tensor of token indices. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`position_ids`	The tensor indicating the position of tokens. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`attention_mask`	The tensor indicating which elements in the input do not need to be attended to. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The tensor indicating the heads in the multi-head attention layer to be masked. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`inputs_embeds`	The input embeddings. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values`	The previous key values. Default is None. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`use_cache`	Whether to use the cache. Default is None. TYPE: `Optional[bool]` DEFAULT: `None`
`output_hidden_states`	Whether to output the hidden states. Default is None. TYPE: `Optional[bool]` DEFAULT: `None`
`output_attentions`	Whether to output the attentions. Default is None. TYPE: `Optional[bool]` DEFAULT: `None`
`return_dict`	Whether to return a dictionary. Default is None. TYPE: `Optional[bool]` DEFAULT: `None`

RETURNS	DESCRIPTION
`Union[Tuple[Tensor], BaseModelOutputWithPoolingAndCrossAttentions]`	Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]: Depending on the value of `return_dict`, returns a tuple of tensors including the last hidden state and the pooler output, or a BaseModelOutputWithPoolingAndCrossAttentions object.

RAISES	DESCRIPTION
`ValueError`	If both `input_ids` and `inputs_embeds` are specified.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    use_cache: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    return_dict: Optional[bool] = None,
) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
    """
    Constructs the ERNIE-M model.

    Args:
        self: The object instance.
        input_ids (Optional[mindspore.Tensor]): The input tensor of token indices. Default is None.
        position_ids (Optional[mindspore.Tensor]): The tensor indicating the position of tokens. Default is None.
        attention_mask (Optional[mindspore.Tensor]):
            The tensor indicating which elements in the input do not need to be attended to. Default is None.
        head_mask (Optional[mindspore.Tensor]):
            The tensor indicating the heads in the multi-head attention layer to be masked. Default is None.
        inputs_embeds (Optional[mindspore.Tensor]): The input embeddings. Default is None.
        past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): The previous key values. Default is None.
        use_cache (Optional[bool]): Whether to use the cache. Default is None.
        output_hidden_states (Optional[bool]): Whether to output the hidden states. Default is None.
        output_attentions (Optional[bool]): Whether to output the attentions. Default is None.
        return_dict (Optional[bool]): Whether to return a dictionary. Default is None.

    Returns:
        Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
            Depending on the value of `return_dict`, returns a tuple of tensors including the last hidden state and
            the pooler output, or a BaseModelOutputWithPoolingAndCrossAttentions object.

    Raises:
        ValueError: If both `input_ids` and `inputs_embeds` are specified.
    """
    if input_ids is not None and inputs_embeds is not None:
        raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time.")

    # init the default bool value
    output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
    output_hidden_states = (
        output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
    )
    return_dict = return_dict if return_dict is not None else self.config.return_dict

    head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

    past_key_values_length = 0
    if past_key_values is not None:
        past_key_values_length = past_key_values[0][0].shape[2]

    # Adapted from paddlenlp.transformers.ernie_m.ErnieMModel
    if attention_mask is None:
        attention_mask = (input_ids == 0).to(self.dtype)
        attention_mask *= mindspore.tensor(np.finfo(mindspore.dtype_to_nptype(attention_mask.dtype)).min, attention_mask.dtype)
        if past_key_values is not None:
            batch_size = past_key_values[0][0].shape[0]
            past_mask = ops.zeros([batch_size, 1, 1, past_key_values_length], dtype=attention_mask.dtype)
            attention_mask = ops.concat([past_mask, attention_mask], dim=-1)
    # For 2D attention_mask from tokenizer
    elif attention_mask.ndim == 2:
        attention_mask = attention_mask.to(self.dtype)
        attention_mask = 1.0 - attention_mask
        attention_mask *= mindspore.tensor(np.finfo(mindspore.dtype_to_nptype(attention_mask.dtype)).min, attention_mask.dtype)

    extended_attention_mask = attention_mask.unsqueeze(1).unsqueeze(1)

    embedding_output = self.embeddings(
        input_ids=input_ids,
        position_ids=position_ids,
        inputs_embeds=inputs_embeds,
        past_key_values_length=past_key_values_length,
    )
    encoder_outputs = self.encoder(
        embedding_output,
        attention_mask=extended_attention_mask,
        head_mask=head_mask,
        past_key_values=past_key_values,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    if not return_dict:
        sequence_output = encoder_outputs[0]
        pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
        return (sequence_output, pooler_output) + encoder_outputs[1:]

    sequence_output = encoder_outputs["last_hidden_state"]
    pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
    hidden_states = None if not output_hidden_states else encoder_outputs["hidden_states"]
    attentions = None if not output_attentions else encoder_outputs["attentions"]

    return BaseModelOutputWithPoolingAndCrossAttentions(
        last_hidden_state=sequence_output,
        pooler_output=pooler_output,
        hidden_states=hidden_states,
        attentions=attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.get_input_embeddings()` ¶

This method returns the input embeddings from the ErnieMModel.

PARAMETER	DESCRIPTION
`self`	ErnieMModel object. The instance of the ErnieMModel class.

RETURNS	DESCRIPTION
`word_embeddings`	The method returns the input embeddings from the ErnieMModel.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def get_input_embeddings(self):
    """
    This method returns the input embeddings from the ErnieMModel.

    Args:
        self: ErnieMModel object. The instance of the ErnieMModel class.

    Returns:
        word_embeddings: The method returns the input embeddings from the ErnieMModel.

    Raises:
        None.
    """
    return self.embeddings.word_embeddings

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.set_input_embeddings(value)` ¶

Set the input embeddings for the ErnieMModel.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMModel class. TYPE: `ErnieMModel`
`value`	The input embeddings value to be set. It should be a tensor representing the input embeddings.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def set_input_embeddings(self, value):
    """
    Set the input embeddings for the ErnieMModel.

    Args:
        self (ErnieMModel): The instance of the ErnieMModel class.
        value: The input embeddings value to be set. It should be a tensor representing the input embeddings.

    Returns:
        None.

    Raises:
        None.
    """
    self.embeddings.word_embeddings = value

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler` ¶

Bases: Module

This class represents the MPooler module of the ERNIE model, which is responsible for pooling the hidden states to obtain a single representation of the input sequence.

Inherits from

nn.Module

ATTRIBUTE	DESCRIPTION
`dense`	A fully connected layer that projects the input hidden states to a new hidden size. TYPE: `Linear`
`activation`	The activation function applied to the projected hidden states. TYPE: `Tanh`

METHOD	DESCRIPTION
`__init__`	Initializes the ERNIE MPooler module.
`forward`	Constructs the MPooler module by pooling the hidden states.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMPooler(nn.Module):
    """
    This class represents the MPooler module of the ERNIE model, which is responsible for pooling the hidden states to
    obtain a single representation of the input sequence.

    Inherits from:
        nn.Module

    Attributes:
        dense (nn.Linear): A fully connected layer that projects the input hidden states to a new hidden size.
        activation (nn.Tanh): The activation function applied to the projected hidden states.

    Methods:
        __init__(config): Initializes the ERNIE MPooler module.
        forward(hidden_states): Constructs the MPooler module by pooling the hidden states.

    """
    def __init__(self, config):
        """
        Initializes a new instance of the ErnieMPooler class.

        Args:
            self: The object instance.
            config: An instance of the configuration class used to configure the ErnieMPooler.
                It provides various settings and parameters for the ErnieMPooler's behavior. This parameter is required.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
        self.activation = nn.Tanh()

    def forward(self, hidden_states: mindspore.Tensor) -> mindspore.Tensor:
        """
        Constructs the pooled output tensor for the ERNIE model.

        Args:
            self (ErnieMPooler): An instance of the ErnieMPooler class.
            hidden_states (mindspore.Tensor): A tensor containing the hidden states from the ERNIE model.
                It should have the shape (batch_size, sequence_length, hidden_size) where:

                - batch_size: The number of sequences in the batch.
                - sequence_length: The length of each input sequence.
                - hidden_size: The size of the hidden state vectors.

        Returns:
            mindspore.Tensor: A tensor representing the pooled output of the ERNIE model.
                The pooled output is obtained by applying dense and activation layers to the first token tensor
                extracted from the hidden states tensor.

        Raises:
            None
        """
        # We "pool" the model by simply taking the hidden state corresponding
        # to the first token.
        first_token_tensor = hidden_states[:, 0]
        pooled_output = self.dense(first_token_tensor)
        pooled_output = self.activation(pooled_output)
        return pooled_output

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.init(config)` ¶

Initializes a new instance of the ErnieMPooler class.

PARAMETER	DESCRIPTION
`self`	The object instance.
`config`	An instance of the configuration class used to configure the ErnieMPooler. It provides various settings and parameters for the ErnieMPooler's behavior. This parameter is required.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config):
    """
    Initializes a new instance of the ErnieMPooler class.

    Args:
        self: The object instance.
        config: An instance of the configuration class used to configure the ErnieMPooler.
            It provides various settings and parameters for the ErnieMPooler's behavior. This parameter is required.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    self.dense = nn.Linear(config.hidden_size, config.hidden_size)
    self.activation = nn.Tanh()

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.forward(hidden_states)` ¶

Constructs the pooled output tensor for the ERNIE model.

PARAMETER	DESCRIPTION
`self`	An instance of the ErnieMPooler class. TYPE: `ErnieMPooler`
`hidden_states`	A tensor containing the hidden states from the ERNIE model. It should have the shape (batch_size, sequence_length, hidden_size) where: batch_size: The number of sequences in the batch. sequence_length: The length of each input sequence. hidden_size: The size of the hidden state vectors. TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: A tensor representing the pooled output of the ERNIE model. The pooled output is obtained by applying dense and activation layers to the first token tensor extracted from the hidden states tensor.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(self, hidden_states: mindspore.Tensor) -> mindspore.Tensor:
    """
    Constructs the pooled output tensor for the ERNIE model.

    Args:
        self (ErnieMPooler): An instance of the ErnieMPooler class.
        hidden_states (mindspore.Tensor): A tensor containing the hidden states from the ERNIE model.
            It should have the shape (batch_size, sequence_length, hidden_size) where:

            - batch_size: The number of sequences in the batch.
            - sequence_length: The length of each input sequence.
            - hidden_size: The size of the hidden state vectors.

    Returns:
        mindspore.Tensor: A tensor representing the pooled output of the ERNIE model.
            The pooled output is obtained by applying dense and activation layers to the first token tensor
            extracted from the hidden states tensor.

    Raises:
        None
    """
    # We "pool" the model by simply taking the hidden state corresponding
    # to the first token.
    first_token_tensor = hidden_states[:, 0]
    pooled_output = self.dense(first_token_tensor)
    pooled_output = self.activation(pooled_output)
    return pooled_output

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPreTrainedModel` ¶

Bases: PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMPreTrainedModel(PreTrainedModel):
    """
    An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained
    models.
    """
    config_class = ErnieMConfig
    base_model_prefix = "ernie_m"

    def _init_weights(self, cell):
        """Initialize the weights"""
        if isinstance(cell, nn.Linear):
            # Slightly different from the TF version which uses truncated_normal for initialization
            # cf https://github.com/pytorch/pytorch/pull/5617
            cell.weight.set_data(initializer(Normal(self.config.initializer_range),
                                                    cell.weight.shape, cell.weight.dtype))
            if cell.bias is not None:
                cell.bias.set_data(initializer('zeros', cell.bias.shape, cell.bias.dtype))
        elif isinstance(cell, nn.Embedding):
            weight = np.random.normal(0.0, self.config.initializer_range, cell.weight.shape)
            if cell.padding_idx:
                weight[cell.padding_idx] = 0

            cell.weight.set_data(Tensor(weight, cell.weight.dtype))
        elif isinstance(cell, nn.LayerNorm):
            cell.weight.set_data(initializer('ones', cell.weight.shape, cell.weight.dtype))
            cell.bias.set_data(initializer('zeros', cell.bias.shape, cell.bias.dtype))

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention` ¶

Bases: Module

A module that implements the self-attention mechanism used in ERNIE model.

This module contains the ErnieMSelfAttention class, which represents the self-attention mechanism used in the ERNIE model. It is a subclass of nn.Module and is responsible for calculating the attention scores and producing the context layer.

ATTRIBUTE	DESCRIPTION
`num_attention_heads`	The number of attention heads in the self-attention mechanism. TYPE: `int`
`attention_head_size`	The size of each attention head. TYPE: `int`
`all_head_size`	The total size of all attention heads combined. TYPE: `int`
`q_proj`	The projection layer for the query tensor. TYPE: `Linear`
`k_proj`	The projection layer for the key tensor. TYPE: `Linear`
`v_proj`	The projection layer for the value tensor. TYPE: `Linear`
`dropout`	The dropout layer applied to the attention probabilities. TYPE: `Dropout`
`position_embedding_type`	The type of position embedding used in the attention mechanism. TYPE: `str`
`distance_embedding`	The embedding layer for computing relative positions in the attention scores. TYPE: `Embedding`
`is_decoder`	Whether the self-attention mechanism is used in a decoder module. TYPE: `bool`

METHOD	DESCRIPTION
`transpose_for_scores`	Reshapes the input tensor for calculating attention scores.
`forward`	Constructs the self-attention mechanism by calculating attention scores and producing the context layer.

Example

>>> config = ErnieConfig(hidden_size=768, num_attention_heads=12, attention_probs_dropout_prob=0.1)
>>> self_attention = ErnieMSelfAttention(config)

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class ErnieMSelfAttention(nn.Module):
    """
    A module that implements the self-attention mechanism used in ERNIE model.

    This module contains the `ErnieMSelfAttention` class, which represents the self-attention mechanism used in the
    ERNIE model. It is a subclass of `nn.Module` and is responsible for calculating the attention scores and producing
    the context layer.

    Attributes:
        num_attention_heads (int): The number of attention heads in the self-attention mechanism.
        attention_head_size (int): The size of each attention head.
        all_head_size (int): The total size of all attention heads combined.
        q_proj (nn.Linear): The projection layer for the query tensor.
        k_proj (nn.Linear): The projection layer for the key tensor.
        v_proj (nn.Linear): The projection layer for the value tensor.
        dropout (nn.Dropout): The dropout layer applied to the attention probabilities.
        position_embedding_type (str): The type of position embedding used in the attention mechanism.
        distance_embedding (nn.Embedding): The embedding layer for computing relative positions in the attention scores.
        is_decoder (bool): Whether the self-attention mechanism is used in a decoder module.

    Methods:
        transpose_for_scores:
            Reshapes the input tensor for calculating attention scores.

        forward:
            Constructs the self-attention mechanism by calculating attention scores and producing the context layer.

    Example:
        ```python
        >>> config = ErnieConfig(hidden_size=768, num_attention_heads=12, attention_probs_dropout_prob=0.1)
        >>> self_attention = ErnieMSelfAttention(config)
        ```
        """
    def __init__(self, config, position_embedding_type=None):
        """
        Initializes the ErnieMSelfAttention class.

        Args:
            self: The object itself.
            config (object): An object containing configuration parameters for the self-attention mechanism.
            position_embedding_type (str, optional): The type of position embedding to use. Defaults to None.

        Returns:
            None.

        Raises:
            ValueError: If the hidden size is not a multiple of the number of attention heads.
        """
        super().__init__()
        if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config, "embedding_size"):
            raise ValueError(
                f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
                f"heads ({config.num_attention_heads})"
            )

        self.num_attention_heads = config.num_attention_heads
        self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
        self.all_head_size = self.num_attention_heads * self.attention_head_size

        self.q_proj = nn.Linear(config.hidden_size, self.all_head_size)
        self.k_proj = nn.Linear(config.hidden_size, self.all_head_size)
        self.v_proj = nn.Linear(config.hidden_size, self.all_head_size)

        self.dropout = nn.Dropout(p=config.attention_probs_dropout_prob)
        self.position_embedding_type = position_embedding_type or getattr(
            config, "position_embedding_type", "absolute"
        )
        if self.position_embedding_type in ('relative_key', 'relative_key_query'):
            self.max_position_embeddings = config.max_position_embeddings
            self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1, self.attention_head_size)

        self.is_decoder = config.is_decoder

    def transpose_for_scores(self, x: mindspore.Tensor) -> mindspore.Tensor:
        """
        Transposes the input tensor for calculating attention scores in the ErnieMSelfAttention class.

        Args:
            self (ErnieMSelfAttention): The instance of the ErnieMSelfAttention class.
            x (mindspore.Tensor): The input tensor to be transposed.
                It should have a shape of (batch_size, sequence_length, hidden_size).

        Returns:
            mindspore.Tensor:
                The transposed tensor with shape (batch_size, num_attention_heads, sequence_length, attention_head_size).

        Raises:
            None.
        """
        new_x_shape = x.shape[:-1] + (self.num_attention_heads, self.attention_head_size)
        x = x.view(new_x_shape)
        return x.permute(0, 2, 1, 3)

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        encoder_hidden_states: Optional[mindspore.Tensor] = None,
        encoder_attention_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
    ) -> Tuple[mindspore.Tensor]:
        """
        This method forwards the self-attention mechanism for the ErnieMSelfAttention class.

        Args:
            self: The instance of the class.
            hidden_states (mindspore.Tensor): The input tensor representing the hidden states.
            attention_mask (Optional[mindspore.Tensor]):
                Optional tensor for masking attention scores. Defaults to None.
            head_mask (Optional[mindspore.Tensor]): Optional tensor for masking attention heads. Defaults to None.
            encoder_hidden_states (Optional[mindspore.Tensor]):
                Optional tensor representing hidden states from an encoder. Defaults to None.
            encoder_attention_mask (Optional[mindspore.Tensor]):
                Optional tensor for masking encoder attention scores. Defaults to None.
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]):
                Optional tuple of past key and value tensors. Defaults to None.
            output_attentions (Optional[bool]):
                Flag indicating whether to output attentions. Defaults to False.

        Returns:
            Tuple[mindspore.Tensor]:
                A tuple containing the context layer tensor and optionally the attention probabilities tensor.

        Raises:
            ValueError: If the input tensor shapes are incompatible for matrix multiplication.
            ValueError: If the position_embedding_type specified is not supported.
            RuntimeError: If there is an issue with applying softmax or dropout operations.
            RuntimeError: If there is an issue with reshaping the context layer tensor.
        """
        mixed_query_layer = self.q_proj(hidden_states)

        # If this is instantiated as a cross-attention module, the keys
        # and values come from an encoder; the attention mask needs to be
        # such that the encoder's padding tokens are not attended to.
        is_cross_attention = encoder_hidden_states is not None

        if is_cross_attention and past_key_value is not None:
            # reuse k,v, cross_attentions
            key_layer = past_key_value[0]
            value_layer = past_key_value[1]
            attention_mask = encoder_attention_mask
        elif is_cross_attention:
            key_layer = self.transpose_for_scores(self.k_proj(encoder_hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(encoder_hidden_states))
            attention_mask = encoder_attention_mask
        elif past_key_value is not None:
            key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(hidden_states))
            key_layer = ops.cat([past_key_value[0], key_layer], dim=2)
            value_layer = ops.cat([past_key_value[1], value_layer], dim=2)
        else:
            key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(hidden_states))

        query_layer = self.transpose_for_scores(mixed_query_layer)

        use_cache = past_key_value is not None
        if self.is_decoder:
            # if cross_attention save Tuple(mindspore.Tensor, mindspore.Tensor) of all cross attention key/value_states.
            # Further calls to cross_attention layer can then reuse all cross-attention
            # key/value_states (first "if" case)
            # if uni-directional self-attention (decoder) save Tuple(mindspore.Tensor, mindspore.Tensor) of
            # all previous decoder key/value_states. Further calls to uni-directional self-attention
            # can concat previous decoder key/value_states to current projected key/value_states (third "elif" case)
            # if encoder bi-directional self-attention `past_key_value` is always `None`
            past_key_value = (key_layer, value_layer)

        # Take the dot product between "query" and "key" to get the raw attention scores.
        attention_scores = ops.matmul(query_layer, key_layer.swapaxes(-1, -2))

        if self.position_embedding_type in ('relative_key', 'relative_key_query'):
            query_length, key_length = query_layer.shape[2], key_layer.shape[2]
            if use_cache:
                position_ids_l = mindspore.tensor(key_length - 1, dtype=mindspore.int64).view(
                    -1, 1
                )
            else:
                position_ids_l = ops.arange(query_length, dtype=mindspore.int64).view(-1, 1)
            position_ids_r = ops.arange(key_length, dtype=mindspore.int64).view(1, -1)
            distance = position_ids_l - position_ids_r

            positional_embedding = self.distance_embedding(distance + self.max_position_embeddings - 1)
            positional_embedding = positional_embedding.to(dtype=query_layer.dtype)  # fp16 compatibility

            if self.position_embedding_type == "relative_key":
                relative_position_scores = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
                attention_scores = attention_scores + relative_position_scores
            elif self.position_embedding_type == "relative_key_query":
                relative_position_scores_query = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
                relative_position_scores_key = ops.einsum("bhrd,lrd->bhlr", key_layer, positional_embedding)
                attention_scores = attention_scores + relative_position_scores_query + relative_position_scores_key

        attention_scores = attention_scores / math.sqrt(self.attention_head_size)
        if attention_mask is not None:
            # Apply the attention mask is (precomputed for all layers in ErnieMModel forward() function)
            attention_scores = attention_scores + attention_mask

        # Normalize the attention scores to probabilities.
        attention_probs = ops.softmax(attention_scores, dim=-1)

        # This is actually dropping out entire tokens to attend to, which might
        # seem a bit unusual, but is taken from the original Transformer paper.
        attention_probs = self.dropout(attention_probs)

        # Mask heads if we want to
        if head_mask is not None:
            attention_probs = attention_probs * head_mask

        context_layer = ops.matmul(attention_probs, value_layer)

        context_layer = context_layer.permute(0, 2, 1, 3)
        new_context_layer_shape = context_layer.shape[:-2] + (self.all_head_size,)
        context_layer = context_layer.view(new_context_layer_shape)

        outputs = (context_layer, attention_probs) if output_attentions else (context_layer,)

        if self.is_decoder:
            outputs = outputs + (past_key_value,)
        return outputs

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.init(config, position_embedding_type=None)` ¶

Initializes the ErnieMSelfAttention class.

PARAMETER	DESCRIPTION
`self`	The object itself.
`config`	An object containing configuration parameters for the self-attention mechanism. TYPE: `object`
`position_embedding_type`	The type of position embedding to use. Defaults to None. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`ValueError`	If the hidden size is not a multiple of the number of attention heads.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def __init__(self, config, position_embedding_type=None):
    """
    Initializes the ErnieMSelfAttention class.

    Args:
        self: The object itself.
        config (object): An object containing configuration parameters for the self-attention mechanism.
        position_embedding_type (str, optional): The type of position embedding to use. Defaults to None.

    Returns:
        None.

    Raises:
        ValueError: If the hidden size is not a multiple of the number of attention heads.
    """
    super().__init__()
    if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config, "embedding_size"):
        raise ValueError(
            f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
            f"heads ({config.num_attention_heads})"
        )

    self.num_attention_heads = config.num_attention_heads
    self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
    self.all_head_size = self.num_attention_heads * self.attention_head_size

    self.q_proj = nn.Linear(config.hidden_size, self.all_head_size)
    self.k_proj = nn.Linear(config.hidden_size, self.all_head_size)
    self.v_proj = nn.Linear(config.hidden_size, self.all_head_size)

    self.dropout = nn.Dropout(p=config.attention_probs_dropout_prob)
    self.position_embedding_type = position_embedding_type or getattr(
        config, "position_embedding_type", "absolute"
    )
    if self.position_embedding_type in ('relative_key', 'relative_key_query'):
        self.max_position_embeddings = config.max_position_embeddings
        self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1, self.attention_head_size)

    self.is_decoder = config.is_decoder

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

This method forwards the self-attention mechanism for the ErnieMSelfAttention class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`hidden_states`	The input tensor representing the hidden states. TYPE: `Tensor`
`attention_mask`	Optional tensor for masking attention scores. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	Optional tensor for masking attention heads. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_hidden_states`	Optional tensor representing hidden states from an encoder. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_attention_mask`	Optional tensor for masking encoder attention scores. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	Optional tuple of past key and value tensors. Defaults to None. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Flag indicating whether to output attentions. Defaults to False. TYPE: `Optional[bool]` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the context layer tensor and optionally the attention probabilities tensor.

RAISES	DESCRIPTION
`ValueError`	If the input tensor shapes are incompatible for matrix multiplication.
`ValueError`	If the position_embedding_type specified is not supported.
`RuntimeError`	If there is an issue with applying softmax or dropout operations.
`RuntimeError`	If there is an issue with reshaping the context layer tensor.

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    encoder_hidden_states: Optional[mindspore.Tensor] = None,
    encoder_attention_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
) -> Tuple[mindspore.Tensor]:
    """
    This method forwards the self-attention mechanism for the ErnieMSelfAttention class.

    Args:
        self: The instance of the class.
        hidden_states (mindspore.Tensor): The input tensor representing the hidden states.
        attention_mask (Optional[mindspore.Tensor]):
            Optional tensor for masking attention scores. Defaults to None.
        head_mask (Optional[mindspore.Tensor]): Optional tensor for masking attention heads. Defaults to None.
        encoder_hidden_states (Optional[mindspore.Tensor]):
            Optional tensor representing hidden states from an encoder. Defaults to None.
        encoder_attention_mask (Optional[mindspore.Tensor]):
            Optional tensor for masking encoder attention scores. Defaults to None.
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]):
            Optional tuple of past key and value tensors. Defaults to None.
        output_attentions (Optional[bool]):
            Flag indicating whether to output attentions. Defaults to False.

    Returns:
        Tuple[mindspore.Tensor]:
            A tuple containing the context layer tensor and optionally the attention probabilities tensor.

    Raises:
        ValueError: If the input tensor shapes are incompatible for matrix multiplication.
        ValueError: If the position_embedding_type specified is not supported.
        RuntimeError: If there is an issue with applying softmax or dropout operations.
        RuntimeError: If there is an issue with reshaping the context layer tensor.
    """
    mixed_query_layer = self.q_proj(hidden_states)

    # If this is instantiated as a cross-attention module, the keys
    # and values come from an encoder; the attention mask needs to be
    # such that the encoder's padding tokens are not attended to.
    is_cross_attention = encoder_hidden_states is not None

    if is_cross_attention and past_key_value is not None:
        # reuse k,v, cross_attentions
        key_layer = past_key_value[0]
        value_layer = past_key_value[1]
        attention_mask = encoder_attention_mask
    elif is_cross_attention:
        key_layer = self.transpose_for_scores(self.k_proj(encoder_hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(encoder_hidden_states))
        attention_mask = encoder_attention_mask
    elif past_key_value is not None:
        key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(hidden_states))
        key_layer = ops.cat([past_key_value[0], key_layer], dim=2)
        value_layer = ops.cat([past_key_value[1], value_layer], dim=2)
    else:
        key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(hidden_states))

    query_layer = self.transpose_for_scores(mixed_query_layer)

    use_cache = past_key_value is not None
    if self.is_decoder:
        # if cross_attention save Tuple(mindspore.Tensor, mindspore.Tensor) of all cross attention key/value_states.
        # Further calls to cross_attention layer can then reuse all cross-attention
        # key/value_states (first "if" case)
        # if uni-directional self-attention (decoder) save Tuple(mindspore.Tensor, mindspore.Tensor) of
        # all previous decoder key/value_states. Further calls to uni-directional self-attention
        # can concat previous decoder key/value_states to current projected key/value_states (third "elif" case)
        # if encoder bi-directional self-attention `past_key_value` is always `None`
        past_key_value = (key_layer, value_layer)

    # Take the dot product between "query" and "key" to get the raw attention scores.
    attention_scores = ops.matmul(query_layer, key_layer.swapaxes(-1, -2))

    if self.position_embedding_type in ('relative_key', 'relative_key_query'):
        query_length, key_length = query_layer.shape[2], key_layer.shape[2]
        if use_cache:
            position_ids_l = mindspore.tensor(key_length - 1, dtype=mindspore.int64).view(
                -1, 1
            )
        else:
            position_ids_l = ops.arange(query_length, dtype=mindspore.int64).view(-1, 1)
        position_ids_r = ops.arange(key_length, dtype=mindspore.int64).view(1, -1)
        distance = position_ids_l - position_ids_r

        positional_embedding = self.distance_embedding(distance + self.max_position_embeddings - 1)
        positional_embedding = positional_embedding.to(dtype=query_layer.dtype)  # fp16 compatibility

        if self.position_embedding_type == "relative_key":
            relative_position_scores = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
            attention_scores = attention_scores + relative_position_scores
        elif self.position_embedding_type == "relative_key_query":
            relative_position_scores_query = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
            relative_position_scores_key = ops.einsum("bhrd,lrd->bhlr", key_layer, positional_embedding)
            attention_scores = attention_scores + relative_position_scores_query + relative_position_scores_key

    attention_scores = attention_scores / math.sqrt(self.attention_head_size)
    if attention_mask is not None:
        # Apply the attention mask is (precomputed for all layers in ErnieMModel forward() function)
        attention_scores = attention_scores + attention_mask

    # Normalize the attention scores to probabilities.
    attention_probs = ops.softmax(attention_scores, dim=-1)

    # This is actually dropping out entire tokens to attend to, which might
    # seem a bit unusual, but is taken from the original Transformer paper.
    attention_probs = self.dropout(attention_probs)

    # Mask heads if we want to
    if head_mask is not None:
        attention_probs = attention_probs * head_mask

    context_layer = ops.matmul(attention_probs, value_layer)

    context_layer = context_layer.permute(0, 2, 1, 3)
    new_context_layer_shape = context_layer.shape[:-2] + (self.all_head_size,)
    context_layer = context_layer.view(new_context_layer_shape)

    outputs = (context_layer, attention_probs) if output_attentions else (context_layer,)

    if self.is_decoder:
        outputs = outputs + (past_key_value,)
    return outputs

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.transpose_for_scores(x)` ¶

Transposes the input tensor for calculating attention scores in the ErnieMSelfAttention class.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMSelfAttention class. TYPE: `ErnieMSelfAttention`
`x`	The input tensor to be transposed. It should have a shape of (batch_size, sequence_length, hidden_size). TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: The transposed tensor with shape (batch_size, num_attention_heads, sequence_length, attention_head_size).

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def transpose_for_scores(self, x: mindspore.Tensor) -> mindspore.Tensor:
    """
    Transposes the input tensor for calculating attention scores in the ErnieMSelfAttention class.

    Args:
        self (ErnieMSelfAttention): The instance of the ErnieMSelfAttention class.
        x (mindspore.Tensor): The input tensor to be transposed.
            It should have a shape of (batch_size, sequence_length, hidden_size).

    Returns:
        mindspore.Tensor:
            The transposed tensor with shape (batch_size, num_attention_heads, sequence_length, attention_head_size).

    Raises:
        None.
    """
    new_x_shape = x.shape[:-1] + (self.num_attention_heads, self.attention_head_size)
    x = x.view(new_x_shape)
    return x.permute(0, 2, 1, 3)

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM` ¶

Bases: ErnieMForInformationExtraction

UIEM model

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

class UIEM(ErnieMForInformationExtraction):
    """UIEM model"""
    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = True,
    ) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
                not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
                taken into account for computing the loss.
        """
        result = self.ernie_m(
            input_ids,
            # attention_mask=attention_mask,
            position_ids=position_ids,
            # head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )
        if return_dict:
            sequence_output = result.last_hidden_state
        elif not return_dict:
            sequence_output = result[0]

        start_logits = self.linear_start(sequence_output)
        start_logits = start_logits.squeeze(-1)
        start_prob = self.sigmoid(start_logits)
        end_logits = self.linear_end(sequence_output)
        end_logits = end_logits.squeeze(-1)
        end_prob = self.sigmoid(end_logits)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = F.binary_cross_entropy_with_logits(start_prob, start_positions)
            end_loss = F.binary_cross_entropy_with_logits(end_prob, end_positions)
            total_loss = (start_loss + end_loss) / 2

        if not return_dict:
            return tuple(
                i
                for i in [total_loss, start_prob, end_prob, result.hidden_states, result.attentions]
                if i is not None
            )

        return QuestionAnsweringModelOutput(
            loss=total_loss,
            start_logits=start_prob,
            end_logits=end_prob,
            hidden_states=result.hidden_states,
            attentions=result.attentions,
        )

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) for computing the start_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = True,
) -> Union[Tuple[mindspore.Tensor], QuestionAnsweringModelOutput]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
            not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
            taken into account for computing the loss.
    """
    result = self.ernie_m(
        input_ids,
        # attention_mask=attention_mask,
        position_ids=position_ids,
        # head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )
    if return_dict:
        sequence_output = result.last_hidden_state
    elif not return_dict:
        sequence_output = result[0]

    start_logits = self.linear_start(sequence_output)
    start_logits = start_logits.squeeze(-1)
    start_prob = self.sigmoid(start_logits)
    end_logits = self.linear_end(sequence_output)
    end_logits = end_logits.squeeze(-1)
    end_prob = self.sigmoid(end_logits)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = F.binary_cross_entropy_with_logits(start_prob, start_positions)
        end_loss = F.binary_cross_entropy_with_logits(end_prob, end_positions)
        total_loss = (start_loss + end_loss) / 2

    if not return_dict:
        return tuple(
            i
            for i in [total_loss, start_prob, end_prob, result.hidden_states, result.attentions]
            if i is not None
        )

    return QuestionAnsweringModelOutput(
        loss=total_loss,
        start_logits=start_prob,
        end_logits=end_prob,
        hidden_states=result.hidden_states,
        attentions=result.attentions,
    )

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m` ¶

MindSpore ErnieM model.

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention` ¶

Bases: Module

This class represents an attention module for MSErnieM model, which includes self-attention mechanism and projection layers. It inherits from nn.Module and provides methods to initialize the attention module, prune attention heads, and perform attention computation. The attention module consists of self-attention mechanism with configurable position embedding type and projection layers for output transformation. The 'prune_heads' method allows pruning specific attention heads based on provided indices. The 'forward' method computes the attention output given input hidden states, optional masks, and other optional inputs.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMAttention(nn.Module):

    """
    This class represents an attention module for MSErnieM model, which includes self-attention mechanism and projection
    layers.
    It inherits from nn.Module and provides methods to initialize the attention module, prune attention heads, and perform
    attention computation.
    The attention module consists of self-attention mechanism with configurable position embedding type and projection
    layers for output transformation.
    The 'prune_heads' method allows pruning specific attention heads based on provided indices.
    The 'forward' method computes the attention output given input hidden states, optional masks, and other optional
    inputs.
    """
    def __init__(self, config, position_embedding_type=None):
        """
        Initializes an instance of the MSErnieMAttention class.

        Args:
            self: The instance of the class.
            config (object): An object that contains the configuration settings for the attention layer.
            position_embedding_type (str, optional): The type of position embedding to use. Defaults to None.

        Returns:
            None

        Raises:
            None
        """
        super().__init__()
        self.self_attn = MSErnieMSelfAttention(config, position_embedding_type=position_embedding_type)
        self.out_proj = nn.Linear(config.hidden_size, config.hidden_size)
        self.pruned_heads = set()

    def prune_heads(self, heads):
        """
        This method 'prune_heads' in the class 'MSErnieMAttention' prunes heads from the attention mechanism.

        Args:
            self (object): The instance of the class.
            heads (list): A list of integers representing the indices of heads to be pruned from the attention mechanism.

        Returns:
            None: This method does not return anything explicitly, as it operates by mutating the internal state of the class.

        Raises:
            ValueError: If the length of the 'heads' list is equal to 0.
            TypeError: If the 'heads' parameter is not a list of integers.
            IndexError: If the indices in 'heads' exceed the available attention heads in the mechanism.
        """
        if len(heads) == 0:
            return
        heads, index = find_pruneable_heads_and_indices(
            heads, self.self_attn.num_attention_heads, self.self_attn.attention_head_size, self.pruned_heads
        )

        # Prune linear layers
        self.self_attn.q_proj = prune_linear_layer(self.self_attn.q_proj, index)
        self.self_attn.k_proj = prune_linear_layer(self.self_attn.k_proj, index)
        self.self_attn.v_proj = prune_linear_layer(self.self_attn.v_proj, index)
        self.out_proj = prune_linear_layer(self.out_proj, index, dim=1)

        # Update hyper params and store pruned heads
        self.self_attn.num_attention_heads = self.self_attn.num_attention_heads - len(heads)
        self.self_attn.all_head_size = self.self_attn.attention_head_size * self.self_attn.num_attention_heads
        self.pruned_heads = self.pruned_heads.union(heads)

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        encoder_hidden_states: Optional[mindspore.Tensor] = None,
        encoder_attention_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
    ) -> Tuple[mindspore.Tensor]:
        """
        Constructs the MSErnieMAttention module.

        Args:
            self (MSErnieMAttention): The instance of the MSErnieMAttention class.
            hidden_states (mindspore.Tensor): The input hidden states of the model.
                Shape: (batch_size, seq_length, hidden_size).
            attention_mask (Optional[mindspore.Tensor], optional):
                The attention mask tensor, indicating which tokens should be attended to and which should not.
                Shape: (batch_size, seq_length). Defaults to None.
            head_mask (Optional[mindspore.Tensor], optional):
                The head mask tensor, indicating which heads should be masked out.
                Shape: (num_heads, seq_length, seq_length). Defaults to None.
            encoder_hidden_states (Optional[mindspore.Tensor], optional):
                The hidden states of the encoder. Shape: (batch_size, seq_length, hidden_size). Defaults to None.
            encoder_attention_mask (Optional[mindspore.Tensor], optional):
                The attention mask tensor for the encoder, indicating which tokens should be attended to and which
                should not. Shape: (batch_size, seq_length). Defaults to None.
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]], optional):
                The tuple of past key and value tensors for keeping the previous attention weights.
                Shape: ((batch_size, num_heads, seq_length, hidden_size),
                (batch_size, num_heads, seq_length, hidden_size)). Defaults to None.
            output_attentions (Optional[bool], optional): Whether to output attention weights. Defaults to False.

        Returns:
            Tuple[mindspore.Tensor]: A tuple containing the attention output tensor and other optional outputs.

        Raises:
            None.
        """
        self_outputs = self.self_attn(
            hidden_states,
            attention_mask,
            head_mask,
            encoder_hidden_states,
            encoder_attention_mask,
            past_key_value,
            output_attentions,
        )
        attention_output = self.out_proj(self_outputs[0])
        outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them
        return outputs

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.init(config, position_embedding_type=None)` ¶

Initializes an instance of the MSErnieMAttention class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An object that contains the configuration settings for the attention layer. TYPE: `object`
`position_embedding_type`	The type of position embedding to use. Defaults to None. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
	None

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config, position_embedding_type=None):
    """
    Initializes an instance of the MSErnieMAttention class.

    Args:
        self: The instance of the class.
        config (object): An object that contains the configuration settings for the attention layer.
        position_embedding_type (str, optional): The type of position embedding to use. Defaults to None.

    Returns:
        None

    Raises:
        None
    """
    super().__init__()
    self.self_attn = MSErnieMSelfAttention(config, position_embedding_type=position_embedding_type)
    self.out_proj = nn.Linear(config.hidden_size, config.hidden_size)
    self.pruned_heads = set()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

Constructs the MSErnieMAttention module.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMAttention class. TYPE: `MSErnieMAttention`
`hidden_states`	The input hidden states of the model. Shape: (batch_size, seq_length, hidden_size). TYPE: `Tensor`
`attention_mask`	The attention mask tensor, indicating which tokens should be attended to and which should not. Shape: (batch_size, seq_length). Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The head mask tensor, indicating which heads should be masked out. Shape: (num_heads, seq_length, seq_length). Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_hidden_states`	The hidden states of the encoder. Shape: (batch_size, seq_length, hidden_size). Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_attention_mask`	The attention mask tensor for the encoder, indicating which tokens should be attended to and which should not. Shape: (batch_size, seq_length). Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	The tuple of past key and value tensors for keeping the previous attention weights. Shape: ((batch_size, num_heads, seq_length, hidden_size), (batch_size, num_heads, seq_length, hidden_size)). Defaults to None. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Whether to output attention weights. Defaults to False. TYPE: `Optional[bool]` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the attention output tensor and other optional outputs.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    encoder_hidden_states: Optional[mindspore.Tensor] = None,
    encoder_attention_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
) -> Tuple[mindspore.Tensor]:
    """
    Constructs the MSErnieMAttention module.

    Args:
        self (MSErnieMAttention): The instance of the MSErnieMAttention class.
        hidden_states (mindspore.Tensor): The input hidden states of the model.
            Shape: (batch_size, seq_length, hidden_size).
        attention_mask (Optional[mindspore.Tensor], optional):
            The attention mask tensor, indicating which tokens should be attended to and which should not.
            Shape: (batch_size, seq_length). Defaults to None.
        head_mask (Optional[mindspore.Tensor], optional):
            The head mask tensor, indicating which heads should be masked out.
            Shape: (num_heads, seq_length, seq_length). Defaults to None.
        encoder_hidden_states (Optional[mindspore.Tensor], optional):
            The hidden states of the encoder. Shape: (batch_size, seq_length, hidden_size). Defaults to None.
        encoder_attention_mask (Optional[mindspore.Tensor], optional):
            The attention mask tensor for the encoder, indicating which tokens should be attended to and which
            should not. Shape: (batch_size, seq_length). Defaults to None.
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]], optional):
            The tuple of past key and value tensors for keeping the previous attention weights.
            Shape: ((batch_size, num_heads, seq_length, hidden_size),
            (batch_size, num_heads, seq_length, hidden_size)). Defaults to None.
        output_attentions (Optional[bool], optional): Whether to output attention weights. Defaults to False.

    Returns:
        Tuple[mindspore.Tensor]: A tuple containing the attention output tensor and other optional outputs.

    Raises:
        None.
    """
    self_outputs = self.self_attn(
        hidden_states,
        attention_mask,
        head_mask,
        encoder_hidden_states,
        encoder_attention_mask,
        past_key_value,
        output_attentions,
    )
    attention_output = self.out_proj(self_outputs[0])
    outputs = (attention_output,) + self_outputs[1:]  # add attentions if we output them
    return outputs

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.prune_heads(heads)` ¶

This method 'prune_heads' in the class 'MSErnieMAttention' prunes heads from the attention mechanism.

PARAMETER	DESCRIPTION
`self`	The instance of the class. TYPE: `object`
`heads`	A list of integers representing the indices of heads to be pruned from the attention mechanism. TYPE: `list`

RETURNS	DESCRIPTION
`None`	This method does not return anything explicitly, as it operates by mutating the internal state of the class.

RAISES	DESCRIPTION
`ValueError`	If the length of the 'heads' list is equal to 0.
`TypeError`	If the 'heads' parameter is not a list of integers.
`IndexError`	If the indices in 'heads' exceed the available attention heads in the mechanism.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def prune_heads(self, heads):
    """
    This method 'prune_heads' in the class 'MSErnieMAttention' prunes heads from the attention mechanism.

    Args:
        self (object): The instance of the class.
        heads (list): A list of integers representing the indices of heads to be pruned from the attention mechanism.

    Returns:
        None: This method does not return anything explicitly, as it operates by mutating the internal state of the class.

    Raises:
        ValueError: If the length of the 'heads' list is equal to 0.
        TypeError: If the 'heads' parameter is not a list of integers.
        IndexError: If the indices in 'heads' exceed the available attention heads in the mechanism.
    """
    if len(heads) == 0:
        return
    heads, index = find_pruneable_heads_and_indices(
        heads, self.self_attn.num_attention_heads, self.self_attn.attention_head_size, self.pruned_heads
    )

    # Prune linear layers
    self.self_attn.q_proj = prune_linear_layer(self.self_attn.q_proj, index)
    self.self_attn.k_proj = prune_linear_layer(self.self_attn.k_proj, index)
    self.self_attn.v_proj = prune_linear_layer(self.self_attn.v_proj, index)
    self.out_proj = prune_linear_layer(self.out_proj, index, dim=1)

    # Update hyper params and store pruned heads
    self.self_attn.num_attention_heads = self.self_attn.num_attention_heads - len(heads)
    self.self_attn.all_head_size = self.self_attn.attention_head_size * self.self_attn.num_attention_heads
    self.pruned_heads = self.pruned_heads.union(heads)

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings` ¶

Bases: Module

Construct the embeddings from word and position embeddings.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMEmbeddings(nn.Module):
    """Construct the embeddings from word and position embeddings."""
    def __init__(self, config):
        """
        Initializes an instance of the MSErnieMEmbeddings class.

        Args:
            self: The object instance.
            config (object):
                A configuration object containing various parameters.

                - hidden_size (int): The size of the hidden state.
                - vocab_size (int): The size of the vocabulary.
                - pad_token_id (int): The ID of the padding token.
                - max_position_embeddings (int): The maximum number of positional embeddings.
                - layer_norm_eps (float): The epsilon value for layer normalization.
                - hidden_dropout_prob (float): The dropout probability for the hidden state.

        Returns:
            None

        Raises:
            None
        """
        super().__init__()
        self.hidden_size = config.hidden_size
        self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size, padding_idx=config.pad_token_id)
        self.position_embeddings = nn.Embedding(
            config.max_position_embeddings, config.hidden_size, padding_idx=config.pad_token_id
        )
        self.layer_norm = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.dropout = nn.Dropout(p=config.hidden_dropout_prob)
        self.padding_idx = config.pad_token_id

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values_length: int = 0,
    ) -> mindspore.Tensor:
        """
        Constructs the embeddings for MSErnieM model.

        Args:
            self (MSErnieMEmbeddings): The MSErnieMEmbeddings instance.
            input_ids (Optional[mindspore.Tensor]):
                The input tensor containing the indices of input tokens. Default is None.
            position_ids (Optional[mindspore.Tensor]):
                The input tensor containing the indices of position tokens. Default is None.
            inputs_embeds (Optional[mindspore.Tensor]):
                The input tensor containing the embeddings of input tokens. Default is None.
            past_key_values_length (int): The length of past key values. Default is 0.

        Returns:
            mindspore.Tensor: The forwarded embeddings tensor.

        Raises:
            ValueError: If the input_ids and inputs_embeds are both None.
            ValueError: If the input_shape is invalid for position_ids calculation.
            ValueError: If past_key_values_length is negative.
        """
        if inputs_embeds is None:
            inputs_embeds = self.word_embeddings(input_ids)
        if position_ids is None:
            input_shape = inputs_embeds.shape[:-1]
            ones = ops.ones(input_shape, dtype=mindspore.int64)
            seq_length = ops.cumsum(ones, dim=1)
            position_ids = seq_length - ones

            if past_key_values_length > 0:
                position_ids = position_ids + past_key_values_length
        # to mimic paddlenlp implementation
        position_ids += 2
        position_embeddings = self.position_embeddings(position_ids)
        embeddings = inputs_embeds + position_embeddings
        embeddings = self.layer_norm(embeddings)
        embeddings = self.dropout(embeddings)

        return embeddings

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.init(config)` ¶

Initializes an instance of the MSErnieMEmbeddings class.

PARAMETER DESCRIPTION

self

The object instance.

config

A configuration object containing various parameters.

hidden_size (int): The size of the hidden state.
vocab_size (int): The size of the vocabulary.
pad_token_id (int): The ID of the padding token.
max_position_embeddings (int): The maximum number of positional embeddings.
layer_norm_eps (float): The epsilon value for layer normalization.
hidden_dropout_prob (float): The dropout probability for the hidden state.

TYPE: object

RETURNS	DESCRIPTION
	None

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the MSErnieMEmbeddings class.

    Args:
        self: The object instance.
        config (object):
            A configuration object containing various parameters.

            - hidden_size (int): The size of the hidden state.
            - vocab_size (int): The size of the vocabulary.
            - pad_token_id (int): The ID of the padding token.
            - max_position_embeddings (int): The maximum number of positional embeddings.
            - layer_norm_eps (float): The epsilon value for layer normalization.
            - hidden_dropout_prob (float): The dropout probability for the hidden state.

    Returns:
        None

    Raises:
        None
    """
    super().__init__()
    self.hidden_size = config.hidden_size
    self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size, padding_idx=config.pad_token_id)
    self.position_embeddings = nn.Embedding(
        config.max_position_embeddings, config.hidden_size, padding_idx=config.pad_token_id
    )
    self.layer_norm = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.dropout = nn.Dropout(p=config.hidden_dropout_prob)
    self.padding_idx = config.pad_token_id

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0)` ¶

Constructs the embeddings for MSErnieM model.

PARAMETER	DESCRIPTION
`self`	The MSErnieMEmbeddings instance. TYPE: `MSErnieMEmbeddings`
`input_ids`	The input tensor containing the indices of input tokens. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`position_ids`	The input tensor containing the indices of position tokens. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`inputs_embeds`	The input tensor containing the embeddings of input tokens. Default is None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values_length`	The length of past key values. Default is 0. TYPE: `int` DEFAULT: `0`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: The forwarded embeddings tensor.

RAISES	DESCRIPTION
`ValueError`	If the input_ids and inputs_embeds are both None.
`ValueError`	If the input_shape is invalid for position_ids calculation.
`ValueError`	If past_key_values_length is negative.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values_length: int = 0,
) -> mindspore.Tensor:
    """
    Constructs the embeddings for MSErnieM model.

    Args:
        self (MSErnieMEmbeddings): The MSErnieMEmbeddings instance.
        input_ids (Optional[mindspore.Tensor]):
            The input tensor containing the indices of input tokens. Default is None.
        position_ids (Optional[mindspore.Tensor]):
            The input tensor containing the indices of position tokens. Default is None.
        inputs_embeds (Optional[mindspore.Tensor]):
            The input tensor containing the embeddings of input tokens. Default is None.
        past_key_values_length (int): The length of past key values. Default is 0.

    Returns:
        mindspore.Tensor: The forwarded embeddings tensor.

    Raises:
        ValueError: If the input_ids and inputs_embeds are both None.
        ValueError: If the input_shape is invalid for position_ids calculation.
        ValueError: If past_key_values_length is negative.
    """
    if inputs_embeds is None:
        inputs_embeds = self.word_embeddings(input_ids)
    if position_ids is None:
        input_shape = inputs_embeds.shape[:-1]
        ones = ops.ones(input_shape, dtype=mindspore.int64)
        seq_length = ops.cumsum(ones, dim=1)
        position_ids = seq_length - ones

        if past_key_values_length > 0:
            position_ids = position_ids + past_key_values_length
    # to mimic paddlenlp implementation
    position_ids += 2
    position_embeddings = self.position_embeddings(position_ids)
    embeddings = inputs_embeds + position_embeddings
    embeddings = self.layer_norm(embeddings)
    embeddings = self.dropout(embeddings)

    return embeddings

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder` ¶

Bases: Module

This class represents an MSErnieMEncoder, which is a multi-layer transformer-based encoder model for natural language processing tasks.

The MSErnieMEncoder inherits from the nn.Module class and is designed to process input embeddings and generate hidden states, attentions, and last hidden state output.

ATTRIBUTE	DESCRIPTION
`config`	The configuration object that contains the model's hyperparameters and settings. TYPE: `object`
`layers`	A list of MSErnieMEncoderLayer instances that make up the layers of the encoder. TYPE: `ModuleList`

METHOD DESCRIPTION

__init__

Initializes a new MSErnieMEncoder instance with the given configuration.

forward

Constructs the MSErnieMEncoder model by processing the input embeddings and generating the desired outputs.

Args:

input_embeds (mindspore.Tensor): The input embeddings for the model.
attention_mask (Optional[mindspore.Tensor], optional): The attention mask tensor to mask certain positions. Defaults to None.
head_mask (Optional[mindspore.Tensor], optional): The head mask tensor to mask certain heads. Defaults to None.
past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]], optional): The cached key-value tensors from previous decoding steps. Defaults to None.
output_attentions (Optional[bool], optional): Whether to output attention weights. Defaults to False.
output_hidden_states (Optional[bool], optional): Whether to output hidden states. Defaults to False.

Returns:

Tuple[mindspore.Tensor]: A tuple containing the last hidden state, hidden states, and attentions (if enabled).

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMEncoder(nn.Module):

    """
    This class represents an MSErnieMEncoder, which is a multi-layer transformer-based encoder model for
    natural language processing tasks.

    The MSErnieMEncoder inherits from the nn.Module class and is designed to process input embeddings and generate
    hidden states, attentions, and last hidden state output.

    Attributes:
        config (object): The configuration object that contains the model's hyperparameters and settings.
        layers (nn.ModuleList): A list of MSErnieMEncoderLayer instances that make up the layers of the encoder.

    Methods:
        __init__(self, config):
            Initializes a new MSErnieMEncoder instance with the given configuration.

        forward(self, input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False):
            Constructs the MSErnieMEncoder model by processing the input embeddings and generating the desired outputs.

            Args:

            - input_embeds (mindspore.Tensor): The input embeddings for the model.
            - attention_mask (Optional[mindspore.Tensor], optional): The attention mask tensor to mask
            certain positions. Defaults to None.
            - head_mask (Optional[mindspore.Tensor], optional): The head mask tensor to mask certain heads.
            Defaults to None.
            - past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]], optional): The cached key-value tensors
            from previous decoding steps. Defaults to None.
            - output_attentions (Optional[bool], optional): Whether to output attention weights. Defaults to False.
            - output_hidden_states (Optional[bool], optional): Whether to output hidden states. Defaults to False.

            Returns:

            - Tuple[mindspore.Tensor]: A tuple containing the last hidden state, hidden states, and attentions (if enabled).

        """
    def __init__(self, config):
        """
        Initializes the MSErnieMEncoder class.

        Args:
            self: The object itself.
            config (object): An object containing the configuration parameters for the MSErnieMEncoder.
                The config object should have the following attributes:

                - num_hidden_layers (int): The number of hidden layers in the encoder.
                - other attributes specific to the MSErnieMEncoderLayer.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        self.config = config
        self.layers = nn.ModuleList([MSErnieMEncoderLayer(config) for _ in range(config.num_hidden_layers)])

    def forward(
        self,
        input_embeds: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
        output_hidden_states: Optional[bool] = False,
    ) -> Tuple[mindspore.Tensor]:
        """
        This method forwards the MSErnieMEncoder by processing the input embeddings and applying attention masks and
        head masks if provided.

        Args:
            self: The instance of the MSErnieMEncoder class.
            input_embeds (mindspore.Tensor): The input embeddings to be processed by the encoder.
            attention_mask (Optional[mindspore.Tensor]): An optional tensor representing the attention mask.
                If provided, it restricts the attention of the encoder.
            head_mask (Optional[mindspore.Tensor]): An optional tensor representing the head mask.
                If provided, it restricts the attention heads of the encoder.
            past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): An optional tuple of past key values,
                if provided, it allows the encoder to reuse previously computed key value states.
            output_attentions (Optional[bool]): An optional boolean indicating whether to output attentions.
                Default is False.
            output_hidden_states (Optional[bool]): An optional boolean indicating whether to output hidden states.
                Default is False.

        Returns:
            Tuple[mindspore.Tensor]: A tuple containing the processed output tensor.

        Raises:
            ValueError: If the input_embeds parameter is not of type mindspore.Tensor.
            ValueError: If the attention_mask parameter is not of type Optional[mindspore.Tensor].
            ValueError: If the head_mask parameter is not of type Optional[mindspore.Tensor].
            ValueError: If the past_key_values parameter is not of type Optional[Tuple[Tuple[mindspore.Tensor]]].
            ValueError: If the output_attentions parameter is not of type Optional[bool].
            ValueError: If the output_hidden_states parameter is not of type Optional[bool].
        """
        hidden_states = () if output_hidden_states else None
        attentions = () if output_attentions else None

        output = input_embeds
        if output_hidden_states:
            hidden_states = hidden_states + (output,)
        for i, layer in enumerate(self.layers):
            layer_head_mask = head_mask[i] if head_mask is not None else None
            past_key_value = past_key_values[i] if past_key_values is not None else None

            output, opt_attn_weights = layer(
                hidden_states=output,
                attention_mask=attention_mask,
                head_mask=layer_head_mask,
                past_key_value=past_key_value,
            )

            if output_hidden_states:
                hidden_states = hidden_states + (output,)
            if output_attentions:
                attentions = attentions + (opt_attn_weights,)

        last_hidden_state = output
        return tuple(v for v in [last_hidden_state, hidden_states, attentions] if v is not None)

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.init(config)` ¶

Initializes the MSErnieMEncoder class.

PARAMETER	DESCRIPTION
`self`	The object itself.
`config`	An object containing the configuration parameters for the MSErnieMEncoder. The config object should have the following attributes: num_hidden_layers (int): The number of hidden layers in the encoder. other attributes specific to the MSErnieMEncoderLayer. TYPE: `object`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes the MSErnieMEncoder class.

    Args:
        self: The object itself.
        config (object): An object containing the configuration parameters for the MSErnieMEncoder.
            The config object should have the following attributes:

            - num_hidden_layers (int): The number of hidden layers in the encoder.
            - other attributes specific to the MSErnieMEncoderLayer.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    self.config = config
    self.layers = nn.ModuleList([MSErnieMEncoderLayer(config) for _ in range(config.num_hidden_layers)])

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False)` ¶

This method forwards the MSErnieMEncoder by processing the input embeddings and applying attention masks and head masks if provided.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMEncoder class.
`input_embeds`	The input embeddings to be processed by the encoder. TYPE: `Tensor`
`attention_mask`	An optional tensor representing the attention mask. If provided, it restricts the attention of the encoder. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	An optional tensor representing the head mask. If provided, it restricts the attention heads of the encoder. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values`	An optional tuple of past key values, if provided, it allows the encoder to reuse previously computed key value states. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	An optional boolean indicating whether to output attentions. Default is False. TYPE: `Optional[bool]` DEFAULT: `False`
`output_hidden_states`	An optional boolean indicating whether to output hidden states. Default is False. TYPE: `Optional[bool]` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the processed output tensor.

RAISES	DESCRIPTION
`ValueError`	If the input_embeds parameter is not of type mindspore.Tensor.
`ValueError`	If the attention_mask parameter is not of type Optional[mindspore.Tensor].
`ValueError`	If the head_mask parameter is not of type Optional[mindspore.Tensor].
`ValueError`	If the past_key_values parameter is not of type Optional[Tuple[Tuple[mindspore.Tensor]]].
`ValueError`	If the output_attentions parameter is not of type Optional[bool].
`ValueError`	If the output_hidden_states parameter is not of type Optional[bool].

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_embeds: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
    output_hidden_states: Optional[bool] = False,
) -> Tuple[mindspore.Tensor]:
    """
    This method forwards the MSErnieMEncoder by processing the input embeddings and applying attention masks and
    head masks if provided.

    Args:
        self: The instance of the MSErnieMEncoder class.
        input_embeds (mindspore.Tensor): The input embeddings to be processed by the encoder.
        attention_mask (Optional[mindspore.Tensor]): An optional tensor representing the attention mask.
            If provided, it restricts the attention of the encoder.
        head_mask (Optional[mindspore.Tensor]): An optional tensor representing the head mask.
            If provided, it restricts the attention heads of the encoder.
        past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]): An optional tuple of past key values,
            if provided, it allows the encoder to reuse previously computed key value states.
        output_attentions (Optional[bool]): An optional boolean indicating whether to output attentions.
            Default is False.
        output_hidden_states (Optional[bool]): An optional boolean indicating whether to output hidden states.
            Default is False.

    Returns:
        Tuple[mindspore.Tensor]: A tuple containing the processed output tensor.

    Raises:
        ValueError: If the input_embeds parameter is not of type mindspore.Tensor.
        ValueError: If the attention_mask parameter is not of type Optional[mindspore.Tensor].
        ValueError: If the head_mask parameter is not of type Optional[mindspore.Tensor].
        ValueError: If the past_key_values parameter is not of type Optional[Tuple[Tuple[mindspore.Tensor]]].
        ValueError: If the output_attentions parameter is not of type Optional[bool].
        ValueError: If the output_hidden_states parameter is not of type Optional[bool].
    """
    hidden_states = () if output_hidden_states else None
    attentions = () if output_attentions else None

    output = input_embeds
    if output_hidden_states:
        hidden_states = hidden_states + (output,)
    for i, layer in enumerate(self.layers):
        layer_head_mask = head_mask[i] if head_mask is not None else None
        past_key_value = past_key_values[i] if past_key_values is not None else None

        output, opt_attn_weights = layer(
            hidden_states=output,
            attention_mask=attention_mask,
            head_mask=layer_head_mask,
            past_key_value=past_key_value,
        )

        if output_hidden_states:
            hidden_states = hidden_states + (output,)
        if output_attentions:
            attentions = attentions + (opt_attn_weights,)

    last_hidden_state = output
    return tuple(v for v in [last_hidden_state, hidden_states, attentions] if v is not None)

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer` ¶

Bases: Module

This class represents an encoder layer for the MSErnieM model. It includes self-attention, linear transformations, dropout, layer normalization, and activation functions for processing input hidden states.

The MSErnieMEncoderLayer class inherits from nn.Module and consists of an init method for initializing the layer's components and a forward method for performing the encoding operations on input tensors.

ATTRIBUTE	DESCRIPTION
`self_attn`	Self-attention mechanism for capturing dependencies within the input hidden states. TYPE: `MSErnieMAttention`
`linear1`	Linear transformation layer from hidden size to intermediate size. TYPE: `Linear`
`dropout`	Dropout layer for regularization during activation functions. TYPE: `Dropout`
`linear2`	Linear transformation layer from intermediate size back to hidden size. TYPE: `Linear`
`norm1`	Layer normalization for normalizing hidden states. TYPE: `LayerNorm`
`norm2`	Layer normalization for normalizing hidden states. TYPE: `LayerNorm`
`dropout1`	Dropout layer for regularization after the first linear transformation. TYPE: `Dropout`
`dropout2`	Dropout layer for regularization after the second linear transformation. TYPE: `Dropout`
`activation`	Activation function applied to the hidden states. TYPE: `function`

METHOD	DESCRIPTION
`__init__`	Constructor method for initializing the encoder layer with provided configuration settings.
`forward`	Method for processing input hidden states through the encoder layer's components.

The forward method performs a series of operations on the input hidden states, including self-attention, linear transformations, activation functions, dropout, and layer normalization. It returns the processed hidden states and optional attention outputs if specified.

Note

The MSErnieMEncoderLayer class is designed to be used within the MSErnieM model architecture for encoding input sequences.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMEncoderLayer(nn.Module):

    """
    This class represents an encoder layer for the MSErnieM model. It includes self-attention, linear transformations,
    dropout, layer normalization, and activation functions for processing input hidden states.

    The MSErnieMEncoderLayer class inherits from nn.Module and consists of an __init__ method for initializing the
    layer's components and a forward method for performing the encoding operations on input tensors.

    Attributes:
        self_attn (MSErnieMAttention): Self-attention mechanism for capturing dependencies within the input hidden states.
        linear1 (nn.Linear): Linear transformation layer from hidden size to intermediate size.
        dropout (nn.Dropout): Dropout layer for regularization during activation functions.
        linear2 (nn.Linear): Linear transformation layer from intermediate size back to hidden size.
        norm1 (nn.LayerNorm): Layer normalization for normalizing hidden states.
        norm2 (nn.LayerNorm): Layer normalization for normalizing hidden states.
        dropout1 (nn.Dropout): Dropout layer for regularization after the first linear transformation.
        dropout2 (nn.Dropout): Dropout layer for regularization after the second linear transformation.
        activation (function): Activation function applied to the hidden states.

    Methods:
        __init__: Constructor method for initializing the encoder layer with provided configuration settings.
        forward: Method for processing input hidden states through the encoder layer's components.

    The forward method performs a series of operations on the input hidden states, including self-attention,
    linear transformations, activation functions, dropout, and layer normalization. It returns the processed hidden
    states and optional attention outputs if specified.

    Note:
        The MSErnieMEncoderLayer class is designed to be used within the MSErnieM model architecture for encoding input sequences.
    """
    def __init__(self, config):
        """
        Initializes a MSErnieMEncoderLayer object with the provided configuration.

        Args:
            self (object): The MSErnieMEncoderLayer instance itself.
            config (object): An object containing configuration parameters for the encoder layer.
                This object should have the following attributes:

                - hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Default is 0.1.
                - act_dropout (float, optional): The dropout probability for the activation layers.
                Default is the value of hidden_dropout_prob.
                - hidden_size (int): The size of the hidden layers.
                - intermediate_size (int): The size of the intermediate layers.
                - layer_norm_eps (float): The epsilon value for layer normalization.
                - hidden_act (str or function): The activation function to use.
                If str, it should be a key in the ACT2FN dictionary.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        # to mimic paddlenlp implementation
        dropout = 0.1 if config.hidden_dropout_prob is None else config.hidden_dropout_prob
        act_dropout = config.hidden_dropout_prob if config.act_dropout is None else config.act_dropout

        self.self_attn = MSErnieMAttention(config)
        self.linear1 = nn.Linear(config.hidden_size, config.intermediate_size)
        self.dropout = nn.Dropout(p=act_dropout)
        self.linear2 = nn.Linear(config.intermediate_size, config.hidden_size)
        self.norm1 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.norm2 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
        self.dropout1 = nn.Dropout(p=dropout)
        self.dropout2 = nn.Dropout(p=dropout)
        if isinstance(config.hidden_act, str):
            self.activation = ACT2FN[config.hidden_act]
        else:
            self.activation = config.hidden_act

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = True,
    ):
        """Constructs the MSErnieMEncoderLayer.

        This method applies the MSErnieMEncoderLayer to the input hidden states.

        Args:
            self (MSErnieMEncoderLayer): The instance of the MSErnieMEncoderLayer class.
            hidden_states (mindspore.Tensor): The input hidden states.
                It is a tensor of shape (batch_size, sequence_length, hidden_size).
            attention_mask (Optional[mindspore.Tensor]): The attention mask tensor.
                It is an optional tensor of shape (batch_size, sequence_length).
            head_mask (Optional[mindspore.Tensor]): The head mask tensor.
                It is an optional tensor of shape (num_heads, sequence_length, sequence_length).
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key-value tensor.
                It is an optional tuple of tuple of tensors.
            output_attentions (Optional[bool]): Whether to return attentions as well. Defaults to True.

        Returns:
            mindspore.Tensor or Tuple[mindspore.Tensor]: The output hidden states.
                If `output_attentions` is True, returns a tuple containing the hidden states and attentions.
                Otherwise, only returns the hidden states.

        Raises:
            None
        """
        residual = hidden_states
        outputs = self.self_attn(
                hidden_states=hidden_states,
                attention_mask=attention_mask,
                head_mask=head_mask,
                past_key_value=past_key_value,
                output_attentions=output_attentions,
            )

        hidden_states = outputs[0]
        hidden_states = residual + self.dropout1(hidden_states)
        hidden_states = self.norm1(hidden_states)
        residual = hidden_states

        hidden_states = self.linear1(hidden_states)
        hidden_states = self.activation(hidden_states)
        hidden_states = self.dropout(hidden_states)
        hidden_states = self.linear2(hidden_states)
        hidden_states = residual + self.dropout2(hidden_states)
        hidden_states = self.norm2(hidden_states)

        if output_attentions:
            return (hidden_states,) + outputs[1:]
        return hidden_states

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.init(config)` ¶

Initializes a MSErnieMEncoderLayer object with the provided configuration.

PARAMETER DESCRIPTION

self

The MSErnieMEncoderLayer instance itself.

TYPE: object

config

An object containing configuration parameters for the encoder layer. This object should have the following attributes:

hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Default is 0.1.
act_dropout (float, optional): The dropout probability for the activation layers. Default is the value of hidden_dropout_prob.
hidden_size (int): The size of the hidden layers.
intermediate_size (int): The size of the intermediate layers.
layer_norm_eps (float): The epsilon value for layer normalization.
hidden_act (str or function): The activation function to use. If str, it should be a key in the ACT2FN dictionary.

TYPE: object

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes a MSErnieMEncoderLayer object with the provided configuration.

    Args:
        self (object): The MSErnieMEncoderLayer instance itself.
        config (object): An object containing configuration parameters for the encoder layer.
            This object should have the following attributes:

            - hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Default is 0.1.
            - act_dropout (float, optional): The dropout probability for the activation layers.
            Default is the value of hidden_dropout_prob.
            - hidden_size (int): The size of the hidden layers.
            - intermediate_size (int): The size of the intermediate layers.
            - layer_norm_eps (float): The epsilon value for layer normalization.
            - hidden_act (str or function): The activation function to use.
            If str, it should be a key in the ACT2FN dictionary.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    # to mimic paddlenlp implementation
    dropout = 0.1 if config.hidden_dropout_prob is None else config.hidden_dropout_prob
    act_dropout = config.hidden_dropout_prob if config.act_dropout is None else config.act_dropout

    self.self_attn = MSErnieMAttention(config)
    self.linear1 = nn.Linear(config.hidden_size, config.intermediate_size)
    self.dropout = nn.Dropout(p=act_dropout)
    self.linear2 = nn.Linear(config.intermediate_size, config.hidden_size)
    self.norm1 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.norm2 = nn.LayerNorm([config.hidden_size], eps=config.layer_norm_eps)
    self.dropout1 = nn.Dropout(p=dropout)
    self.dropout2 = nn.Dropout(p=dropout)
    if isinstance(config.hidden_act, str):
        self.activation = ACT2FN[config.hidden_act]
    else:
        self.activation = config.hidden_act

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True)` ¶

Constructs the MSErnieMEncoderLayer.

This method applies the MSErnieMEncoderLayer to the input hidden states.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMEncoderLayer class. TYPE: `MSErnieMEncoderLayer`
`hidden_states`	The input hidden states. It is a tensor of shape (batch_size, sequence_length, hidden_size). TYPE: `Tensor`
`attention_mask`	The attention mask tensor. It is an optional tensor of shape (batch_size, sequence_length). TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The head mask tensor. It is an optional tensor of shape (num_heads, sequence_length, sequence_length). TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	The past key-value tensor. It is an optional tuple of tuple of tensors. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Whether to return attentions as well. Defaults to True. TYPE: `Optional[bool]` DEFAULT: `True`

RETURNS	DESCRIPTION
	mindspore.Tensor or Tuple[mindspore.Tensor]: The output hidden states. If `output_attentions` is True, returns a tuple containing the hidden states and attentions. Otherwise, only returns the hidden states.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = True,
):
    """Constructs the MSErnieMEncoderLayer.

    This method applies the MSErnieMEncoderLayer to the input hidden states.

    Args:
        self (MSErnieMEncoderLayer): The instance of the MSErnieMEncoderLayer class.
        hidden_states (mindspore.Tensor): The input hidden states.
            It is a tensor of shape (batch_size, sequence_length, hidden_size).
        attention_mask (Optional[mindspore.Tensor]): The attention mask tensor.
            It is an optional tensor of shape (batch_size, sequence_length).
        head_mask (Optional[mindspore.Tensor]): The head mask tensor.
            It is an optional tensor of shape (num_heads, sequence_length, sequence_length).
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]]): The past key-value tensor.
            It is an optional tuple of tuple of tensors.
        output_attentions (Optional[bool]): Whether to return attentions as well. Defaults to True.

    Returns:
        mindspore.Tensor or Tuple[mindspore.Tensor]: The output hidden states.
            If `output_attentions` is True, returns a tuple containing the hidden states and attentions.
            Otherwise, only returns the hidden states.

    Raises:
        None
    """
    residual = hidden_states
    outputs = self.self_attn(
            hidden_states=hidden_states,
            attention_mask=attention_mask,
            head_mask=head_mask,
            past_key_value=past_key_value,
            output_attentions=output_attentions,
        )

    hidden_states = outputs[0]
    hidden_states = residual + self.dropout1(hidden_states)
    hidden_states = self.norm1(hidden_states)
    residual = hidden_states

    hidden_states = self.linear1(hidden_states)
    hidden_states = self.activation(hidden_states)
    hidden_states = self.dropout(hidden_states)
    hidden_states = self.linear2(hidden_states)
    hidden_states = residual + self.dropout2(hidden_states)
    hidden_states = self.norm2(hidden_states)

    if output_attentions:
        return (hidden_states,) + outputs[1:]
    return hidden_states

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction` ¶

Bases: MSErnieMPreTrainedModel

The 'MSErnieMForInformationExtraction' class is a model for information extraction tasks using the MSERNIE-M (multi-lingual) model. It extends the 'MSErnieMPreTrainedModel' class.

This class initializes the MSERNIE-M model and includes methods for forwarding the model for information extraction tasks, such as computing start and end position losses and probabilities. It also provides functionality for calculating the total loss, start probability, and end probability.

The 'MSErnieMForInformationExtraction' class inherits the configuration parameters and methods from 'MSErnieMPreTrainedModel' and extends it to support information extraction tasks. The class is designed to handle input tensors for input_ids, attention_mask, position_ids, head_mask, and inputs_embeds, and provides output in the form of a tuple containing total loss, start probability, end probability, and additional model outputs.

The class is suitable for tasks such as named entity recognition, question answering, and other information extraction tasks where start and end positions within a sequence need to be identified and predicted.

This class is a part of the MindSpore library and is designed to provide a high-level interface for utilizing the MSERNIE-M model for information extraction tasks.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMForInformationExtraction(MSErnieMPreTrainedModel):

    """
    The 'MSErnieMForInformationExtraction' class is a model for information extraction tasks using the MSERNIE-M
    (multi-lingual) model. It extends the 'MSErnieMPreTrainedModel' class.

    This class initializes the MSERNIE-M model and includes methods for forwarding the model for information
    extraction tasks, such as computing start and end position losses and probabilities. It also provides functionality
    for calculating the total loss, start probability, and end probability.

    The 'MSErnieMForInformationExtraction' class inherits the configuration parameters and methods from
    'MSErnieMPreTrainedModel' and extends it to support information extraction tasks. The class is designed to handle
    input tensors for input_ids, attention_mask, position_ids, head_mask, and inputs_embeds, and provides output in the
    form of a tuple containing total loss, start probability, end probability, and additional model outputs.

    The class is suitable for tasks such as named entity recognition, question answering, and other information
    extraction tasks where start and end positions within a sequence need to be identified and predicted.

    This class is a part of the MindSpore library and is designed to provide a high-level interface for utilizing
    the MSERNIE-M model for information extraction tasks.
    """
    def __init__(self, config):
        """
        Initializes an instance of the MSErnieMForInformationExtraction class.

        Args:
            self (MSErnieMForInformationExtraction): The instance of the MSErnieMForInformationExtraction class.
            config (object): The configuration object for the model.

        Returns:
            None.

        Raises:
            TypeError: If the config parameter is not of the expected type.
            ValueError: If the config parameter does not contain the required attributes.
        """
        super().__init__(config)
        self.ernie_m = MSErnieMModel(config)
        self.linear_start = nn.Linear(config.hidden_size, 1)
        self.linear_end = nn.Linear(config.hidden_size, 1)
        self.sigmoid = nn.Sigmoid()
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
                not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
                taken into account for computing the loss.
        """
        result = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )

        sequence_output = result[0]

        start_logits = self.linear_start(sequence_output)
        start_logits = start_logits.squeeze(-1)
        start_prob = self.sigmoid(start_logits)
        end_logits = self.linear_end(sequence_output)
        end_logits = end_logits.squeeze(-1)
        end_prob = self.sigmoid(end_logits)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = ops.binary_cross_entropy(start_prob, start_positions)
            end_loss = ops.binary_cross_entropy(end_prob, end_positions)
            total_loss = (start_loss + end_loss) / 2

        return (total_loss, start_prob, end_prob) + result[1:]

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.init(config)` ¶

Initializes an instance of the MSErnieMForInformationExtraction class.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMForInformationExtraction class. TYPE: `MSErnieMForInformationExtraction`
`config`	The configuration object for the model. TYPE: `object`

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`TypeError`	If the config parameter is not of the expected type.
`ValueError`	If the config parameter does not contain the required attributes.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the MSErnieMForInformationExtraction class.

    Args:
        self (MSErnieMForInformationExtraction): The instance of the MSErnieMForInformationExtraction class.
        config (object): The configuration object for the model.

    Returns:
        None.

    Raises:
        TypeError: If the config parameter is not of the expected type.
        ValueError: If the config parameter does not contain the required attributes.
    """
    super().__init__(config)
    self.ernie_m = MSErnieMModel(config)
    self.linear_start = nn.Linear(config.hidden_size, 1)
    self.linear_end = nn.Linear(config.hidden_size, 1)
    self.sigmoid = nn.Sigmoid()
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) for computing the start_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
            not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
            taken into account for computing the loss.
    """
    result = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )

    sequence_output = result[0]

    start_logits = self.linear_start(sequence_output)
    start_logits = start_logits.squeeze(-1)
    start_prob = self.sigmoid(start_logits)
    end_logits = self.linear_end(sequence_output)
    end_logits = end_logits.squeeze(-1)
    end_prob = self.sigmoid(end_logits)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = ops.binary_cross_entropy(start_prob, start_positions)
        end_loss = ops.binary_cross_entropy(end_prob, end_positions)
        total_loss = (start_loss + end_loss) / 2

    return (total_loss, start_prob, end_prob) + result[1:]

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice` ¶

Bases: MSErnieMPreTrainedModel

This class represents a Multiple Choice classification model based on the MSErnieM architecture. It inherits from the MSErnieMPreTrainedModel and is designed to facilitate multiple choice question answering tasks.

The class implements the initialization method to set up the model and a forward method to process input data and produce classification predictions. The forward method handles input tensors for input_ids, attention_mask, position_ids, head_mask, inputs_embeds, and labels, and provides options for output_attentions and output_hidden_states.

The forward method computes the multiple choice classification loss based on the input data and generates reshaped logits for each choice. It utilizes the MSErnieM model to process the input data and applies dropout and dense layers for classification. Additionally, it handles the cross-entropy loss calculation for training the model.

Overall, the MSErnieMForMultipleChoice class encapsulates the functionality for performing multiple choice classification using the MSErnieM architecture and provides flexibility for processing various input tensors and generating classification predictions.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMForMultipleChoice(MSErnieMPreTrainedModel):

    """
    This class represents a Multiple Choice classification model based on the MSErnieM architecture.
    It inherits from the MSErnieMPreTrainedModel and is designed to facilitate multiple choice question answering tasks.

    The class implements the initialization method to set up the model and a forward method to process input data and
    produce classification predictions. The forward method handles input tensors for input_ids, attention_mask,
    position_ids, head_mask, inputs_embeds, and labels, and provides options for output_attentions and output_hidden_states.

    The forward method computes the multiple choice classification loss based on the input data and generates reshaped
    logits for each choice. It utilizes the MSErnieM model to process the input data and applies dropout and dense layers
    for classification. Additionally, it handles the cross-entropy loss calculation for training the model.

    Overall, the MSErnieMForMultipleChoice class encapsulates the functionality for performing multiple choice
    classification using the MSErnieM architecture and provides flexibility for processing various input tensors and
    generating classification predictions.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForMultipleChoice.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of MSErnieMForMultipleChoice.

        Args:
            self (object): The instance of the class.
            config (object): The configuration object containing various parameters for the model initialization.

        Returns:
            None.

        Raises:
            ValueError: If the provided configuration object is invalid or missing required parameters.
            TypeError: If the configuration parameters are of incorrect type.
        """
        super().__init__(config)

        self.ernie_m = MSErnieMModel(config)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, 1)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        labels: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for computing the multiple choice classification loss. Indices should be in `[0, ...,
                num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See
                `input_ids` above)
        """
        num_choices = input_ids.shape[1] if input_ids is not None else inputs_embeds.shape[1]

        input_ids = input_ids.view(-1, input_ids.shape[-1]) if input_ids is not None else None
        attention_mask = attention_mask.view(-1, attention_mask.shape[-1]) if attention_mask is not None else None
        position_ids = position_ids.view(-1, position_ids.shape[-1]) if position_ids is not None else None
        inputs_embeds = (
            inputs_embeds.view(-1, inputs_embeds.shape[-2], inputs_embeds.shape[-1])
            if inputs_embeds is not None
            else None
        )

        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )

        pooled_output = outputs[1]

        pooled_output = self.dropout(pooled_output)
        logits = self.classifier(pooled_output)
        reshaped_logits = logits.view(-1, num_choices)

        loss = None
        if labels is not None:
            loss = F.cross_entropy(reshaped_logits, labels)

        output = (reshaped_logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.init(config)` ¶

Initializes an instance of MSErnieMForMultipleChoice.

PARAMETER	DESCRIPTION
`self`	The instance of the class. TYPE: `object`
`config`	The configuration object containing various parameters for the model initialization. TYPE: `object`

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`ValueError`	If the provided configuration object is invalid or missing required parameters.
`TypeError`	If the configuration parameters are of incorrect type.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of MSErnieMForMultipleChoice.

    Args:
        self (object): The instance of the class.
        config (object): The configuration object containing various parameters for the model initialization.

    Returns:
        None.

    Raises:
        ValueError: If the provided configuration object is invalid or missing required parameters.
        TypeError: If the configuration parameters are of incorrect type.
    """
    super().__init__(config)

    self.ernie_m = MSErnieMModel(config)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, 1)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the multiple choice classification loss. Indices should be in `[0, ..., num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See `input_ids` above) TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    labels: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for computing the multiple choice classification loss. Indices should be in `[0, ...,
            num_choices-1]` where `num_choices` is the size of the second dimension of the input tensors. (See
            `input_ids` above)
    """
    num_choices = input_ids.shape[1] if input_ids is not None else inputs_embeds.shape[1]

    input_ids = input_ids.view(-1, input_ids.shape[-1]) if input_ids is not None else None
    attention_mask = attention_mask.view(-1, attention_mask.shape[-1]) if attention_mask is not None else None
    position_ids = position_ids.view(-1, position_ids.shape[-1]) if position_ids is not None else None
    inputs_embeds = (
        inputs_embeds.view(-1, inputs_embeds.shape[-2], inputs_embeds.shape[-1])
        if inputs_embeds is not None
        else None
    )

    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )

    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)
    reshaped_logits = logits.view(-1, num_choices)

    loss = None
    if labels is not None:
        loss = F.cross_entropy(reshaped_logits, labels)

    output = (reshaped_logits,) + outputs[2:]
    return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering` ¶

Bases: MSErnieMPreTrainedModel

MSErnieMForQuestionAnswering represents a model for question answering tasks using the MSErnieM architecture. This class inherits from MSErnieMPreTrainedModel and implements methods for initializing the model and forwarding outputs for question answering.

ATTRIBUTE	DESCRIPTION
`num_labels`	The number of labels for token classification. TYPE: `int`
`ernie_m`	The MSErnieMModel instance used for processing inputs. TYPE: `MSErnieMModel`
`qa_outputs`	Dense layer for outputting logits for question answering. TYPE: `Linear`

METHOD	DESCRIPTION
`__init__`	Initializes the MSErnieMForQuestionAnswering instance with the provided configuration.
`forward`	Constructs the question answering outputs based on the input tensors and labels provided.
`Note`	The start_positions and end_positions parameters are used for computing the token classification loss by providing labels for the start and end positions of the labelled span in the input sequence. Position indices are clamped to the length of the sequence and positions outside of the sequence are not considered for loss computation.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMForQuestionAnswering(MSErnieMPreTrainedModel):

    """
    MSErnieMForQuestionAnswering represents a model for question answering tasks using the MSErnieM architecture.
    This class inherits from MSErnieMPreTrainedModel and implements methods for initializing the model and forwarding
    outputs for question answering.

    Attributes:
        num_labels (int): The number of labels for token classification.
        ernie_m (MSErnieMModel): The MSErnieMModel instance used for processing inputs.
        qa_outputs (nn.Linear): Dense layer for outputting logits for question answering.

    Methods:
        __init__: Initializes the MSErnieMForQuestionAnswering instance with the provided configuration.
        forward:
            Constructs the question answering outputs based on the input tensors and labels provided.

        Note:
            The start_positions and end_positions parameters are used for computing the token classification loss by
            providing labels for the start and end positions of the labelled span in the input sequence.
            Position indices are clamped to the length of the sequence and positions outside of the sequence
            are not considered for loss computation.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForQuestionAnswering.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of the MSErnieMForQuestionAnswering class.

        Args:
            self: The instance of the class.
            config:
                An instance of the configuration class containing the model configuration.

                - Type: object
                - Purpose: To provide the configuration settings for the model initialization.
                - Restrictions: Must be a valid configuration object.

        Returns:
            None

        Raises:
            TypeError: If the provided config parameter is not of the expected type.
            ValueError: If the config parameter is missing essential attributes.
            RuntimeError: If an error occurs during initialization or post-initialization steps.
        """
        super().__init__(config)
        self.num_labels = config.num_labels

        self.ernie_m = MSErnieMModel(config, add_pooling_layer=False)
        self.qa_outputs = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) of the start of the labelled span for computing the token classification loss.
                Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
                are not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) of the end of the labelled span for computing the token classification loss.
                Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
                are not taken into account for computing the loss.
        """
        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )

        sequence_output = outputs[0]

        logits = self.qa_outputs(sequence_output)
        start_logits, end_logits = logits.split(1, axis=-1)
        start_logits = start_logits.squeeze(-1)
        end_logits = end_logits.squeeze(-1)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = F.cross_entropy(start_logits, start_positions, ignore_index=ignored_index)
            end_loss = F.cross_entropy(end_logits, end_positions, ignore_index=ignored_index)
            total_loss = (start_loss + end_loss) / 2

        output = (start_logits, end_logits) + outputs[2:]
        return ((total_loss,) + output) if total_loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.init(config)` ¶

Initializes an instance of the MSErnieMForQuestionAnswering class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An instance of the configuration class containing the model configuration. Type: object Purpose: To provide the configuration settings for the model initialization. Restrictions: Must be a valid configuration object.

RETURNS	DESCRIPTION
	None

RAISES	DESCRIPTION
`TypeError`	If the provided config parameter is not of the expected type.
`ValueError`	If the config parameter is missing essential attributes.
`RuntimeError`	If an error occurs during initialization or post-initialization steps.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the MSErnieMForQuestionAnswering class.

    Args:
        self: The instance of the class.
        config:
            An instance of the configuration class containing the model configuration.

            - Type: object
            - Purpose: To provide the configuration settings for the model initialization.
            - Restrictions: Must be a valid configuration object.

    Returns:
        None

    Raises:
        TypeError: If the provided config parameter is not of the expected type.
        ValueError: If the config parameter is missing essential attributes.
        RuntimeError: If an error occurs during initialization or post-initialization steps.
    """
    super().__init__(config)
    self.num_labels = config.num_labels

    self.ernie_m = MSErnieMModel(config, add_pooling_layer=False)
    self.qa_outputs = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) of the start of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) of the end of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) of the start of the labelled span for computing the token classification loss.
            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
            are not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) of the end of the labelled span for computing the token classification loss.
            Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence
            are not taken into account for computing the loss.
    """
    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )

    sequence_output = outputs[0]

    logits = self.qa_outputs(sequence_output)
    start_logits, end_logits = logits.split(1, axis=-1)
    start_logits = start_logits.squeeze(-1)
    end_logits = end_logits.squeeze(-1)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = F.cross_entropy(start_logits, start_positions, ignore_index=ignored_index)
        end_loss = F.cross_entropy(end_logits, end_positions, ignore_index=ignored_index)
        total_loss = (start_loss + end_loss) / 2

    output = (start_logits, end_logits) + outputs[2:]
    return ((total_loss,) + output) if total_loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification` ¶

Bases: MSErnieMPreTrainedModel

This class represents a modified version of the MSErnieM model for sequence classification tasks. It inherits from the MSErnieMPreTrainedModel class.

ATTRIBUTE	DESCRIPTION
`num_labels`	The number of labels for the sequence classification task. TYPE: `int`
`config`	The configuration object for the model. TYPE: `MSErnieMConfig`
`ernie_m`	The MSErnieM model. TYPE: `MSErnieMModel`
`dropout`	The dropout layer for regularization. TYPE: `Dropout`
`classifier`	The dense layer for classification. TYPE: `Linear`

METHOD	DESCRIPTION
`__init__`	Initializes the MSErnieMForSequenceClassification instance.
`forward`	Constructs the model and computes the loss and logits for the given input.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMForSequenceClassification(MSErnieMPreTrainedModel):

    """
    This class represents a modified version of the MSErnieM model for sequence classification tasks.
    It inherits from the MSErnieMPreTrainedModel class.

    Attributes:
        num_labels (int): The number of labels for the sequence classification task.
        config (MSErnieMConfig): The configuration object for the model.
        ernie_m (MSErnieMModel): The MSErnieM model.
        dropout (nn.Dropout): The dropout layer for regularization.
        classifier (nn.Linear): The dense layer for classification.

    Methods:
        __init__: Initializes the MSErnieMForSequenceClassification instance.
        forward: Constructs the model and computes the loss and logits for the given input.
    """
    # Copied from transformers.models.bert.modeling_bert.BertForSequenceClassification.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes an instance of the 'MSErnieMForSequenceClassification' class.

        Args:
            self: The instance of the class.
            config:
                An object of type 'Config' containing the configuration parameters for the model.

                - Type: Config
                - Purpose: Specifies the configuration of the model.
                - Restrictions: None

        Returns:
            None

        Raises:
            None
        """
        super().__init__(config)
        self.num_labels = config.num_labels
        self.config = config

        self.ernie_m = MSErnieMModel(config)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[List[mindspore.Tensor]] = None,
        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        labels: Optional[mindspore.Tensor] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
                config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
                `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
        """
        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            past_key_values=past_key_values,
            output_hidden_states=output_hidden_states,
            output_attentions=output_attentions,
        )

        pooled_output = outputs[1]

        pooled_output = self.dropout(pooled_output)
        logits = self.classifier(pooled_output)

        loss = None
        if labels is not None:
            if self.config.problem_type is None:
                if self.num_labels == 1:
                    self.config.problem_type = "regression"
                elif self.num_labels > 1 and labels.dtype in (mindspore.int64, mindspore.int32):
                    self.config.problem_type = "single_label_classification"
                else:
                    self.config.problem_type = "multi_label_classification"

            if self.config.problem_type == "regression":
                if self.num_labels == 1:
                    loss = F.mse_loss(logits.squeeze(), labels.squeeze())
                else:
                    loss = F.mse_loss(logits, labels)
            elif self.config.problem_type == "single_label_classification":
                loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))
            elif self.config.problem_type == "multi_label_classification":
                loss = F.binary_cross_entropy_with_logits(logits, labels)

        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.init(config)` ¶

Initializes an instance of the 'MSErnieMForSequenceClassification' class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An object of type 'Config' containing the configuration parameters for the model. Type: Config Purpose: Specifies the configuration of the model. Restrictions: None

RETURNS	DESCRIPTION
	None

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes an instance of the 'MSErnieMForSequenceClassification' class.

    Args:
        self: The instance of the class.
        config:
            An object of type 'Config' containing the configuration parameters for the model.

            - Type: Config
            - Purpose: Specifies the configuration of the model.
            - Restrictions: None

    Returns:
        None

    Raises:
        None
    """
    super().__init__(config)
    self.num_labels = config.num_labels
    self.config = config

    self.ernie_m = MSErnieMModel(config)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, labels=None)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the sequence classification/regression loss. Indices should be in `[0, ..., config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If `config.num_labels > 1` a classification loss is computed (Cross-Entropy). TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[List[mindspore.Tensor]] = None,
    use_cache: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    labels: Optional[mindspore.Tensor] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for computing the sequence classification/regression loss. Indices should be in `[0, ...,
            config.num_labels - 1]`. If `config.num_labels == 1` a regression loss is computed (Mean-Square loss), If
            `config.num_labels > 1` a classification loss is computed (Cross-Entropy).
    """
    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        past_key_values=past_key_values,
        output_hidden_states=output_hidden_states,
        output_attentions=output_attentions,
    )

    pooled_output = outputs[1]

    pooled_output = self.dropout(pooled_output)
    logits = self.classifier(pooled_output)

    loss = None
    if labels is not None:
        if self.config.problem_type is None:
            if self.num_labels == 1:
                self.config.problem_type = "regression"
            elif self.num_labels > 1 and labels.dtype in (mindspore.int64, mindspore.int32):
                self.config.problem_type = "single_label_classification"
            else:
                self.config.problem_type = "multi_label_classification"

        if self.config.problem_type == "regression":
            if self.num_labels == 1:
                loss = F.mse_loss(logits.squeeze(), labels.squeeze())
            else:
                loss = F.mse_loss(logits, labels)
        elif self.config.problem_type == "single_label_classification":
            loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))
        elif self.config.problem_type == "multi_label_classification":
            loss = F.binary_cross_entropy_with_logits(logits, labels)

    output = (logits,) + outputs[2:]
    return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification` ¶

Bases: MSErnieMPreTrainedModel

This class represents a token classification model based on MSErnieM architecture. It is designed for tasks that involve assigning labels to individual tokens within a sequence.

The MSErnieMForTokenClassification class inherits from MSErnieMPreTrainedModel and extends its functionality by adding a token classification layer on top of the base model.

The class's forwardor initializes the model and sets up the necessary components. It takes a config object as input and initializes the base model with the provided configuration. The number of labels for token classification is also stored for later use. The dropout layer and the token classification layer are defined. Lastly, the post_init method is called to perform any additional initialization steps.

The forward method is the main entry point for using the model for token classification. It takes various input tensors such as input_ids, attention_mask, position_ids, head_mask, inputs_embeds, past_key_values, output_hidden_states, output_attentions, and labels.

The method first passes the input tensors through the base model (self.ernie_m) to obtain the sequence output. The sequence output is then passed through a dropout layer to prevent overfitting. Finally, the token classification layer (self.classifier) is applied to generate logits for each token in the sequence.

If labels are provided, the method calculates the token classification loss using the cross-entropy function. The loss is computed by reshaping the logits and labels tensors to have a shape of (batch_size * sequence_length, num_labels) and applying the cross-entropy function.

The method returns a tuple containing the logits for each token, as well as any additional outputs from the base model. If a loss is calculated, it is included in the output tuple.

Note

The MSErnieMForTokenClassification class assumes that the input tensors are of type mindspore.Tensor, and the labels tensor should have a shape of (batch_size, sequence_length) with indices in the range [0, ..., config.num_labels - 1].

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMForTokenClassification(MSErnieMPreTrainedModel):

    """
    This class represents a token classification model based on MSErnieM architecture.
    It is designed for tasks that involve assigning labels to individual tokens within a sequence.

    The `MSErnieMForTokenClassification` class inherits from `MSErnieMPreTrainedModel` and extends its functionality
    by adding a token classification layer on top of the base model.

    The class's forwardor initializes the model and sets up the necessary components.
    It takes a `config` object as input and initializes the base model with the provided configuration.
    The number of labels for token classification is also stored for later use.
    The dropout layer and the token classification layer are defined. Lastly, the `post_init` method is called to
    perform any additional initialization steps.

    The `forward` method is the main entry point for using the model for token classification.
    It takes various input tensors such as `input_ids`, `attention_mask`, `position_ids`, `head_mask`,
    `inputs_embeds`, `past_key_values`, `output_hidden_states`, `output_attentions`, and `labels`.

    The method first passes the input tensors through the base model (`self.ernie_m`) to obtain the sequence output.
    The sequence output is then passed through a dropout layer to prevent overfitting.
    Finally, the token classification layer (`self.classifier`) is applied to generate logits for each token in the
    sequence.

    If `labels` are provided, the method calculates the token classification loss using the cross-entropy function.
    The loss is computed by reshaping the logits and labels tensors to have a shape of
    `(batch_size * sequence_length, num_labels)` and applying the cross-entropy function.

    The method returns a tuple containing the logits for each token, as well as any additional outputs from the base model.
    If a loss is calculated, it is included in the output tuple.

    Note:
        The `MSErnieMForTokenClassification` class assumes that the input tensors are of type `mindspore.Tensor`,
        and the labels tensor should have a shape of `(batch_size, sequence_length)` with indices in the range
        `[0, ..., config.num_labels - 1]`.

    """
    # Copied from transformers.models.bert.modeling_bert.BertForTokenClassification.__init__ with Bert->ErnieM,bert->ernie_m
    def __init__(self, config):
        """
        Initializes a new instance of the MSErnieMForTokenClassification class.

        Args:
            self: The instance of the class.
            config:
                An object containing configuration parameters for the model.

                - Type: dict
                - Purpose: Configuration settings for the model.
                - Restrictions: Must contain the key 'num_labels'.

        Returns:
            None.

        Raises:
            TypeError: If the 'config' parameter is not provided or is not of type dict.
            KeyError: If the 'num_labels' key is missing in the 'config' parameter.
        """
        super().__init__(config)
        self.num_labels = config.num_labels

        self.ernie_m = MSErnieMModel(config, add_pooling_layer=False)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        # Initialize weights and apply final processing
        self.post_init()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[List[mindspore.Tensor]] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        labels: Optional[mindspore.Tensor] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            labels (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`.
        """
        outputs = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            past_key_values=past_key_values,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )

        sequence_output = outputs[0]

        sequence_output = self.dropout(sequence_output)
        logits = self.classifier(sequence_output)

        loss = None
        if labels is not None:
            loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.init(config)` ¶

Initializes a new instance of the MSErnieMForTokenClassification class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`config`	An object containing configuration parameters for the model. Type: dict Purpose: Configuration settings for the model. Restrictions: Must contain the key 'num_labels'.

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`TypeError`	If the 'config' parameter is not provided or is not of type dict.
`KeyError`	If the 'num_labels' key is missing in the 'config' parameter.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes a new instance of the MSErnieMForTokenClassification class.

    Args:
        self: The instance of the class.
        config:
            An object containing configuration parameters for the model.

            - Type: dict
            - Purpose: Configuration settings for the model.
            - Restrictions: Must contain the key 'num_labels'.

    Returns:
        None.

    Raises:
        TypeError: If the 'config' parameter is not provided or is not of type dict.
        KeyError: If the 'num_labels' key is missing in the 'config' parameter.
    """
    super().__init__(config)
    self.num_labels = config.num_labels

    self.ernie_m = MSErnieMModel(config, add_pooling_layer=False)
    classifier_dropout = (
        config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    # Initialize weights and apply final processing
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, labels=None)` ¶

PARAMETER	DESCRIPTION
`labels`	Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[List[mindspore.Tensor]] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    labels: Optional[mindspore.Tensor] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        labels (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for computing the token classification loss. Indices should be in `[0, ..., config.num_labels - 1]`.
    """
    outputs = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        past_key_values=past_key_values,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )

    sequence_output = outputs[0]

    sequence_output = self.dropout(sequence_output)
    logits = self.classifier(sequence_output)

    loss = None
    if labels is not None:
        loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

    output = (logits,) + outputs[2:]
    return ((loss,) + output) if loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel` ¶

Bases: MSErnieMPreTrainedModel

This class represents the MSErnieMModel, which is a variant of the MSErnieMPreTrainedModel. It is a model for sequence classification tasks, built on top of the MSErnieM language model.

The MSErnieMModel class includes methods for initializing the model, getting and setting input embeddings, pruning model heads, and forwarding the model.

METHOD	DESCRIPTION
`__init__`	Initializes the MSErnieMModel with the given configuration. By default, it adds a pooling layer to the model.
`get_input_embeddings`	Returns the word embeddings used as input to the model.
`set_input_embeddings`	Sets the word embeddings used as input to the model.
`_prune_heads`	Prunes the specified heads in the model.
`forward`	Constructs the model with the given input and configuration.

Note

The MSErnieMModel class inherits from the MSErnieMPreTrainedModel, which provides additional functionality and methods.

Example

>>> config = MSErnieMConfig()
>>> model = MSErnieMModel(config)
>>> input_ids = ...
>>> position_ids = ...
>>> attention_mask = ...
>>> output = model.forward(input_ids, position_ids, attention_mask)

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMModel(MSErnieMPreTrainedModel):

    """
    This class represents the MSErnieMModel, which is a variant of the MSErnieMPreTrainedModel.
    It is a model for sequence classification tasks, built on top of the MSErnieM language model.

    The MSErnieMModel class includes methods for initializing the model, getting and setting input embeddings,
    pruning model heads, and forwarding the model.

    Methods:
        __init__: Initializes the MSErnieMModel with the given configuration.
            By default, it adds a pooling layer to the model.
        get_input_embeddings: Returns the word embeddings used as input to the model.
        set_input_embeddings: Sets the word embeddings used as input to the model.
        _prune_heads: Prunes the specified heads in the model.
        forward: Constructs the model with the given input and configuration.

    Note:
        The MSErnieMModel class inherits from the MSErnieMPreTrainedModel, which provides additional functionality
        and methods.

    Example:
        ```python
        >>> config = MSErnieMConfig()
        >>> model = MSErnieMModel(config)
        >>> input_ids = ...
        >>> position_ids = ...
        >>> attention_mask = ...
        >>> output = model.forward(input_ids, position_ids, attention_mask)
        ```
    """
    def __init__(self, config, add_pooling_layer=True):
        """
        Initializes a new MSErnieMModel instance.

        Args:
            self: The instance of the MSErnieMModel class.
            config:
                An object containing configuration settings for the model.

                - Type: object
                - Purpose: Specifies the configuration settings for the model.
            add_pooling_layer:
                A boolean flag indicating whether to add a pooling layer.

                - Type: bool
                - Purpose: Specifies whether to include a pooling layer in the model.
                - Restrictions: Must be a boolean value.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(config)
        self.initializer_range = config.initializer_range
        self.embeddings = MSErnieMEmbeddings(config)
        self.encoder = MSErnieMEncoder(config)
        self.pooler = MSErnieMPooler(config) if add_pooling_layer else None
        self.post_init()

    def get_input_embeddings(self):
        """
        Method: get_input_embeddings

        Description:
            This method returns the input embeddings from the MSErnieMModel class.

        Args:
            self: MSErnieMModel
                The instance of the MSErnieMModel class.

                - Type: MSErnieMModel object
                - Purpose: To access the embeddings from the model.
                - Restrictions: None

        Returns:
            None.

        Raises:
            None.
        """
        return self.embeddings.word_embeddings

    def set_input_embeddings(self, value):
        """
        Sets the input embeddings for the MSErnieMModel.

        Args:
            self (MSErnieMModel): The instance of the MSErnieMModel.
            value (object): The input embeddings to be set. It can be of any type.

        Returns:
            None.

        Raises:
            None.
        """
        self.embeddings.word_embeddings = value

    def _prune_heads(self, heads_to_prune):
        """
        Prunes heads of the model. heads_to_prune: dict of {layer_num: list of heads to prune in this layer} See base
        class PreTrainedModel
        """
        for layer, heads in heads_to_prune.items():
            self.encoder.layers[layer].self_attn.prune_heads(heads)

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
    ) -> Tuple[mindspore.Tensor]:
        '''
        Constructs the MSErnieMModel.

        Args:
            self: The object itself.
            input_ids (Optional[mindspore.Tensor]):
                The input tensor containing the indices of input sequence tokens in the vocabulary.
            position_ids (Optional[mindspore.Tensor]):
                The input tensor containing the position indices of each input sequence token in the sequence.
            attention_mask (Optional[mindspore.Tensor]):
                The input tensor containing the attention mask to avoid performing attention on padding tokens.
            head_mask (Optional[mindspore.Tensor]):
                The input tensor containing the mask to nullify selected heads of the self-attention modules.
            inputs_embeds (Optional[mindspore.Tensor]):
                The input tensor containing the embedded representation of the input sequence.
            past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]):
                The input tensor containing the cached key and value tensors of the self-attention mechanism.
            use_cache (Optional[bool]): Whether to use the cache for the decoding steps of the model.
            output_hidden_states (Optional[bool]): Whether to return the hidden states of all layers.
            output_attentions (Optional[bool]): Whether to return the attention weights.

        Returns:
            Tuple[mindspore.Tensor]: A tuple containing the output sequence tensor, the pooled output tensor,
                and other encoded outputs.

        Raises:
            ValueError: If both input_ids and inputs_embeds are provided.

        '''
        if input_ids is not None and inputs_embeds is not None:
            raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time.")

        # init the default bool value
        output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
        output_hidden_states = (
            output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
        )

        head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

        past_key_values_length = 0
        if past_key_values is not None:
            past_key_values_length = past_key_values[0][0].shape[2]

        # Adapted from paddlenlp.transformers.ernie_m.ErnieMModel
        if attention_mask is None:
            attention_mask = (input_ids == 0).to(self.dtype)
            attention_mask = attention_mask * float(ops.finfo(attention_mask.dtype).min)
            if past_key_values is not None:
                batch_size = past_key_values[0][0].shape[0]
                past_mask = ops.zeros([batch_size, 1, 1, past_key_values_length], dtype=attention_mask.dtype)
                attention_mask = ops.concat([past_mask, attention_mask], dim=-1)
        # For 2D attention_mask from tokenizer
        elif attention_mask.ndim == 2:
            attention_mask = attention_mask.to(self.dtype)
            attention_mask = 1.0 - attention_mask
            attention_mask = attention_mask * float(ops.finfo(attention_mask.dtype).min)

        extended_attention_mask = attention_mask.unsqueeze(1).unsqueeze(1)

        embedding_output = self.embeddings(
            input_ids=input_ids,
            position_ids=position_ids,
            inputs_embeds=inputs_embeds,
            past_key_values_length=past_key_values_length,
        )
        encoder_outputs = self.encoder(
            embedding_output,
            attention_mask=extended_attention_mask,
            head_mask=head_mask,
            past_key_values=past_key_values,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )

        sequence_output = encoder_outputs[0]
        pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
        return (sequence_output, pooler_output) + encoder_outputs[1:]

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.init(config, add_pooling_layer=True)` ¶

Initializes a new MSErnieMModel instance.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMModel class.
`config`	An object containing configuration settings for the model. Type: object Purpose: Specifies the configuration settings for the model.
`add_pooling_layer`	A boolean flag indicating whether to add a pooling layer. Type: bool Purpose: Specifies whether to include a pooling layer in the model. Restrictions: Must be a boolean value. DEFAULT: `True`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config, add_pooling_layer=True):
    """
    Initializes a new MSErnieMModel instance.

    Args:
        self: The instance of the MSErnieMModel class.
        config:
            An object containing configuration settings for the model.

            - Type: object
            - Purpose: Specifies the configuration settings for the model.
        add_pooling_layer:
            A boolean flag indicating whether to add a pooling layer.

            - Type: bool
            - Purpose: Specifies whether to include a pooling layer in the model.
            - Restrictions: Must be a boolean value.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(config)
    self.initializer_range = config.initializer_range
    self.embeddings = MSErnieMEmbeddings(config)
    self.encoder = MSErnieMEncoder(config)
    self.pooler = MSErnieMPooler(config) if add_pooling_layer else None
    self.post_init()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None)` ¶

Constructs the MSErnieMModel.

PARAMETER	DESCRIPTION
`self`	The object itself.
`input_ids`	The input tensor containing the indices of input sequence tokens in the vocabulary. TYPE: `Optional[Tensor]` DEFAULT: `None`
`position_ids`	The input tensor containing the position indices of each input sequence token in the sequence. TYPE: `Optional[Tensor]` DEFAULT: `None`
`attention_mask`	The input tensor containing the attention mask to avoid performing attention on padding tokens. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	The input tensor containing the mask to nullify selected heads of the self-attention modules. TYPE: `Optional[Tensor]` DEFAULT: `None`
`inputs_embeds`	The input tensor containing the embedded representation of the input sequence. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_values`	The input tensor containing the cached key and value tensors of the self-attention mechanism. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`use_cache`	Whether to use the cache for the decoding steps of the model. TYPE: `Optional[bool]` DEFAULT: `None`
`output_hidden_states`	Whether to return the hidden states of all layers. TYPE: `Optional[bool]` DEFAULT: `None`
`output_attentions`	Whether to return the attention weights. TYPE: `Optional[bool]` DEFAULT: `None`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the output sequence tensor, the pooled output tensor, and other encoded outputs.

RAISES	DESCRIPTION
`ValueError`	If both input_ids and inputs_embeds are provided.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    use_cache: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
) -> Tuple[mindspore.Tensor]:
    '''
    Constructs the MSErnieMModel.

    Args:
        self: The object itself.
        input_ids (Optional[mindspore.Tensor]):
            The input tensor containing the indices of input sequence tokens in the vocabulary.
        position_ids (Optional[mindspore.Tensor]):
            The input tensor containing the position indices of each input sequence token in the sequence.
        attention_mask (Optional[mindspore.Tensor]):
            The input tensor containing the attention mask to avoid performing attention on padding tokens.
        head_mask (Optional[mindspore.Tensor]):
            The input tensor containing the mask to nullify selected heads of the self-attention modules.
        inputs_embeds (Optional[mindspore.Tensor]):
            The input tensor containing the embedded representation of the input sequence.
        past_key_values (Optional[Tuple[Tuple[mindspore.Tensor]]]):
            The input tensor containing the cached key and value tensors of the self-attention mechanism.
        use_cache (Optional[bool]): Whether to use the cache for the decoding steps of the model.
        output_hidden_states (Optional[bool]): Whether to return the hidden states of all layers.
        output_attentions (Optional[bool]): Whether to return the attention weights.

    Returns:
        Tuple[mindspore.Tensor]: A tuple containing the output sequence tensor, the pooled output tensor,
            and other encoded outputs.

    Raises:
        ValueError: If both input_ids and inputs_embeds are provided.

    '''
    if input_ids is not None and inputs_embeds is not None:
        raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time.")

    # init the default bool value
    output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
    output_hidden_states = (
        output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
    )

    head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

    past_key_values_length = 0
    if past_key_values is not None:
        past_key_values_length = past_key_values[0][0].shape[2]

    # Adapted from paddlenlp.transformers.ernie_m.ErnieMModel
    if attention_mask is None:
        attention_mask = (input_ids == 0).to(self.dtype)
        attention_mask = attention_mask * float(ops.finfo(attention_mask.dtype).min)
        if past_key_values is not None:
            batch_size = past_key_values[0][0].shape[0]
            past_mask = ops.zeros([batch_size, 1, 1, past_key_values_length], dtype=attention_mask.dtype)
            attention_mask = ops.concat([past_mask, attention_mask], dim=-1)
    # For 2D attention_mask from tokenizer
    elif attention_mask.ndim == 2:
        attention_mask = attention_mask.to(self.dtype)
        attention_mask = 1.0 - attention_mask
        attention_mask = attention_mask * float(ops.finfo(attention_mask.dtype).min)

    extended_attention_mask = attention_mask.unsqueeze(1).unsqueeze(1)

    embedding_output = self.embeddings(
        input_ids=input_ids,
        position_ids=position_ids,
        inputs_embeds=inputs_embeds,
        past_key_values_length=past_key_values_length,
    )
    encoder_outputs = self.encoder(
        embedding_output,
        attention_mask=extended_attention_mask,
        head_mask=head_mask,
        past_key_values=past_key_values,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )

    sequence_output = encoder_outputs[0]
    pooler_output = self.pooler(sequence_output) if self.pooler is not None else None
    return (sequence_output, pooler_output) + encoder_outputs[1:]

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.get_input_embeddings()` ¶

Description

This method returns the input embeddings from the MSErnieMModel class.

PARAMETER	DESCRIPTION
`self`	MSErnieMModel The instance of the MSErnieMModel class. Type: MSErnieMModel object Purpose: To access the embeddings from the model. Restrictions: None

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def get_input_embeddings(self):
    """
    Method: get_input_embeddings

    Description:
        This method returns the input embeddings from the MSErnieMModel class.

    Args:
        self: MSErnieMModel
            The instance of the MSErnieMModel class.

            - Type: MSErnieMModel object
            - Purpose: To access the embeddings from the model.
            - Restrictions: None

    Returns:
        None.

    Raises:
        None.
    """
    return self.embeddings.word_embeddings

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.set_input_embeddings(value)` ¶

Sets the input embeddings for the MSErnieMModel.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMModel. TYPE: `MSErnieMModel`
`value`	The input embeddings to be set. It can be of any type. TYPE: `object`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def set_input_embeddings(self, value):
    """
    Sets the input embeddings for the MSErnieMModel.

    Args:
        self (MSErnieMModel): The instance of the MSErnieMModel.
        value (object): The input embeddings to be set. It can be of any type.

    Returns:
        None.

    Raises:
        None.
    """
    self.embeddings.word_embeddings = value

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMPooler` ¶

Bases: Module

A class representing a pooling layer for the MSErnieM model in MindSpore.

This class is responsible for forwarding the pooling layer of the MSErnieM model. The pooling layer takes the hidden states of the model as input and applies a dense layer followed by an activation function to the first token tensor. The resulting pooled output is returned.

ATTRIBUTE	DESCRIPTION
`dense`	A dense layer used in the pooling layer. TYPE: `Linear`
`activation`	An activation function used in the pooling layer. TYPE: `Tanh`

METHOD	DESCRIPTION
`__init__`	Initializes the MSErnieMPooler instance.
`forward`	Constructs the pooling layer.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMPooler(nn.Module):

    """A class representing a pooling layer for the MSErnieM model in MindSpore.

    This class is responsible for forwarding the pooling layer of the MSErnieM model.
    The pooling layer takes the hidden states of the model as input and applies a dense layer followed by
    an activation function to the first token tensor. The resulting pooled output is returned.

    Attributes:
        dense (nn.Linear): A dense layer used in the pooling layer.
        activation (nn.Tanh): An activation function used in the pooling layer.

    Methods:
        __init__: Initializes the MSErnieMPooler instance.
        forward: Constructs the pooling layer.

    """
    def __init__(self, config):
        """
        Initializes a new instance of the MSErnieMPooler class.

        Args:
            self: The object itself.
            config:
                An instance of the configuration class for MSErnieMPooler.

                - Type: Any valid configuration class.
                - Purpose: Specifies the configuration settings for the MSErnieMPooler instance.
                - Restrictions: None.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__()
        self.dense = nn.Linear(config.hidden_size, config.hidden_size)
        self.activation = nn.Tanh()

    def forward(self, hidden_states: mindspore.Tensor) -> mindspore.Tensor:
        """
        Constructs the pooled output tensor from the provided hidden states.

        Args:
            self (MSErnieMPooler): The instance of the MSErnieMPooler class.
            hidden_states (mindspore.Tensor): The input tensor representing the hidden states of the input sequence.
                It should be of shape (batch_size, sequence_length, hidden_size).

        Returns:
            mindspore.Tensor: The pooled output tensor obtained from the hidden states.
                It is a 2D tensor of shape (batch_size, hidden_size) representing the pooled output features.

        Raises:
            ValueError: If the shape of the input hidden_states tensor is not as expected.
            TypeError: If the input hidden_states is not a mindspore.Tensor object.
        """
        # We "pool" the model by simply taking the hidden state corresponding
        # to the first token.
        first_token_tensor = hidden_states[:, 0]
        pooled_output = self.dense(first_token_tensor)
        pooled_output = self.activation(pooled_output)
        return pooled_output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMPooler.init(config)` ¶

Initializes a new instance of the MSErnieMPooler class.

PARAMETER	DESCRIPTION
`self`	The object itself.
`config`	An instance of the configuration class for MSErnieMPooler. Type: Any valid configuration class. Purpose: Specifies the configuration settings for the MSErnieMPooler instance. Restrictions: None.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config):
    """
    Initializes a new instance of the MSErnieMPooler class.

    Args:
        self: The object itself.
        config:
            An instance of the configuration class for MSErnieMPooler.

            - Type: Any valid configuration class.
            - Purpose: Specifies the configuration settings for the MSErnieMPooler instance.
            - Restrictions: None.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__()
    self.dense = nn.Linear(config.hidden_size, config.hidden_size)
    self.activation = nn.Tanh()

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMPooler.forward(hidden_states)` ¶

Constructs the pooled output tensor from the provided hidden states.

PARAMETER	DESCRIPTION
`self`	The instance of the MSErnieMPooler class. TYPE: `MSErnieMPooler`
`hidden_states`	The input tensor representing the hidden states of the input sequence. It should be of shape (batch_size, sequence_length, hidden_size). TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: The pooled output tensor obtained from the hidden states. It is a 2D tensor of shape (batch_size, hidden_size) representing the pooled output features.

RAISES	DESCRIPTION
`ValueError`	If the shape of the input hidden_states tensor is not as expected.
`TypeError`	If the input hidden_states is not a mindspore.Tensor object.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(self, hidden_states: mindspore.Tensor) -> mindspore.Tensor:
    """
    Constructs the pooled output tensor from the provided hidden states.

    Args:
        self (MSErnieMPooler): The instance of the MSErnieMPooler class.
        hidden_states (mindspore.Tensor): The input tensor representing the hidden states of the input sequence.
            It should be of shape (batch_size, sequence_length, hidden_size).

    Returns:
        mindspore.Tensor: The pooled output tensor obtained from the hidden states.
            It is a 2D tensor of shape (batch_size, hidden_size) representing the pooled output features.

    Raises:
        ValueError: If the shape of the input hidden_states tensor is not as expected.
        TypeError: If the input hidden_states is not a mindspore.Tensor object.
    """
    # We "pool" the model by simply taking the hidden state corresponding
    # to the first token.
    first_token_tensor = hidden_states[:, 0]
    pooled_output = self.dense(first_token_tensor)
    pooled_output = self.activation(pooled_output)
    return pooled_output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMPreTrainedModel` ¶

Bases: PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMPreTrainedModel(PreTrainedModel):
    """
    An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained
    models.
    """
    config_class = ErnieMConfig
    base_model_prefix = "ernie_m"

    def _init_weights(self, cell):
        """Initialize the weights"""
        if isinstance(cell, nn.Linear):
            # Slightly different from the TF version which uses truncated_normal for initialization
            # cf https://github.com/pytorch/pytorch/pull/5617
            cell.weight.set_data(initializer(Normal(self.config.initializer_range),
                                                    cell.weight.shape, cell.weight.dtype))
            if cell.bias is not None:
                cell.bias.set_data(initializer('zeros', cell.bias.shape, cell.bias.dtype))
        elif isinstance(cell, nn.Embedding):
            weight = np.random.normal(0.0, self.config.initializer_range, cell.weight.shape)
            if cell.padding_idx:
                weight[cell.padding_idx] = 0

            cell.weight.set_data(Tensor(weight, cell.weight.dtype))
        elif isinstance(cell, nn.LayerNorm):
            cell.weight.set_data(initializer('ones', cell.weight.shape, cell.weight.dtype))
            cell.bias.set_data(initializer('zeros', cell.bias.shape, cell.bias.dtype))

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMSelfAttention` ¶

Bases: Module

The MSErnieMSelfAttention class represents a self-attention mechanism for the MS ERNIE model. This class inherits from nn.Module.

This class implements the self-attention mechanism, which is a crucial component in natural language processing tasks like machine translation and text summarization. The self-attention mechanism allows the model to weigh the significance of different words in a sequence when processing each word, enabling the model to capture long-range dependencies and improve performance on various language understanding tasks.

The class includes methods for initializing the self-attention mechanism, transposing input tensors for calculating attention scores, and forwarding the self-attention mechanism using the provided input tensors. Additionally, it supports position embeddings and optional output of attention probabilities.

The MSErnieMSelfAttention class ensures that the self-attention mechanism is efficiently implemented and seamlessly integrated into the MS ERNIE model, contributing to the model's effectiveness in natural language understanding and generation tasks.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSErnieMSelfAttention(nn.Module):

    """
    The `MSErnieMSelfAttention` class represents a self-attention mechanism for the MS ERNIE model.
    This class inherits from `nn.Module`.

    This class implements the self-attention mechanism, which is a crucial component in natural language processing
    tasks like machine translation and text summarization. The self-attention mechanism allows the model to weigh the
    significance of different words in a sequence when processing each word, enabling the model to capture long-range
    dependencies and improve performance on various language understanding tasks.

    The class includes methods for initializing the self-attention mechanism, transposing input tensors for calculating
    attention scores, and forwarding the self-attention mechanism using the provided input
    tensors. Additionally, it supports position embeddings and optional output of attention probabilities.

    The `MSErnieMSelfAttention` class ensures that the self-attention mechanism is efficiently implemented and
    seamlessly integrated into the MS ERNIE model, contributing to the model's effectiveness in natural language
    understanding and generation tasks.
    """
    def __init__(self, config, position_embedding_type=None):
        """
        Initializes the MSErnieMSelfAttention instance.

        Args:
            self (MSErnieMSelfAttention): The MSErnieMSelfAttention instance.
            config (object): An object containing configuration settings for the self-attention mechanism.
            position_embedding_type (str, optional): The type of position embedding to be used, defaults to None.
                Possible values are 'absolute', 'relative_key', or 'relative_key_query'.

        Returns:
            None.

        Raises:
            ValueError: If the hidden size in the configuration is not a multiple of the number of attention heads
                and the configuration does not have an 'embedding_size' attribute.
        """
        super().__init__()
        if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config, "embedding_size"):
            raise ValueError(
                f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
                f"heads ({config.num_attention_heads})"
            )

        self.num_attention_heads = config.num_attention_heads
        self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
        self.all_head_size = self.num_attention_heads * self.attention_head_size

        self.q_proj = nn.Linear(config.hidden_size, self.all_head_size)
        self.k_proj = nn.Linear(config.hidden_size, self.all_head_size)
        self.v_proj = nn.Linear(config.hidden_size, self.all_head_size)

        self.dropout = nn.Dropout(p=config.attention_probs_dropout_prob)
        self.position_embedding_type = position_embedding_type or getattr(
            config, "position_embedding_type", "absolute"
        )
        if self.position_embedding_type in ('relative_key', 'relative_key_query'):
            self.max_position_embeddings = config.max_position_embeddings
            self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1, self.attention_head_size)

        self.is_decoder = config.is_decoder

    def transpose_for_scores(self, x: mindspore.Tensor) -> mindspore.Tensor:
        """
        Method transposes the input tensor for scores in a self-attention mechanism.

        Args:
            self (MSErnieMSelfAttention): An instance of the MSErnieMSelfAttention class.
            x (mindspore.Tensor): The input tensor to be transposed. It represents the scores to be processed.
                It is expected to have a shape compatible with the transposition operation.

        Returns:
            mindspore.Tensor: A new tensor obtained by transposing the input tensor for scores.
                The shape of the returned tensor is transformed based on the number of attention heads and head size.

        Raises:
            None
        """
        new_x_shape = x.shape[:-1] + (self.num_attention_heads, self.attention_head_size)
        x = x.view(new_x_shape)
        return x.permute(0, 2, 1, 3)

    def forward(
        self,
        hidden_states: mindspore.Tensor,
        attention_mask: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        encoder_hidden_states: Optional[mindspore.Tensor] = None,
        encoder_attention_mask: Optional[mindspore.Tensor] = None,
        past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
        output_attentions: Optional[bool] = False,
    ) -> Tuple[mindspore.Tensor]:
        """
        Method to forward self-attention mechanism in the MSErnieMSelfAttention class.

        Args:
            self: The instance of the class.
            hidden_states (mindspore.Tensor): The input hidden states to the self-attention mechanism.
            attention_mask (Optional[mindspore.Tensor], optional):
                Mask tensor indicating which positions should be attended to and which should not. Defaults to None.
            head_mask (Optional[mindspore.Tensor], optional):
                Mask tensor indicating which heads to mask. Defaults to None.
            encoder_hidden_states (Optional[mindspore.Tensor], optional):
                Hidden states from an encoder in case of cross-attention. Defaults to None.
            encoder_attention_mask (Optional[mindspore.Tensor], optional): Mask tensor for encoder_hidden_states.
                Defaults to None.
            past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]], optional):
                Tuple containing the past key and value tensors. Defaults to None.
            output_attentions (Optional[bool], optional): Flag to output attentions. Defaults to False.

        Returns:
            Tuple[mindspore.Tensor]:
                A tuple containing the context layer and attention probabilities if output_attentions is True,
                otherwise just the context layer.

        Raises:
            ValueError: If the position_embedding_type is not 'relative_key' or 'relative_key_query'.
            TypeError: If there are issues with the input types or dimensions during the computations.
            RuntimeError: If there are runtime issues during the self-attention mechanism.
        """
        mixed_query_layer = self.q_proj(hidden_states)

        # If this is instantiated as a cross-attention module, the keys
        # and values come from an encoder; the attention mask needs to be
        # such that the encoder's padding tokens are not attended to.
        is_cross_attention = encoder_hidden_states is not None

        if is_cross_attention and past_key_value is not None:
            # reuse k,v, cross_attentions
            key_layer = past_key_value[0]
            value_layer = past_key_value[1]
            attention_mask = encoder_attention_mask
        elif is_cross_attention:
            key_layer = self.transpose_for_scores(self.k_proj(encoder_hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(encoder_hidden_states))
            attention_mask = encoder_attention_mask
        elif past_key_value is not None:
            key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(hidden_states))
            key_layer = ops.cat([past_key_value[0], key_layer], dim=2)
            value_layer = ops.cat([past_key_value[1], value_layer], dim=2)
        else:
            key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
            value_layer = self.transpose_for_scores(self.v_proj(hidden_states))

        query_layer = self.transpose_for_scores(mixed_query_layer)

        use_cache = past_key_value is not None
        if self.is_decoder:
            # if cross_attention save Tuple(mindspore.Tensor, mindspore.Tensor) of all cross attention key/value_states.
            # Further calls to cross_attention layer can then reuse all cross-attention
            # key/value_states (first "if" case)
            # if uni-directional self-attention (decoder) save Tuple(mindspore.Tensor, mindspore.Tensor) of
            # all previous decoder key/value_states. Further calls to uni-directional self-attention
            # can concat previous decoder key/value_states to current projected key/value_states (third "elif" case)
            # if encoder bi-directional self-attention `past_key_value` is always `None`
            past_key_value = (key_layer, value_layer)

        # Take the dot product between "query" and "key" to get the raw attention scores.
        attention_scores = ops.matmul(query_layer, key_layer.swapaxes(-1, -2))

        if self.position_embedding_type in ('relative_key', 'relative_key_query'):
            query_length, key_length = query_layer.shape[2], key_layer.shape[2]
            if use_cache:
                position_ids_l = mindspore.tensor(key_length - 1, dtype=mindspore.int64).view(
                    -1, 1
                )
            else:
                position_ids_l = ops.arange(query_length, dtype=mindspore.int64).view(-1, 1)
            position_ids_r = ops.arange(key_length, dtype=mindspore.int64).view(1, -1)
            distance = position_ids_l - position_ids_r

            positional_embedding = self.distance_embedding(distance + self.max_position_embeddings - 1)
            positional_embedding = positional_embedding.to(dtype=query_layer.dtype)  # fp16 compatibility

            if self.position_embedding_type == "relative_key":
                relative_position_scores = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
                attention_scores = attention_scores + relative_position_scores
            elif self.position_embedding_type == "relative_key_query":
                relative_position_scores_query = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
                relative_position_scores_key = ops.einsum("bhrd,lrd->bhlr", key_layer, positional_embedding)
                attention_scores = attention_scores + relative_position_scores_query + relative_position_scores_key

        attention_scores = attention_scores / ops.sqrt(ops.scalar_to_tensor(self.attention_head_size, attention_scores.dtype))
        if attention_mask is not None:
            # Apply the attention mask is (precomputed for all layers in ErnieMModel forward() function)
            attention_scores = attention_scores + attention_mask

        # Normalize the attention scores to probabilities.
        attention_probs = ops.softmax(attention_scores, dim=-1)

        # This is actually dropping out entire tokens to attend to, which might
        # seem a bit unusual, but is taken from the original Transformer paper.
        attention_probs = self.dropout(attention_probs)

        # Mask heads if we want to
        if head_mask is not None:
            attention_probs = attention_probs * head_mask

        context_layer = ops.matmul(attention_probs, value_layer)

        context_layer = context_layer.permute(0, 2, 1, 3)
        new_context_layer_shape = context_layer.shape[:-2] + (self.all_head_size,)
        context_layer = context_layer.view(new_context_layer_shape)

        outputs = (context_layer, attention_probs) if output_attentions else (context_layer,)

        if self.is_decoder:
            outputs = outputs + (past_key_value,)
        return outputs

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMSelfAttention.init(config, position_embedding_type=None)` ¶

Initializes the MSErnieMSelfAttention instance.

PARAMETER	DESCRIPTION
`self`	The MSErnieMSelfAttention instance. TYPE: `MSErnieMSelfAttention`
`config`	An object containing configuration settings for the self-attention mechanism. TYPE: `object`
`position_embedding_type`	The type of position embedding to be used, defaults to None. Possible values are 'absolute', 'relative_key', or 'relative_key_query'. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
	None.

RAISES	DESCRIPTION
`ValueError`	If the hidden size in the configuration is not a multiple of the number of attention heads and the configuration does not have an 'embedding_size' attribute.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def __init__(self, config, position_embedding_type=None):
    """
    Initializes the MSErnieMSelfAttention instance.

    Args:
        self (MSErnieMSelfAttention): The MSErnieMSelfAttention instance.
        config (object): An object containing configuration settings for the self-attention mechanism.
        position_embedding_type (str, optional): The type of position embedding to be used, defaults to None.
            Possible values are 'absolute', 'relative_key', or 'relative_key_query'.

    Returns:
        None.

    Raises:
        ValueError: If the hidden size in the configuration is not a multiple of the number of attention heads
            and the configuration does not have an 'embedding_size' attribute.
    """
    super().__init__()
    if config.hidden_size % config.num_attention_heads != 0 and not hasattr(config, "embedding_size"):
        raise ValueError(
            f"The hidden size ({config.hidden_size}) is not a multiple of the number of attention "
            f"heads ({config.num_attention_heads})"
        )

    self.num_attention_heads = config.num_attention_heads
    self.attention_head_size = int(config.hidden_size / config.num_attention_heads)
    self.all_head_size = self.num_attention_heads * self.attention_head_size

    self.q_proj = nn.Linear(config.hidden_size, self.all_head_size)
    self.k_proj = nn.Linear(config.hidden_size, self.all_head_size)
    self.v_proj = nn.Linear(config.hidden_size, self.all_head_size)

    self.dropout = nn.Dropout(p=config.attention_probs_dropout_prob)
    self.position_embedding_type = position_embedding_type or getattr(
        config, "position_embedding_type", "absolute"
    )
    if self.position_embedding_type in ('relative_key', 'relative_key_query'):
        self.max_position_embeddings = config.max_position_embeddings
        self.distance_embedding = nn.Embedding(2 * config.max_position_embeddings - 1, self.attention_head_size)

    self.is_decoder = config.is_decoder

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMSelfAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

Method to forward self-attention mechanism in the MSErnieMSelfAttention class.

PARAMETER	DESCRIPTION
`self`	The instance of the class.
`hidden_states`	The input hidden states to the self-attention mechanism. TYPE: `Tensor`
`attention_mask`	Mask tensor indicating which positions should be attended to and which should not. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`head_mask`	Mask tensor indicating which heads to mask. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_hidden_states`	Hidden states from an encoder in case of cross-attention. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`encoder_attention_mask`	Mask tensor for encoder_hidden_states. Defaults to None. TYPE: `Optional[Tensor]` DEFAULT: `None`
`past_key_value`	Tuple containing the past key and value tensors. Defaults to None. TYPE: `Optional[Tuple[Tuple[Tensor]]]` DEFAULT: `None`
`output_attentions`	Flag to output attentions. Defaults to False. TYPE: `Optional[bool]` DEFAULT: `False`

RETURNS	DESCRIPTION
`Tuple[Tensor]`	Tuple[mindspore.Tensor]: A tuple containing the context layer and attention probabilities if output_attentions is True, otherwise just the context layer.

RAISES	DESCRIPTION
`ValueError`	If the position_embedding_type is not 'relative_key' or 'relative_key_query'.
`TypeError`	If there are issues with the input types or dimensions during the computations.
`RuntimeError`	If there are runtime issues during the self-attention mechanism.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    hidden_states: mindspore.Tensor,
    attention_mask: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    encoder_hidden_states: Optional[mindspore.Tensor] = None,
    encoder_attention_mask: Optional[mindspore.Tensor] = None,
    past_key_value: Optional[Tuple[Tuple[mindspore.Tensor]]] = None,
    output_attentions: Optional[bool] = False,
) -> Tuple[mindspore.Tensor]:
    """
    Method to forward self-attention mechanism in the MSErnieMSelfAttention class.

    Args:
        self: The instance of the class.
        hidden_states (mindspore.Tensor): The input hidden states to the self-attention mechanism.
        attention_mask (Optional[mindspore.Tensor], optional):
            Mask tensor indicating which positions should be attended to and which should not. Defaults to None.
        head_mask (Optional[mindspore.Tensor], optional):
            Mask tensor indicating which heads to mask. Defaults to None.
        encoder_hidden_states (Optional[mindspore.Tensor], optional):
            Hidden states from an encoder in case of cross-attention. Defaults to None.
        encoder_attention_mask (Optional[mindspore.Tensor], optional): Mask tensor for encoder_hidden_states.
            Defaults to None.
        past_key_value (Optional[Tuple[Tuple[mindspore.Tensor]]], optional):
            Tuple containing the past key and value tensors. Defaults to None.
        output_attentions (Optional[bool], optional): Flag to output attentions. Defaults to False.

    Returns:
        Tuple[mindspore.Tensor]:
            A tuple containing the context layer and attention probabilities if output_attentions is True,
            otherwise just the context layer.

    Raises:
        ValueError: If the position_embedding_type is not 'relative_key' or 'relative_key_query'.
        TypeError: If there are issues with the input types or dimensions during the computations.
        RuntimeError: If there are runtime issues during the self-attention mechanism.
    """
    mixed_query_layer = self.q_proj(hidden_states)

    # If this is instantiated as a cross-attention module, the keys
    # and values come from an encoder; the attention mask needs to be
    # such that the encoder's padding tokens are not attended to.
    is_cross_attention = encoder_hidden_states is not None

    if is_cross_attention and past_key_value is not None:
        # reuse k,v, cross_attentions
        key_layer = past_key_value[0]
        value_layer = past_key_value[1]
        attention_mask = encoder_attention_mask
    elif is_cross_attention:
        key_layer = self.transpose_for_scores(self.k_proj(encoder_hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(encoder_hidden_states))
        attention_mask = encoder_attention_mask
    elif past_key_value is not None:
        key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(hidden_states))
        key_layer = ops.cat([past_key_value[0], key_layer], dim=2)
        value_layer = ops.cat([past_key_value[1], value_layer], dim=2)
    else:
        key_layer = self.transpose_for_scores(self.k_proj(hidden_states))
        value_layer = self.transpose_for_scores(self.v_proj(hidden_states))

    query_layer = self.transpose_for_scores(mixed_query_layer)

    use_cache = past_key_value is not None
    if self.is_decoder:
        # if cross_attention save Tuple(mindspore.Tensor, mindspore.Tensor) of all cross attention key/value_states.
        # Further calls to cross_attention layer can then reuse all cross-attention
        # key/value_states (first "if" case)
        # if uni-directional self-attention (decoder) save Tuple(mindspore.Tensor, mindspore.Tensor) of
        # all previous decoder key/value_states. Further calls to uni-directional self-attention
        # can concat previous decoder key/value_states to current projected key/value_states (third "elif" case)
        # if encoder bi-directional self-attention `past_key_value` is always `None`
        past_key_value = (key_layer, value_layer)

    # Take the dot product between "query" and "key" to get the raw attention scores.
    attention_scores = ops.matmul(query_layer, key_layer.swapaxes(-1, -2))

    if self.position_embedding_type in ('relative_key', 'relative_key_query'):
        query_length, key_length = query_layer.shape[2], key_layer.shape[2]
        if use_cache:
            position_ids_l = mindspore.tensor(key_length - 1, dtype=mindspore.int64).view(
                -1, 1
            )
        else:
            position_ids_l = ops.arange(query_length, dtype=mindspore.int64).view(-1, 1)
        position_ids_r = ops.arange(key_length, dtype=mindspore.int64).view(1, -1)
        distance = position_ids_l - position_ids_r

        positional_embedding = self.distance_embedding(distance + self.max_position_embeddings - 1)
        positional_embedding = positional_embedding.to(dtype=query_layer.dtype)  # fp16 compatibility

        if self.position_embedding_type == "relative_key":
            relative_position_scores = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
            attention_scores = attention_scores + relative_position_scores
        elif self.position_embedding_type == "relative_key_query":
            relative_position_scores_query = ops.einsum("bhld,lrd->bhlr", query_layer, positional_embedding)
            relative_position_scores_key = ops.einsum("bhrd,lrd->bhlr", key_layer, positional_embedding)
            attention_scores = attention_scores + relative_position_scores_query + relative_position_scores_key

    attention_scores = attention_scores / ops.sqrt(ops.scalar_to_tensor(self.attention_head_size, attention_scores.dtype))
    if attention_mask is not None:
        # Apply the attention mask is (precomputed for all layers in ErnieMModel forward() function)
        attention_scores = attention_scores + attention_mask

    # Normalize the attention scores to probabilities.
    attention_probs = ops.softmax(attention_scores, dim=-1)

    # This is actually dropping out entire tokens to attend to, which might
    # seem a bit unusual, but is taken from the original Transformer paper.
    attention_probs = self.dropout(attention_probs)

    # Mask heads if we want to
    if head_mask is not None:
        attention_probs = attention_probs * head_mask

    context_layer = ops.matmul(attention_probs, value_layer)

    context_layer = context_layer.permute(0, 2, 1, 3)
    new_context_layer_shape = context_layer.shape[:-2] + (self.all_head_size,)
    context_layer = context_layer.view(new_context_layer_shape)

    outputs = (context_layer, attention_probs) if output_attentions else (context_layer,)

    if self.is_decoder:
        outputs = outputs + (past_key_value,)
    return outputs

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMSelfAttention.transpose_for_scores(x)` ¶

Method transposes the input tensor for scores in a self-attention mechanism.

PARAMETER	DESCRIPTION
`self`	An instance of the MSErnieMSelfAttention class. TYPE: `MSErnieMSelfAttention`
`x`	The input tensor to be transposed. It represents the scores to be processed. It is expected to have a shape compatible with the transposition operation. TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	mindspore.Tensor: A new tensor obtained by transposing the input tensor for scores. The shape of the returned tensor is transformed based on the number of attention heads and head size.

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def transpose_for_scores(self, x: mindspore.Tensor) -> mindspore.Tensor:
    """
    Method transposes the input tensor for scores in a self-attention mechanism.

    Args:
        self (MSErnieMSelfAttention): An instance of the MSErnieMSelfAttention class.
        x (mindspore.Tensor): The input tensor to be transposed. It represents the scores to be processed.
            It is expected to have a shape compatible with the transposition operation.

    Returns:
        mindspore.Tensor: A new tensor obtained by transposing the input tensor for scores.
            The shape of the returned tensor is transformed based on the number of attention heads and head size.

    Raises:
        None
    """
    new_x_shape = x.shape[:-1] + (self.num_attention_heads, self.attention_head_size)
    x = x.view(new_x_shape)
    return x.permute(0, 2, 1, 3)

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSUIEM` ¶

Bases: MSErnieMForInformationExtraction

UIEM model

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

class MSUIEM(MSErnieMForInformationExtraction):
    """UIEM model"""
    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        start_positions: Optional[mindspore.Tensor] = None,
        end_positions: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
    ) -> Tuple[mindspore.Tensor]:
        r"""
        Args:
            start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
                Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
                not taken into account for computing the loss.
            end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
                Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
                taken into account for computing the loss.
        """
        result = self.ernie_m(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
        )
        sequence_output = result[0]

        start_logits = self.linear_start(sequence_output)
        start_logits = start_logits.squeeze(-1)
        start_prob = self.sigmoid(start_logits)
        end_logits = self.linear_end(sequence_output)
        end_logits = end_logits.squeeze(-1)
        end_prob = self.sigmoid(end_logits)

        total_loss = None
        if start_positions is not None and end_positions is not None:
            # If we are on multi-GPU, split add a dimension
            if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
                start_positions = start_positions.squeeze(-1)
            if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
                end_positions = end_positions.squeeze(-1)
            # sometimes the start/end positions are outside our model inputs, we ignore these terms
            ignored_index = start_logits.shape[1]
            start_positions = start_positions.clamp(0, ignored_index)
            end_positions = end_positions.clamp(0, ignored_index)

            start_loss = ops.binary_cross_entropy(start_prob, start_positions)
            end_loss = ops.binary_cross_entropy(end_prob, end_positions)
            total_loss = (start_loss + end_loss) / 2

        output = (start_prob, end_prob) + result[1:]
        return ((total_loss,) + output) if total_loss is not None else output

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSUIEM.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None)` ¶

PARAMETER	DESCRIPTION
`start_positions`	Labels for position (index) for computing the start_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size, sequence_length)`, optional DEFAULT: `None`
`end_positions`	Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not taken into account for computing the loss. TYPE: `mindspore.Tensor` of shape `(batch_size,)`, optional DEFAULT: `None`

Source code in mindnlp\transformers\models\ernie_m\modeling_graph_ernie_m.py

def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    start_positions: Optional[mindspore.Tensor] = None,
    end_positions: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
) -> Tuple[mindspore.Tensor]:
    r"""
    Args:
        start_positions (`mindspore.Tensor` of shape `(batch_size, sequence_length)`, *optional*):
            Labels for position (index) for computing the start_positions loss. Position outside of the sequence are
            not taken into account for computing the loss.
        end_positions (`mindspore.Tensor` of shape `(batch_size,)`, *optional*):
            Labels for position (index) for computing the end_positions loss. Position outside of the sequence are not
            taken into account for computing the loss.
    """
    result = self.ernie_m(
        input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
    )
    sequence_output = result[0]

    start_logits = self.linear_start(sequence_output)
    start_logits = start_logits.squeeze(-1)
    start_prob = self.sigmoid(start_logits)
    end_logits = self.linear_end(sequence_output)
    end_logits = end_logits.squeeze(-1)
    end_prob = self.sigmoid(end_logits)

    total_loss = None
    if start_positions is not None and end_positions is not None:
        # If we are on multi-GPU, split add a dimension
        if len(start_positions.shape) > 1 and start_positions.shape[-1] == 1:
            start_positions = start_positions.squeeze(-1)
        if len(end_positions.shape) > 1 and end_positions.shape[-1] == 1:
            end_positions = end_positions.squeeze(-1)
        # sometimes the start/end positions are outside our model inputs, we ignore these terms
        ignored_index = start_logits.shape[1]
        start_positions = start_positions.clamp(0, ignored_index)
        end_positions = end_positions.clamp(0, ignored_index)

        start_loss = ops.binary_cross_entropy(start_prob, start_positions)
        end_loss = ops.binary_cross_entropy(end_prob, end_positions)
        total_loss = (start_loss + end_loss) / 2

    output = (start_prob, end_prob) + result[1:]
    return ((total_loss,) + output) if total_loss is not None else output

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m` ¶

Tokenization classes for Ernie-M.

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer` ¶

Bases: PreTrainedTokenizer

Constructs a Ernie-M tokenizer. It uses the sentencepiece tools to cut the words to sub-words.

PARAMETER	DESCRIPTION
`sentencepiece_model_file`	The file path of sentencepiece model. TYPE: `str`
`vocab_file`	The file path of the vocabulary. TYPE: `str`, optional DEFAULT: `None`
`do_lower_case`	Whether or not to lowercase the input when tokenizing. TYPE: `str`, optional, defaults to `True` DEFAULT: `False`
`unk_token`	A special token representing the `unknown (out-of-vocabulary)` token. An unknown token is set to be `unk_token` inorder to be converted to an ID. TYPE: `str`, optional, defaults to `"[UNK]"` DEFAULT: `'[UNK]'`
`sep_token`	A special token separating two different sentences in the same input. TYPE: `str`, optional, defaults to `"[SEP]"` DEFAULT: `'[SEP]'`
`pad_token`	A special token used to make arrays of tokens the same size for batching purposes. TYPE: `str`, optional, defaults to `"[PAD]"` DEFAULT: `'[PAD]'`
`cls_token`	A special token used for sequence classification. It is the last token of the sequence when built with special tokens. TYPE: `str`, optional, defaults to `"[CLS]"` DEFAULT: `'[CLS]'`
`mask_token`	A special token representing a masked token. This is the token used in the masked language modeling task which the model tries to predict the original unmasked ones. TYPE: `str`, optional, defaults to `"[MASK]"` DEFAULT: `'[MASK]'`

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

class ErnieMTokenizer(PreTrainedTokenizer):
    r"""
    Constructs a Ernie-M tokenizer. It uses the `sentencepiece` tools to cut the words to sub-words.

    Args:
        sentencepiece_model_file (`str`):
            The file path of sentencepiece model.
        vocab_file (`str`, *optional*):
            The file path of the vocabulary.
        do_lower_case (`str`, *optional*, defaults to `True`):
            Whether or not to lowercase the input when tokenizing.
        unk_token (`str`, *optional*, defaults to `"[UNK]"`):
            A special token representing the `unknown (out-of-vocabulary)` token. An unknown token is set to be
            `unk_token` inorder to be converted to an ID.
        sep_token (`str`, *optional*, defaults to `"[SEP]"`):
            A special token separating two different sentences in the same input.
        pad_token (`str`, *optional*, defaults to `"[PAD]"`):
            A special token used to make arrays of tokens the same size for batching purposes.
        cls_token (`str`, *optional*, defaults to `"[CLS]"`):
            A special token used for sequence classification. It is the last token of the sequence when built with
            special tokens.
        mask_token (`str`, *optional*, defaults to `"[MASK]"`):
            A special token representing a masked token. This is the token used in the masked language modeling task
            which the model tries to predict the original unmasked ones.
    """
    # Ernie-M model doesn't have token_type embedding.
    model_input_names: List[str] = ["input_ids"]

    vocab_files_names = VOCAB_FILES_NAMES
    pretrained_init_configuration = PRETRAINED_INIT_CONFIGURATION
    max_model_input_sizes = PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
    pretrained_vocab_files_map = PRETRAINED_VOCAB_FILES_MAP
    resource_files_names = RESOURCE_FILES_NAMES

    def __init__(
        self,
        sentencepiece_model_ckpt,
        vocab_file=None,
        do_lower_case=False,
        encoding="utf8",
        unk_token="[UNK]",
        sep_token="[SEP]",
        pad_token="[PAD]",
        cls_token="[CLS]",
        mask_token="[MASK]",
        sp_model_kwargs: Optional[Dict[str, Any]] = None,
        **kwargs,
    ) -> None:
        """
        Initialize the ErnieMTokenizer class.

        Args:
            sentencepiece_model_ckpt (str): The path to the sentencepiece model checkpoint file.
            vocab_file (str): The path to the vocabulary file. Defaults to None.
            do_lower_case (bool): A flag indicating whether to convert tokens to lowercase. Defaults to False.
            encoding (str): The character encoding to be used. Defaults to 'utf8'.
            unk_token (str): The token representing unknown words. Defaults to '[UNK]'.
            sep_token (str): The token representing sentence separation. Defaults to '[SEP]'.
            pad_token (str): The token representing padding. Defaults to '[PAD]'.
            cls_token (str): The token representing classification. Defaults to '[CLS]'.
            mask_token (str): The token representing masking. Defaults to '[MASK]'.
            sp_model_kwargs (Optional[Dict[str, Any]]): Additional keyword arguments for the SentencePiece model. Defaults to None.

        Returns:
            None.

        Raises:
            None.
        """
        # Mask token behave like a normal word, i.e. include the space before it and
        # is included in the raw text, there should be a match in a non-normalized sentence.

        self.sp_model_kwargs = {} if sp_model_kwargs is None else sp_model_kwargs

        self.do_lower_case = do_lower_case
        self.sentencepiece_model_ckpt = sentencepiece_model_ckpt
        self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
        self.sp_model.Load(sentencepiece_model_ckpt)

        # to mimic paddlenlp.transformers.ernie_m.tokenizer.ErnieMTokenizer functioning
        if vocab_file is not None:
            self.vocab = self.load_vocab(filepath=vocab_file)
        else:
            self.vocab = {self.sp_model.id_to_piece(id): id for id in range(self.sp_model.get_piece_size())}
        self.reverse_vocab = {v: k for k, v in self.vocab.items()}

        super().__init__(
            do_lower_case=do_lower_case,
            unk_token=unk_token,
            sep_token=sep_token,
            pad_token=pad_token,
            cls_token=cls_token,
            mask_token=mask_token,
            vocab_file=vocab_file,
            encoding=encoding,
            sp_model_kwargs=self.sp_model_kwargs,
            **kwargs,
        )

        self.SP_CHAR_MAPPING = {}

        for ch in range(65281, 65375):
            if ch in [ord('～')]:
                self.SP_CHAR_MAPPING[chr(ch)] = chr(ch)
                continue
            self.SP_CHAR_MAPPING[chr(ch)] = chr(ch - 65248)

    def get_offset_mapping(self, text):
        """
        This method is part of the ErnieMTokenizer class and is used to obtain the offset mapping for the given text.

        Args:
            self: The instance of the ErnieMTokenizer class.
            text (str): The input text for which the offset mapping is to be generated.

        Returns:
            None.

        Raises:
            None
        """
        if text is None:
            return None

        split_tokens = self.tokenize(text)
        normalized_text, char_mapping = "", []

        for i, ch in enumerate(text):
            if ch in self.SP_CHAR_MAPPING:
                ch = self.SP_CHAR_MAPPING.get(ch)
            else:
                ch = unicodedata.normalize("NFKC", ch)
            if self.is_whitespace(ch):
                continue
            normalized_text += ch
            char_mapping.extend([i] * len(ch))

        text, token_mapping, offset = normalized_text, [], 0

        if self.do_lower_case:
            text = text.lower()

        for token in split_tokens:
            if token[:1] == "▁":
                token = token[1:]
            start = text[offset:].index(token) + offset
            end = start + len(token)

            token_mapping.append((char_mapping[start], char_mapping[end - 1] + 1))
            offset = end
        return token_mapping

    @property
    def vocab_size(self):
        """
        Method to retrieve the size of the vocabulary stored in the ErnieMTokenizer instance.

        Args:
            self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
                It represents the tokenizer object containing the vocabulary.

        Returns:
            int: The number of unique tokens in the vocabulary.
                Returns the length of the vocabulary stored in the tokenizer.

        Raises:
            None.
        """
        return len(self.vocab)

    def get_vocab(self):
        """
        Get the vocabulary of the tokenizer.

        Args:
            self: The instance of the ErnieMTokenizer class.

        Returns:
            dict: A dictionary representing the vocabulary of the tokenizer. It contains the original vocabulary 
                along with any added tokens.

        Raises:
            None.
        """
        return dict(self.vocab, **self.added_tokens_encoder)

    def __getstate__(self):
        """
        Method: __getstate__

        Description:
            This method is used to retrieve the state of an instance of the ErnieMTokenizer class.
            It returns a dictionary representing the current state of the instance, with the 'sp_model' attribute set
            to None.

        Args:
            self: An instance of the ErnieMTokenizer class.

        Returns:
            None.

        Raises:
            None.

        """
        state = self.__dict__.copy()
        state["sp_model"] = None
        return state

    def __setstate__(self, d):
        """
        Sets the state of the ErnieMTokenizer object from a serialized state dictionary.

        Args:
            self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
            d (dict): The serialized state dictionary containing the attributes to be set.

        Returns:
            None.

        Raises:
            None.

        Note:
            This method is automatically called when an ErnieMTokenizer object is loaded from a serialized state.
            It sets the attributes of the object using the values from the serialized state dictionary.

            The 'self.__dict__' attribute is updated with the values from the 'd' dictionary.

            If the 'sp_model_kwargs' attribute is not present in the serialized state, it is initialized as an empty dictionary.

            The SentencePieceProcessor object 'self.sp_model' is initialized using the 'spm.SentencePieceProcessor' class.
            The 'self.sp_model_kwargs' dictionary is passed as keyword arguments to the SentencePieceProcessor forwardor.

            Finally, the sentencepiece model is loaded into the SentencePieceProcessor object using 'self.sentencepiece_model_ckpt'.

            Note that this method assumes the 'spm' module has been imported and is available in the current namespace.
        """
        self.__dict__ = d

        # for backward compatibility
        if not hasattr(self, "sp_model_kwargs"):
            self.sp_model_kwargs = {}

        self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
        self.sp_model.Load(self.sentencepiece_model_ckpt)

    def clean_text(self, text):
        """Performs invalid character removal and whitespace cleanup on text."""
        return "".join((self.SP_CHAR_MAPPING.get(c, c) for c in text))

    def _tokenize(self, text, enable_sampling=False, nbest_size=64, alpha=0.1):
        """Tokenize a string."""
        if self.sp_model_kwargs.get("enable_sampling") is True:
            enable_sampling = True
        if self.sp_model_kwargs.get("alpha") is not None:
            alpha = self.sp_model_kwargs.get("alpha")
        if self.sp_model_kwargs.get("nbest_size") is not None:
            nbest_size = self.sp_model_kwargs.get("nbest_size")

        if not enable_sampling:
            pieces = self.sp_model.EncodeAsPieces(text)
        else:
            pieces = self.sp_model.SampleEncodeAsPieces(text, nbest_size, alpha)
        new_pieces = []
        for pi, piece in enumerate(pieces):
            if piece == SPIECE_UNDERLINE:
                if not pieces[pi + 1].startswith(SPIECE_UNDERLINE) and pi != 0:
                    new_pieces.append(SPIECE_UNDERLINE)
                    continue
                continue
            lst_i = 0
            for i, chunk in enumerate(piece):
                if chunk == SPIECE_UNDERLINE:
                    continue
                if self.is_ch_char(chunk) or self.is_punct(chunk):
                    if i > lst_i and piece[lst_i:i] != SPIECE_UNDERLINE:
                        new_pieces.append(piece[lst_i:i])
                    new_pieces.append(chunk)
                    lst_i = i + 1
                elif chunk.isdigit() and i > 0 and not piece[i - 1].isdigit():
                    if i > lst_i and piece[lst_i:i] != SPIECE_UNDERLINE:
                        new_pieces.append(piece[lst_i:i])
                    lst_i = i
                elif not chunk.isdigit() and i > 0 and piece[i - 1].isdigit():
                    if i > lst_i and piece[lst_i:i] != SPIECE_UNDERLINE:
                        new_pieces.append(piece[lst_i:i])
                    lst_i = i
            if len(piece) > lst_i:
                new_pieces.append(piece[lst_i:])
        return new_pieces

    def convert_tokens_to_string(self, tokens):
        """Converts a sequence of tokens (strings for sub-words) in a single string."""
        out_string = "".join(tokens).replace(SPIECE_UNDERLINE, " ").strip()
        return out_string

    def convert_ids_to_string(self, ids):
        """
        Converts a sequence of tokens (strings for sub-words) in a single string.
        """
        tokens = self.convert_ids_to_tokens(ids)
        out_string = "".join(tokens).replace(SPIECE_UNDERLINE, " ").strip()
        return out_string

    # to mimic paddlenlp.transformers.ernie_m.tokenizer.ErnieMTokenizer functioning
    def _convert_token_to_id(self, token):
        """
        Converts a token to its corresponding ID using the provided vocabulary in the ErnieMTokenizer class.

        Args:
            self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
            token (str): The token to be converted to an ID.

        Returns:
            None: This method returns None. The token ID can be obtained via the 'vocab' attribute in the ErnieMTokenizer class.

        Raises:
            KeyError: If the token is not found in the vocabulary and the unknown token (self.unk_token) is also not
                present in the vocabulary.
        """
        return self.vocab.get(token, self.vocab.get(self.unk_token))

    # to mimic paddlenlp.transformers.ernie_m.tokenizer.ErnieMTokenizer functioning
    def _convert_id_to_token(self, index):
        """Converts an index (integer) in a token (str) using the vocab."""
        return self.reverse_vocab.get(index, self.unk_token)

    def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
        r"""
        Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and
        adding special tokens. An ErnieM sequence has the following format:

        - single sequence: `[CLS] X [SEP]`
        - pair of sequences: `[CLS] A [SEP] [SEP] B [SEP]`

        Args:
            token_ids_0 (`List[int]`):
                List of IDs to which the special tokens will be added.
            token_ids_1 (`List[int]`, *optional*):
                Optional second list of IDs for sequence pairs.

        Returns:
            `List[int]`: List of input_id with the appropriate special tokens.
        """
        if token_ids_1 is None:
            return [self.cls_token_id] + token_ids_0 + [self.sep_token_id]
        _cls = [self.cls_token_id]
        _sep = [self.sep_token_id]
        return _cls + token_ids_0 + _sep + _sep + token_ids_1 + _sep

    def build_offset_mapping_with_special_tokens(self, offset_mapping_0, offset_mapping_1=None):
        r"""
        Build offset map from a pair of offset map by concatenating and adding offsets of special tokens. An Ernie-M
        offset_mapping has the following format:

        - single sequence: `(0,0) X (0,0)`
        - pair of sequences: `(0,0) A (0,0) (0,0) B (0,0)`

        Args:
            offset_mapping_ids_0 (`List[tuple]`):
                List of char offsets to which the special tokens will be added.
            offset_mapping_ids_1 (`List[tuple]`, *optional*):
                Optional second list of wordpiece offsets for offset mapping pairs.

        Returns:
            `List[tuple]`: List of wordpiece offsets with the appropriate offsets of special tokens.
        """
        if offset_mapping_1 is None:
            return [(0, 0)] + offset_mapping_0 + [(0, 0)]

        return [(0, 0)] + offset_mapping_0 + [(0, 0), (0, 0)] + offset_mapping_1 + [(0, 0)]

    def get_special_tokens_mask(self, token_ids_0, token_ids_1=None, already_has_special_tokens=False):
        r"""
        Retrieves sequence ids from a token list that has no special tokens added. This method is called when adding
        special tokens using the tokenizer `encode` method.

        Args:
            token_ids_0 (`List[int]`):
                List of ids of the first sequence.
            token_ids_1 (`List[int]`, *optional*):
                Optional second list of IDs for sequence pairs.
            already_has_special_tokens (`str`, *optional*, defaults to `False`):
                Whether or not the token list is already formatted with special tokens for the model.

        Returns:
            `List[int]`:
                The list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
        """
        if already_has_special_tokens:
            if token_ids_1 is not None:
                raise ValueError(
                    "You should not supply a second sequence if the provided sequence of "
                    "ids is already formatted with special tokens for the model."
                )
            return [1 if x in [self.sep_token_id, self.cls_token_id] else 0 for x in token_ids_0]

        if token_ids_1 is not None:
            return [1] + ([0] * len(token_ids_0)) + [1, 1] + ([0] * len(token_ids_1)) + [1]
        return [1] + ([0] * len(token_ids_0)) + [1]

    def create_token_type_ids_from_sequences(
        self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
    ) -> List[int]:
        """
        Create the token type IDs corresponding to the sequences passed. [What are token type
        IDs?](../glossary#token-type-ids) Should be overridden in a subclass if the model has a special way of
        building: those.

        Args:
            token_ids_0 (`List[int]`):
                The first tokenized sequence.
            token_ids_1 (`List[int]`, *optional*):
                The second tokenized sequence.

        Returns:
            `List[int]`: The token type ids.
        """
        # called when `add_special_tokens` is True, so align with `build_inputs_with_special_tokens` method
        if token_ids_1 is None:
            # [CLS] X [SEP]
            return (len(token_ids_0) + 2) * [0]

        # [CLS] A [SEP] [SEP] B [SEP]
        return [0] * (len(token_ids_0) + 1) + [1] * (len(token_ids_1) + 3)

    def is_ch_char(self, char):
        """
        is_ch_char
        """
        if "\u4e00" <= char <= "\u9fff":
            return True
        return False

    def is_alpha(self, char):
        """
        is_alpha
        """
        if ("a" <= char <= "z") or ("A" <= char <= "Z"):
            return True
        return False

    def is_punct(self, char):
        """
        is_punct
        """
        if char in ",;:.?!~，；：。？！《》【】":
            return True
        return False

    def is_whitespace(self, char):
        """
        is whitespace
        """
        if char in (' ', '\t', '\n', '\r'):
            return True
        if len(char) == 1:
            cat = unicodedata.category(char)
            if cat == "Zs":
                return True
        return False

    def load_vocab(self, filepath):
        """
        This method loads a vocabulary from a specified file path into a token-to-index mapping within the ErnieMTokenizer class.

        Args:
            self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
            filepath (str): The path to the file containing the vocabulary. The file should be encoded in UTF-8 format.

        Returns:
            dict: A dictionary mapping tokens to their corresponding indices in the loaded vocabulary.

        Raises:
            IOError: If the specified file path is invalid or inaccessible.
            ValueError: If the index conversion to integer fails during token-to-index mapping.
        """
        token_to_idx = {}
        with io.open(filepath, "r", encoding="utf-8") as f:
            for index, line in enumerate(f):
                token = line.rstrip("\n")
                token_to_idx[token] = int(index)

        return token_to_idx

    def save_vocabulary(self, save_directory: str, filename_prefix: Optional[str] = None) -> Tuple[str]:
        """
        Save the vocabulary and tokenizer model.

        Args:
            self: The instance of the ErnieMTokenizer class.
            save_directory (str): The directory where the vocabulary and tokenizer model will be saved.
            filename_prefix (Optional[str]): The prefix to be added to the filename. Defaults to None.

        Returns:
            Tuple[str]: A tuple containing the file path of the saved vocabulary.

        Raises:
            OSError: If the save_directory does not exist or is not a valid directory.
            IOError: If there is an issue with writing the vocabulary or tokenizer model files.
            Warning: If the vocabulary indices are not consecutive, indicating a potential corruption in the vocabulary.
        """
        index = 0
        if os.path.isdir(save_directory):
            vocab_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + VOCAB_FILES_NAMES["vocab_file"]
            )
        else:
            vocab_file = (filename_prefix + "-" if filename_prefix else "") + save_directory
        with open(vocab_file, "w", encoding="utf-8") as writer:
            for token, token_index in sorted(self.vocab.items(), key=lambda kv: kv[1]):
                if index != token_index:
                    logger.warning(
                        f"Saving vocabulary to {vocab_file}: vocabulary indices are not consecutive."
                        " Please check that the vocabulary is not corrupted!"
                    )
                    index = token_index
                writer.write(token + "\n")
                index += 1

        tokenizer_model_file = os.path.join(save_directory, "sentencepiece.bpe.model")
        with open(tokenizer_model_file, "wb") as fi:
            content_spiece_model = self.sp_model.serialized_model_proto()
            fi.write(content_spiece_model)

        return (vocab_file,)

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.vocab_size` `property` ¶

Method to retrieve the size of the vocabulary stored in the ErnieMTokenizer instance.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class. It represents the tokenizer object containing the vocabulary. TYPE: `ErnieMTokenizer`

RETURNS	DESCRIPTION
`int`	The number of unique tokens in the vocabulary. Returns the length of the vocabulary stored in the tokenizer.

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.getstate()` ¶

Description

This method is used to retrieve the state of an instance of the ErnieMTokenizer class. It returns a dictionary representing the current state of the instance, with the 'sp_model' attribute set to None.

PARAMETER	DESCRIPTION
`self`	An instance of the ErnieMTokenizer class.

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def __getstate__(self):
    """
    Method: __getstate__

    Description:
        This method is used to retrieve the state of an instance of the ErnieMTokenizer class.
        It returns a dictionary representing the current state of the instance, with the 'sp_model' attribute set
        to None.

    Args:
        self: An instance of the ErnieMTokenizer class.

    Returns:
        None.

    Raises:
        None.

    """
    state = self.__dict__.copy()
    state["sp_model"] = None
    return state

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.init(sentencepiece_model_ckpt, vocab_file=None, do_lower_case=False, encoding='utf8', unk_token='[UNK]', sep_token='[SEP]', pad_token='[PAD]', cls_token='[CLS]', mask_token='[MASK]', sp_model_kwargs=None, **kwargs)` ¶

Initialize the ErnieMTokenizer class.

PARAMETER	DESCRIPTION
`sentencepiece_model_ckpt`	The path to the sentencepiece model checkpoint file. TYPE: `str`
`vocab_file`	The path to the vocabulary file. Defaults to None. TYPE: `str` DEFAULT: `None`
`do_lower_case`	A flag indicating whether to convert tokens to lowercase. Defaults to False. TYPE: `bool` DEFAULT: `False`
`encoding`	The character encoding to be used. Defaults to 'utf8'. TYPE: `str` DEFAULT: `'utf8'`
`unk_token`	The token representing unknown words. Defaults to '[UNK]'. TYPE: `str` DEFAULT: `'[UNK]'`
`sep_token`	The token representing sentence separation. Defaults to '[SEP]'. TYPE: `str` DEFAULT: `'[SEP]'`
`pad_token`	The token representing padding. Defaults to '[PAD]'. TYPE: `str` DEFAULT: `'[PAD]'`
`cls_token`	The token representing classification. Defaults to '[CLS]'. TYPE: `str` DEFAULT: `'[CLS]'`
`mask_token`	The token representing masking. Defaults to '[MASK]'. TYPE: `str` DEFAULT: `'[MASK]'`
`sp_model_kwargs`	Additional keyword arguments for the SentencePiece model. Defaults to None. TYPE: `Optional[Dict[str, Any]]` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`	None.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def __init__(
    self,
    sentencepiece_model_ckpt,
    vocab_file=None,
    do_lower_case=False,
    encoding="utf8",
    unk_token="[UNK]",
    sep_token="[SEP]",
    pad_token="[PAD]",
    cls_token="[CLS]",
    mask_token="[MASK]",
    sp_model_kwargs: Optional[Dict[str, Any]] = None,
    **kwargs,
) -> None:
    """
    Initialize the ErnieMTokenizer class.

    Args:
        sentencepiece_model_ckpt (str): The path to the sentencepiece model checkpoint file.
        vocab_file (str): The path to the vocabulary file. Defaults to None.
        do_lower_case (bool): A flag indicating whether to convert tokens to lowercase. Defaults to False.
        encoding (str): The character encoding to be used. Defaults to 'utf8'.
        unk_token (str): The token representing unknown words. Defaults to '[UNK]'.
        sep_token (str): The token representing sentence separation. Defaults to '[SEP]'.
        pad_token (str): The token representing padding. Defaults to '[PAD]'.
        cls_token (str): The token representing classification. Defaults to '[CLS]'.
        mask_token (str): The token representing masking. Defaults to '[MASK]'.
        sp_model_kwargs (Optional[Dict[str, Any]]): Additional keyword arguments for the SentencePiece model. Defaults to None.

    Returns:
        None.

    Raises:
        None.
    """
    # Mask token behave like a normal word, i.e. include the space before it and
    # is included in the raw text, there should be a match in a non-normalized sentence.

    self.sp_model_kwargs = {} if sp_model_kwargs is None else sp_model_kwargs

    self.do_lower_case = do_lower_case
    self.sentencepiece_model_ckpt = sentencepiece_model_ckpt
    self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
    self.sp_model.Load(sentencepiece_model_ckpt)

    # to mimic paddlenlp.transformers.ernie_m.tokenizer.ErnieMTokenizer functioning
    if vocab_file is not None:
        self.vocab = self.load_vocab(filepath=vocab_file)
    else:
        self.vocab = {self.sp_model.id_to_piece(id): id for id in range(self.sp_model.get_piece_size())}
    self.reverse_vocab = {v: k for k, v in self.vocab.items()}

    super().__init__(
        do_lower_case=do_lower_case,
        unk_token=unk_token,
        sep_token=sep_token,
        pad_token=pad_token,
        cls_token=cls_token,
        mask_token=mask_token,
        vocab_file=vocab_file,
        encoding=encoding,
        sp_model_kwargs=self.sp_model_kwargs,
        **kwargs,
    )

    self.SP_CHAR_MAPPING = {}

    for ch in range(65281, 65375):
        if ch in [ord('～')]:
            self.SP_CHAR_MAPPING[chr(ch)] = chr(ch)
            continue
        self.SP_CHAR_MAPPING[chr(ch)] = chr(ch - 65248)

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.setstate(d)` ¶

Sets the state of the ErnieMTokenizer object from a serialized state dictionary.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class. TYPE: `ErnieMTokenizer`
`d`	The serialized state dictionary containing the attributes to be set. TYPE: `dict`

RETURNS	DESCRIPTION
	None.

Note

This method is automatically called when an ErnieMTokenizer object is loaded from a serialized state. It sets the attributes of the object using the values from the serialized state dictionary.

The 'self.dict' attribute is updated with the values from the 'd' dictionary.

If the 'sp_model_kwargs' attribute is not present in the serialized state, it is initialized as an empty dictionary.

The SentencePieceProcessor object 'self.sp_model' is initialized using the 'spm.SentencePieceProcessor' class. The 'self.sp_model_kwargs' dictionary is passed as keyword arguments to the SentencePieceProcessor forwardor.

Finally, the sentencepiece model is loaded into the SentencePieceProcessor object using 'self.sentencepiece_model_ckpt'.

Note that this method assumes the 'spm' module has been imported and is available in the current namespace.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def __setstate__(self, d):
    """
    Sets the state of the ErnieMTokenizer object from a serialized state dictionary.

    Args:
        self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
        d (dict): The serialized state dictionary containing the attributes to be set.

    Returns:
        None.

    Raises:
        None.

    Note:
        This method is automatically called when an ErnieMTokenizer object is loaded from a serialized state.
        It sets the attributes of the object using the values from the serialized state dictionary.

        The 'self.__dict__' attribute is updated with the values from the 'd' dictionary.

        If the 'sp_model_kwargs' attribute is not present in the serialized state, it is initialized as an empty dictionary.

        The SentencePieceProcessor object 'self.sp_model' is initialized using the 'spm.SentencePieceProcessor' class.
        The 'self.sp_model_kwargs' dictionary is passed as keyword arguments to the SentencePieceProcessor forwardor.

        Finally, the sentencepiece model is loaded into the SentencePieceProcessor object using 'self.sentencepiece_model_ckpt'.

        Note that this method assumes the 'spm' module has been imported and is available in the current namespace.
    """
    self.__dict__ = d

    # for backward compatibility
    if not hasattr(self, "sp_model_kwargs"):
        self.sp_model_kwargs = {}

    self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
    self.sp_model.Load(self.sentencepiece_model_ckpt)

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.build_inputs_with_special_tokens(token_ids_0, token_ids_1=None)` ¶

Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and adding special tokens. An ErnieM sequence has the following format:

single sequence: [CLS] X [SEP]
pair of sequences: [CLS] A [SEP] [SEP] B [SEP]

PARAMETER	DESCRIPTION
`token_ids_0`	List of IDs to which the special tokens will be added. TYPE: `List[int]`
`token_ids_1`	Optional second list of IDs for sequence pairs. TYPE: `List[int]`, optional DEFAULT: `None`

RETURNS	DESCRIPTION
	`List[int]`: List of input_id with the appropriate special tokens.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
    r"""
    Build model inputs from a sequence or a pair of sequence for sequence classification tasks by concatenating and
    adding special tokens. An ErnieM sequence has the following format:

    - single sequence: `[CLS] X [SEP]`
    - pair of sequences: `[CLS] A [SEP] [SEP] B [SEP]`

    Args:
        token_ids_0 (`List[int]`):
            List of IDs to which the special tokens will be added.
        token_ids_1 (`List[int]`, *optional*):
            Optional second list of IDs for sequence pairs.

    Returns:
        `List[int]`: List of input_id with the appropriate special tokens.
    """
    if token_ids_1 is None:
        return [self.cls_token_id] + token_ids_0 + [self.sep_token_id]
    _cls = [self.cls_token_id]
    _sep = [self.sep_token_id]
    return _cls + token_ids_0 + _sep + _sep + token_ids_1 + _sep

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.build_offset_mapping_with_special_tokens(offset_mapping_0, offset_mapping_1=None)` ¶

Build offset map from a pair of offset map by concatenating and adding offsets of special tokens. An Ernie-M offset_mapping has the following format:

single sequence: (0,0) X (0,0)
pair of sequences: (0,0) A (0,0) (0,0) B (0,0)

PARAMETER	DESCRIPTION
`offset_mapping_ids_0`	List of char offsets to which the special tokens will be added. TYPE: `List[tuple]`
`offset_mapping_ids_1`	Optional second list of wordpiece offsets for offset mapping pairs. TYPE: `List[tuple]`, optional

RETURNS	DESCRIPTION
	`List[tuple]`: List of wordpiece offsets with the appropriate offsets of special tokens.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def build_offset_mapping_with_special_tokens(self, offset_mapping_0, offset_mapping_1=None):
    r"""
    Build offset map from a pair of offset map by concatenating and adding offsets of special tokens. An Ernie-M
    offset_mapping has the following format:

    - single sequence: `(0,0) X (0,0)`
    - pair of sequences: `(0,0) A (0,0) (0,0) B (0,0)`

    Args:
        offset_mapping_ids_0 (`List[tuple]`):
            List of char offsets to which the special tokens will be added.
        offset_mapping_ids_1 (`List[tuple]`, *optional*):
            Optional second list of wordpiece offsets for offset mapping pairs.

    Returns:
        `List[tuple]`: List of wordpiece offsets with the appropriate offsets of special tokens.
    """
    if offset_mapping_1 is None:
        return [(0, 0)] + offset_mapping_0 + [(0, 0)]

    return [(0, 0)] + offset_mapping_0 + [(0, 0), (0, 0)] + offset_mapping_1 + [(0, 0)]

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.clean_text(text)` ¶

Performs invalid character removal and whitespace cleanup on text.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def clean_text(self, text):
    """Performs invalid character removal and whitespace cleanup on text."""
    return "".join((self.SP_CHAR_MAPPING.get(c, c) for c in text))

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.convert_ids_to_string(ids)` ¶

Converts a sequence of tokens (strings for sub-words) in a single string.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def convert_ids_to_string(self, ids):
    """
    Converts a sequence of tokens (strings for sub-words) in a single string.
    """
    tokens = self.convert_ids_to_tokens(ids)
    out_string = "".join(tokens).replace(SPIECE_UNDERLINE, " ").strip()
    return out_string

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.convert_tokens_to_string(tokens)` ¶

Converts a sequence of tokens (strings for sub-words) in a single string.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def convert_tokens_to_string(self, tokens):
    """Converts a sequence of tokens (strings for sub-words) in a single string."""
    out_string = "".join(tokens).replace(SPIECE_UNDERLINE, " ").strip()
    return out_string

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.create_token_type_ids_from_sequences(token_ids_0, token_ids_1=None)` ¶

Create the token type IDs corresponding to the sequences passed. What are token type IDs? Should be overridden in a subclass if the model has a special way of building: those.

PARAMETER	DESCRIPTION
`token_ids_0`	The first tokenized sequence. TYPE: `List[int]`
`token_ids_1`	The second tokenized sequence. TYPE: `List[int]`, optional DEFAULT: `None`

RETURNS	DESCRIPTION
`List[int]`	`List[int]`: The token type ids.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def create_token_type_ids_from_sequences(
    self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
) -> List[int]:
    """
    Create the token type IDs corresponding to the sequences passed. [What are token type
    IDs?](../glossary#token-type-ids) Should be overridden in a subclass if the model has a special way of
    building: those.

    Args:
        token_ids_0 (`List[int]`):
            The first tokenized sequence.
        token_ids_1 (`List[int]`, *optional*):
            The second tokenized sequence.

    Returns:
        `List[int]`: The token type ids.
    """
    # called when `add_special_tokens` is True, so align with `build_inputs_with_special_tokens` method
    if token_ids_1 is None:
        # [CLS] X [SEP]
        return (len(token_ids_0) + 2) * [0]

    # [CLS] A [SEP] [SEP] B [SEP]
    return [0] * (len(token_ids_0) + 1) + [1] * (len(token_ids_1) + 3)

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.get_offset_mapping(text)` ¶

This method is part of the ErnieMTokenizer class and is used to obtain the offset mapping for the given text.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class.
`text`	The input text for which the offset mapping is to be generated. TYPE: `str`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def get_offset_mapping(self, text):
    """
    This method is part of the ErnieMTokenizer class and is used to obtain the offset mapping for the given text.

    Args:
        self: The instance of the ErnieMTokenizer class.
        text (str): The input text for which the offset mapping is to be generated.

    Returns:
        None.

    Raises:
        None
    """
    if text is None:
        return None

    split_tokens = self.tokenize(text)
    normalized_text, char_mapping = "", []

    for i, ch in enumerate(text):
        if ch in self.SP_CHAR_MAPPING:
            ch = self.SP_CHAR_MAPPING.get(ch)
        else:
            ch = unicodedata.normalize("NFKC", ch)
        if self.is_whitespace(ch):
            continue
        normalized_text += ch
        char_mapping.extend([i] * len(ch))

    text, token_mapping, offset = normalized_text, [], 0

    if self.do_lower_case:
        text = text.lower()

    for token in split_tokens:
        if token[:1] == "▁":
            token = token[1:]
        start = text[offset:].index(token) + offset
        end = start + len(token)

        token_mapping.append((char_mapping[start], char_mapping[end - 1] + 1))
        offset = end
    return token_mapping

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.get_special_tokens_mask(token_ids_0, token_ids_1=None, already_has_special_tokens=False)` ¶

Retrieves sequence ids from a token list that has no special tokens added. This method is called when adding special tokens using the tokenizer encode method.

PARAMETER	DESCRIPTION
`token_ids_0`	List of ids of the first sequence. TYPE: `List[int]`
`token_ids_1`	Optional second list of IDs for sequence pairs. TYPE: `List[int]`, optional DEFAULT: `None`
`already_has_special_tokens`	Whether or not the token list is already formatted with special tokens for the model. TYPE: `str`, optional, defaults to `False` DEFAULT: `False`

RETURNS	DESCRIPTION
	`List[int]`: The list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def get_special_tokens_mask(self, token_ids_0, token_ids_1=None, already_has_special_tokens=False):
    r"""
    Retrieves sequence ids from a token list that has no special tokens added. This method is called when adding
    special tokens using the tokenizer `encode` method.

    Args:
        token_ids_0 (`List[int]`):
            List of ids of the first sequence.
        token_ids_1 (`List[int]`, *optional*):
            Optional second list of IDs for sequence pairs.
        already_has_special_tokens (`str`, *optional*, defaults to `False`):
            Whether or not the token list is already formatted with special tokens for the model.

    Returns:
        `List[int]`:
            The list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
    """
    if already_has_special_tokens:
        if token_ids_1 is not None:
            raise ValueError(
                "You should not supply a second sequence if the provided sequence of "
                "ids is already formatted with special tokens for the model."
            )
        return [1 if x in [self.sep_token_id, self.cls_token_id] else 0 for x in token_ids_0]

    if token_ids_1 is not None:
        return [1] + ([0] * len(token_ids_0)) + [1, 1] + ([0] * len(token_ids_1)) + [1]
    return [1] + ([0] * len(token_ids_0)) + [1]

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.get_vocab()` ¶

Get the vocabulary of the tokenizer.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class.

RETURNS	DESCRIPTION
`dict`	A dictionary representing the vocabulary of the tokenizer. It contains the original vocabulary along with any added tokens.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def get_vocab(self):
    """
    Get the vocabulary of the tokenizer.

    Args:
        self: The instance of the ErnieMTokenizer class.

    Returns:
        dict: A dictionary representing the vocabulary of the tokenizer. It contains the original vocabulary 
            along with any added tokens.

    Raises:
        None.
    """
    return dict(self.vocab, **self.added_tokens_encoder)

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.is_alpha(char)` ¶

is_alpha

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def is_alpha(self, char):
    """
    is_alpha
    """
    if ("a" <= char <= "z") or ("A" <= char <= "Z"):
        return True
    return False

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.is_ch_char(char)` ¶

is_ch_char

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def is_ch_char(self, char):
    """
    is_ch_char
    """
    if "\u4e00" <= char <= "\u9fff":
        return True
    return False

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.is_punct(char)` ¶

is_punct

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def is_punct(self, char):
    """
    is_punct
    """
    if char in ",;:.?!~，；：。？！《》【】":
        return True
    return False

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.is_whitespace(char)` ¶

is whitespace

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def is_whitespace(self, char):
    """
    is whitespace
    """
    if char in (' ', '\t', '\n', '\r'):
        return True
    if len(char) == 1:
        cat = unicodedata.category(char)
        if cat == "Zs":
            return True
    return False

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.load_vocab(filepath)` ¶

This method loads a vocabulary from a specified file path into a token-to-index mapping within the ErnieMTokenizer class.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class. TYPE: `ErnieMTokenizer`
`filepath`	The path to the file containing the vocabulary. The file should be encoded in UTF-8 format. TYPE: `str`

RETURNS	DESCRIPTION
`dict`	A dictionary mapping tokens to their corresponding indices in the loaded vocabulary.

RAISES	DESCRIPTION
`IOError`	If the specified file path is invalid or inaccessible.
`ValueError`	If the index conversion to integer fails during token-to-index mapping.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def load_vocab(self, filepath):
    """
    This method loads a vocabulary from a specified file path into a token-to-index mapping within the ErnieMTokenizer class.

    Args:
        self (ErnieMTokenizer): The instance of the ErnieMTokenizer class.
        filepath (str): The path to the file containing the vocabulary. The file should be encoded in UTF-8 format.

    Returns:
        dict: A dictionary mapping tokens to their corresponding indices in the loaded vocabulary.

    Raises:
        IOError: If the specified file path is invalid or inaccessible.
        ValueError: If the index conversion to integer fails during token-to-index mapping.
    """
    token_to_idx = {}
    with io.open(filepath, "r", encoding="utf-8") as f:
        for index, line in enumerate(f):
            token = line.rstrip("\n")
            token_to_idx[token] = int(index)

    return token_to_idx

`mindnlp.transformers.models.ernie_m.tokenization_ernie_m.ErnieMTokenizer.save_vocabulary(save_directory, filename_prefix=None)` ¶

Save the vocabulary and tokenizer model.

PARAMETER	DESCRIPTION
`self`	The instance of the ErnieMTokenizer class.
`save_directory`	The directory where the vocabulary and tokenizer model will be saved. TYPE: `str`
`filename_prefix`	The prefix to be added to the filename. Defaults to None. TYPE: `Optional[str]` DEFAULT: `None`

RETURNS	DESCRIPTION
`Tuple[str]`	Tuple[str]: A tuple containing the file path of the saved vocabulary.

RAISES	DESCRIPTION
`OSError`	If the save_directory does not exist or is not a valid directory.
`IOError`	If there is an issue with writing the vocabulary or tokenizer model files.
`Warning`	If the vocabulary indices are not consecutive, indicating a potential corruption in the vocabulary.

Source code in mindnlp\transformers\models\ernie_m\tokenization_ernie_m.py

def save_vocabulary(self, save_directory: str, filename_prefix: Optional[str] = None) -> Tuple[str]:
    """
    Save the vocabulary and tokenizer model.

    Args:
        self: The instance of the ErnieMTokenizer class.
        save_directory (str): The directory where the vocabulary and tokenizer model will be saved.
        filename_prefix (Optional[str]): The prefix to be added to the filename. Defaults to None.

    Returns:
        Tuple[str]: A tuple containing the file path of the saved vocabulary.

    Raises:
        OSError: If the save_directory does not exist or is not a valid directory.
        IOError: If there is an issue with writing the vocabulary or tokenizer model files.
        Warning: If the vocabulary indices are not consecutive, indicating a potential corruption in the vocabulary.
    """
    index = 0
    if os.path.isdir(save_directory):
        vocab_file = os.path.join(
            save_directory, (filename_prefix + "-" if filename_prefix else "") + VOCAB_FILES_NAMES["vocab_file"]
        )
    else:
        vocab_file = (filename_prefix + "-" if filename_prefix else "") + save_directory
    with open(vocab_file, "w", encoding="utf-8") as writer:
        for token, token_index in sorted(self.vocab.items(), key=lambda kv: kv[1]):
            if index != token_index:
                logger.warning(
                    f"Saving vocabulary to {vocab_file}: vocabulary indices are not consecutive."
                    " Please check that the vocabulary is not corrupted!"
                )
                index = token_index
            writer.write(token + "\n")
            index += 1

    tokenizer_model_file = os.path.join(save_directory, "sentencepiece.bpe.model")
    with open(tokenizer_model_file, "wb") as fi:
        content_spiece_model = self.sp_model.serialized_model_proto()
        fi.write(content_spiece_model)

    return (vocab_file,)

ernie_m

mindnlp.transformers.models.ernie_m.configuration_ernie_m ¶

mindnlp.transformers.models.ernie_m.configuration_ernie_m.ErnieMConfig ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.__init__(config, position_embedding_type=None) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.prune_heads(heads) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False, return_dict=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None, return_dict=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, return_dict=True, labels=None) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.__init__(config, add_pooling_layer=True) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, return_dict=None) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.get_input_embeddings() ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.set_input_embeddings(value) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.forward(hidden_states) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPreTrainedModel ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.__init__(config, position_embedding_type=None) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.transpose_for_scores(x) ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM ¶

mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.__init__(config, position_embedding_type=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.prune_heads(heads) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, labels=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.__init__(config) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, labels=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.__init__(config, add_pooling_layer=True) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None) ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.get_input_embeddings() ¶

mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.set_input_embeddings(value) ¶

`mindnlp.transformers.models.ernie_m.configuration_ernie_m` ¶

`mindnlp.transformers.models.ernie_m.configuration_ernie_m.ErnieMConfig` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.init(config, position_embedding_type=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMAttention.prune_heads(heads)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False, return_dict=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForSequenceClassification.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, return_dict=True, labels=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.init(config, add_pooling_layer=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, return_dict=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.get_input_embeddings()` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMModel.set_input_embeddings(value)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPooler.forward(hidden_states)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMPreTrainedModel` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.init(config, position_embedding_type=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.ErnieMSelfAttention.transpose_for_scores(x)` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM` ¶

`mindnlp.transformers.models.ernie_m.modeling_ernie_m.UIEM.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None, return_dict=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.init(config, position_embedding_type=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.forward(hidden_states, attention_mask=None, head_mask=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_value=None, output_attentions=False)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMAttention.prune_heads(heads)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEmbeddings.forward(input_ids=None, position_ids=None, inputs_embeds=None, past_key_values_length=0)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoder.forward(input_embeds, attention_mask=None, head_mask=None, past_key_values=None, output_attentions=False, output_hidden_states=False)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMEncoderLayer.forward(hidden_states, attention_mask=None, head_mask=None, past_key_value=None, output_attentions=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForInformationExtraction.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForMultipleChoice.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForQuestionAnswering.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, start_positions=None, end_positions=None, output_attentions=None, output_hidden_states=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForSequenceClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None, labels=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.init(config)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMForTokenClassification.forward(input_ids=None, attention_mask=None, position_ids=None, head_mask=None, inputs_embeds=None, past_key_values=None, output_hidden_states=None, output_attentions=None, labels=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.init(config, add_pooling_layer=True)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.forward(input_ids=None, position_ids=None, attention_mask=None, head_mask=None, inputs_embeds=None, past_key_values=None, use_cache=None, output_hidden_states=None, output_attentions=None)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.get_input_embeddings()` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMModel.set_input_embeddings(value)` ¶

`mindnlp.transformers.models.ernie_m.modeling_graph_ernie_m.MSErnieMPooler` ¶