mpnet

`mindnlp.transformers.models.mpnet.configuration_mpnet` ¶

MPNet model configuration

`mindnlp.transformers.models.mpnet.configuration_mpnet.MPNetConfig` ¶

Bases: PretrainedConfig

This is the configuration class to store the configuration of a [MPNetModel] or a [TFMPNetModel]. It is used to instantiate a MPNet model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the MPNet microsoft/mpnet-base architecture. ```

Source code in mindnlp\transformers\models\mpnet\configuration_mpnet.py

class MPNetConfig(PretrainedConfig):
    r"""
    This is the configuration class to store the configuration of a [`MPNetModel`] or a [`TFMPNetModel`]. It is used to
    instantiate a MPNet model according to the specified arguments, defining the model architecture. Instantiating a
    configuration with the defaults will yield a similar configuration to that of the MPNet
    [microsoft/mpnet-base](https://huggingface.co/microsoft/mpnet-base) architecture.
    ```"""
    model_type = "mpnet"

    def __init__(
        self,
        vocab_size=30527,
        hidden_size=768,
        num_hidden_layers=12,
        num_attention_heads=12,
        intermediate_size=3072,
        hidden_act="gelu",
        hidden_dropout_prob=0.1,
        attention_probs_dropout_prob=0.1,
        max_position_embeddings=512,
        initializer_range=0.02,
        layer_norm_eps=1e-12,
        relative_attention_num_buckets=32,
        pad_token_id=1,
        bos_token_id=0,
        eos_token_id=2,
        **kwargs,
    ):
        """Initializes a new instance of the MPNetConfig class.

        Args:
            vocab_size (int, optional): The size of the vocabulary. Defaults to 30527.
            hidden_size (int, optional): The size of the hidden states. Defaults to 768.
            num_hidden_layers (int, optional): The number of hidden layers. Defaults to 12.
            num_attention_heads (int, optional): The number of attention heads. Defaults to 12.
            intermediate_size (int, optional): The size of the intermediate layer in the feedforward network. Defaults to 3072.
            hidden_act (str, optional): The activation function for the hidden layers. Defaults to 'gelu'.
            hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Defaults to 0.1.
            attention_probs_dropout_prob (float, optional): The dropout probability for the attention probabilities. Defaults to 0.1.
            max_position_embeddings (int, optional): The maximum number of positional embeddings. Defaults to 512.
            initializer_range (float, optional): The range for the random weight initialization. Defaults to 0.02.
            layer_norm_eps (float, optional): The epsilon value for layer normalization. Defaults to 1e-12.
            relative_attention_num_buckets (int, optional): The number of buckets for relative attention. Defaults to 32.
            pad_token_id (int, optional): The token ID for padding. Defaults to 1.
            bos_token_id (int, optional): The token ID for the beginning of sequence. Defaults to 0.
            eos_token_id (int, optional): The token ID for the end of sequence. Defaults to 2.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(pad_token_id=pad_token_id, bos_token_id=bos_token_id, eos_token_id=eos_token_id, **kwargs)

        self.vocab_size = vocab_size
        self.hidden_size = hidden_size
        self.num_hidden_layers = num_hidden_layers
        self.num_attention_heads = num_attention_heads
        self.hidden_act = hidden_act
        self.intermediate_size = intermediate_size
        self.hidden_dropout_prob = hidden_dropout_prob
        self.attention_probs_dropout_prob = attention_probs_dropout_prob
        self.max_position_embeddings = max_position_embeddings
        self.initializer_range = initializer_range
        self.layer_norm_eps = layer_norm_eps
        self.relative_attention_num_buckets = relative_attention_num_buckets

`mindnlp.transformers.models.mpnet.configuration_mpnet.MPNetConfig.init(vocab_size=30527, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, initializer_range=0.02, layer_norm_eps=1e-12, relative_attention_num_buckets=32, pad_token_id=1, bos_token_id=0, eos_token_id=2, **kwargs)` ¶

Initializes a new instance of the MPNetConfig class.

PARAMETER	DESCRIPTION
`vocab_size`	The size of the vocabulary. Defaults to 30527. TYPE: `int` DEFAULT: `30527`
`hidden_size`	The size of the hidden states. Defaults to 768. TYPE: `int` DEFAULT: `768`
`num_hidden_layers`	The number of hidden layers. Defaults to 12. TYPE: `int` DEFAULT: `12`
`num_attention_heads`	The number of attention heads. Defaults to 12. TYPE: `int` DEFAULT: `12`
`intermediate_size`	The size of the intermediate layer in the feedforward network. Defaults to 3072. TYPE: `int` DEFAULT: `3072`
`hidden_act`	The activation function for the hidden layers. Defaults to 'gelu'. TYPE: `str` DEFAULT: `'gelu'`
`hidden_dropout_prob`	The dropout probability for the hidden layers. Defaults to 0.1. TYPE: `float` DEFAULT: `0.1`
`attention_probs_dropout_prob`	The dropout probability for the attention probabilities. Defaults to 0.1. TYPE: `float` DEFAULT: `0.1`
`max_position_embeddings`	The maximum number of positional embeddings. Defaults to 512. TYPE: `int` DEFAULT: `512`
`initializer_range`	The range for the random weight initialization. Defaults to 0.02. TYPE: `float` DEFAULT: `0.02`
`layer_norm_eps`	The epsilon value for layer normalization. Defaults to 1e-12. TYPE: `float` DEFAULT: `1e-12`
`relative_attention_num_buckets`	The number of buckets for relative attention. Defaults to 32. TYPE: `int` DEFAULT: `32`
`pad_token_id`	The token ID for padding. Defaults to 1. TYPE: `int` DEFAULT: `1`
`bos_token_id`	The token ID for the beginning of sequence. Defaults to 0. TYPE: `int` DEFAULT: `0`
`eos_token_id`	The token ID for the end of sequence. Defaults to 2. TYPE: `int` DEFAULT: `2`

RETURNS	DESCRIPTION
	None.

Source code in mindnlp\transformers\models\mpnet\configuration_mpnet.py

def __init__(
    self,
    vocab_size=30527,
    hidden_size=768,
    num_hidden_layers=12,
    num_attention_heads=12,
    intermediate_size=3072,
    hidden_act="gelu",
    hidden_dropout_prob=0.1,
    attention_probs_dropout_prob=0.1,
    max_position_embeddings=512,
    initializer_range=0.02,
    layer_norm_eps=1e-12,
    relative_attention_num_buckets=32,
    pad_token_id=1,
    bos_token_id=0,
    eos_token_id=2,
    **kwargs,
):
    """Initializes a new instance of the MPNetConfig class.

    Args:
        vocab_size (int, optional): The size of the vocabulary. Defaults to 30527.
        hidden_size (int, optional): The size of the hidden states. Defaults to 768.
        num_hidden_layers (int, optional): The number of hidden layers. Defaults to 12.
        num_attention_heads (int, optional): The number of attention heads. Defaults to 12.
        intermediate_size (int, optional): The size of the intermediate layer in the feedforward network. Defaults to 3072.
        hidden_act (str, optional): The activation function for the hidden layers. Defaults to 'gelu'.
        hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Defaults to 0.1.
        attention_probs_dropout_prob (float, optional): The dropout probability for the attention probabilities. Defaults to 0.1.
        max_position_embeddings (int, optional): The maximum number of positional embeddings. Defaults to 512.
        initializer_range (float, optional): The range for the random weight initialization. Defaults to 0.02.
        layer_norm_eps (float, optional): The epsilon value for layer normalization. Defaults to 1e-12.
        relative_attention_num_buckets (int, optional): The number of buckets for relative attention. Defaults to 32.
        pad_token_id (int, optional): The token ID for padding. Defaults to 1.
        bos_token_id (int, optional): The token ID for the beginning of sequence. Defaults to 0.
        eos_token_id (int, optional): The token ID for the end of sequence. Defaults to 2.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(pad_token_id=pad_token_id, bos_token_id=bos_token_id, eos_token_id=eos_token_id, **kwargs)

    self.vocab_size = vocab_size
    self.hidden_size = hidden_size
    self.num_hidden_layers = num_hidden_layers
    self.num_attention_heads = num_attention_heads
    self.hidden_act = hidden_act
    self.intermediate_size = intermediate_size
    self.hidden_dropout_prob = hidden_dropout_prob
    self.attention_probs_dropout_prob = attention_probs_dropout_prob
    self.max_position_embeddings = max_position_embeddings
    self.initializer_range = initializer_range
    self.layer_norm_eps = layer_norm_eps
    self.relative_attention_num_buckets = relative_attention_num_buckets

`mindnlp.transformers.models.mpnet.modeling_mpnet` ¶

MindSpore MPNet model.