BERT模型源码解析( 三 ) _生活百科

(self.embedding_output, self.embedding_table) = embedding_lookup(
input_ids=input_ids,
vocab_size=config.vocab_size,
embedding_size=config.hidden_size,
initializer_range=config.initializer_range,
word_embedding_name="word_embeddings",
use_one_hot_embeddings=use_one_hot_embeddings)
添加位置嵌入、令牌嵌入，然后标准化并执行丢弃
# Add positional embeddings and token type embeddings, then layer
# normalize and perform dropout.
embedding_postprocessor对单词嵌入张量执行各种后处理。
self.embedding_output = embedding_postprocessor(
input_tensor=self.embedding_output,
use_token_type=True,
token_type_ids=token_type_ids,
token_type_vocab_size=config.type_vocab_size,
token_type_embedding_name="token_type_embeddings",
use_position_embeddings=True,
position_embedding_name="position_embeddings",
initializer_range=config.initializer_range,
max_position_embeddings=config.max_position_embeddings,
dropout_prob=config.hidden_dropout_prob)
with tf.variable_scope("encoder"):
将2维掩码转换成3维，用于注意力评分
# This converts a 2D mask of shape [batch_size, seq_length] to a 3D
# mask of shape [batch_size, seq_length, seq_length] which is used
# for the attention scores.
attention_mask = create_attention_mask_from_input_mask(
input_ids, input_mask)
# Run the stacked transformer.  运行堆叠的transformer模型
# `sequence_output` shape = [batch_size, seq_length, hidden_size].
创建transformer_model对象
self.all_encoder_layers = transformer_model(
input_tensor=self.embedding_output,
attention_mask=attention_mask,
hidden_size=config.hidden_size,
num_hidden_layers=config.num_hidden_layers,
num_attention_heads=config.num_attention_heads,
intermediate_size=config.intermediate_size,
intermediate_act_fn=get_activation(config.hidden_act),
hidden_dropout_prob=config.hidden_dropout_prob,
attention_probs_dropout_prob=config.attention_probs_dropout_prob,
initializer_range=config.initializer_range,
do_return_all_layers=True)
[-1]表示倒数第一项
self.sequence_output = self.all_encoder_layers[-1]
# The "pooler" converts the encoded sequence tensor of shape
# [batch_size, seq_length, hidden_size] to a tensor of shape
# [batch_size, hidden_size].
pooler改变编码张量的形状，从3维变成了2维
This is necessary for segment-level
# (or segment-pair-level) classification tasks where we need a fixed
# dimensional representation of the segment.
句子分类任务中，这种转换是必要的，因为我们需要一个固定维度的表达
with tf.variable_scope("pooler"):
# We "pool" the model by simply taking the hidden state corresponding to the first token.
通过获取和第一个令牌一致的隐藏状态，我们池化了模型
We assume that this has been pre-trained
假定模型已经预训练好了
tf.squeeze从张量的形状中去除大小为1的维数
squeeze英 [skwi?z] 美 [skwi?z]v. 挤压，捏；
first_token_tensor = tf.squeeze(self.sequence_output[:, 0:1, :], axis=1)
self.pooled_output = tf.layers.dense(
first_token_tensor, 符号张量输入到密集层
config.hidden_size, 隐藏层的大小
activation=tf.tanh, 激活函数：反正切
kernel_initializer=create_initializer(config.initializer_range))
#构造函数结束
def get_pooled_output(self):  获取池化输出
return self.pooled_output
def get_sequence_output(self):   获取序列输出
"""Gets final hidden layer of encoder.  获取编码后的隐藏层
Returns: 返回一个张量，和transformer 编码一致的
float Tensor of shape [batch_size, seq_length, hidden_size] corresponding
to the final hidden of the transformer encoder.
"""
return self.sequence_output
def get_all_encoder_layers(self):  获取所有编码层
return self.all_encoder_layers
def get_embedding_output(self):  获取嵌入层的输出
"""Gets output of the embedding lookup (i.e., input to the transformer).
获取嵌入查找的结果，例如 transformer的输入
Returns: 返回一个浮点型张量，和嵌入层一致的
将位置嵌入和类型嵌入数据统统相加求和，然后再标准化
这就是transformer的输入
float Tensor of shape [batch_size, seq_length, hidden_size] corresponding
to the output of the embedding layer, after summing the word
embeddings with the positional embeddings and the token type embeddings,
then performing layer normalization. This is the input to the transformer.

BERT模型源码解析( 三 )

推荐阅读

男牛女鸡结婚好不好男牛女鸡结婚会分开吗

汕头方特欢乐世界蓝水星园内小火车是否收费？

跟女生聊天怎么显得不舔

旺财风水秘籍PDF下载，三合风水有哪些古籍

甲苯能使溴水褪色吗

一石二鸟猜一生肖一石二鸟猜一生肖？

关于特伦托音乐学院简述特伦托音乐学院

重庆今年中考分数线重庆中考录取分数线2023年公布时间

生辰八字查询:农历2022年正月二十二这天出生的虎宝宝八字是什么

女孩带楠字有寓意的名字

关于最高人民法院、国家教委、人事部关于加强法院系统成人高等教育《专业证书》教学班管理的通知简述最高人民法院、国家教委、人事部关于加强法院系统成人高等教育《专业证书》教学班管

属兔和属马婚姻怎么样属兔的人与属马的人适合吗

大结局是好是坏怎么样《骊歌行》骊歌行原著小说是什么

我把心挖出来挂在了树上什么意思

维纳斯是什么神维纳斯是什么神话

社保卡遗失后怎么补办社保卡 ?补办社保卡的材料有哪些？

游戏“双子”讲了啥呀？

孕妇能喝蜂蜜吗(孕妇可以喝蜂蜜吗？)

2023年下半年台州教资笔试考试时间是什么时候？

梦幻新诛仙转门派价格介绍梦幻新诛仙转门派多少钱