PyTorch中的Embedding层_外汇行情

PyTorch中的Embedding层

迪丽瓦拉

2025-05-30 18:05:55

0次

词向量实现在PyTorch中对应于Embedding层，其实现代码的源码函数（PyTorch的版本为2.0）如下：

torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None)

该函数随机会生成了一个向量，可以把它看作一个词向量查询表，其size为[num_embeddings，embedding_dim] 。其中num_embeddings是查询表的大小，embedding_dim是每个查询向量的维度。

函数参数解释：

num_embeddings: int, 查询表的大小
embedding_dim: int, 每个查询向量的维度
padding_idx: int, 填充id
max_norm: float, 最大范数，每个范数超过max_norm的embedding向量会重新规范化至其范数为max_norm
norm_type: float, p范数（p=norm_type），默认值为2

需要注意的是，查询的下标向量的数据类型必须使Long，即Int64.
让我们来看几个Embedding层的使用例子。

例子1：

# -*- coding: utf-8 -*-
import torch
import torch.nn as nn# an Embedding module containing 10 tensors of size 3
embedding = nn.Embedding(10, 3)
print(embedding.weight)
# a batch of 2 samples of 4 indices each
input = torch.LongTensor([[1, 2, 4, 5], [4, 3, 2, 9]])
print(embedding(input))

输出：

Parameter containing:
tensor([[ 1.9260, -1.3492,  0.3753],[ 1.2182,  1.8350,  0.7975],[ 0.1568,  0.9562,  1.1164],[ 1.4660,  0.8763,  0.1681],[ 0.4175,  0.4029, -1.3495],[-0.5182,  0.1465,  0.0280],[ 0.7748,  0.1848, -0.4229],[ 0.3740, -0.2761,  1.5017],[-0.4583,  0.2934,  0.2217],[-0.1402, -0.5671, -1.7069]], requires_grad=True)
tensor([[[ 1.2182,  1.8350,  0.7975],[ 0.1568,  0.9562,  1.1164],[ 0.4175,  0.4029, -1.3495],[-0.5182,  0.1465,  0.0280]],[[ 0.4175,  0.4029, -1.3495],[ 1.4660,  0.8763,  0.1681],[ 0.1568,  0.9562,  1.1164],[-0.1402, -0.5671, -1.7069]]], grad_fn=)Process finished with exit code 0

例子2：

# -*- coding: utf-8 -*-
import torch
import torch.nn as nn# Embedding with padding_idx
embedding = nn.Embedding(5, 3, padding_idx=0)
print(embedding.weight)
input = torch.LongTensor([[0, 2, 0, 4]])
print(embedding(input))

输出：

Parameter containing:
tensor([[ 0.0000,  0.0000,  0.0000],[-1.4159, -1.2288,  0.2698],[-2.2287, -1.5313,  0.3296],[-1.0393, -0.3102,  0.2819],[-0.2162, -0.2060,  0.3289]], requires_grad=True)
tensor([[[ 0.0000,  0.0000,  0.0000],[-2.2287, -1.5313,  0.3296],[ 0.0000,  0.0000,  0.0000],[-0.2162, -0.2060,  0.3289]]], grad_fn=)

从中我们可以发现，Embedding层中padding_idx对应的向量为零向量。

参考文献

Embedding source code: https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html

词库加载错误:未能找到文件“E:\highferrum_mysql\Configuration\Dict_Stopwords.txt”。

上一篇：【推理题】称重【污染的药丸，求质量不同的球或苹果】

下一篇：[计算机系统]：一文读懂软链接和硬链接

PyTorch中的Embedding层

参考文献

相关内容

热门资讯