-
Notifications
You must be signed in to change notification settings - Fork 5.8k
【Hackathon 6th No.17】为 Paddle 新增 sparse.mask_as API -part #64320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
冲突了 |
Update 20240524
@luotao1 已解决冲突 ~ @zhwesky2010 请评审 ~ |
python/paddle/sparse/binary.py
Outdated
>>> csr = paddle.sparse.sparse_csr_tensor(crows, cols, values, dense_shape) | ||
>>> np.random.seed(2024) | ||
>>> d = np.random.rand(*dense_shape) | ||
>>> x = paddle.to_tensor(d, dtype=csr.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不可以直接用paddle.rand创建吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK ~
|
||
def check(self, shape, dtype, place, check_grad=True): | ||
paddle.disable_static() | ||
data_np, mask_np = generate_data(shape, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个命名可以改一下,一个是dense_np,一个是sparse_np
mask一般是0/1矩阵,这里是一个稀疏矩阵
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个命名可以改一下,一个是dense_np,一个是sparse_np
mask一般是0/1矩阵,这里是一个稀疏矩阵
这里需要澄清一下:
- data_np 和 mask_np 此时都是 dense
- 为了比对后面 numpy 的值,mask 此时还不是稀疏矩阵,后面会转为 paddle 的稀疏矩阵
- 这里的 mask 可以是任何值的矩阵
In [3]: def generate_data(shape, dtype):
...: """
...: Generate `data` and `mask` with the same shape and dtype.
...: """
...: _mask = np.random.randint(0, 2, shape)
...: if np.sum(_mask) == 0:
...: _mask.flat[0] = 1
...: mask = (np.random.randint(-100, 100, shape) * _mask).astype(dtype)
...: data = np.random.randint(-100, 100, shape).astype(dtype)
...: return data, mask
...:
In [4]: data, mask = generate_data((3, 4), 'float32')
In [5]: data
Out[5]:
array([[ 2., -57., 86., 2.],
[-77., -59., -55., 10.],
[-38., 70., 6., -63.]], dtype=float32)
In [6]: mask
Out[6]:
array([[ 0., 0., 0., 91.],
[ 0., 99., 94., -41.],
[ 0., 57., 99., 52.]], dtype=float32)
后面 mask_as 的时候,会有具体转换的操作
if self.format == 'coo':
mask = paddle.to_tensor(
mask_np, dtype=dtype, place=place
).to_sparse_coo(len(shape))
else:
mask = paddle.to_tensor(
mask_np, dtype=dtype, place=place
).to_sparse_csr()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的命名我一起改一下吧 ~
self.assertEqual(sparse_grad.dtype, data.dtype) | ||
|
||
# make a dense tensor to compare the grad from sparse_out | ||
dense_tensor = paddle.to_tensor(data_np) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些命名风格容易降低可读性
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确实 ... ... 这里各种矩阵的名称都有,我看看改一下 ~
const DenseTensor& x, | ||
const SparseCooTensor& mask, | ||
SparseCooTensor* out) { | ||
MaskCooKernel<T, Context>(dev_ctx, x, mask, out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看起来是一模一样的,直接将当前的MaskCooKernel改名吧:MaskCooKernel->MaskAsCooKernel,这样也减少以后的调用与理解成本
const DenseTensor& x, | ||
const SparseCooTensor& mask, | ||
SparseCooTensor* out) { | ||
MaskCooKernel<T, Context>(dev_ctx, x, mask, out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议直接改名
@@ -329,3 +582,39 @@ PD_REGISTER_KERNEL(mask_helper_coo, | |||
phi::dtype::complex<double>) { | |||
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO); | |||
} | |||
|
|||
PD_REGISTER_KERNEL(mask_as_coo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前的mask_coo
kernel注册是否有用在某些地方,如果没有的话,建议直接全部改名为mask_as_coo
,比免维护多套完全一样的内容
) | ||
self.assertEqual(dense_data_grad.dtype, dense_data_pd.dtype) | ||
|
||
# make a dense tensor to compare the grad from sparse_out_pd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不用这么麻烦,直接把expect_grad模拟写出来就行吧,就是np.ones再乘以mask的相应位置:np.ones(shape)* (dense_mask_np != 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,这样做也可以 ~
这相当于已知先验知识,即梯度为 1
~
我这里相当于未知此先验,从 paddle 自身梯度回传的角度来测试 ~
两者都可以 ~
要改成 np.ones(shape)* (dense_mask_np != 0)
这种形式吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,这样做也可以 ~
这相当于已知先验知识,即梯度为
1
~我这里相当于未知此先验,从 paddle 自身梯度回传的角度来测试 ~
两者都可以 ~
要改成
np.ones(shape)* (dense_mask_np != 0)
这种形式吗?
改一下吧,写法越简单越好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@megemini 补充一下API中文文档与API映射文档,这个对应到Pytorch是 |
1792 - test_sparse_mask_as_op (Timeout) |
OK ~ coverage 好像时间都比较长 ~~~ |
另外,这个接口是否需要 patch 到 Tensor 上?类似 torch 的调用方式? |
由于其他稀疏系列的API都没有patch,这个也就不用patch吧 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
User Experience
PR Types
New features
Description
NO.17 为 Paddle 新增 sparse.mask_as API
关联 RFC PaddlePaddle/community#901
### 20240515- 此 PR 暂时为 RFC 提供 review 参考,未实现 docstring 与单测