-
Notifications
You must be signed in to change notification settings - Fork 614
feat(translator_commons): add dictionary_exclude
to exclude words
#1008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, the user must delete words from the dictionary.
我有個疑問,屏蔽詞還會參與造句嗎? |
在配置裏指定屏蔽詞列表應該夠用了。 |
语义依然是「等价于从 dict.yaml 里删词」:
|
建议---使用外部 .txt 或 .dict.yaml 文件、而非 schema.yaml 来指定黑名单? |
假设黑名单比较小,用 .custom.yaml 就相当于「外部文件」了:
用 txt 和 dict.yaml 的话我目前不知道怎么实现 :( |
可以先用目前這個支持小規模的列表,以後可以繼續做,讓這項配置兼容外部文件,不給列表而是給文件名(字符串值)。 |
return false; | ||
} | ||
while (filter_ && !filter_(Peek())) { | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
這裏邏輯沒變吧?
寫成這樣可能是因爲看到有的書推薦不用 do-while。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
循环体内必须要 reset,FindNextEntry 在读到 entry 的时候就不干活了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
说错了,是 Peek 会不断返回相同的 entry,最后把所有候选都删掉了。
@@ -137,6 +137,7 @@ void DictEntryIterator::AddFilter(DictEntryFilter filter) { | |||
// the introduced filter could invalidate the current or even all the | |||
// remaining entries | |||
while (!exhausted() && !filter_(Peek())) { | |||
entry_.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
這是弄啥咧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
for (auto& v : *collector) { | ||
v.second.Sort(); | ||
if (blacklist && !blacklist->empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
吶,我沒分析代碼,先問問看。
排完序再過濾,過濾完,順序還對不對呢。
還有,過濾完的迭代器爲空的情況,能不能處理好。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
过滤应该不影响顺序吧。
迭代器为空的情况测试了,应该没什么问题。逻辑上跟之前的 filter by charset 是一样的,如果现在有问题,那之前就也有问题。(上面的 reset 就是之前就有的 bug。)
如果要在前端的程序添加,是否比较麻烦呢?还是前端只想手动添加到schema文件? |
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
closes #883
该 PR 新增
translator/dictionary_exclude
其语义严格等价于从 .dict.yaml 中删除对应词条,但不影响组句和用户词库,即用户依然可以选字打出对应词。在上面的例子中, luna_pinyin 在用户词库为空时,将无法输出「零零」,但用户可以依次选取2次「零」造词,之后又可以直接输出「零零」。
未解决问题: