This repository has been archived on 2025-09-14. You can view files and clone it, but cannot push or open issues or pull requests.
Files
zhangshuo1-domain-classific…/README.md
2019-12-31 15:19:14 +08:00

17 lines
951 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Domain Classification
程序根据url库和cdn库对指定格式dns日志中的域名进行分类
使用makefile进行编译需要c++11的环境支持
编译的可执行程序是DomainDeal调用格式DomainDeal filefile是输入文件
输入文件应使用指定格式第一行为忽略行剩下的每行存在至少15个以tab分隔的短语其中第3个短语为待分类域名一个示例的输入文件为test.txt
输出的结果存储在data目录下其中包含多个文件。statis.txt是统计结果other是无法分类的部分spcdn是cdn库匹配结果其他每个文件是文件名对应url库类别的分类结果。
除了统计结果外,文件每一行都是该类别的域名,未经过去重
lib目录中存储两个库UrlDomainList.dat是url分类库CdnDomainList.dat是cdn的cname匹配库都是二进制文件。
程序逻辑是优先匹配url库其次匹配cdn库。