first draft

This commit is contained in:
liuwentan
2023-07-05 21:47:58 +08:00
parent 69ea78debb
commit 2d6ffdd166
6 changed files with 772 additions and 168 deletions

BIN
docs/imgs/thread_mode.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

View File

@@ -1,21 +1,34 @@
# Concepts # Concepts
Item: As a filter for network attributes, the smallest unit of a rule **Item**: As a filter for network attributes, the smallest unit of a rule
- Eg1: specify that the UserAgent field in the HTTP protocol contains substrings "Chrome" and "11.8.1",
   HTTP UserAgent: Chrome & 11.8.1
- Eg1: specify that the UserAgent field in the HTTP protocol contains substrings "Chrome" and "11.8.1".
- Eg2: specify that the domain name in the HTTP protocol ends with ".emodao.com" - Eg2: specify that the domain name in the HTTP protocol ends with ".emodao.com"
   HTTP HOST: *.emodao.com
- Eg3: specify client IP address belongs to the C segment of 202.118.101.* - Eg3: specify client IP address belongs to the C segment of 202.118.101.*
The types of Items include string(such as keywords, regular expression), IP(mask, CIDR, range) and numeric range    Source IP: 202.11.101.0/24
Group(Object): Collection of Items, the constraints of group are as follows: There are multiple types of items stored in corresponding tables such as string, IP and numerical range, more details can be found in [Item table](./table_schema.md#item-table)
**Group(Object)**: Collection of Items, the constraints of group are as follows:
- An Item only belongs to one group, but one group can has multiple items. The multiple items under the same group are logical 'OR' relationships. e.g.(g1 = item1 | item2)
- A Group can be included or excluded by other groups. The multiple included groups under the same superior group are logical 'OR' relationship. e.g.(g3 = incl-g1 | incl-g2). Between included group and excluded group under the same superior group are logical 'AND' relationship. e.g.(g4 = incl-g1 & excl-g2)
- An Item only belongs to one group, but one group can has multiple items. The multiple items under the same group are logical 'OR' relationships.
- A Group can be included or excluded by other groups. The multiple included groups under the same superior group are logical 'OR' relationship. Between included group and excluded group under the same superior group are logical 'AND' relationship.
- Group supports multi-level nesting - Group supports multi-level nesting
- A Group can be referenced by multiple compiles. - A Group can be referenced by multiple compiles.
Compile(Policy): A conjunctive normal form(CNF) consisting of multiple groups and virtual tables The relationship between group and group is stored in the [group2group table](./table_schema.md#4-group2group-table), while the relationship between group and compile is stored in the [group2compile table](./table_schema.md#5-group2compile-table).
**Compile(Policy)**: A conjunctive normal form(CNF) consisting of multiple groups and virtual tables
- A Compile can contain up to 8 clauses and multiple clauses in the same compile can be logical 'AND' and logical 'NOT' relationships. - A Compile can contain up to 8 clauses and multiple clauses in the same compile can be logical 'AND' and logical 'NOT' relationships.
- A Clause consists of several Literals and the relationship between them is a logical 'OR'. A Literal consists of virtual table and group. During the configuration loading process, a unique Clause ID will be generated based on the combination of virtual table ID and group ID in the same clause. - A Clause consists of several Literals and the relationship between them is a logical 'OR'. A Literal consists of virtual table and group. During the configuration loading process, a unique Clause ID will be generated based on the combination of virtual table ID and group ID in the same clause.

View File

@@ -1,8 +1,142 @@
# Table Data # Table Data
输入必须使用UTF-8 without BOM编码例如MySQL使用utf8mb4而不是utf8参考永远不要在MySQL中使用UTF8编码 Input must use UTF-8 without BOM encoding, for example, MySQL use utf8mb4.
## 行列式文件格式IRIS Maat supports three configuration loading modes.
- [Redis mode](#1-redis-mode)
- [Iris mode](#2-iris-mode)
- [Json mode](#3-json-mode)
## 1.<a name='Redis mode'></a> Redis mode
Maat可以通过Redis的主从同步机制实现配置的分发。本节介绍MAAT加载Redis中配置时对存储结构的要求。和数据库一样Redis存储结构的设计上不需要考虑编译、分组和域的逻辑层次。由配置更新线程通过行列式配置重构各层次间的组合关系。
![Sync Rule with Redis](./imgs/data-sync-with-redis.png)
### 1.1 Transactional Write
表 26 MAAT Redis中定义的数据结构
| Redis KEY | 名称 | 结构 | 用途 |
| ------------------------------------------------------ | ---------------- | -------------- | ------------------------- |
| MAAT_VERSION | primary version | INTERGER | 标识Redis中配置的版本号。当redis中版本号大于MAAT中配置版本号时会去读取MAAT_UPDATE_STATUS。 |
| MAAT_PRE_VERSION | 预备版本 | INTERGER | |
| MAAT_TRANSACTION_xx | 事务配置状态 | LIST | 用于临时存储事务中的配置状态xx为MAAT_PRE_VERSION其中的状态在事务结束后会被更新到MAAT_UPDATE_STATUS本身被删除。 |
| MAAT_UPDATE_STATUS | 配置状态 | sorted set, member是配置规则score 为版本号详见11.3 | MAAT会用ZRANGEBYSCORE命令读取。 |
| MAAT_RULE_TIMER | 主配置超时信息 | sorted set, member是配置规则score为超时间详见11.4 | MAAT配置更新线程会定时检查超时状况并设置超时状态。 |
| MAAT_VERSION_TIMER | 版本创建时间 | sorted Set | 存储了每个版本的创建时间score为版本创建时间member 为version用以将MAAT_UPDATE_STATUS维持在一个较小的规模。 |
| MAAT_LABEL_INDEX | 标签索引 | sorted set, element 是配置表名编译配置IDscore为label_id | |
| EFFECTIVE_RULE:TableName,ID OBSOLETE_RULE:TableName,ID | 主配置 |string | 生效中的配置结构与10.3中的行结构相同MAAT会逐条加载。 |
| SEQUENCE_ REGION | 域ID生成序列号 | INTERGER | 用于生产者生成不重复的region_id |
| SEQUENCE_ GROUP | 分组ID生成序列号 | INTERGER | 用于生产者生成不重复的group_id |
| EXPIRE_OP_LOCK | 分布式锁 | 字符串”locked" | 用于保证最多只有一个写者进行淘汰。 |
Maat command API 可直接将配置写入 redis
```c
struct maat_cmd_line {
const char *table_name;
const char *table_line;
long long rule_id; // for MAAT_OP_DEL, only rule_id and table_name are necessary.
int expire_after; //expired after $timeout$ seconds, set to 0 for never timeout.
};
int maat_cmd_set_line(struct maat *maat_inst, const struct maat_cmd_line *line_rule);
Example:
char table_line[1024] = {0};
long long item_id = 100;
long long group_id = 200;
const char *keywords = "Hello&Maat";
int expr_type = 1; //EXPR_TYPE_AND
int match_method = 0; //MATCH_METHOD_SUB
int is_hexbin = 0;
int op = 1; //add
sprintf(table_line, "%lld\t%lld\t%s\t%d\t%d\t%d\t%d", item_id, group_id,
keywords, expr_type, match_method, is_hexbin, op);
struct maat_cmd_line line_rule;
line_rule.rule_id = item_id;
line_rule.table_line = table_line;
line_rule.table_name = table_name;
line_rule.expire_after = expire_after;
int ret = maat_cmd_set_line(maat_inst, &line_rule);
```
### 1.2 主版本号、预备版本号与Lua Script
生产者写入配置时先对预备版本号加1并作为写入配置状态的score待写入完成后再对主版本号加1。放弃WATCH MAAT_VERSION的事务。这一方法可以大幅提高写入性能除ID冲突外可确保写入成功。
当有多个生产者时可能存在配置状态与主版本号不一致的问题。主版本号为v某次更新时在配置状态中声明的版本号为u消费者增量更新时有以下情况
- 若v=u则版本号一致配置正常加载
- 若v>u该情况不存在。因为只有配置状态修改完成主版本号才会增加1。换句话说每次写入都是先增加预备版本号后增加主版本号所以主版本号必然小于或等于配置状态中的最大版本号。
- 错误:三个生产者情况下,有问题,如下表。
- 若v<u说明两个生产者中先启动写入的并没有先完成此时本次只更新到版本v留待下次轮询再更新至u
消费者全量更新时不看配置状态直接读取全部有效配置因为配置写入和主版本号增加1在同一个事务中执行读取到的全量配置版本必定与主版本一致
有多个生产者的情况下可能丢失配置更新消息状态
| **Time** | **Producer1** | **Producer2** | **Producer3** | **consmuer** |
| -------- | ------------------------ | ------------------------ | ------------------------ | ----------------------------------------------- |
| **0** | 准备更新mv=3924, tv=3925 | | | |
| **1** | | 准备更新mv=3924, tv=3926 | | |
| **2** | | | 准备更新mv=3924, tv=3927 | |
| **3** | | | 更新完毕mv=3925, tv=3927 | Get version 3925, zrangebyscore拿不到3925的状态 |
| **4** | | 更新完毕mv=3926, tv=3926 | | Maat版本号升到3926报错:noncontigous |
| **5** | 更新完毕mv=3927, tv=3925 | | | 3925被跳过 |
在事务结束部分采用lua script检查事务版本号transaction_version与主版本号maat_version
- tv==mv无需修正
- tv>mv本次更新的增量将在下一次
- tv<mv如何识别本次事务写入配置状态的规则呢然后才能将其score改为mv
为了解决事物结束时transaction version<maat_version的问题使用redis list MAAT_TRANSACTION_xx存储配置更新状态xx取自MAAT_PRE_VERSION事务结束时再用lua script同步MAAT_UPDATE_STATUS并删除MAAT_TRANSACTION_xx
![Add and Delete Operation](./imgs/add-del-rule-with-redis.png)
#### 1.3 MAAT_UPDATE_STATUS
该结构中使用Sorted Set存储了主配置的变化状态score为版本号member为配置状态member的02字节描述了更新指令
1. ADD即配置增加结构为ADD,TableName,ID
2. DEL即配置删除结构为DEL,TableName,ID
MAAT在发现MAAT_VERSION变化后会用ZRANGEBYSCORE读取更新的配置状态按VERSION升序并检测第一个配置的Score如该Score>Maat版本+1则说明有遗漏的更新网络长时间中断启用全量更新流程。
对于DEL状态如果查询不到对应的主配置状态同样说明有遗漏更新网络中断时间超过OBSOLETE_RULE超时时间启用全量更新流程。
#### MAAT_EXPIRE_TIMER
该结构使用Sorted Set存储了主配置的超时信息score为绝对超时时间member的结构为TableName,ID。
#### MAAT_VERSION_TIMER
该结构使用Sorted Set存储了每个版本的创建时间score为版本创建时间member为 版本号version即MAAT_UPDATE_STATUS的score用以将MAAT_UPDATE_STATUS维持在一个较小的规模。
#### 主配置结构
有两类配置命名方式:
1. EFFECTIVE_RULE:TableName,ID 表示正在生效的配置;
2. OBSOLETE_RULE:TableName,ID 表示已经删除的配置这些配置超时EXPIRE后会被Redis删除。
### Load From Redis
Maat实例的工作线程定时轮询Redis中MAAT_VERSION如果大于实例的MAAT_VERSION则进行更新。
![Load from Redis](./imgs/load-data-from-redis.png)
### 读写性能
为保证事务Redis需工作在单机+主从模式。带超时的配置写入5000条/秒无超时配置10000条/秒。
## 2.<a name='Iris mode'></a> Iris mode
在Maat可以监听全量和增量目录下的文件来更新配置运行时变化下面对这种模式下的文件格式进行介绍。 在Maat可以监听全量和增量目录下的文件来更新配置运行时变化下面对这种模式下的文件格式进行介绍。
@@ -60,106 +194,7 @@
2. 在配置汇总表中增加该配置的汇总信息注意要和库表文件中的compile_id一致且不能与已有compile_id冲突修改文件第一行的行数 2. 在配置汇总表中增加该配置的汇总信息注意要和库表文件中的compile_id一致且不能与已有compile_id冲突修改文件第一行的行数
3. 在库表索引文件中修改配置汇总表和域表的行数 3. 在库表索引文件中修改配置汇总表和域表的行数
## Redis配置加载接口 ## 3.<a name='Json mode'></a> Json mode
Maat可以通过Redis的主从同步机制实现配置的分发。本节介绍MAAT加载Redis中配置时对存储结构的要求。和数据库一样Redis存储结构的设计上不需要考虑编译、分组和域的逻辑层次。由配置更新线程通过行列式配置重构各层次间的组合关系。
![Sync Rule with Redis](./imgs/data-sync-with-redis.png)
### Transactional Write
表 26 MAAT Redis中定义的数据结构
| Redis KEY | 名称 | 结构 | 用途 |
| ------------------------------------------------------- | ---------------- | ----------------------------------------------------------- | ------------------------------------------------------------ |
| MAAT_VERSION | 主版本号 | INTERGER | 标识Redis中配置的版本号。当redis中版本号大于MAAT中配置版本号时会去读取MAAT_UPDATE_STATUS。 |
| MAAT_PRE_VERSION | 预备版本号 | INTERGER | |
| MAAT_TRANSACTION_xx | 事务配置状态 | LIST | 用于临时存储事务中的配置状态xx为MAAT_PRE_VERSION其中的状态在事务结束后会被更新到MAAT_UPDATE_STATUS本身被删除。 |
| MAAT_UPDATE_STATUS | 配置状态 | sorted set, member是配置规则score 为版本号详见11.3 | MAAT会用ZRANGEBYSCORE命令读取。 |
| MAAT_RULE_TIMER | 主配置超时信息 | sorted set, member是配置规则score为超时间详见11.4 | MAAT配置更新线程会定时检查超时状况并设置超时状态。 |
| MAAT_VERSION_TIMER | 版本创建时间 | sorted Set | 存储了每个版本的创建时间score为版本创建时间member 为version用以将MAAT_UPDATE_STATUS维持在一个较小的规模。 |
| MAAT_LABEL_INDEX | 标签索引 | sorted set, element 是配置表名编译配置IDscore为label_id | |
| EFFECTIVE_RULE:TableName,ID OBSOLETE_RULE:TableName,ID | 主配置 | string | 生效中的配置结构与10.3中的行结构相同MAAT会逐条加载。 |
| SEQUENCE_ REGION | 域ID生成序列号 | INTERGER | 用于生产者生成不重复的region_id |
| SEQUENCE_ GROUP | 分组ID生成序列号 | INTERGER | 用于生产者生成不重复的group_id |
| EXPIRE_OP_LOCK | 分布式锁 | 字符串”locked” | 用于保证最多只有一个写者进行淘汰。 |
源码中reset_redis4maat.sh工具或Maat_cmd_flushDB函数可以对redis进行初始化。
#### 主版本号、预备版本号与Lua Script
生产者写入配置时先对预备版本号加1并作为写入配置状态的score待写入完成后再对主版本号加1。放弃WATCH MAAT_VERSION的事务。这一方法可以大幅提高写入性能除ID冲突外可确保写入成功。
当有多个生产者时可能存在配置状态与主版本号不一致的问题。主版本号为v某次更新时在配置状态中声明的版本号为u消费者增量更新时有以下情况
- 若v=u则版本号一致配置正常加载
- 若v>u该情况不存在。因为只有配置状态修改完成主版本号才会增加1。换句话说每次写入都是先增加预备版本号后增加主版本号所以主版本号必然小于或等于配置状态中的最大版本号。
- 错误:三个生产者情况下,有问题,如下表。
- 若v<u说明两个生产者中先启动写入的并没有先完成此时本次只更新到版本v留待下次轮询再更新至u
消费者全量更新时不看配置状态直接读取全部有效配置因为配置写入和主版本号增加1在同一个事务中执行读取到的全量配置版本必定与主版本一致
有多个生产者的情况下可能丢失配置更新消息状态
| **Time** | **Producer1** | **Producer2** | **Producer3** | **consmuer** |
| -------- | ------------------------ | ------------------------ | ------------------------ | ----------------------------------------------- |
| **0** | 准备更新mv=3924, tv=3925 | | | |
| **1** | | 准备更新mv=3924, tv=3926 | | |
| **2** | | | 准备更新mv=3924, tv=3927 | |
| **3** | | | 更新完毕mv=3925, tv=3927 | Get version 3925, zrangebyscore拿不到3925的状态 |
| **4** | | 更新完毕mv=3926, tv=3926 | | Maat版本号升到3926报错:noncontigous |
| **5** | 更新完毕mv=3927, tv=3925 | | | 3925被跳过 |
在事务结束部分采用lua script检查事务版本号transaction_version与主版本号maat_version
- tv==mv无需修正
- tv>mv本次更新的增量将在下一次
- tv<mv如何识别本次事务写入配置状态的规则呢然后才能将其score改为mv
为了解决事物结束时transaction version<maat_version的问题使用redis list MAAT_TRANSACTION_xx存储配置更新状态xx取自MAAT_PRE_VERSION事务结束时再用lua script同步MAAT_UPDATE_STATUS并删除MAAT_TRANSACTION_xx
![Add and Delete Operation](./imgs/add-del-rule-with-redis.png)
#### MAAT_UPDATE_STATUS
该结构中使用Sorted Set存储了主配置的变化状态score为版本号member为配置状态member的02字节描述了更新指令
1. ADD即配置增加结构为ADD,TableName,ID
2. DEL即配置删除结构为DEL,TableName,ID
MAAT在发现MAAT_VERSION变化后会用ZRANGEBYSCORE读取更新的配置状态按VERSION升序并检测第一个配置的Score如该Score>Maat版本+1则说明有遗漏的更新网络长时间中断启用全量更新流程。
对于DEL状态如果查询不到对应的主配置状态同样说明有遗漏更新网络中断时间超过OBSOLETE_RULE超时时间启用全量更新流程。
#### MAAT_EXPIRE_TIMER
该结构使用Sorted Set存储了主配置的超时信息score为绝对超时时间member的结构为TableName,ID。
删除配置时exec_serial_rule会关联删除该索引。
#### MAAT_VERSION_TIMER
该结构使用Sorted Set存储了每个版本的创建时间score为版本创建时间member为 版本号version即MAAT_UPDATE_STATUS的score用以将MAAT_UPDATE_STATUS维持在一个较小的规模。
#### 主配置结构
有两类配置命名方式:
1. EFFECTIVE_RULE:TableName,ID 表示正在生效的配置;
2. OBSOLETE_RULE:TableName,ID 表示已经删除的配置这些配置超时EXPIRE后会被Redis删除。
### Load From Redis
Maat实例的工作线程定时轮询Redis中MAAT_VERSION如果大于实例的MAAT_VERSION则进行更新。
![Load from Redis](./imgs/load-data-from-redis.png)
### 读写性能
为保证事物Redis需工作在单机+主从模式。带超时的配置写入5000条/秒无超时配置10000条/秒。
## JSON配置加载接口
使用Maat_summon_feather_jsonMaat_set_feather_opt函数通过选项MAAT_OPT_JSON_FILE_PATH设置进行JSON格式配置的加载。Maat在初始化后一旦检测到文件MD5值的变化则以全量更新的方式加载变化的json文件。 使用Maat_summon_feather_jsonMaat_set_feather_opt函数通过选项MAAT_OPT_JSON_FILE_PATH设置进行JSON格式配置的加载。Maat在初始化后一旦检测到文件MD5值的变化则以全量更新的方式加载变化的json文件。
@@ -457,10 +492,6 @@ Maat的配置管理线程会针对增量索引文件目录进行扫描在初
文件扫描间隔和配置生效间隔可以通过Maat_set_feather_opt设置详见本文档“函数接口”一章。 文件扫描间隔和配置生效间隔可以通过Maat_set_feather_opt设置详见本文档“函数接口”一章。
### 引用计数
引用计数机制为了避免多个变量多线程读写因Cache一致性和伪共享问题导致速度降低采用为每个线程分配64字节对齐的引用计数变量。
### 延迟删除机制 ### 延迟删除机制
Maat使用延时删除机制在不使用锁的前提下保证线程安全。 Maat使用延时删除机制在不使用锁的前提下保证线程安全。
@@ -475,12 +506,4 @@ c) 需要修改时获得mutex后访问
扫描线程中: 扫描线程中:
a) 需要读取时获得mutex后访问 a) 需要读取时获得mutex后访问
### 强制卸载机制
rulescan内部使用引用计数方式管理待删除的自动机其引用计数的加减周期是一次扫描函数的调用而不是一次流式扫描。满足MAAT实现强制卸载机制的条件。
所谓强制卸载机制,是指在一次流式扫描过程中,配置发生更新后,强制卸载该次扫描所引用的自动机,回收所占用内存。后续引用旧自动机的流式字符串扫描将不做任何命中,直接返回。
由于组合扫描为对MAAT的扫描器进行引用计数替换前后各自使用当前的bool matcher进行规则组合运算不受此影响。

View File

@@ -1,21 +1,47 @@
# Table Schema # Table Schema
Since Maat 4.0The range of item_id(group_id, compile_id) is 02^63which is 8 bytes. Maat tables are divided into two categories: physical tables that actually exist in the database and virtual tables that reference physical tables.
## Item Table The types of physical tables are as follows:
- [item table](#1-item-table)
- [compile table](#4-compile-table)
- [group2compile table](#3-group2compile-table)
- [group2group table](#2-group2group-table)
- [plugin table](#5-plugin-table)
- [ip_plugin table](#6-ip_plugin-table)
- [fqdn_plugin table](#7-fqdn_plugin-table)
- [bool_plugin table](#8-bool_plugin-table)
Each item table must has the following columns Different physical tables can be combined into one table, see [conjunction table](#112-12-conjunction-table)
A virtual table can only reference one physical table or conjuntion table, see [virtual table](#111-11-virtual-table)
## 1. <a name='Itemtable'></a> Item table
Item tables are further subdivided into different types of subtables as follows:
- [expr item table](#11-expr-item-table)
- [expr_plus item table](#12-expr_plus-item-table)
- [ip_plus item table](#13-ip_plus-item-table)
- [intval item table](#14-numeric-range-item-table)
- [intval_plus item table](#14-numeric-range-item-table)
- [flag item table](#15-flag-item-table)
- [flag_plus item table](#16-flag_plus-item-table)
Each item table must has the following columns:
- item_id: In a maat instance, the item id is globally unique, meaning that the item IDs of different tables must not be duplicate.
- item_id: In a maat instance, the item ID is globally unique, meaning that the item IDs of different tables must not be duplicate.
- group_id: Indicate the group to which the item belongs, an item belongs to only one group. - group_id: Indicate the group to which the item belongs, an item belongs to only one group.
- is_valid: In incremental updates, 1(valid means add) 0(invalid means del) - is_valid: In incremental updates, 1(valid means add) 0(invalid means del)
Different types of tables also have different fields defined according to their respective needs. The range of item_id(group_id, compile_id) is 02^63which is 8 bytes.
### 1. String item table ### 1.1 <a name='exprtable'></a> expr item table
Describe matching rules for strings. Describe matching rules for strings.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ---------------- | -------------- | -------- | ------- | | ---------------- | -------------- | -------- | ------- |
| **item_id** | LONG LONG | N | primary key | | **item_id** | LONG LONG | N | primary key |
@@ -26,22 +52,78 @@ Describe matching rules for strings.
| **is_hexbin** | INT | N | 0(not HEX & case insensitive, this is default value) 1(HEX & case sensitive) 2(not HEX & case sensitive) | | **is_hexbin** | INT | N | 0(not HEX & case insensitive, this is default value) 1(HEX & case sensitive) 2(not HEX & case sensitive) |
| **is_valid** | INT | N | 0(invalid), 1(valid) | | **is_valid** | INT | N | 0(invalid), 1(valid) |
Matching rules for stringexpr_type column represents the expression type. - table schema(stored in table_info.conf)
```c
{
"table_id":3, //[0 ~ 1023], don't allow duplicate
"table_name":"HTTP_URL", //db table's name
"table_type":"expr",
"valid_column":7, //7th column(is_valid field)
"custom": {
"item_id":1, //1st column(item_id field)
"group_id":2, //2nd column(group_id field)
"keywords":3, //3rd column(keywords field)
"expr_type":4, //4th column(expr_type field)
"match_method":5,//5th column(match_method field)
"is_hexbin":6 //6th column(is_hexbin field)
}
}
/* If you want to combine multiple physical tables into one table, db_tables should be added as follows.
The value of table_name can be a user-defined string, the value of db_tables is the table name that actually exists in database. */
{
"table_id":3, //[0 ~ 1023], don't allow duplicate
"table_name":"HTTP_REGION", //user-defined string
"db_tables":["HTTP_URL", "HTTP_HOST"],
"table_type":"expr",
"valid_column":7,
"custom": {
"item_id":1,
"group_id":2,
"keywords":3,
"expr_type":4,
"match_method":5,
"is_hexbin":6
}
}
```
**expr_type** column represents the expression type:
1. keywords matching(0), match_method column as follows 1. keywords matching(0), match_method column as follows
- substring matching (0) - substring matching (0)
- suffix matching (1)
- prefix matching (2) For example: substring: "China", scan_data: "Hello China" will hit, "Hello World" will not hit
- exactly matching (3)
2. AND expression(1), supports up to 8 substrings. - suffix matching (1)
For example: suffix: ".baidu.com", scan_data: "www.baidu.com" will hit, "www.google.com" will not hit
- prefix matching (2)
For example: prefix: "^abc", scan_data: "abcdef" will hit, "1abcdef" will not hit
- exactly matching (3)
For example: string: "World", scan_data: "World" will hit, "Hello World" will not hit
2. AND expression(1), supports up to 8 substrings.
For example: AND expr: "yesterday&today", scan_data: "Goodbye yesterday, Hello today!" will hit, "Goodbye yesterday, Hello tomorrow!" will not hit.
3. Regular expression(2) 3. Regular expression(2)
For example: Regex expr: "[W|world]", scan_data: "Hello world" will hit, "Hello World" will hit too.
4. substring matching with offset(3) 4. substring matching with offset(3)
- offset start with 0, [offset_start, offset_end] closed interval - offset start with 0, [offset_start, offset_end] closed interval
- multiple substrings with offset are logical AND - multiple substrings with offset are logical AND
Since Maat4.0only support UTF-8no more encoding conversion。For binary format configurations, the keyword is hexadecimal, such as the keyword "hello" is represented as "68656C6C6F". A keyword can't contain invisible characters such as spaces, tabs, and CR, which are ASCII codes 0x00 to 0x1F and 0x7F. For example: substring expr: "1-1:48&3-4:4C4C", scan_data: "HELLO" will hit, "HLLO" will not hit.
If these characters need to be used, they must be escaped, refer to the "keywords escape table". **Note**: 48('H') 4C('L')
Characters led by backslashes outside this table are processed as ordinary strings, such as '\t' will be processed as the string "\t".
&ensp;&ensp;Since Maat4.0only support UTF-8no more encoding conversion。For binary format configurations, the keyword is hexadecimal, such as the keyword "hello" is represented as "68656C6C6F". A keyword can't contain invisible characters such as spaces, tabs, and CR, which are ASCII codes 0x00 to 0x1F and 0x7F. If these characters need to be used, they must be escaped, refer to the "keywords escape table". Characters led by backslashes outside this table are processed as ordinary strings, such as '\t' will be processed as the string "\t".
The symbol '&' means conjunction operation in AND expression. So if the keywords has '&', it must be escaped by '\&'. The symbol '&' means conjunction operation in AND expression. So if the keywords has '&', it must be escaped by '\&'.
@@ -55,17 +137,366 @@ The symbol '&' means conjunction operation in AND expression. So if the keywords
Length constraint Length constraint
- Single substring no less than 3 bytes - Single substring no less than 3 bytes
- No less than 3 bytes for a single substring in AND expression
- No less than 3 bytes for a single substring in AND expression
- Support up to 8 substrings in one AND expression, expr = substr1 & substr2 & substr3 & substr4 & substr5 & substr6 & substr7 & substr8 - Support up to 8 substrings in one AND expression, expr = substr1 & substr2 & substr3 & substr4 & substr5 & substr6 & substr7 & substr8
- The length of one AND expression should not exceed 1024 bytes(including '&') - The length of one AND expression should not exceed 1024 bytes(including '&')
Sample
- table schema
- rule
- scanning
### 2. IP item table table schema stored in table_info.conf
```json
[
{
"table_id":0,
"table_name":"COMPILE",
"table_type":"compile",
"valid_column":8,
"custom": {
"compile_id":1,
"tags":6,
"clause_num":9
}
},
{
"table_id":1,
"table_name":"GROUP2COMPILE",
"table_type":"group2compile",
"associated_compile_table_id":0,
"valid_column":3,
"custom": {
"group_id":1,
"compile_id":2,
"not_flag":4,
"virtual_table_name":5,
"clause_index":6
}
},
{
"table_id":2,
"table_name":"GROUP2GROUP",
"table_type":"group2group",
"valid_column":4,
"custom": {
"group_id":1,
"super_group_id":2,
"is_exclude":3
}
},
{
"table_id":3,
"table_name":"HTTP_URL",
"table_type":"expr",
"valid_column":7,
"custom": {
"item_id":1,
"group_id":2,
"keywords":3,
"expr_type":4,
"match_method":5,
"is_hexbin":6
}
}
]
```
rule stored in maat_json.json
```json
{
"compile_table": "COMPILE",
"group2compile_table": "GROUP2COMPILE",
"group2group_table": "GROUP2GROUP",
"rules": [
{
"compile_id": 123,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "multiple disciplines",
"expr_type": "none",
"match_method": "exact",
"format": "uncase plain"
}
}
]
}
]
},
{
"compile_id": 124,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "baidu.com",
"expr_type": "none",
"match_method": "suffix",
"format": "uncase plain"
}
}
]
}
]
},
{
"compile_id": 125,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "www",
"expr_type": "none",
"match_method": "prefix",
"format": "uncase plain"
}
}
]
}
]
},
{
"compile_id": 126,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "abc&123",
"expr_type": "and",
"match_method": "sub",
"format": "uncase plain"
}
}
]
}
]
},
{
"compile_id": 127,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "action=search\\&query=(.*)",
"expr_type": "regex",
"match_method": "sub",
"format": "uncase plain"
}
}
]
}
]
},
{
"compile_id": 128,
"service": 1,
"action": 1,
"do_blacklist": 1,
"do_log": 1,
"user_region": "anything",
"is_valid": "yes",
"groups": [
{
"group_name": "Untitled",
"regions": [
{
"table_name": "HTTP_URL",
"table_type": "expr",
"table_content":
{
"keywords": "1-1:48&3-4:4C4C",
"expr_type": "offset",
"match_method": "sub",
"format": "uncase plain"
}
}
]
}
]
}
]
}
```
scanning
```c
#include <assert.h>
#include "maat.h"
#define ARRAY_SIZE 16
const char *json_filename = "./maat_json.json";
const char *table_info_path = "./table_info.conf";
int main()
{
// initialize maat options which will be used by maat_new()
struct maat_options *opts = maat_options_new();
maat_options_set_json_file(opts, json_filename);
maat_options_set_logger(opts, "./sample_test.log", LOG_LEVEL_INFO);
// create maat instance, rules in table_info.conf will be loaded.
struct maat *maat_instance = maat_new(opts, table_info_path);
assert(maat_instance != NULL);
maat_options_free(opts);
const char *table_name = "HTTP_URL"; //maat_json.json has HTTP_URL rule
int table_id = maat_get_table_id(maat_instance, table_name);
assert(table_id == 3); // defined in table_info.conf
int thread_id = 0;
long long results[ARRAY_SIZE] = {0};
size_t n_hit_result = 0;
struct maat_state *state = maat_state_new(maat_instance, thread_id);
assert(state != NULL);
const char *scan_data1 = "There are multiple disciplines";
int ret = maat_scan_string(maat_instance, table_id, scan_data1, strlen(scan_data1),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 123);
maat_state_reset(state);
const char *scan_data2 = "www.baidu.com";
ret = maat_scan_string(maat_instance, table_id, scan_data2, strlen(scan_data2),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 124);
maat_state_reset(state);
const char *scan_data3 = "www.google.com";
ret = maat_scan_string(maat_instance, table_id, scan_data3, strlen(scan_data3),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 125);
maat_state_reset(state);
const char *scan_data4 = "alphabet abc, digit 123";
ret = maat_scan_string(maat_instance, table_id, scan_data4, strlen(scan_data4),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 126);
maat_state_reset(state);
const char *scan_data5 = "http://www.cyberessays.com/search_results.php?action=search&query=username,abckkk,1234567";
ret = maat_scan_string(maat_instance, table_id, scan_data5, strlen(scan_data5),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 127);
maat_state_reset(state);
const char *scan_data6 = "HELLO WORLD";
ret = maat_scan_string(maat_instance, table_id, scan_data6, strlen(scan_data6),
results, ARRAY_SIZE, &n_hit_result, state);
assert(ret == MAAT_SCAN_HIT);
assert(n_hit_result == 1);
assert(results[0] == 128);
maat_state_free(state);
return 0;
}
```
### 1.2 <a name='ExprPlusItemTable'></a> expr_plus item table
Describe extended matching rules for strings by adding the district column.
- table format
| **FieldName** | **type** | **NULL** | **constraint** |
| ---------------- | -------------- | -------- | ------- |
| **item_id** | LONG LONG | N | primary key |
| **group_id** | LONG LONG | N | group2group or group2compile table's group_id |
| **district** | VARCHAR2(1024) | N | describe the effective position of the keywords |
| **keywords** | VARCHAR2(1024) | N | field to match during scanning |
| **expr_type** | INT | N | 0(keywords), 1(AND expr), 2(regular expr), 3(substring with offset)
| **match_method** | INT | N | only useful when expr_type is 0 |
| **is_hexbin** | INT | N | 0(not HEX & case insensitive, this is default value) 1(HEX & case sensitive) 2(not HEX & case sensitive) |
| **is_valid** | INT | N | 0(invalid), 1(valid) |
For example, if the district is User-Agent and keywords is Chrome, scanning in the following way will hit.
```c
const char *scan_data = "Chrome is fast";
const char *district = "User-Agent";
maat_state_set_scan_district(..., district, ...);
maat_scan_string(..., scan_data, ...)
```
### 1.3 <a name='IPPlusItemTable'></a> ip_plus item table
Describe matching rules for IP address. Both the address and port are represented by string, IPv4 is dotted decimal and IPv6 is colon separated hexadecimal. Describe matching rules for IP address. Both the address and port are represented by string, IPv4 is dotted decimal and IPv6 is colon separated hexadecimal.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ------------- | ------------ | -------- | -------------- | | ------------- | ------------ | -------- | -------------- |
@@ -81,26 +512,50 @@ Describe matching rules for IP address. Both the address and port are represente
| protocol | INT | N | default(-1) TCP(6) UDP(17), user define field | | protocol | INT | N | default(-1) TCP(6) UDP(17), user define field |
| is_valid | INT | N | 0(invalid), 1(valid) | | is_valid | INT | N | 0(invalid), 1(valid) |
### 3. Numeric item table ### 1.4 <a name='NumericItemTable'></a> numeric range item table
Determine whether an integer is within a certain numerical range. Determine whether an integer is within a certain numerical range.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ------------- | -------- | -------- | -------------- | | ------------- | -------- | -------- | -------------- |
| item_id | INT | N | primary key | | item_id | INT | N | primary key |
| group_id | INT | N | group2group or group2compile table's group_id | | group_id | INT | N | group2group or group2compile table's group_id |
| low_boundary | INT | N | lower bound of the numerical range(including lb), 0 ~ (2^32 - 1)| | low_boundary | INT | N | lower bound of the numerical range(including lb), 0 ~ (2^32 - 1)|
| up_boundary | INT | N | upper bound of the numerical range(including ub), 0 ~ (2^32 - 1)| | up_boundary | INT | N | upper bound of the numerical range(including ub), 0 ~ (2^32 - 1)|
| is_valid | INT | N | 0(invalid), 1(valid) | | is_valid | INT | N | 0(invalid), 1(valid) |
### 1.5 <a name="FlagItemTable"></a> flag item table
### 4. Group2group table - table format
| **FieldName** | **type** | **NULL** | **constraint** |
| ------------- | -------- | -------- | -------------- |
| item_id | INT | N | primary key |
| group_id | INT | N | group2group or group2compile table's group_id |
| flag | INT | N | flag, 0 ~ (2^32 - 1)|
| flag_mask | INT | N | flag_mask, 0 ~ (2^32 - 1)|
| is_valid | INT | N | 0(invalid), 1(valid) |
### 1.6 <a name="FlagPlusItemTable"></a> flag_plus item table
- table format
| **FieldName** | **type** | **NULL** | **constraint** |
| ------------- | -------- | -------- | -------------- |
| item_id | INT | N | primary key |
| group_id | INT | N | group2group or group2compile table's group_id |
| district | INT | N | describe the effective position of the flag |
| flag | INT | N | flag, 0 ~ (2^32 - 1)|
| flag_mask | INT | N | flag_mask, 0 ~ (2^32 - 1)|
| is_valid | INT | N | 0(invalid), 1(valid) |
### 2. <a name='group2grouptable'></a> group2group table
Describe the relationship between groups. Describe the relationship between groups.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ----------------- | --------- | -------- | ---------------| | ----------------- | --------- | -------- | ---------------|
@@ -109,11 +564,11 @@ Describe the relationship between groups.
| is_exlude | Bool | N | 0(include) 1(exclude) | | is_exlude | Bool | N | 0(include) 1(exclude) |
| is_valid | Bool | N | 0(invalid), 1(valid) | | is_valid | Bool | N | 0(invalid), 1(valid) |
### 5. Group2compile table ### 3. <a name='group2compiletable'></a> group2compile table
Describe the relationship between group and compile. Describe the relationship between group and compile.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ------------- | ------------- | -------- | ------- | | ------------- | ------------- | -------- | ------- |
@@ -126,11 +581,11 @@ Describe the relationship between group and compile.
NOTE: If group_id is invalid in xx_item table, it must be marked as invalid in this table. NOTE: If group_id is invalid in xx_item table, it must be marked as invalid in this table.
### 6. Compile table ### 4. <a name='compiletable'></a> compile table
Describe the specific policy, One maat instance can has multiple compile tables with different names. Describe the specific policy, One maat instance can has multiple compile tables with different names.
#### table schema - table format
| **FieldName** | **type** | **NULL** | **constraint** | | **FieldName** | **type** | **NULL** | **constraint** |
| ---------------- | -------------- | -------- | --------------- | | ---------------- | -------------- | -------- | --------------- |
@@ -146,14 +601,54 @@ Describe the specific policy, One maat instance can has multiple compile tables
| evaluation_order | DOUBLE | N | | default 0 | | evaluation_order | DOUBLE | N | | default 0 |
### 7. Plugin table ### 5. <a name='plugintable'></a> plugin table
There is no fixed format for configuration of the plugin table, which is determined by business side. The plugin table support three types of keys: pointer, integer and ip_addr. There is no fixed format for configuration of the plugin table, which is determined by business side. The plugin table supports two sets of callback functions, registered with **maat_table_callback_register** and **maat_plugin_table_ex_schema_register** respectively.
maat_table_callback_register
```c
/*
When the plugin table configurations are updated, start will be called first and only once, then update will be called by each configuration item, and finish will be called last and only once.
If configurations have been loaded but maat_table_callback_register has not yet been called, maat will cache the loaded configurations and perform the callbacks(start, update, finish) when registration is complete.
*/
typedef void maat_start_callback_t(int update_type, ...);
//table_line points to one complete configuration line, such as: "1\tHeBei\tShijiazhuang\t1\t0"
typedef void maat_update_callback_t(..., const char *table_line, ...);
typedef void maat_finish_callback_t(...);
int maat_table_callback_register(...,
maat_start_callback_t *start,
maat_update_callback_t *update,
maat_finish_callback_t *finish,
...);
```
maat_plugin_table_ex_schema_register
```c
/*
*/
typedef void maat_ex_new_func_t(..., const char *key, const char *table_line, ...);
typedef void maat_ex_free_func_t(...);
typedef void maat_ex_dup_func_t(...);
int maat_plugin_table_ex_schema_register(...,
maat_ex_new_func_t *new_func,
maat_ex_free_func_t *free_func,
maat_ex_dup_func_t *dup_func,
...);
```
three types of keys(pointer, integer and ip_addr) for ex_data callback.
**pointer key(compatible with maat3)** **pointer key(compatible with maat3)**
(1) schema (1) schema
``` ```json
{ {
"table_id":1, "table_id":1,
"table_name":"TEST_PLUGIN_POINTER_KEY_TYPE", "table_name":"TEST_PLUGIN_POINTER_KEY_TYPE",
@@ -168,7 +663,7 @@ There is no fixed format for configuration of the plugin table, which is determi
``` ```
(2) plugin table configuration (2) plugin table configuration
``` ```json
{ {
"table_name": "TEST_PLUGIN_POINTER_KEY_TYPE", "table_name": "TEST_PLUGIN_POINTER_KEY_TYPE",
"table_content": [ "table_content": [
@@ -180,7 +675,12 @@ There is no fixed format for configuration of the plugin table, which is determi
} }
``` ```
(3) get_ex_data (3) register callback
```c
```
(4) get ex_data
``` ```
const char *key1 = "HeBei"; const char *key1 = "HeBei";
const char *table_name = "TEST_PLUGIN_POINTER_KEY_TYPE"; const char *table_name = "TEST_PLUGIN_POINTER_KEY_TYPE";
@@ -245,7 +745,7 @@ support integers of different lengths, such as int(4 bytes), long long(8 bytes).
} }
``` ```
(3) get_ex_data (3) get ex_data
``` ```
//int //int
int key1 = 101; int key1 = 101;
@@ -295,7 +795,7 @@ The addr_type column indicates whether the key is a v4 or v6 address.
} }
``` ```
(3) get_ex_data (3) get ex_data
``` ```
uint32_t ipv4_addr; uint32_t ipv4_addr;
inet_pton(AF_INET, "100.64.1.1", &ipv4_addr); inet_pton(AF_INET, "100.64.1.1", &ipv4_addr);
@@ -306,11 +806,11 @@ maat_plugin_table_get_ex_data(maat_instance, table_id, (char *)&ipv4_addr, sizeo
``` ```
### 8. IP Plugin table ### 6. <a name='ip_plugintable'></a> ip_plugin table
Similar to plugin table but the key of maat_ip_plugin_table_get_ex_data is ip address. Similar to plugin table but the key of maat_ip_plugin_table_get_ex_data is ip address.
### 9. FQDN Plugin table ### 7. <a name='FQDNPlugintable'></a> fqdn_plugin table
Scan the input string according to the domain name hierarchy '.' Scan the input string according to the domain name hierarchy '.'
@@ -328,13 +828,13 @@ For example:
If the input string is example.com.cn则返回结果顺序为3124。规则5中的ample不是域名层级的一部分不返回。 If the input string is example.com.cn则返回结果顺序为3124。规则5中的ample不是域名层级的一部分不返回。
### 10. BoolPlugin table ### 8. <a name='boolplugintable'></a> bool_plugin table
按照布尔表达式扫描输入的整数数组,如[100,1000,2,3]。 按照布尔表达式扫描输入的整数数组,如[100,1000,2,3]。
布尔表达式规则为“&”分隔的数字例如“1&2&1000”。 布尔表达式规则为“&”分隔的数字例如“1&2&1000”。
### 11. Virtual Table ### 1.11 <a name='virtualtable'></a> virtual table
虚拟一个配置表其内容为特定物理域配置表的视图。实践中通常采用网络流量的属性作为虚拟表名如HTTP_HOST、SSL_SNI等。一个虚拟表可以建立在多个不同类型的物理表之上但不允许建立在其它虚拟表上。 虚拟一个配置表其内容为特定物理域配置表的视图。实践中通常采用网络流量的属性作为虚拟表名如HTTP_HOST、SSL_SNI等。一个虚拟表可以建立在多个不同类型的物理表之上但不允许建立在其它虚拟表上。
@@ -345,7 +845,7 @@ For example:
| **keyword_group_1** | compile_1 | 1 | 0 | 0 | REQUEST_BODY | | **keyword_group_1** | compile_1 | 1 | 0 | 0 | REQUEST_BODY |
| **keyword_group_1** | compile_1 | 1 | 0 | 0 | RESPONSE_BODY | | **keyword_group_1** | compile_1 | 1 | 0 | 0 | RESPONSE_BODY |
### 12. Conjunction Table ### 1.12 <a name='conjunctiontable'></a> conjunction table
表名不同但table id相同的表。旨在数据库表文件和MAAT API之间提供一个虚拟层通过API调用一次扫描即可扫描多张同类配置表。 表名不同但table id相同的表。旨在数据库表文件和MAAT API之间提供一个虚拟层通过API调用一次扫描即可扫描多张同类配置表。
@@ -358,7 +858,7 @@ For example:
支持所有类型表的连接,包括各类域配置、回调类配置。配置分组和配置编译的连接没有意义。 支持所有类型表的连接,包括各类域配置、回调类配置。配置分组和配置编译的连接没有意义。
## Foreign Files ## 2. <a name='ForeignFiles'></a>Foreign Files
回调类配置中特定字段可以指向一个外部内容目前支持指向Redis中的一个key。 回调类配置中特定字段可以指向一个外部内容目前支持指向Redis中的一个key。
@@ -376,7 +876,7 @@ For example:
内容外键的声明方法,参见本文档-配置表描述文件一节。 内容外键的声明方法,参见本文档-配置表描述文件一节。
## Tags ## 3. <a name='Tags'></a>Tags
通过将Maat接受标签与配置标签的匹配实现有选择的配置加载。其中配置标签是一个标签数组的集合记为”tag_sets”Maat接受标签是标签数组记为”tags”。 通过将Maat接受标签与配置标签的匹配实现有选择的配置加载。其中配置标签是一个标签数组的集合记为”tag_sets”Maat接受标签是标签数组记为”tags”。

66
docs/thread_mode.md Normal file
View File

@@ -0,0 +1,66 @@
# Thread mode
Maat will create a monitor loop thread internally when calling maat_new to create maat instance. Scaning threads are created extenal maat caller. So all maat_scan_xx APIs are per-thread
![The subordinate object](./imgs/thread_mode.png)
Sample
```c
const char *table_info_path = "table_info.conf";
size_t thread_num = 5;
struct thread_param {
int thread_id;
struct maat *maat_inst;
const char *table_name;
};
void *string_scan_thread(void *arg)
{
struct thread_param *param = (struct thread_param *)arg;
struct maat *maat_inst = param->maat_inst;
const char *table_name = param->table_name;
const char *scan_data = "String TEST should hit";
long long results[ARRAY_SIZE] = {0};
size_t n_hit_result = 0;
struct maat_state *state = maat_state_new(maat_inst, param->thread_id);
int table_id = maat_get_table_id(maat_inst, table_name);
int ret = maat_scan_string(maat_inst, table_id, scan_data, strlen(scan_data),
results, ARRAY_SIZE, &n_hit_result, state);
EXPECT_EQ(ret, MAAT_SCAN_HIT);
EXPECT_EQ(n_hit_result, 1);
EXPECT_EQ(results[0], 123);
maat_state_free(state);
return NULL;
}
int main()
{
struct maat_options *opts = maat_options_new();
maat_options_set_caller_thread_number(opts, thread_num);
struct maat *maat_inst = maat_new(opts, table_info_path);
pthread_t threads[thread_num];
struct thread_param thread_params[thread_num];
for (size_t i = 0; i < thread_num; i++) {
thread_params[i].maat_inst = maat_inst;
thread_params[i].thread_id = i;
thread_params[i].table_name = table_name;
pthread_create(&threads[i], NULL, string_scan_thread, thread_params+i);
}
for (i = 0; i < thread_num; i++) {
pthread_join(threads[i], NULL);
}
return 0;
}
```

View File

@@ -195,4 +195,6 @@ int main()
* [Scan API](./docs/scan_api.md) * [Scan API](./docs/scan_api.md)
* [Thread mode](./docs/thread_mode.md)
* [Tools](./docs/tools.md) * [Tools](./docs/tools.md)