release 24.09 test

This commit is contained in:
doufenghu
2024-11-07 20:48:12 +08:00
parent b07fbdc544
commit 3539138f95
45 changed files with 25563 additions and 90 deletions

33
.gitlab-ci.yml Normal file
View File

@@ -0,0 +1,33 @@
image: 192.168.40.153:8082/common/maven:3.8.1-openjdk-11-slim
stages:
- build
- deploy
variables:
VERSION: "$CI_COMMIT_TAG"
build:
stage: build
script:
- echo "Building the project..."
- tar -czvf tsg-olap-data-initialization-$VERSION.tar.gz --exclude='*.log' --exclude='.gitlab-ci.yml' --exclude='.idea'
- ls -lah
artifacts:
paths:
- tsg-olap-data-initialization-$CI_COMMIT_TAG.tar.gz
only:
- tags
deploy:
stage: deploy
script:
- echo "Uploading to Nexus..."
- ls -lah
- curl -v -u $NEXUS_USER:$NEXUS_PASSWORD --upload-file tsg-olap-data-initialization-$VERSION.tar.gz "$NEXUS_URL/deployment/tsg-olap-data-initialization-$VERSION.tar.gz"
dependencies:
- build
only:
- tags

15
CHANGELOG.md Normal file
View File

@@ -0,0 +1,15 @@
# Changelog
## [24.10] - 2024-11-07
### Added
### Fixed
### Deleted
## [24.09] - 2024-11-07
### Initial release
- 发布TSG OLAP 安装包所有的配置文件和初始化脚本。

View File

@@ -1,93 +1,8 @@
# TSG OLAP Data Initialization
## Directory Overview
## Getting started
To make it easy for you to get started with GitLab, here's a list of recommended next steps.
Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)!
## Add your files
- [ ] [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files
- [ ] [Add files using the command line](https://docs.gitlab.com/ee/gitlab-basics/add-file.html#add-a-file-using-the-command-line) or push an existing Git repository with the following command:
```
cd existing_repo
git remote add origin https://git.mesalab.cn/galaxy/deployment/tsg-olap-data-initialization.git
git branch -M main
git push -uf origin main
```
## Integrate with your tools
- [ ] [Set up project integrations](https://git.mesalab.cn/galaxy/deployment/tsg-olap-data-initialization/-/settings/integrations)
## Collaborate with your team
- [ ] [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/)
- [ ] [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
- [ ] [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
- [ ] [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/)
- [ ] [Set auto-merge](https://docs.gitlab.com/ee/user/project/merge_requests/merge_when_pipeline_succeeds.html)
## Test and Deploy
Use the built-in continuous integration in GitLab.
- [ ] [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/index.html)
- [ ] [Analyze your code for known vulnerabilities with Static Application Security Testing (SAST)](https://docs.gitlab.com/ee/user/application_security/sast/)
- [ ] [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html)
- [ ] [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/)
- [ ] [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html)
***
# Editing this README
When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thanks to [makeareadme.com](https://www.makeareadme.com/) for this template.
## Suggestions for a good README
Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information.
## Name
Choose a self-explaining name for your project.
## Description
Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors.
## Badges
On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge.
## Visuals
Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method.
## Installation
Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection.
## Usage
Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README.
## Support
Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc.
## Roadmap
If you have ideas for releases in the future, it is a good idea to list them in the README.
## Contributing
State if you are open to contributions and what your requirements are for accepting them.
For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self.
You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser.
## Authors and acknowledgment
Show your appreciation to those who have contributed to the project.
## License
For open source projects, say how it is licensed.
## Project status
If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers.
| Directory Name | Description |
|:-----------------|:--------------|
| shell-scripts | 存储安装和初始化脚本。 |
| config-templates | 存储配置文件模板。 |

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,22 @@
SELECT log_id, recv_time, vsys_id, assessment_date, lot_number, file_name, assessment_file, assessment_type, features, `size`, file_checksum_sha
FROM tsg_galaxy_v3.assessment_event where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT vsys_id, recv_time, log_id, profile_id, rule_id, start_time, end_time, attack_type, severity, conditions, destination_ip, destination_country, source_ip_list, source_country_list, sessions, session_rate, packets, packet_rate, bytes, bit_rate
FROM tsg_galaxy_v3.dos_event where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, start_timestamp_ms, end_timestamp_ms, duration_ms, tcp_handshake_latency_ms, ingestion_time, processing_time, insert_time, device_id, out_link_id, in_link_id, device_tag, data_center, device_group, sled_ip, address_type, direction, vsys_id, t_vsys_id, flags, flags_identify_info, c2s_ttl, s2c_ttl, security_rule_list, security_action, monitor_rule_list, shaping_rule_list, proxy_rule_list, statistics_rule_list, sc_rule_list, sc_rsp_raw, sc_rsp_decrypted, proxy_action, proxy_pinning_status, proxy_intercept_status, proxy_passthrough_reason, proxy_client_side_latency_ms, proxy_server_side_latency_ms, proxy_client_side_version, proxy_server_side_version, proxy_cert_verify, proxy_intercept_error, monitor_mirrored_pkts, monitor_mirrored_bytes, client_ip, client_ip_tags, client_port, client_os_desc, client_geolocation, client_country, client_super_administrative_area, client_administrative_area, client_sub_administrative_area, client_asn, subscriber_id, imei, imsi, phone_number, apn, server_ip, server_ip_tags, server_port, server_os_desc, server_geolocation, server_country, server_super_administrative_area, server_administrative_area, server_sub_administrative_area, server_asn, server_fqdn, server_fqdn_tags, server_domain, app_transition, app, app_category, app_debug_info, app_content, app_extra_info, fqdn_category_list, ip_protocol, decoded_path, dns_message_id, dns_qr, dns_opcode, dns_aa, dns_tc, dns_rd, dns_ra, dns_rcode, dns_qdcount, dns_ancount, dns_nscount, dns_arcount, dns_qname, dns_qtype, dns_qclass, dns_cname, dns_sub, dns_rr, dns_response_latency_ms, http_url, http_host, http_request_line, http_response_line, http_request_body, http_response_body, http_proxy_flag, http_sequence, http_cookie, http_referer, http_user_agent, http_request_content_length, http_request_content_type, http_response_content_length, http_response_content_type, http_set_cookie, http_version, http_status_code, http_response_latency_ms, http_session_duration_ms, http_action_file_size, ssl_version, ssl_sni, ssl_san, ssl_cn, ssl_handshake_latency_ms, ssl_ja3_hash, ssl_ja3s_hash, ssl_cert_issuer, ssl_cert_subject, ssl_esni_flag, ssl_ech_flag, dtls_cookie, dtls_version, dtls_sni, dtls_san, dtls_cn, dtls_handshake_latency_ms, dtls_ja3_fingerprint, dtls_ja3_hash, dtls_cert_issuer, dtls_cert_subject, mail_protocol_type, mail_account, mail_from_cmd, mail_to_cmd, mail_from, mail_password, mail_to, mail_cc, mail_bcc, mail_subject, mail_subject_charset, mail_attachment_name, mail_attachment_name_charset, mail_starttls_flag, mail_eml_file, ftp_account, ftp_url, ftp_link_type, quic_version, quic_sni, quic_user_agent, rdp_cookie, rdp_security_protocol, rdp_client_channels, rdp_keyboard_layout, rdp_client_version, rdp_client_name, rdp_client_product_id, rdp_desktop_width, rdp_desktop_height, rdp_requested_color_depth, rdp_certificate_type, rdp_certificate_count, rdp_certificate_permanent, rdp_encryption_level, rdp_encryption_method, ssh_version, ssh_auth_success, ssh_client_version, ssh_server_version, ssh_cipher_alg, ssh_mac_alg, ssh_compression_alg, ssh_kex_alg, ssh_host_key_alg, ssh_host_key, ssh_hassh, sip_call_id, sip_originator_description, sip_responder_description, sip_user_agent, sip_server, sip_originator_sdp_connect_ip, sip_originator_sdp_media_port, sip_originator_sdp_media_type, sip_originator_sdp_content, sip_responder_sdp_connect_ip, sip_responder_sdp_media_port, sip_responder_sdp_media_type, sip_responder_sdp_content, sip_duration_s, sip_bye, sip_bye_reason, rtp_payload_type_c2s, rtp_payload_type_s2c, rtp_pcap_path, rtp_originator_dir, stratum_cryptocurrency, stratum_mining_pools, stratum_mining_program, stratum_mining_subscribe, sent_pkts, received_pkts, sent_bytes, received_bytes, tcp_c2s_ip_fragments, tcp_s2c_ip_fragments, tcp_c2s_lost_bytes, tcp_s2c_lost_bytes, tcp_c2s_o3_pkts, tcp_s2c_o3_pkts, tcp_c2s_rtx_pkts, tcp_s2c_rtx_pkts, tcp_c2s_rtx_bytes, tcp_s2c_rtx_bytes, tcp_rtt_ms, tcp_client_isn, tcp_server_isn, packet_capture_file, in_src_mac, out_src_mac, in_dest_mac, out_dest_mac, encapsulation, dup_traffic_flag, tunnel_id_list, tunnel_endpoint_a_desc, tunnel_endpoint_b_desc
FROM tsg_galaxy_v3.monitor_event where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, start_timestamp_ms, end_timestamp_ms, duration_ms, tcp_handshake_latency_ms, ingestion_time, processing_time, insert_time, device_id, out_link_id, in_link_id, device_tag, data_center, device_group, sled_ip, address_type, direction, vsys_id, t_vsys_id, flags, flags_identify_info, c2s_ttl, s2c_ttl, security_rule_list, security_action, monitor_rule_list, shaping_rule_list, proxy_rule_list, statistics_rule_list, sc_rule_list, sc_rsp_raw, sc_rsp_decrypted, proxy_action, proxy_pinning_status, proxy_intercept_status, proxy_passthrough_reason, proxy_client_side_latency_ms, proxy_server_side_latency_ms, proxy_client_side_version, proxy_server_side_version, proxy_cert_verify, proxy_intercept_error, monitor_mirrored_pkts, monitor_mirrored_bytes, client_ip, client_ip_tags, client_port, client_os_desc, client_geolocation, client_country, client_super_administrative_area, client_administrative_area, client_sub_administrative_area, client_asn, subscriber_id, imei, imsi, phone_number, apn, server_ip, server_ip_tags, server_port, server_os_desc, server_geolocation, server_country, server_super_administrative_area, server_administrative_area, server_sub_administrative_area, server_asn, server_fqdn, server_fqdn_tags, server_domain, app_transition, app, app_category, app_debug_info, app_content, app_extra_info, fqdn_category_list, ip_protocol, decoded_path, http_url, http_host, http_request_line, http_response_line, http_request_body, http_response_body, http_proxy_flag, http_sequence, http_cookie, http_referer, http_user_agent, http_request_content_length, http_request_content_type, http_response_content_length, http_response_content_type, http_set_cookie, http_version, http_status_code, http_response_latency_ms, http_session_duration_ms, http_action_file_size, doh_url, doh_host, doh_request_line, doh_response_line, doh_cookie, doh_referer, doh_user_agent, doh_content_length, doh_content_type, doh_set_cookie, doh_version, doh_message_id, doh_qr, doh_opcode, doh_aa, doh_tc, doh_rd, doh_ra, doh_rcode, doh_qdcount, doh_ancount, doh_nscount, doh_arcount, doh_qname, doh_qtype, doh_qclass, doh_cname, doh_sub, doh_rr, sent_pkts, received_pkts, sent_bytes, received_bytes, tcp_c2s_ip_fragments, tcp_s2c_ip_fragments, tcp_c2s_lost_bytes, tcp_s2c_lost_bytes, tcp_c2s_o3_pkts, tcp_s2c_o3_pkts, tcp_c2s_rtx_pkts, tcp_s2c_rtx_pkts, tcp_c2s_rtx_bytes, tcp_s2c_rtx_bytes, tcp_rtt_ms, tcp_client_isn, tcp_server_isn, packet_capture_file, in_src_mac, out_src_mac, in_dest_mac, out_dest_mac, encapsulation, dup_traffic_flag, tunnel_id_list, tunnel_endpoint_a_desc, tunnel_endpoint_b_desc
FROM tsg_galaxy_v3.proxy_event where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, start_timestamp_ms, end_timestamp_ms, duration_ms, tcp_handshake_latency_ms, ingestion_time, processing_time, insert_time, device_id, out_link_id, in_link_id, device_tag, data_center, device_group, sled_ip, address_type, direction, vsys_id, t_vsys_id, flags, flags_identify_info, c2s_ttl, s2c_ttl, security_rule_list, security_action, monitor_rule_list, sc_rule_list, sc_rsp_raw, sc_rsp_decrypted, shaping_rule_list, proxy_rule_list, statistics_rule_list, proxy_action, proxy_pinning_status, proxy_intercept_status, proxy_passthrough_reason, proxy_client_side_latency_ms, proxy_server_side_latency_ms, proxy_client_side_version, proxy_server_side_version, proxy_cert_verify, proxy_intercept_error, monitor_mirrored_pkts, monitor_mirrored_bytes, client_ip, client_ip_tags, client_port, client_os_desc, client_geolocation, client_country, client_super_administrative_area, client_administrative_area, client_sub_administrative_area, client_asn, subscriber_id, imei, imsi, phone_number, apn, server_ip, server_ip_tags, server_port, server_os_desc, server_geolocation, server_country, server_super_administrative_area, server_administrative_area, server_sub_administrative_area, server_asn, server_fqdn, server_fqdn_tags, server_domain, app_transition, app, app_category, app_debug_info, app_content, app_extra_info, fqdn_category_list, ip_protocol, decoded_path, dns_message_id, dns_qr, dns_opcode, dns_aa, dns_tc, dns_rd, dns_ra, dns_rcode, dns_qdcount, dns_ancount, dns_nscount, dns_arcount, dns_qname, dns_qtype, dns_qclass, dns_cname, dns_sub, dns_rr, dns_response_latency_ms, http_url, http_host, http_request_line, http_response_line, http_request_body, http_response_body, http_proxy_flag, http_sequence, http_cookie, http_referer, http_user_agent, http_request_content_length, http_request_content_type, http_response_content_length, http_response_content_type, http_set_cookie, http_version, http_status_code, http_response_latency_ms, http_session_duration_ms, http_action_file_size, ssl_version, ssl_sni, ssl_san, ssl_cn, ssl_handshake_latency_ms, ssl_ja3_hash, ssl_ja3s_hash, ssl_cert_issuer, ssl_cert_subject, ssl_esni_flag, ssl_ech_flag, dtls_cookie, dtls_version, dtls_sni, dtls_san, dtls_cn, dtls_handshake_latency_ms, dtls_ja3_fingerprint, dtls_ja3_hash, dtls_cert_issuer, dtls_cert_subject, mail_protocol_type, mail_account, mail_from_cmd, mail_to_cmd, mail_from, mail_password, mail_to, mail_cc, mail_bcc, mail_subject, mail_subject_charset, mail_attachment_name, mail_attachment_name_charset, mail_starttls_flag, mail_eml_file, ftp_account, ftp_url, ftp_link_type, quic_version, quic_sni, quic_user_agent, rdp_cookie, rdp_security_protocol, rdp_client_channels, rdp_keyboard_layout, rdp_client_version, rdp_client_name, rdp_client_product_id, rdp_desktop_width, rdp_desktop_height, rdp_requested_color_depth, rdp_certificate_type, rdp_certificate_count, rdp_certificate_permanent, rdp_encryption_level, rdp_encryption_method, ssh_version, ssh_auth_success, ssh_client_version, ssh_server_version, ssh_cipher_alg, ssh_mac_alg, ssh_compression_alg, ssh_kex_alg, ssh_host_key_alg, ssh_host_key, ssh_hassh, sip_call_id, sip_originator_description, sip_responder_description, sip_user_agent, sip_server, sip_originator_sdp_connect_ip, sip_originator_sdp_media_port, sip_originator_sdp_media_type, sip_originator_sdp_content, sip_responder_sdp_connect_ip, sip_responder_sdp_media_port, sip_responder_sdp_media_type, sip_responder_sdp_content, sip_duration_s, sip_bye, sip_bye_reason, rtp_payload_type_c2s, rtp_payload_type_s2c, rtp_pcap_path, rtp_originator_dir, stratum_cryptocurrency, stratum_mining_pools, stratum_mining_program, stratum_mining_subscribe, sent_pkts, received_pkts, sent_bytes, received_bytes, tcp_c2s_ip_fragments, tcp_s2c_ip_fragments, tcp_c2s_lost_bytes, tcp_s2c_lost_bytes, tcp_c2s_o3_pkts, tcp_s2c_o3_pkts, tcp_c2s_rtx_pkts, tcp_s2c_rtx_pkts, tcp_c2s_rtx_bytes, tcp_s2c_rtx_bytes, tcp_rtt_ms, tcp_client_isn, tcp_server_isn, packet_capture_file, in_src_mac, out_src_mac, in_dest_mac, out_dest_mac, encapsulation, dup_traffic_flag, tunnel_id_list, tunnel_endpoint_a_desc, tunnel_endpoint_b_desc
FROM tsg_galaxy_v3.security_event where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, start_timestamp_ms, end_timestamp_ms, duration_ms, tcp_handshake_latency_ms, ingestion_time, processing_time, insert_time, device_id, out_link_id, in_link_id, device_tag, data_center, device_group, sled_ip, address_type, direction, vsys_id, t_vsys_id, flags, flags_identify_info, c2s_ttl, s2c_ttl, security_rule_list, security_action, monitor_rule_list, sc_rule_list, sc_rsp_raw, sc_rsp_decrypted, shaping_rule_list, proxy_rule_list, statistics_rule_list, proxy_action, proxy_pinning_status, proxy_intercept_status, proxy_passthrough_reason, proxy_client_side_latency_ms, proxy_server_side_latency_ms, proxy_client_side_version, proxy_server_side_version, proxy_cert_verify, proxy_intercept_error, monitor_mirrored_pkts, monitor_mirrored_bytes, client_ip, client_ip_tags, client_port, client_os_desc, client_geolocation, client_country, client_super_administrative_area, client_administrative_area, client_sub_administrative_area, client_asn, subscriber_id, imei, imsi, phone_number, apn, server_ip, server_ip_tags, server_port, server_os_desc, server_geolocation, server_country, server_super_administrative_area, server_administrative_area, server_sub_administrative_area, server_asn, server_fqdn, server_fqdn_tags, server_domain, app_transition, app, app_category, app_debug_info, app_content, app_extra_info, fqdn_category_list, ip_protocol, decoded_path, dns_message_id, dns_qr, dns_opcode, dns_aa, dns_tc, dns_rd, dns_ra, dns_rcode, dns_qdcount, dns_ancount, dns_nscount, dns_arcount, dns_qname, dns_qtype, dns_qclass, dns_cname, dns_sub, dns_rr, dns_response_latency_ms, http_url, http_host, http_request_line, http_response_line, http_request_body, http_response_body, http_proxy_flag, http_sequence, http_cookie, http_referer, http_user_agent, http_request_content_length, http_request_content_type, http_response_content_length, http_response_content_type, http_set_cookie, http_version, http_status_code, http_response_latency_ms, http_session_duration_ms, http_action_file_size, ssl_version, ssl_sni, ssl_san, ssl_cn, ssl_handshake_latency_ms, ssl_ja3_hash, ssl_ja3s_hash, ssl_cert_issuer, ssl_cert_subject, ssl_esni_flag, ssl_ech_flag, dtls_cookie, dtls_version, dtls_sni, dtls_san, dtls_cn, dtls_handshake_latency_ms, dtls_ja3_fingerprint, dtls_ja3_hash, dtls_cert_issuer, dtls_cert_subject, mail_protocol_type, mail_account, mail_from_cmd, mail_to_cmd, mail_from, mail_password, mail_to, mail_cc, mail_bcc, mail_subject, mail_subject_charset, mail_attachment_name, mail_attachment_name_charset, mail_starttls_flag, mail_eml_file, ftp_account, ftp_url, ftp_link_type, quic_version, quic_sni, quic_user_agent, rdp_cookie, rdp_security_protocol, rdp_client_channels, rdp_keyboard_layout, rdp_client_version, rdp_client_name, rdp_client_product_id, rdp_desktop_width, rdp_desktop_height, rdp_requested_color_depth, rdp_certificate_type, rdp_certificate_count, rdp_certificate_permanent, rdp_encryption_level, rdp_encryption_method, ssh_version, ssh_auth_success, ssh_client_version, ssh_server_version, ssh_cipher_alg, ssh_mac_alg, ssh_compression_alg, ssh_kex_alg, ssh_host_key_alg, ssh_host_key, ssh_hassh, sip_call_id, sip_originator_description, sip_responder_description, sip_user_agent, sip_server, sip_originator_sdp_connect_ip, sip_originator_sdp_media_port, sip_originator_sdp_media_type, sip_originator_sdp_content, sip_responder_sdp_connect_ip, sip_responder_sdp_media_port, sip_responder_sdp_media_type, sip_responder_sdp_content, sip_duration_s, sip_bye, sip_bye_reason, rtp_payload_type_c2s, rtp_payload_type_s2c, rtp_pcap_path, rtp_originator_dir, stratum_cryptocurrency, stratum_mining_pools, stratum_mining_program, stratum_mining_subscribe, sent_pkts, received_pkts, sent_bytes, received_bytes, tcp_c2s_ip_fragments, tcp_s2c_ip_fragments, tcp_c2s_lost_bytes, tcp_s2c_lost_bytes, tcp_c2s_o3_pkts, tcp_s2c_o3_pkts, tcp_c2s_rtx_pkts, tcp_s2c_rtx_pkts, tcp_c2s_rtx_bytes, tcp_s2c_rtx_bytes, tcp_rtt_ms, tcp_client_isn, tcp_server_isn, packet_capture_file, in_src_mac, out_src_mac, in_dest_mac, out_dest_mac, encapsulation, dup_traffic_flag, tunnel_id_list, tunnel_endpoint_a_desc, tunnel_endpoint_b_desc
FROM tsg_galaxy_v3.session_record where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, ingestion_time, processing_time, insert_time, address_type, vsys_id, client_ip, client_port, server_ip, server_port, sent_pkts, received_pkts, sent_bytes, received_bytes, dns_message_id, dns_qr, dns_opcode, dns_aa, dns_tc, dns_rd, dns_ra, dns_rcode, dns_qdcount, dns_ancount, dns_nscount, dns_arcount, dns_qname, dns_qtype, dns_qclass, dns_cname, dns_sub, dns_rr, dns_response_latency_ms, http_url, http_host, http_request_line, http_response_line, http_request_body, http_response_body, http_proxy_flag, http_sequence, http_cookie, http_referer, http_user_agent, http_request_content_length, http_request_content_type, http_response_content_length, http_response_content_type, http_set_cookie, http_version, http_status_code, http_response_latency_ms, http_session_duration_ms, http_action_file_size, mail_protocol_type, mail_account, mail_from_cmd, mail_to_cmd, mail_from, mail_password, mail_to, mail_cc, mail_bcc, mail_subject, mail_subject_charset, mail_attachment_name, mail_attachment_name_charset, mail_starttls_flag, mail_eml_file, sip_call_id, sip_originator_description, sip_responder_description, sip_user_agent, sip_server, sip_originator_sdp_connect_ip, sip_originator_sdp_media_port, sip_originator_sdp_media_type, sip_originator_sdp_content, sip_responder_sdp_connect_ip, sip_responder_sdp_media_port, sip_responder_sdp_media_type, sip_responder_sdp_content, sip_duration_s, sip_bye, sip_bye_reason
FROM tsg_galaxy_v3.transaction_record where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT recv_time, log_id, decoded_as, session_id, start_timestamp_ms, end_timestamp_ms, duration_ms, tcp_handshake_latency_ms, ingestion_time, processing_time, insert_time, device_id, out_link_id, in_link_id, device_tag, data_center, device_group, sled_ip, address_type, direction, vsys_id, t_vsys_id, flags, flags_identify_info, client_ip, client_port, client_os_desc, client_geolocation, client_country, client_super_administrative_area, client_administrative_area, client_sub_administrative_area, client_asn, server_ip, server_port, server_os_desc, server_geolocation, server_country, server_super_administrative_area, server_administrative_area, server_sub_administrative_area, server_asn, ip_protocol, sip_call_id, sip_originator_description, sip_responder_description, sip_user_agent, sip_server, sip_originator_sdp_connect_ip, sip_originator_sdp_media_port, sip_originator_sdp_media_type, sip_originator_sdp_content, sip_responder_sdp_connect_ip, sip_responder_sdp_media_port, sip_responder_sdp_media_type, sip_responder_sdp_content, sip_duration_s, sip_bye, sip_bye_reason, rtp_payload_type_c2s, rtp_payload_type_s2c, rtp_pcap_path, rtp_originator_dir, sent_pkts, received_pkts, sent_bytes, received_bytes
FROM tsg_galaxy_v3.voip_record where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT log_id, recv_time, vsys_id, timestamp_us, egress_action, job_id, sled_ip, device_group, traffic_link_id, source_ip, source_port, destination_ip, destination_port, packet, packet_length, measurements
FROM tsg_galaxy_v3.datapath_telemetry_record where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');
SELECT log_id, recv_time, vsys_id, device_id, device_group, data_center, direction, ip_protocol, client_ip, server_ip, internal_ip, external_ip, client_country, server_country, client_asn, server_asn, server_fqdn, server_domain, app, app_category, c2s_ttl, s2c_ttl, c2s_link_id, s2c_link_id, sessions, bytes, sent_bytes, received_bytes, pkts, sent_pkts, received_pkts, asymmetric_c2s_flows, asymmetric_s2c_flows, c2s_fragments, s2c_fragments, c2s_tcp_lost_bytes, s2c_tcp_lost_bytes, c2s_tcp_retransmitted_pkts, s2c_tcp_retransmitted_pkts
FROM tsg_galaxy_v3.traffic_sketch_metric where recv_time >= toUnixTimestamp('2030-01-01 00:00:00') AND recv_time <toUnixTimestamp('2030-01-01 00:00:01');

View File

@@ -0,0 +1,50 @@
flink.job.name={{ job_name }}
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker={{ kafka_source_servers }}
source.kafka.topic={{ kafka_source_topic }}
source.kafka.group.id={{ kafka_source_group_id }}
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library={{ deploy_dir }}/flink/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism={{ combiner_window_parallelism }}
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism={{ hos_sink_parallelism }}
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint={{ hos_sink_servers }}
sink.hos.bucket={{ hos_sink_bucket }}
sink.hos.token={{ hos_token }}
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_eml_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.12:9092
source.kafka.topic=TRAFFIC-EML-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_eml_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=1
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=1
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.12:8186
sink.hos.bucket=traffic_eml_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_http_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.12:9092
source.kafka.topic=TRAFFIC-HTTP-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_http_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.12:8186
sink.hos.bucket=traffic_http_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_policy_capture_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.12:9092
source.kafka.topic=TRAFFIC-POLICY-CAPTURE-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_policy_capture_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.12:8186
sink.hos.bucket=traffic_policy_capture_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_rtp_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.12:9092
source.kafka.topic=TRAFFIC-RTP-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_rtp_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.12:8186
sink.hos.bucket=traffic_rtp_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="1"
export TASK_MODE="yarn-session"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=1024m
-Dtaskmanager.numberOfTaskSlots=1
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-session"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-session"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-session"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_eml_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.11:9094,192.168.44.14:9094,192.168.44.15:9094
source.kafka.topic=TRAFFIC-EML-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_eml_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=1
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=1
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.11:8186,192.168.44.14:8186
sink.hos.bucket=traffic_eml_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_http_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.11:9094,192.168.44.14:9094,192.168.44.15:9094
source.kafka.topic=TRAFFIC-HTTP-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_http_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.11:8186,192.168.44.14:8186
sink.hos.bucket=traffic_http_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_policy_capture_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.11:9094,192.168.44.14:9094,192.168.44.15:9094
source.kafka.topic=TRAFFIC-POLICY-CAPTURE-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_policy_capture_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.11:8186,192.168.44.14:8186
sink.hos.bucket=traffic_policy_capture_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,50 @@
flink.job.name=agg_traffic_rtp_file_chunk_combiner
#kafka source配置
#9092为无验证 9095为ssl 9094为sasl
source.kafka.broker=192.168.44.11:9094,192.168.44.14:9094,192.168.44.15:9094
source.kafka.topic=TRAFFIC-RTP-FILE-STREAM-RECORD
source.kafka.group.id=agg_traffic_rtp_file_chunk_combiner_1
#earliest从头开始 latest最新
source.kafka.auto.offset.reset=latest
source.kafka.session.timeout.ms=60000
#每次拉取操作从分区中获取的最大记录数
source.kafka.max.poll.records=1000
#消费者从单个分区中一次性获取的最大字节数
source.kafka.max.partition.fetch.bytes=31457280
source.kafka.enable.auto.commit=true
#kafka SASL验证用户名
source.kafka.user=olap
#kafka SASL及SSL验证密码
source.kafka.pin=galaxy2024
#SSL需要
source.kafka.tools.library=/opt/tsg/olap/topology/data/
map.filter.expression=FileChunk.offset <= 1073741824
#窗口相关配置
combiner.window.parallelism=3
#窗口大小,单位秒
combiner.window.size=10
#sink相关参数
sink.parallelism=3
#可选hos、oss、hbase
sink.type=hos
sink.async=false
#hos sink相关配置
#访问nginx或单个hos配置为ip:port访问多个hos配置为ip1:port,ip2:port...
sink.hos.endpoint=192.168.44.11:8186,192.168.44.14:8186
sink.hos.bucket=traffic_rtp_file_bucket
sink.hos.token=c21f969b5f03d33d43e04f8f136e7682
sink.hos.batch.size=1048576
sink.hos.batch.interval.ms=10000
#http相关配置
sink.http.client.retries.number=3
sink.http.client.max.total=20
sink.http.client.max.per.route=10
sink.http.client.connect.timeout.ms=10000
sink.http.client.request.timeout.ms=10000
sink.http.client.socket.timeout.ms=60000

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="1"
export TASK_MODE="yarn-per-job"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=1024m
-Dtaskmanager.numberOfTaskSlots=1
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-per-job"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-per-job"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,14 @@
#!/usr/bin/env bash
export MAIN_CLASS="com.zdjizhi.FileChunkCombiner"
export PARALLELISM="3"
export TASK_MODE="yarn-per-job"
export FLINK_JOB_OPTS="
-Djobmanager.memory.process.size=1024m
-Dtaskmanager.memory.process.size=2048m
-Dtaskmanager.numberOfTaskSlots=3
-Dtaskmanager.memory.framework.off-heap.size=256m
-Dtaskmanager.memory.jvm-metaspace.size=128m
"

View File

@@ -0,0 +1,4 @@
grootstream:
properties:
hos.path: http://192.168.44.12:9098/hos
scheduler.knowledge_base.update.interval.minutes: 5

View File

@@ -0,0 +1,54 @@
sources:
kafka_source:
type: kafka
properties:
topic: {{ kafka_source_topic }}
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: raw
sinks:
kafka_sink:
type: kafka
properties:
topic: {{ kafka_sink_topic }}
kafka.client.id: {{ kafka_sink_topic }}
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
format: raw
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.traffic_sketch_metric_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
{{ topology }}

View File

@@ -0,0 +1,76 @@
sources:
kafka_source:
type: kafka
properties:
topic: {{ kafka_source_topic }}
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: msgpack
processing_pipelines:
etl_processor:
type: projection
functions:
- function: SNOWFLAKE_ID
lookup_fields: [ '' ]
output_fields: [ log_id ]
parameters:
data_center_id_num: {{ data_center_id_num }}
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [ __timestamp ]
output_fields: [ recv_time ]
parameters:
precision: seconds
- function: BASE64_ENCODE_TO_STRING
output_fields: [ packet ]
parameters:
value_field: packet
sinks:
kafka_sink:
type: kafka
properties:
topic: {{ kafka_sink_topic }}
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
format: raw
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.datapath_telemetry_record_local
batch.size: 5000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
connection.connect_timeout: 30
connection.query_timeout: 300
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
{{ topology }}

View File

@@ -0,0 +1,65 @@
sources:
kafka_source:
type: kafka
properties:
topic: {{ kafka_source_topic }}
kafka.bootstrap.servers: "{{ kafka_source_servers }}"
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.ssl.keystore.location:
kafka.ssl.keystore.password:
kafka.ssl.truststore.location:
kafka.ssl.truststore.password:
kafka.ssl.key.password:
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
kafka.buffer.memory:
kafka.group.id: dos_event_kafka_to_clickhouse-20231221
kafka.auto.offset.reset: latest
kafka.max.request.size:
kafka.compression.type: none
format: json
sinks:
kafka_sink:
type: kafka
properties:
topic: {{ kafka_sink_topic }}
kafka.client.id: {{ kafka_sink_topic }}
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.dos_event_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true # [boolean] Object Reuse, default is false
{{ topology }}

View File

@@ -0,0 +1,133 @@
sources:
kafka_source:
type: kafka
properties:
topic: PROXY-EVENT
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.client.id: PROXY-EVENT
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: json
json.ignore.parse.errors: false
processing_pipelines:
etl_processor:
type: projection
functions:
- function: SNOWFLAKE_ID
lookup_fields: ['']
output_fields: [log_id]
parameters:
data_center_id_num: {{ data_center_id_num }}
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [__timestamp]
output_fields: [recv_time]
parameters:
precision: seconds
- function: EVAL
output_fields: [ingestion_time]
parameters:
value_expression: recv_time
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_subject]
parameters:
value_field: mail_subject
charset_field: mail_subject_charset
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_attachment_name]
parameters:
value_field: mail_attachment_name
charset_field: mail_attachment_name_charset
- function: PATH_COMBINE
lookup_fields: [rtp_pcap_path]
output_fields: [rtp_pcap_path]
parameters:
path: [props.hos.path, props.hos.bucket.name.rtp_file, rtp_pcap_path]
- function: PATH_COMBINE
lookup_fields: [http_request_body]
output_fields: [http_request_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_request_body]
- function: PATH_COMBINE
lookup_fields: [http_response_body]
output_fields: [http_response_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_response_body]
- function: PATH_COMBINE
lookup_fields: [mail_eml_file]
output_fields: [mail_eml_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.eml_file, mail_eml_file]
- function: PATH_COMBINE
lookup_fields: [packet_capture_file]
output_fields: [packet_capture_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.policy_capture_file, packet_capture_file]
- function: CURRENT_UNIX_TIMESTAMP
output_fields: [ processing_time ]
parameters:
precision: seconds
sinks:
kafka_sink:
type: kafka
properties:
topic: PROXY-EVENT
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.client.id: PROXY-EVENT
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.proxy_event_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
connection.connect_timeout: 30
connection.query_timeout: 300
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
properties:
hos.bucket.name.rtp_file: traffic_rtp_file_bucket
hos.bucket.name.http_file: traffic_http_file_bucket
hos.bucket.name.eml_file: traffic_eml_file_bucket
hos.bucket.name.policy_capture_file: traffic_policy_capture_file_bucket
{{ topology }}

View File

@@ -0,0 +1,131 @@
sources:
kafka_source:
type: kafka
properties:
topic: SESSION-RECORD
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.client.id: SESSION-RECORD
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: json
json.ignore.parse.errors: false
processing_pipelines:
etl_processor:
type: projection
functions:
- function: SNOWFLAKE_ID
lookup_fields: ['']
output_fields: [log_id]
parameters:
data_center_id_num: {{ data_center_id_num }}
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [__timestamp]
output_fields: [recv_time]
parameters:
precision: seconds
- function: EVAL
output_fields: [ingestion_time]
parameters:
value_expression: recv_time
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_subject]
parameters:
value_field: mail_subject
charset_field: mail_subject_charset
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_attachment_name]
parameters:
value_field: mail_attachment_name
charset_field: mail_attachment_name_charset
- function: PATH_COMBINE
lookup_fields: [rtp_pcap_path]
output_fields: [rtp_pcap_path]
parameters:
path: [props.hos.path, props.hos.bucket.name.rtp_file, rtp_pcap_path]
- function: PATH_COMBINE
lookup_fields: [http_request_body]
output_fields: [http_request_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_request_body]
- function: PATH_COMBINE
lookup_fields: [http_response_body]
output_fields: [http_response_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_response_body]
- function: PATH_COMBINE
lookup_fields: [mail_eml_file]
output_fields: [mail_eml_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.eml_file, mail_eml_file]
- function: PATH_COMBINE
lookup_fields: [packet_capture_file]
output_fields: [packet_capture_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.policy_capture_file, packet_capture_file]
- function: CURRENT_UNIX_TIMESTAMP
output_fields: [ processing_time ]
parameters:
precision: seconds
sinks:
kafka_sink:
type: kafka
properties:
topic: SESSION-RECORD
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.client.id: SESSION-RECORD
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.session_record_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
connection.connect_timeout: 30
connection.query_timeout: 300
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
properties:
hos.bucket.name.rtp_file: traffic_rtp_file_bucket
hos.bucket.name.http_file: traffic_http_file_bucket
hos.bucket.name.eml_file: traffic_eml_file_bucket
hos.bucket.name.policy_capture_file: traffic_policy_capture_file_bucket
{{ topology }}

View File

@@ -0,0 +1,87 @@
sources:
kafka_source:
type: kafka
properties:
topic: TRAFFIC-SKETCH-METRIC
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a7ff0b2d3889a424249967b3870b50993d9644f239f0de82cdb13bdb502959e16afadffa49ef1e1d2b9c9b5113e619817
kafka.group.id: etl_traffic_sketch_metric
kafka.auto.offset.reset: latest
kafka.compression.type: none
format: json
processing_pipelines:
etl_processor: # [object] Processing Pipeline
type: projection
remove_fields:
output_fields:
functions: # [array of object] Function List
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [ timestamp_ms ]
output_fields: [ recv_time ]
parameters:
precision: seconds
interval: 60
- function: EVAL
output_fields: [ internal_ip ]
parameters:
value_expression: "(direction == 'Outbound')? client_ip : server_ip"
- function: EVAL
output_fields: [ external_ip ]
parameters:
value_expression: "(direction == 'Outbound')? server_ip : client_ip"
- function: SNOWFLAKE_ID
lookup_fields: [ '' ]
output_fields: [ log_id ]
filter:
parameters:
data_center_id_num: 1
sinks:
kafka_sink:
type: kafka
properties:
topic: TRAFFIC-SKETCH-METRIC
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.client.id: TRAFFIC-SKETCH-METRIC
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_servers }}
table: tsg_galaxy_v3.traffic_sketch_metric_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
application:
env: # [object] Environment Variables
name: etl_traffic_sketch_metric # [string] Job Name
shade.identifier: aes
pipeline:
object-reuse: true # [boolean] Object Reuse, default is false
{{ topology }}

View File

@@ -0,0 +1,131 @@
sources:
kafka_source:
type: kafka
properties:
topic: TRANSACTION-RECORD
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.client.id: TRANSACTION-RECORD
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: json
json.ignore.parse.errors: false
processing_pipelines:
etl_processor:
type: projection
functions:
- function: SNOWFLAKE_ID
lookup_fields: ['']
output_fields: [log_id]
parameters:
data_center_id_num: {{ data_center_id_num }}
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [__timestamp]
output_fields: [recv_time]
parameters:
precision: seconds
- function: EVAL
output_fields: [ingestion_time]
parameters:
value_expression: recv_time
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_subject]
parameters:
value_field: mail_subject
charset_field: mail_subject_charset
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_attachment_name]
parameters:
value_field: mail_attachment_name
charset_field: mail_attachment_name_charset
- function: PATH_COMBINE
lookup_fields: [rtp_pcap_path]
output_fields: [rtp_pcap_path]
parameters:
path: [props.hos.path, props.hos.bucket.name.rtp_file, rtp_pcap_path]
- function: PATH_COMBINE
lookup_fields: [http_request_body]
output_fields: [http_request_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_request_body]
- function: PATH_COMBINE
lookup_fields: [http_response_body]
output_fields: [http_response_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_response_body]
- function: PATH_COMBINE
lookup_fields: [mail_eml_file]
output_fields: [mail_eml_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.eml_file, mail_eml_file]
- function: PATH_COMBINE
lookup_fields: [packet_capture_file]
output_fields: [packet_capture_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.policy_capture_file, packet_capture_file]
- function: CURRENT_UNIX_TIMESTAMP
output_fields: [ processing_time ]
parameters:
precision: seconds
sinks:
kafka_sink:
type: kafka
properties:
topic: TRANSACTION-RECORD
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.client.id: TRANSACTION-RECORD
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.transaction_record_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
connection.connect_timeout: 30
connection.query_timeout: 300
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
properties:
hos.bucket.name.rtp_file: traffic_rtp_file_bucket
hos.bucket.name.http_file: traffic_http_file_bucket
hos.bucket.name.eml_file: traffic_eml_file_bucket
hos.bucket.name.policy_capture_file: traffic_policy_capture_file_bucket
{{ topology }}

View File

@@ -0,0 +1,131 @@
sources:
kafka_source:
type: kafka
properties:
topic: VOIP-CONVERSATION-RECORD
kafka.bootstrap.servers: {{ kafka_source_servers }}
kafka.client.id: VOIP-CONVERSATION-RECORD
kafka.session.timeout.ms: 60000
kafka.max.poll.records: 3000
kafka.max.partition.fetch.bytes: 31457280
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
kafka.group.id: {{ kafka_source_group_id }}
kafka.auto.offset.reset: latest
format: json
json.ignore.parse.errors: false
processing_pipelines:
etl_processor:
type: projection
functions:
- function: SNOWFLAKE_ID
lookup_fields: ['']
output_fields: [log_id]
parameters:
data_center_id_num: {{ data_center_id_num }}
- function: UNIX_TIMESTAMP_CONVERTER
lookup_fields: [__timestamp]
output_fields: [recv_time]
parameters:
precision: seconds
- function: EVAL
output_fields: [ingestion_time]
parameters:
value_expression: recv_time
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_subject]
parameters:
value_field: mail_subject
charset_field: mail_subject_charset
- function: BASE64_DECODE_TO_STRING
output_fields: [mail_attachment_name]
parameters:
value_field: mail_attachment_name
charset_field: mail_attachment_name_charset
- function: PATH_COMBINE
lookup_fields: [rtp_pcap_path]
output_fields: [rtp_pcap_path]
parameters:
path: [props.hos.path, props.hos.bucket.name.rtp_file, rtp_pcap_path]
- function: PATH_COMBINE
lookup_fields: [http_request_body]
output_fields: [http_request_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_request_body]
- function: PATH_COMBINE
lookup_fields: [http_response_body]
output_fields: [http_response_body]
parameters:
path: [props.hos.path, props.hos.bucket.name.http_file, http_response_body]
- function: PATH_COMBINE
lookup_fields: [mail_eml_file]
output_fields: [mail_eml_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.eml_file, mail_eml_file]
- function: PATH_COMBINE
lookup_fields: [packet_capture_file]
output_fields: [packet_capture_file]
parameters:
path: [props.hos.path, props.hos.bucket.name.policy_capture_file, packet_capture_file]
- function: CURRENT_UNIX_TIMESTAMP
output_fields: [ processing_time ]
parameters:
precision: seconds
sinks:
kafka_sink:
type: kafka
properties:
topic: VOIP-CONVERSATION-RECORD
kafka.bootstrap.servers: {{ kafka_sink_servers }}
kafka.client.id: VOIP-CONVERSATION-RECORD
kafka.retries: 0
kafka.linger.ms: 10
kafka.request.timeout.ms: 30000
kafka.batch.size: 262144
kafka.buffer.memory: 134217728
kafka.max.request.size: 10485760
kafka.compression.type: snappy
kafka.security.protocol: SASL_PLAINTEXT
kafka.sasl.mechanism: PLAIN
kafka.sasl.jaas.config: 454f65ea6eef1256e3067104f82730e737b68959560966b811e7ff364116b03124917eb2b0f3596f14733aa29ebad9352644ce1a5c85991c6f01ba8a5e8f177a80bea937958aaa485c2acc2b475603495a23eb59f055e037c0b186acb22886bd0275ca91f1633441d9943e7962942252
format: json
json.ignore.parse.errors: false
log.failures.only: true
clickhouse_sink:
type: clickhouse
properties:
host: {{ clickhouse_sink_host }}
table: tsg_galaxy_v3.voip_record_local
batch.size: 100000
batch.interval: 30s
connection.user: e54c9568586180eede1506eecf3574e9
connection.password: 86cf0e2ffba3f541a6c6761313e5cc7e
connection.connect_timeout: 30
connection.query_timeout: 300
application:
env:
name: {{ job_name }}
shade.identifier: aes
pipeline:
object-reuse: true
properties:
hos.bucket.name.rtp_file: traffic_rtp_file_bucket
hos.bucket.name.http_file: traffic_http_file_bucket
hos.bucket.name.eml_file: traffic_eml_file_bucket
hos.bucket.name.policy_capture_file: traffic_policy_capture_file_bucket
{{ topology }}

24
groot-stream/udf.plugins Normal file
View File

@@ -0,0 +1,24 @@
com.geedgenetworks.core.udf.AsnLookup
com.geedgenetworks.core.udf.CurrentUnixTimestamp
com.geedgenetworks.core.udf.DecodeBase64
com.geedgenetworks.core.udf.Domain
com.geedgenetworks.core.udf.Drop
com.geedgenetworks.core.udf.EncodeBase64
com.geedgenetworks.core.udf.Eval
com.geedgenetworks.core.udf.Flatten
com.geedgenetworks.core.udf.FromUnixTimestamp
com.geedgenetworks.core.udf.GenerateStringArray
com.geedgenetworks.core.udf.GeoIpLookup
com.geedgenetworks.core.udf.JsonExtract
com.geedgenetworks.core.udf.PathCombine
com.geedgenetworks.core.udf.Rename
com.geedgenetworks.core.udf.SnowflakeId
com.geedgenetworks.core.udf.StringJoiner
com.geedgenetworks.core.udf.UnixTimestampConverter
com.geedgenetworks.core.udf.udaf.NumberSum
com.geedgenetworks.core.udf.udaf.CollectList
com.geedgenetworks.core.udf.udaf.CollectSet
com.geedgenetworks.core.udf.udaf.LongCount
com.geedgenetworks.core.udf.udaf.Mean
com.geedgenetworks.core.udf.udaf.LastValue
com.geedgenetworks.core.udf.udaf.FirstValue

11
hbase/update_hbase.sh Normal file
View File

@@ -0,0 +1,11 @@
docker exec -i HMaster hbase shell <<EOF
alter 'troubleshooting_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_time_troubleshooting_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_filename_troubleshooting_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_partfile_troubleshooting_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'assessment_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_time_assessment_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_filename_assessment_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
alter 'index_partfile_assessment_file_bucket',{NAME => 'data',METADATA => {'DFS_REPLICATION' => '1'}},{NAME => 'meta', METADATA => {'DFS_REPLICATION' => '1'}}
EOF

6
hos/bucket_upgrade.sh Normal file
View File

@@ -0,0 +1,6 @@
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_policy_capture_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_rtp_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_http_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_eml_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X DELETE http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_file_bucket -H 'token:{{ hos_token }}'

8
hos/create_bucket.sh Normal file
View File

@@ -0,0 +1,8 @@
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_policy_capture_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_rtp_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_http_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:64*hbase服务器数' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/traffic_eml_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/troubleshooting_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/assessment_file_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-wal:open' -H 'x-hos-ttl:30' -H 'x-hos-replication:1'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/knowledge_base_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-wal:open' -H 'x-hos-replication:单机为1集群为2'
curl -X PUT http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098/hos/report_snapshot_bucket -H 'token:{{ hos_token }}' -H 'x-hos-region-count:16' -H 'x-hos-wal:open' -H 'x-hos-replication:单机为1集群为2'

View File

@@ -0,0 +1,97 @@
#服务端口
server:
port: 8186
max-http-header-size: 20MB
tomcat:
max-threads: 400
#tomcat缓存大小单位KB系统默认10M配置10g
tomcat:
cacheMaxSize: 1000000
#hbase参数
hbase:
zookeeperQuorum: 192.168.44.11:2181,192.168.44.14:2181,192.168.44.15:2181
zookeeperPort: 2181
zookeeperNodeParent: /hbase
clientRetriesNumber: 9
rpcTimeout: 100000
connectPool: 10
clientWriteBuffer: 10485760
clientKeyValueMaxsize: 1073741824
mobThreshold: 10485760
#part的最大数量
maxParts: 100000
#每次获取的part数
getPartBatch: 10
#hbase索引表前缀前缀为以下的都为索引表
timeIndexTablePrefix: index_time_
filenameIndexTablePrefix: index_filename_
partFileIndexTablePrefix: index_partfile_
systemBucketMeta: system:bucket_meta
#创建表的分区数
regionCount: 16
filenameHead: 0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f
partHead: 0,1,2,3,4,5,6,7,8,9,a,b,c,d,e,f
#获取文件大小的目录
dataPath: /hbase
#hadoop集群namenode节点单机为单个ip集群为ip1,ip2
hadoopNameNodes: 192.168.44.10,192.168.44.11
#副本数单机为1集群为2
hadoopReplication: 2
#hadoop端口
hadoopPort: 9000
hadoopUser: root
hadoopNameServices: ns1
hadoopNameNodesNs1: nn1,nn2
asyncPut: 0
#是否打开验证0打开打开需要使用S3身份验证或者token访问服务
auth:
open: 0
#http访问使用的token
token: ENC(vknRT6U4I739rLIha9CvojM+4uFyXZLEYpO2HZayLnRak1HPW0K2yZ3vnQBA2foo)
#s3验证
s3:
accesskey: ENC(FUQDvVP+zqCiwHQhXcRvbw==)
secretkey: ENC(FUQDvVP+zqCiwHQhXcRvbw==)
hos:
#文件大小阈值
maxFileSize: 5073741800
#大文件阈值
uploadThreshold: 104857600
#长连接超时时间
keepAliveTimeout: 60000
#批量删除对象的最大数量
deleteMultipleNumber: 1000
#获取对象列表等操作的最大值
maxResultLimit: 100000
#分块上传的最大分块数
maxPartNumber: 10000
#追加上传的最大次数
maxAppendNumber: 100000
#是否快速上传
isQuickUpload: 0
#是否快速下载文件1打开hbase内存小于20G的集群设为0
isQuickDownloadFile: 0
#用户白名单hbase的namespace获取存储配额
users: default
#是否打开限流,0:关闭1:打开
openRateLimiter: 0
#限流每秒请求数
rateLimiterQps: 20000
#设置上传文件大小的最大值
spring:
servlet:
multipart:
max-file-size: 5GB
max-request-size: 5GB
#Prometheus参数
application:
name: HosServiceApplication
#Prometheus参数
management:
endpoints:
web:
exposure:
include: '*'
metrics:
tags:
application: ${spring.application.name}

View File

@@ -0,0 +1,21 @@
qgw.serverAddr=http://{{ vrrp_instance.default.virtual_ipaddress }}:9999
hos.serverAddr=http://{{ vrrp_instance.oss.virtual_ipaddress }}:9098
hos.token={{ hos_token }}
kafka.server={{ groups.kafka[0] }}:9092
#延迟时间,校验多少秒之前的文件,单位秒
check.time.delay=180
hos.traffic.buckets=traffic_policy_capture_file_bucket,traffic_rtp_file_bucket,traffic_http_file_bucket,traffic_eml_file_bucket
kafka.traffic.topics=TRAFFIC-POLICY-CAPTURE-FILE-STREAM-RECORD,TRAFFIC-RTP-FILE-STREAM-RECORD,TRAFFIC-HTTP-FILE-STREAM-RECORD,TRAFFIC-EML-FILE-STREAM-RECORD
kafka.troubleshooting.topic=TROUBLESHOOTING-FILE-STREAM-RECORD
file.chunk.combiner.window.time=15000
traffic.file.count=10
threads=1
max.threads=10
print.out.interval=1000
http.max.total=100
http.default.max.per.route=100
http.connect.timeout=5000
http.connection.request.timeout=10000
http.socket.timeout=-1
hos.log.types=security_event,monitor_event,proxy_event,session_record,voip_record,assessment_event,transaction_record,troubleshooting
hos.log.types.file.types.url.fields=security_event:http-http_response_body&http_request_body,pcap-packet_capture_file&rtp_pcap_path,eml-mail_eml_file;proxy_event:http-http_response_body&http_request_body;session_record:http-http_response_body&http_request_body,pcap-packet_capture_file&rtp_pcap_path,eml-mail_eml_file;voip_record:pcap-rtp_pcap_path;assessment_event:other-assessment_file;transaction_record:http-http_response_body&http_request_body,eml-mail_eml_file;monitor_event:http-http_response_body&http_request_body,pcap-packet_capture_file&rtp_pcap_path,eml-mail_eml_file

Binary file not shown.

138
hos/hosutil/hosutil.sh Normal file
View File

@@ -0,0 +1,138 @@
#!/bin/bash
version="1.4"
jar="galaxy-hos-util-$version.jar"
usage() {
cat <<EOF
Usage: ./hosutil.sh [command] [-h] [options...]
Available commands:
download Download individual or batch files
upload Upload individual or batch files
check Check file availability
combiner Verify if the file-chunk-combiner data stream is correct
version Print the version
Options for 'download' command:
-b, --bucket The bucket to access.
-d, --directory Directory to save files. If not exists, will be created. Default is ./download/.
-k, --keys Files to download. Can be a single or multiple files separated by commas.
-p, --prefix Prefix for batch downloading files based on file name.
-s, --start_time Start time in UTC format (yyyyMMdd, yyyy-MM-dd, yyyyMMddHHmmss). Default is the previous day's time.
-e, --end_time End time in UTC format (yyyyMMdd, yyyy-MM-dd, yyyyMMddHHmmss). Default is current time.
-c, --count Number of files to download. Default is 1000, maximum is 100000.
-t, --threads Number of threads. Default is 1, maximum is 10.
Options for 'upload' command:
-b, --bucket The bucket to access.
-d, --directory Directory where files to upload are located. Default is ./upload/.
-t, --threads Number of threads. Default is 1, maximum is 10.
Options for 'check' command:
-s, --start_time Start time in UTC format (yyyyMMdd, yyyy-MM-dd, yyyyMMddHHmmss). Default is the previous day's time.
-e, --end_time End time in UTC format (yyyyMMdd, yyyy-MM-dd, yyyyMMddHHmmss). Default is current time.
-c, --count Number of logs to evaluate. Default is 1000, maximum is 100000.
-d, --data_center Specify the data centers to evaluate, separated by commas. If not specified, all data centers are evaluated.
-l, --log_type Specify the logs to evaluate, separated by commas. If not specified, all logs are evaluated.
Supported logs: security_event, monitor_event, proxy_event, session_record, voip_record, assessment_event, transaction_record, troubleshooting.
-f, --file_type Specify file types. If not specified, all types are evaluated. Supported types: eml, http, pcap, other.
Only session_record, security_event, monitor_event, transaction_record support multiple types.
-t --threads Number of threads. Default is 1, maximum is 10.
Options for 'combiner' command:
-j, --job Job to verify. Options: traffic, troubleshooting. Default is traffic.(Troubleshooting job removed in version 24.05)
EOF
}
# 初始化默认值
bucket=""
directory=""
keys=""
prefix=""
start_time=""
end_time=""
count=1000
threads=1
log_type=""
file_type=""
data_center=""
job_name="traffic"
# 检查必填参数
check_required() {
case "$operation" in
download|upload)
if [ -z "$bucket" ]; then
echo "Error: bucket is required for $operation."
exit 1
fi
;;
*)
# 对于其他操作,不需要检查特定参数
;;
esac
}
# 下载函数
download() {
directory=${directory:-"./download/"}
check_required
java -jar $jar download $bucket $directory keys=$keys prefix=$prefix max_keys=$count time_range=$start_time/$end_time thread_num=$threads
}
# 上传函数
upload() {
directory=${directory:-"./upload/"}
check_required
java -jar $jar upload $bucket $directory thread_num=$threads
}
# 检查函数
check() {
java -jar $jar check data_center=$data_center log_type=$log_type file_type=$file_type max_logs=$count time_range=$start_time/$end_time thread_num=$threads
}
# 合并器函数
combiner() {
java -jar $jar combiner $job_name
}
# 主操作流程
if [ $# -eq 0 ];then
usage
exit 0
fi
operation=$1
shift
while getopts ":h:b:d:k:p:s:e:c:t:l:f:j:" opt; do
case $opt in
h) usage; exit 0 ;;
b) bucket=$OPTARG ;;
d) if [ "$operation" == "check" ]; then data_center=$OPTARG; else directory=$OPTARG; fi ;;
k) keys=$OPTARG ;;
p) prefix=$OPTARG ;;
s) start_time=$OPTARG ;;
e) end_Time=$OPTARG ;;
c) count=$OPTARG ;;
t) threads=$OPTARG ;;
l) log_type=$OPTARG ;;
f) file_type=$OPTARG ;;
j) job_name=$OPTARG ;;
\?) echo "Invalid option: -$OPTARG" >&2; usage; exit 1 ;;
:) echo "Option -$OPTARG requires an argument" >&2; usage; exit 1 ;;
esac
done
case "$operation" in
download) download ;;
upload) upload ;;
check) check ;;
combiner) combiner ;;
version) echo $version ;;
*) usage; exit 1 ;;
esac

111
kafka/kafka-operation.sh Normal file
View File

@@ -0,0 +1,111 @@
#!/bin/bash
LOCAL_IP=127.0.0.1:9094
ZK_SERVER=127.0.0.1:2181/kafka
KAFKA_SERVER=127.0.0.1:9092
PARTITIONS=1
usage() {
echo "------------------------------------------------------------------------"
echo -e "生产数据\n"
echo "交互式kafka-operation.sh producer MOCK-DATA-SESSION-RECORD" | column -t
echo "读取文件kafka-operation.sh producer MOCK-DATA-SESSION-RECORD < mock_data.txt" | column -t
echo "------------------------------------------------------------------------"
echo -e "消费数据\n"
echo "从当前消费kafka-operation.sh consumer MOCK-DATA-SESSION-RECORD" | column -t
echo "从头消费kafka-operation.sh consumer-begin MOCK-DATA-SESSION-RECORD" | column -t
echo "------------------------------------------------------------------------"
echo -e "创建Topic\n"
echo "kafka-operation.sh create-topic [topic name] [partition num] [replication num]" | column -t
echo "样例kafka-operation.sh create-topic MOCK-DATA-SESSION-RECORD 3 1" | column -t
echo "------------------------------------------------------------------------"
echo -e "删除Topic\n"
echo "kafka-operation.sh delete-topic [topic name]" | column -t
echo "样例kafka-operation.sh delete-topic MOCK-DATA-SESSION-RECORD" | column -t
echo "------------------------------------------------------------------------"
echo -e "查看Topic列表\n"
echo "kafka-operation.sh list" | column -t
echo "------------------------------------------------------------------------"
echo -e "查看Topic详情\n"
echo "kafka-operation.sh desc-topic [topic name]" | column -t
echo "样例kafka-operation.sh desc-topic MOCK-DATA-SESSION-RECORD" | column -t
echo "------------------------------------------------------------------------"
echo -e "查看消费组列表\n"
echo "kafka-operation.sh groups" | column -t
echo "------------------------------------------------------------------------"
echo -e "查看消费组详情\n"
echo "kafka-operation.sh desc-group [consumer group name]" | column -t
echo "样例kafka-operation.sh desc-group etl-session-record-kafka-to-kafka" | column -t
echo "------------------------------------------------------------------------"
echo -e "增加Topic分区\n"
echo "kafka-operation.sh add-partition [topic name] [new partition num]" | column -t
echo "样例kafka-operation.sh add-partition MOCK-DATA-SESSION-RECORD 5" | column -t
echo "------------------------------------------------------------------------"
echo -e "Topic Leader平衡\n"
echo "kafka-operation.sh election-leader" | column -t
echo "------------------------------------------------------------------------"
echo -e "查看Quotas\n"
echo "kafka-operation.sh desc-quota" | column -t
echo "------------------------------------------------------------------------"
echo -e "增加Quotas\n"
echo "kafka-operation.sh add-quota [quatos rule] [user] [client.id]" | column -t
echo "kafka-operation.sh add-quota 'producer_byte_rate=10485760' tsg_olap SESSION-RECORD" | column -t
echo "------------------------------------------------------------------------"
echo -e "删除Quotas\n"
echo "kafka-operation.sh delete-quota [quatos rule] [user] [client.id]" | column -t
echo "kafka-operation.sh delete-quota 'producer_byte_rate' tsg_olap SESSION-RECORD" | column -t
echo "------------------------------------------------------------------------"
}
if [ $# -eq 0 ]; then
usage
exit 0
fi
case $1 in
producer)
kafka-console-producer.sh --producer.config $KAFKA_HOME/config/producer.properties --broker-list $LOCAL_IP --topic $2
;;
consumer)
kafka-console-consumer.sh --consumer.config $KAFKA_HOME/config/consumer.properties --bootstrap-server $LOCAL_IP --topic $2
;;
consumer-begin)
kafka-console-consumer.sh --consumer.config $KAFKA_HOME/config/consumer.properties --from-beginning --bootstrap-server $LOCAL_IP --topic $2
;;
create-topic)
kafka-topics.sh --if-not-exists --create --bootstrap-server $KAFKA_SERVER --topic $2 --partitions $3 --replication-factor $4
;;
delete-topic)
kafka-topics.sh --if-exists --delete --bootstrap-server $KAFKA_SERVER --topic $2
;;
list)
kafka-topics.sh --list --bootstrap-server $KAFKA_SERVER
;;
desc-topic)
kafka-topics.sh --if-exists --bootstrap-server $KAFKA_SERVER --describe --topic $2
;;
groups)
kafka-consumer-groups.sh --all-groups --all-topics --list --bootstrap-server $KAFKA_SERVER
;;
desc-group)
kafka-consumer-groups.sh --bootstrap-server $KAFKA_SERVER --describe --group $2
;;
add-partition)
kafka-topics.sh --if-exists --bootstrap-server $KAFKA_SERVER --alter --topic $2 --partitions $3
;;
election-leader)
kafka-leader-election.sh --bootstrap-server $KAFKA_SERVER --all-topic-partitions --election-type PREFERRED
;;
desc-quota)
kafka-configs.sh --bootstrap-server $KAFKA_SERVER --describe --entity-type users --entity-type clients
;;
add-quota)
kafka-configs.sh --bootstrap-server $KAFKA_SERVER --alter --add-config $2 --entity-type users --entity-name $3 --entity-type clients --entity-name $4
;;
delete-quota)
kafka-configs.sh --bootstrap-server $KAFKA_SERVER --alter --delete-config $2 --entity-type users --entity-name $3 --entity-type clients --entity-name $4
;;
*)
usage
esac

View File

@@ -0,0 +1,8 @@
KafkaServer {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="admin"
password="galaxy2019"
user_admin="galaxy2019"
user_firewall="galaxy2019"
user_olap="galaxy2019";
};

Binary file not shown.

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.