This repository has been archived on 2025-09-14. You can view files and clone it, but cannot push or open issues or pull requests.
Files

配置模版举例

session_record.yaml.j2 会话日志ETL场景

  • 多数中心部署场景: 分中心Data Transporter预处理后集中汇聚至国家中心NDC
    • etl_session_record_kafka_to_ndc_kafka (A-DT)
      • Topology: kafka_source -> etl_processor -> kafka_sink
      • Data Flow: SESSION-RECORD -> SESSION-RECORD-PROCESSED
  • 多数中心部署场景国家中心侧加载会话日志写入ClickHouse
    • session_record_processed_kafka_to_clickhouse(A-NDC)
      • Topology: kafka_source -> clickhouse_sink
      • Data Flow: SESSION-RECORD-PROCESSED -> session_record_local
  • 集中部署场景摄入会话日志预处理后写入ClickHouse
    • etl_session_record_kafka_to_clickhouse (B)
      • Topology: kafka_source -> etl_processor -> clickhouse_sink
      • Data Flow: SESSION-RECORD -> session_record_local

data_transporter.yaml.j2 (数据回传场景)

  • troubleshooting_file_stream_kafka_to_ndc_kafka
    • Topology: kafka_source -> kafka_sink (format:raw)
    • Data Flow: TROUBLESHOOTING-FILE-STREAM-RECORD -> TROUBLESHOOTING-FILE-STREAM-RECORD

realtime_log_streaming_cn_session_record.yaml.template (向其它厂商/第三方推送场景)

install_cn_udf.sh安装CN UDFsgrootstream.yaml定义CN知识库

  • etl_session_record_kafka_to_cn_kafka
    • Topology: kafka_source -> etl_processor -> post_output_field_processor -> kafka_sink
    • Data Flow: SESSION-RECORD(SESSION-RECORD-PROCESSED) -> SESSION-RECORD-CN