DEV Community

ObservabilityGuy
ObservabilityGuy

Posted on

LoongCollector Security Log Ingestion Practice: Standardized Log Collection for Enterprise Firewall Scenarios

Background

Log Standardization: An Inevitable Requirement for Security Protection

In today's highly interconnected digital environment, cybersecurity threats are showing trends of diversification, concealment, and cross-platform expansion. Enterprises need to collect logs from multiple dimensions, such as firewalls, terminal devices, and identity authentication systems, to build a complete security situation awareness system. However, for enterprises, their logs may come from security products provided by multiple cybersecurity vendors. Issues such as diverse log sources, unstructured log formats, and fragmented log fields have long plagued security teams, resulting in high costs for data integration and low efficiency in analysis.

LoongCollector: Enable Flexible Configuration Solutions

LoongCollector, as a lightweight log collection tool, provides technical support for the unified ingestion of multi-source logs by being compatible with mainstream log formats and offering diverse parsing methods. Its design goal is not only "collection" but also to lay the foundation for subsequent security analysis, threat detection, and compliance auditing through standardized ingestion and flexible parsing rules. Next, we will conduct specific ingestion practices using enterprise-grade firewall logs from cybersecurity vendors such as Chaitin Web Application Firewall (WAF) logs, FortiGate security logs, and Palo Alto security logs as examples.

Prerequisite
Preparation
● Create a Logstore for log storage and configure related indexes for query and analysis
● Install LoongCollector on the machine and create a machine group in the Logstore

How LoongCollector works
Firewall logs from cybersecurity vendors can be forwarded to the collection server in accordance with the Syslog protocol, then collected and parsed by LoongCollector before being written into Alibaba Cloud's SLS log repository. For more information on the working principle, see the Documentation.

Practice on Ingesting Enterprise-Grade Firewall Logs
Note: The following section focuses on the scenario description and practical verification. The specific collection configurations of LoongCollector, such as port configuration, field settings, and plug-in processing, can be supplemented or modified based on actual deployment requirements.

Chaitin WAF logs
Chaitin WAF (SafeLine) is an intelligent Web Application Firewall launched by Chaitin Technology, which focuses on providing efficient and accurate Web security protection. Chaitin WAF is one of the leading vendors in the Web application protection sector of the China Cybersecurity Market. In addition to the Enterprise Edition, it also offers a free Community Edition for trial use.

Enable Syslog Outbound Transmission
SafeLine WAF events can be forwarded to third-party machines in JSON format by configuring Syslog forwarding. Logs contain details such as the request protocol, source IP address, port, timestamp, hostname, request method, event ID, attack type, and risk level.

● Go to the System Settings page of the SafeLine and configure Syslog Settings option to enable the Syslog outbound transmission function.
● SafeLine uses the UDP protocol to transmit Syslog logs, and the memory format complies with the RFC-5424.
● After the configuration is completed, click the Test button. If the Syslog server receives the test information, this indicates that the configuration is successful.

Configure LoongCollector for Collection
In the preceding example, the LoongCollector processing plug-in can expand the default content fields in Chaitin WAF logs (transmitted through Syslog) into JSON key-value (KV) pairs. The configuration details are as follows. For more information about specific operation steps, see Documentation.

{
        "inputs": [{
            "type": "service_syslog",
            "detail": {
                "Address": "udp://0.0.0.0:5144",  # If a server forwards multiple log types, the port needs to be adjusted according to different log types.
                "ParseProtocol": "rfc5424"
        }
    }],
        "processors": [{
            "detail": {
                "ExpandArray": false,
                "ExpandConnector": ".", # Configure as needed 
                "ExpandDepth": 0,
                "IgnoreFirstConnector": true,
                "KeepSource": false, # Whether to keep the original fields, configure as needed
                "KeepSourceIfParseError": true,
                "NoKeyError": true,
                "Prefix": "", # Configure as needed
                "SourceKey": "_content_", # Name of the original field to be expanded in JSON
                "UseSourceKeyAsPrefix": false # Configure as needed
            },
            "type": "processor_json"
    }]
}
Enter fullscreen mode Exit fullscreen mode

Collection Result
The sample log uses the sample log on the official website.

{
  "scheme": "http",                 // The request protocol is HTTP
  "src_ip": "12.123.123.123",       // Source IP address
  "src_port": 53008,                // Source port number
  "socket_ip": "10.2.71.103",       // Socket IP address
  "upstream_addr": "10.2.34.20",    // Upstream address
  "req_start_time": 1712819316749,  // Request start time
  "rsp_start_time": null,           // Response start time
  "req_end_time": 1712819316749,    // Request end time
  "rsp_end_time": null,             // Response end time
  "host": "safeline-ce.chaitin.net",// Hostname
  "method": "GET",                  // The request method is GET
  "query_string": "",               // Query string
  "event_id": "32be0ce3ba6c44be9ed7e1235f9eebab",            // Event ID
  "session": "",                    // Session
  "site_uuid": "35",                // Site UUID
  "site_url": "http://safeline-ce.chaitin.net:8083",         // Site URL
  "req_detector_name": "1276d0f467e4",                       // Request detector name
  "req_detect_time": 286,           // Request detection time
  "req_proxy_name": "16912fe30d8f", // Request proxy name
  "req_rule_id": "m_rule/9bf31c7ff062936a96d3c8bd1f8f2ff3",  // Request rule ID
  "req_location": "urlpath",        // The request location is URL path
  "req_payload": "",                // The request payload is empty
  "req_decode_path": "",            // Request decode path
  "req_rule_module": "m_rule",      // The request rule module is m_rule
  "req_http_body_is_truncate": 0,   // Request HTTP body
  "rsp_http_body_is_truncate": 0,   // Response HTTP body
  "req_skynet_rule_id_list": [      // Request Skynet rule ID list
    65595,
    65595
  ],
  "http_body_is_abandoned": 0,      // HTTP body
  "country": "US",                  // Country
  "province": "",                   // Province
  "city": "",                       // City
  "timestamp": 1712819316,          // City
  "payload": "",  
  "location": "urlpath",            // The location is URL path
  "rule_id": "m_rule/9bf31c7ff062936a96d3c8bd1f8f2ff3",     // Rule ID
  "decode_path": "",                // Decode path
  "cookie": "sl-session=Z0WLa8mjGGZPki+QHX+HNQ==",          // Cookie
  "user_agent": "PostmanRuntime/7.28.4",                    // User agent
  "referer": "",                     // Referrer
  "timestamp_human": "2024-04-11 15:08:36",                 // Timestamp
  "resp_reason_phrase": "",         // Response
  "module": "m_rule",               // The module is m_rule
  "reason": "",                     // Reason
  "proxy_name": "16912fe30d8f",     // Proxy name
  "node": "1276d0f467e4",           // Node
  "dest_port": 8083,                // Destination port number
  "dest_ip": "10.2.34.20",          // Destination IP address
  "urlpath": "/webshell.php",       // URL path
  "protocol": "http",               // The protocol is HTTP
  "attack_type": "backdoor",        // Attack type
  "risk_level": "high",             // Risk level
  "action": "deny",                 // Action
  "req_header_raw": "GET /webshell.php HTTP/1.1\r\nHost: safeline-ce.chaitin.net:8083\r\nUser-Agent: PostmanRuntime/7.28.4\r\nAccept: */*\r\nAccept-Encoding: gzip, deflate, br\r\nCache-Control: no-cache\r\nCookie: sl-session=Z0WLa8mjGGZPki+QHX+HNQ==\r\nPostman-Token: 8e67bec1-6e79-458c-8ee5-0498f3f724db\r\nX-Real-Ip: 12.123.123.123\r\nSL-CE-SUID: 35\r\n\r\n",                      // Raw content of request header
  "body": "",                       // Body
  "req_block_reason": "web",        // Request block reason
  "req_attack_type": "backdoor",    // Request attack type
  "req_risk_level": "high",         // Request risk level
  "req_action": "deny"              // Action
}
Enter fullscreen mode Exit fullscreen mode

The following sample collection result shows that the log has been parsed into the standard JSON KV pair pattern, and fields such as req_header_raw have been correctly escaped and displayed. You can create indexes, perform queries and analyze logs, and configure alert visualization based on the normalized logs.

Connect to FortiGate Log
Fortinet is the global leader in the field of cybersecurity, and its core positioning is the advocate of "integrated security architecture". FortiGate is a next-generation firewall (NGFW) launched by Fortinet. It integrates multiple security functions such as firewall, IPS, VPN, sandbox, and WAF to support full-scenario security protection ranging from small and medium-sized enterprises (SMEs) to data centers.

Configure Syslog Forwarding
For more information on configuring Syslog forwarding for FortiGate logs, see Configure Syslog Forwarding. In addition to the default format, this forwarding function also supports the CEF and CSV formats.

Collect logs in the default format
Log Sample of Fortinet Default Format
In the following example, FortiGate logs are web filtering logs. For more information about log types, see Web filtering logs.

date=2019-05-13 time=16:29:45 logid="0316013056" type="utm" subtype="webfilter" eventtype="ftgd_blk" level="warning" vd="vdom1" eventtime=1557790184975119738 policyid=1 sessionid=381780 srcip=10.1.100.11 srcport=44258 srcintf="port12" srcintfrole="undefined" dstip=185.244.31.158 dstport=80 dstintf="port11" dstintfrole="undefined" proto=6 service="HTTP" hostname="morrishittu.ddns.net" profile="test-webfilter" action="blocked" reqtype="direct" url="/" sentbyte=84 rcvdbyte=0 direction="outgoing" msg="URL belongs to a denied category in policy" method="domain" cat=26 catdesc="Malicious Websites" crscore=30 craction=4194304 crlevel="high"

Enter fullscreen mode Exit fullscreen mode

Collection Configuration and Results
The collection and configuration of FortiGate logs can be created according to the following configuration content (with comments removed).

{
    "inputs": [
        {
            "Type": "service_syslog", // Receive syslog logs forwarded by FortiGate
            "Address": "udp://0.0.0.0:9002",
            "ParseProtocol": "rfc5424",
            "IgnoreParseFailure": true
        }
    ],
    "processors": [
        {
            "Type": "processor_split_key_value", // Expand in key-value pair mode
            "ErrIfSourceKeyNotFound": true,
            "ErrIfSeparatorNotFound": true,
            "Quote": "\"", // Configure the quote symbol as '"'
            "SourceKey": "_content_", // The original field forwarded by syslog
            "Delimiter": " ", // The original field forwarded by syslog
            "KeepSource": false,
            "UseSourceKeyAsPrefix": false,
            "ErrIfKeyIsEmpty": true,
            "Separator": "=", // Keys and values are separated by '='
            "DiscardWhenSeparatorNotFound": false
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

The collection result is as follows. The log is correctly structured and expanded. For example, the value URL belongs to a denied category in policy is also correctly mapped to the msg field, thanks to the proper configuration of the quotation mark '"'.

Collect Logs in CEF Format
CEF (Common Event Format) is a standardized log format proposed by ArcSight, which is designed for security events. It is used to unify the security log format generated by different devices and facilitate SIEM (Security Information and Event Management) system parsing. The CEF format reference is as follows:

Jan 11 10:25:39 host CEF:Version|Device Vendor|Device Product|Device Version|Device Event Class ID|Name|Severity|[Extension]
Enter fullscreen mode Exit fullscreen mode

Log Sample of Fortinet CEF Format
The following is an example of a traffic log in the Fortinet CEF format. The log is available on the FortiGate official website.

Dec 27 11:12:30 FGT-A-LOG CEF: 0|Fortinet|Fortigate|v6.0.3|00013|traffic:forward accept|3|deviceExternalId=FGT5HD3915800610 FTNTFGTlogid=0000000013 cat=traffic:forward FTNTFGTsubtype=forward FTNTFGTlevel=notice FTNTFGTvd=vdom1 FTNTFGTeventtime=1545937950 src=10.1.100.11 spt=58843 deviceInboundInterface=port12 FTNTFGTsrcintfrole=undefined dst=172.16.200.55 dpt=53 deviceOutboundInterface=port11 FTNTFGTdstintfrole=undefined FTNTFGTpoluuid=c2d460aa-fe6f-51e8-9505-41b5117dfdd4 externalId=440 proto=17 act=accept FTNTFGTpolicyid=1 FTNTFGTpolicytype=policy app=DNS FTNTFGTdstcountry=Reserved FTNTFGTsrccountry=Reserved FTNTFGTtrandisp=snat sourceTranslatedAddress=172.16.200.1 sourceTranslatedPort=58843 FTNTFGTappid=16195 FTNTFGTapp=DNS FTNTFGTappcat=Network.Service FTNTFGTapprisk=elevated FTNTFGTapplist=g-default FTNTFGTduration=180 out=70 in=528 FTNTFGTsentpkt=1 FTNTFGTrcvdpkt=1 FTNTFGTcustom_name1=HN123456 FTNTFGTcustom_name2=accounting_dpt
Enter fullscreen mode Exit fullscreen mode

Collection Configuration

{
    "inputs": [
        {
            "Type": "service_syslog", // Receive logs in Fortinet CEF format
            "Address": "udp://0.0.0.0:9003",
            "ParseProtocol": "rfc5424",
            "IgnoreParseFailure": true
        }
    ],
    "processors": [
        {
            "Type": "processor_parse_delimiter_native",
            "SourceKey": "content",
            "Separator": "|", // Extract specific CEF fields based on the delimiter
            "Quote": "\"", // The quote symbol is ""
            "Keys": [
                "time", // The corresponding content is Dec 27 11:12:30 FGT-A-LOG CEF: 0
                "Vendor",
                "Product",
                "Version",
                "Signature_ID",
                "Name",
                "Severity",
                "Extension" // Corresponding to the subsequent fields of Severity
            ],
            "KeepingSourceWhenParseFail": true
        },
        {
            "Type": "processor_split_key_value",
            "ErrIfSourceKeyNotFound": true,
            "ErrIfSeparatorNotFound": true,
            "Quote": "\"", // Quote symbol
            "SourceKey": "Extension", // Expand KV pairs based on the Extension field
            "Delimiter": " ", // Separate different KV pairs
            "KeepSource": false,
            "ErrIfKeyIsEmpty": true,
            "Separator": "=", // Separate key and value
            "DiscardWhenSeparatorNotFound": false
        },
        {
            "Type": "processor_regex",
            "FullMatch": false,
            "SourceKey": "time", // Corresponding to Dec 27 11:12:30 FGT-A-LOG CEF: 0
            "Regex": "^([A-Z][a-z]{2}\\s+\\d{1,2}\\s+\\d{2}:\\d{2}:\\d{2})\\s+(\\S+)\\s+CEF:\\s*(\\d+)",
            "Keys": [
                "Time", // Extract time
                "Host", // Extract host
                "CEF_Version" // Extract CEF version
            ],
            "KeepSource": false,
            "KeepSourceIfParseError": true,
            "NoKeyError": false,
            "NoMatchError": true
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Collection Result

Palo Alto Network Log Ingestion
Palo Alto Networks is a leading global cybersecurity enterprise and a market leader in the security field. Palo Alto has built a comprehensive security solution covering network, cloud, terminal, and Internet of Things (IoT) with the next-generation firewall (NGFW) as its core, and is widely regarded as an innovative benchmark in the field of network security.

Configure Syslog Forwarding
To collect Palo Alto NGFW logs, we first create a Syslog server configuration (FQDN, port, and transmission protocol), and then configure the Syslog forwarding log type and certificate. For more information, see Document Configuration of the firewall Palo Alto Networks.

Configure LoongCollector Collection
Palo Alto Networks firewalls can forward various types of logs (such as traffic logs, threat logs, URL filtering logs, and data filtering logs) to external servers with standard fields including severity levels, custom formats, and escape sequences. For ease of parsing, all fields use a comma as the delimiter: each field is a comma-separated value (CSV) string. The FUTURE_USE tag indicates that this field is not currently enabled in the syslog receiving scenario.

Sample of Raw Forwarding Log
The following is an example of a raw forwarding log (content has been desensitized)

Feb 27 10:25:14 xxxx-X01xxxx.xxx.com 1,2025/02/27 10:25:14,026701001784,THREAT,vulnerability,2816,2025/02/27 10:25:14,30.**.***.192,142.***.**.227,113.**.**.129,142.***.**.227,LAN-To-WAN,,,ssl,vsys1,Inside,Outside,ae1,ethernet1/3,Global Log Forwarding to PlatformTest,2025/02/27 10:25:14,631169,1,25970,443,47499,443,0x402400,tcp,alert,"xxxxxx-xxx-xxxx-xx.xxxx.xxxxxx.com/",Non-RFC Compliant SSL Traffic on Port 443(56112),content-delivery-networks,informational,client-to-server,7365339184546050694,0x8000000000000000,United States,United States,,,0,,,0,,,,,,,,0,57,34,99,0,,SIP-PA03,,,,,0,,0,,N/A,protocol-anomaly,AppThreat-xxxx-xxxx,0x0,0,4294967295,,,cxxxxxx3-xxxx-xxxx-xxxxx-bxxxxxx1xxxx,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,2025-02-27T10:25:14.811+08:00,,,,encrypted-tunnel,networking,browser-based,4,"used-by-malware,able-to-transfer-file,has-known-vulnerability,tunnel-other-application,pervasive-use",,ssl,no,no,,,NonProxyTraffic
Enter fullscreen mode Exit fullscreen mode

Configure Collection
You can configure the local server's rsyslog to receive logs and store Palo Alto logs in log files under /var/log/rsyslog/. Then, configure the LoongCollector file collection method to collect logs, and use the parse-csv command of the SPL statement to process log fields.


* |parse-csv content as FUTURE_USE, Receive_Time, Serial_Number, Type, Threat_Content_Type, FUTURE_USE_1, Generated_Time, Source_Address, Destination_Address, NAT_Source_IP, NAT_Destination_IP, Rule_Name, Source_User, Destination_User,Application,Virtual_System,Source_Zone,Destination_Zone,Inbound_Interface,Outbound_Interface,Log_Action,FUTURE_USE_2,Session_ID,Repeat_Count,Source_Port,Destination_Port,NAT_Source_Port,NAT_Destination_Port,Flags,IP_Protocol,Action,URL_Filename,Threat_ID,Category,Severity,Direction,Sequence_Number,Action_Flags,Source_Location,Destination_Location,FUTURE_USE_3,Content_Type,PCAP_ID,File_Digest,Cloud,URL_Index,User_Agent,File_Type,X_Forwarded_For,Referer,Sender,Subject,Recipient,Report_ID,Device_Group_Hierarchy_Level_1,Device_Group_Hierarchy_Level_2,Device_Group_Hierarchy_Level_3,Device_Group_Hierarchy_Level_4,Virtual_System_Name,Device_Name,FUTURE_USE_4,Source_VM_UUID,Destination_VM_UUID,HTTP_Method,Tunnel_ID_IMSI,Monitor_Tag_IMEI,Parent_Session_ID,Parent_Start_Time,Tunnel_Type,Threat_Category,Content_Version,FUTURE_USE_5,SCTP_Association_ID,Payload_Protocol_ID,HTTP_Headers,URL_Category_List,Rule_UUID,HTTP_2_Connection,Dynamic_User_Group_Name,XFF_Address,Source_Device_Category,Source_Device_Profile,Source_Device_Model,Source_Device_Vendor,Source_Device_OS_Family,Source_Device_OS_Version,Source_Hostname,Source_MAC_Address,Destination_Device_Category,Destination_Device_Profile,Destination_Device_Model,Destination_Device_Vendor,Destination_Device_OS_Family,Destination_Device_OS_Version,Destination_Hostname,Destination_MAC_Address,Container_ID,POD_Namespace,POD_Name,Source_External_Dynamic_List,Destination_External_Dynamic_List,Host_ID,Serial_Number_2,Domain_EDL,Source_Dynamic_Address_Group,Destination_Dynamic_Address_Group,Partial_Hash,High_Resolution_Timestamp,Reason,Justification,A_Slice_Service_Type,Application_Subcategory,Application_Category,Application_Technology,Application_Risk,Application_Characteristic,Application_Container,Tunneled_Application,Application_SaaS,Application_Sanctioned_State,Cloud_Report_ID,Cluster_Name,Flow_Type | project-away content 
Enter fullscreen mode Exit fullscreen mode

Collect and Parse the Results

Conclusion
The access practice of LoonCollector shows that unified multi-source log collection is not a simple "pipeline" function, but needs to consider the compatibility and extensibility of the format, which can effectively help reduce the "data island" problem in enterprise security analysis, especially in threat analysis and compliance scenarios. In the future, with the popularity of the cloud-native architecture, in addition to firewall products, logs from other security scenarios can also be collected based on LoongCollector. Building on this foundation, subsequent storage, query, analysis, and visualization can be performed through the SLS intelligent computing engine. For security teams, choosing a log collection tool that can not only meet current needs but also support future expansion will be a crucial step in building a comprehensive defense system.

Top comments (0)