Alerting and Monitoring Tools
Alerting and Monitoring Tools
1.1 Overview
Purpose: These processes help identify threats in real time, optimize system performance, and
ensure compliance with security standards.
Key Tools: SIEM systems (e.g., Splunk, IBM QRadar, ArcSight) are widely used to collect,
correlate, and analyze logs from various sources.
Definition: Involves installing an agent on each host to collect event data, which is then sent to a
SIEM system for analysis.
How It Works:
o The agent service runs on endpoints (e.g., Windows, Linux, macOS) to gather logs.
o Collected data is filtered, aggregated, and transmitted securely to the SIEM server.
Use Cases:
o Windows Event Logs: Collected using agents like Microsoft Defender for Endpoint.
Definition: Collects logs from network devices (routers, switches, firewalls) without requiring an
agent on each device.
How It Works:
o Devices push log data to a SIEM server using protocols like Syslog.
Examples: Cisco routers and Palo Alto firewalls forwarding logs using Syslog.
2.3 Sensors
Definition: Sensors capture network traffic data and packet flows for analysis.
How It Works:
o Uses tools like Wireshark or Zeek (Bro) for packet capture.
Use Case: Monitoring network traffic for signs of data exfiltration or DDoS attacks.
Definition: Centralizes logs from various sources into a SIEM for correlation and analysis.
Normalization:
o Purpose: Converts logs from different formats into a standardized format for easier
searchability.
o Example: Parsing logs from Windows Event Viewer and Apache web server for unified
analysis.
Ensures logs from different systems are synchronized to a common time zone.
4.1 Alerting
Definition: The process of detecting potential incidents based on predefined correlation rules.
How It Works:
o Correlation Rules: Use logical expressions to identify suspicious patterns (e.g., multiple
failed login attempts followed by a successful one).
o Threat Intelligence Feeds: Enrich alerts with information about known threat indicators
(e.g., IP addresses associated with malware).
Process:
o Analysis: Validates whether an alert is a true positive or false positive.
o Eradication and Recovery: Removes the threat and restores systems to normal
operation.
Use Case: A SIEM alert indicates malware on a server. The IT team isolates the server, scans for
malware, and restores it from a clean backup.
4.3 Reporting
Metrics:
5.1 Archiving
Definition: Retains historical logs and network traffic data for future analysis.
Benefits:
Retention Policy: Balances data volume with SIEM performance by archiving older logs.
Purpose: Reduces false positives to prevent alert fatigue among security analysts.
Techniques:
o Addressing False Negatives: Ensure that legitimate threats are not overlooked.
Example: Tuning SIEM alerts to differentiate between normal and suspicious user behavior.
System Monitoring: Tracks the health of computer resources and network devices using SNMP
traps and NetFlow.
NetFlow Analysis: Provides metadata on network traffic to identify anomalies like unusual data
transfers.
Application Monitoring:
Definition: System monitors assess the health and status of hosts using event logs and SNMP
traps.
Use Cases:
Tools:
Cloud Monitoring: Tools like AWS CloudWatch and Azure Monitor track cloud resource
utilization.
Definition: Controls the movement of sensitive data to prevent unauthorized access or sharing.
Monitoring:
o Policy Violations: Track incidents where data is copied to unauthorized media.
8.1 Overview
Definition: Compare system configurations against industry benchmarks (e.g., CIS Controls, NIST
Standards).
Purpose: Ensure compliance with regulatory standards and internal security policies.
Scenario: A financial institution conducts monthly compliance scans to ensure adherence to PCI-
DSS standards.
Conclusion
Effective alerting and monitoring are critical for detecting and mitigating cybersecurity threats. By
leveraging agent-based and agentless collection methods, log aggregation, alert tuning, and continuous
monitoring, organizations can enhance their security posture and respond more effectively to incidents.
Regular archiving, compliance scanning, and reporting ensure that systems remain secure and compliant
with industry standards.