Performance Guideline For Syslog-Ng Store Box: June 27, 2018
Performance Guideline For Syslog-Ng Store Box: June 27, 2018
Abstract
Performance analysis of syslog-ng Store Box
syslog-ng.com 2
Preface
1. Preface
This whitepaper enables syslog-ng Store Box (SSB) end-users, integrators, and sales personnel to make
predictions about the performance of the SSB appliance based on various environmental and configuration
parameters.
syslog-ng.com 3
Log collecting performance
During RAID synchronization the performance can drop significantly, because of the heavy load of the disk
subsystem. As a result, a new SSB installation may give misleading performance numbers. Always wait for
the RAID synchronization to finish before testing the performance.
If you use DNS for the log source, but the DNS server is unreachable, the performance of SSB will greatly
decrease.
Sending logs from one log source into multiple logspaces degrades performance. If possible, send logs from
one (or more) log source into a single logspace, and not duplicate the logs. If you need to filter or aggregate
log messages in different ways, consider using the filtered logspace and multiple logspace features. (For details,
see Procedure 8.4, Creating filtered logspaces in The syslog-ng Store Box 5 LTS Administrator Guide and
Procedure 8.6, Creating multiple logspaces in The syslog-ng Store Box 5 LTS Administrator Guide.)
Parsing syslog headers adds an extra 18% overhead. You can improve the raw performance of SSB by selecting
the Do not parse option in the log source. However, in this case you cannot search and filter the host, program,
pid, and other fields of these messages.
Disabling flow control on a log source will not throttle back clients, and seems to increase performance. However,
it may lead to losing messages.
■ Number of plain TCP connections to the log sources of SSB, up to around 5000 connections.
■ Number of SSL/TLS TCP connections to the log sources of SSB, up to around 1000 connections.
■ SSB can process about 5% more messages using the IETF-syslog protocol than the legacy BSD-syslog
messages.
■ Enabling debug logging in SSB has no effect: debug logs are related to tracing web access and related
operations.
■ The Trusted, Use DNS, Use FQDN settings have limited effect on performance, provided that DNS
is correctly set up. Internally there is a DNS cache in syslog-ng. (For details on these settings, see
Procedure 7.3, Creating syslog message sources in SSB in The syslog-ng Store Box 5 LTS
Administrator Guide.)
syslog-ng.com 4
Overall performance
Overall performance
Depending on its exact configuration and the mix of log formats received, the largest SSB appliance can collect
and index up to 100,000 messages per second (100k EPS) for sustained periods.
syslog-ng.com 5
Introduction to SSB search algorithms
Level 2: Log messages are stored in a single file per day. There is also a file that lists index files (level 3)
related to distinct time intervals inside the day.
Level 3: Index file that holds an ordered list of tokens processed when SSB received the logs. For each token
there is a list of unique identifiers that points to the messages that contained the token.
Tokens are the words separated by the delimiters set for the logspace (for details, see Procedure 8.1.2,
Configuring the indexer service in The syslog-ng Store Box 5 LTS Administrator Guide).
On Level 3, SSB looks up the tokens that match the basic expressions in the search query. Since the tokens are
stored in alphabetic order, this lookup is very fast for exact searches. If the token contains wildcards (* or ?,
then potential matches are checked individually.
At this point SSB has the list of message identifiers it needs to calculate AND, OR, NOT expressions and
finalize search results per day. Getting the final result simply means repeating the procedure for all the days
that are requested in the search interval.
syslog-ng.com 6
Search performance (SSB T4)
Overview
In this section we describe SSB search algorithm performance, measured from starting a search to returning
the first 100 results. When a search is executed, SSB calculates the unique identifiers of every search results,
without loading the individual messages. The actual messages are loaded temporarily only when requested on
the user interface or the RPC API.
This means that it can easily happen that calculating the results takes under a second, but fetching all the resulting
messages takes minutes, because it takes time to read the messages from the disk and return them. This also
means that the size of the messages has no impact on the memory usage of search.
We have conducted our tests using a real life logspace containing 200 million log entries (about 9.1Gb
compressed). We executed the searches directly on SSB to avoid network and caching effects. The test hardware
was an SSB T4 appliance, but the response times are very similar on SSB T10 appliances as well. For SSB T1,
response times are higher by a factor of x2.5 on the average.
Example1: username
Example2: restart
The simplest search expression is a specific token, like login. Tokens are the words separated by the delimiters
set for the logspace (for details, see Procedure 8.1.2, Configuring the indexer service in The syslog-ng Store
Box 5 LTS Administrator Guide).
For 200 million logs, searching for a token takes between 1-5 seconds, and the used memory is roughly the
same number of bytes as the number of results.
Wildcards:
Example1: user?ame
Example2: system*
Example3: *tool
You can specifying part of a token, or add *, ? characters after and/or in front of the token. For example *pple
or appl?.
Search times depend on how many letters are known of the token, especially at the front of the token. The worst
case is when the search expression starts with the wildcard, for example *pple, which would take between
30-60 seconds to search for in 200 million messages. Searching for appl* takes around 9 seconds. The absolute
worst case is *? where no letter is known, which takes 80 seconds.
syslog-ng.com 7
Complex search expressions
Memory consumption is a sum of the number of eventual results plus at most the size of the biggest index file
involved in the search. The size of the index file depends on the Memory limit setting of the logspace. The
higher the limit, the larger the index files. (For details, see Procedure 8.1.2, Configuring the indexer service in
The syslog-ng Store Box 5 LTS Administrator Guide.)
Excluding tokens:
You can exclude tokens from a search using the NOT keyword, as in “NOT apple, NOT *pple, and so on.
Such search takes slightly longer than searching for the same expression without NOT. The memory used is
roughly the same.
Response time can be calculated by adding up the response times of the searches included in the OR expression.
The actual OR operation is extremely efficient, so there is little additional overhead.
The maximal response time can be calculated by adding up the response times of included searches in the AND
expression. The actual AND operation is extremely efficient, so there is little additional overhead.
Same response times and memory consumption expected as for regular searches.
syslog-ng.com 8
Log collection and search
User simulation was achieved by executing the same RPC API queries that a front end (i.e. internet browser)
would send. These are: splitting the time interval in question into 30 equal parts and calculating the number of
results per interval (these are the bars on the search interface) and also fetching the first 200 results (part of this
is show at the bottom of the search interface).
Measurement quantities:
■ Rate in 1000x message/second, that is, how many log messages we can send into SSB while running
searches at the same time.
■ Number of searches finished in 3 minutes of the test.
Case 1
The search queries are very simple exact token searches. Memory limit for the logspace is 1024Mb. Search
was for X<n> OR Y<m> with variable n,m numbers to avoid effects of the query cache of SSB.
Figure 1. Case 1: T1
syslog-ng.com 9
Case 2
Figure 2. Case 1: T4
Case 2
Addition of *Z to the start or the query to force much more disk IO usage. Results were the same regarding
throughput and obviously lower for number of searches.
syslog-ng.com 10
Case 2
Figure 4. Case 2: T1
syslog-ng.com 11
Performance of SSB versions
■ The structure of index files was optimized in SSB 4 F2, greatly increasing the search performance
of SSB. If searching is slow in your SSB and you are using SSB 4 LTS, consider upgrading to a
newer version. (Note that this will affect only the index files created after the upgrade.)
■ The search algorithms were optimized in SSB 4 F2, decreasing the memory usage of search by an
average of 80%. If search causes high memory consumption in your SSB and you are using SSB 4
LTS, consider upgrading to a newer version.
Figure 5. Average response time
syslog-ng.com 12
Summary
7. Summary
The test measurements show that the processing capabilities and search performance of syslog-ng Store Box
have increased significantly since version 4 LTS, and that SSB is capable of receiving and processing high-volume
log traffic. The largest SSB appliance is capable of scaling up to 100,000 event per second (100k EPS).
If the search performance of SSB is not adequate in your environment (search is slow, or greatly increases the
memory consumption), check the version of your SSB. If you are using SSB 4 LTS, consider upgrading to a
newer version.
If you have questions about the performance of SSB, or need help in optimizing the configuration of your SSB
appliance, contact professionalservices@balabit.com.
For more information, visit syslog-ng.com, read the syslog-ng blog, or follow us on Twitter via @balabit,
LinkedIn or Facebook.
To learn more about commercial and open source One Identity products, request an evaluation version, or find
a reseller, visit the following links:
All questions, comments or inquiries should be directed to <info@balabit.com> or by post to the following address: One Identity LLC 1117 Budapest, Alíz Str. 2 Phone:
+36 1 398 6700 Fax: +36 1 208 0875 Web: https://www.balabit.com/
Copyright © 2018 One Identity LLC All rights reserved. This document is protected by copyright and is distributed under licenses restricting its use, copying, distribution,
and decompilation. No part of this document may be reproduced in any form by any means without prior written authorization of One Identity.
All trademarks and product names mentioned herein are the trademarks of their respective owners.
syslog-ng.com 13