Grafana Loki is configured in a YAML file which contains information on the Loki server and its individual components.
Some of these need to be tweaked for network observability according to your cluster size, number of flows and sampling.
Use the following commands to update your loki config in an easy way.
Update zero-click-loki/2-loki.yaml or Update zero-click-loki/2-loki-tls.yamlwith your custom configuration.
Then replace the related config using:
oc replace --force -f config.yaml
The pod will restart automatically.
Update loki-microservices/1-prerequisites/config.yaml with your custom configuration.
Then replace the config using:
oc replace --force -f config.yaml
Restart all pods of loki
instance:
oc delete pods --selector app.kubernetes.io/instance=loki -n netobserv
LokiStack needs to be configured as Unmanaged
management state first to allow configmap updates.
Run the following command to get the lokistack-config
configmap in netobserv
namespace:
oc get configmap lokistack-config -n netobserv -o yaml | yq '.binaryData | map_values(. | @base64d)' > binaryData.txt
Update the binaryData.txt file accordingly.
Then run the following command to update lokistack-config
configmap in netobserv
namespace using the updated file:
BINARY_CONFIG=$(yq -o=json -I=0 'map_values(. | @base64)' binaryData.txt) && echo $BINARY_CONFIG
oc patch configmap lokistack-config -n netobserv -p '{"binaryData":'$BINARY_CONFIG'}'
Restart all pods of lokistack
instance:
oc delete pods --selector app.kubernetes.io/name=lokistack -n netobserv
The query frontend splits larger queries into multiple smaller queries, executing these queries in parallel on downstream queriers and stitching the results back together again. This prevents large (multi-day, etc) queries from causing out of memory issues in a single querier and helps to execute them faster.
Check Grafana official documentation
Some queries may be limited by the query scheduler. You will need to update the following configuration:
query_range:
parallelise_shardable_queries: true
query_scheduler:
max_outstanding_requests_per_tenant: 100
Ensure parallelise_shardable_queries
is set to true
and increase max_outstanding_requests_per_tenant
following your needs (default = 100). It’s reasonable to put a high value here as 2048
for example, however it will decrease multi-users queries performance.
Check query_scheduler configuration for more details.
The messages containing bulks of records received by Loki distributor and exchanged between components have a maximum size in bytes set by the following parameters:
server:
grpc_server_max_recv_msg_size: 4194304
grpc_server_max_send_msg_size: 4194304
By default the size is 4194304
= 4Mb
. It’s reasonable to increase it to 8388608
= 8Mb
.
While collecting flows and enriching them, a latency appears between the current time and records. This particularly applies when using Kafka on large clusters.
Loki can be configured to reject old samples using the following configuration:
limits_config:
reject_old_samples_max_age: 168h
On top of that, Loki logs are written in chunks in order by time. If a message received is too old than the most recent one, it will be out-of-order
.
To accept messages within a specific time range, use the following configuration:
ingester:
max_chunk_age: 2h
Be careful, Loki calculates the earliest time that out-of-order entries may have and be accepted with:
time_of_most_recent_line - (max_chunk_age / 2)
Check accept out-of-order writes documentation for more info.
The number of active streams can be limited per user per ingester (unlimited by default) or per user across the cluster (default = 5000).
To update these limits, you can tweak the following values:
limits_config:
max_streams_per_user: 0
max_global_streams_per_user: 5000
It’s not recommended to disable both using 0
. You may set max_streams_per_user
to 5000
using multiple ingesters and disable max_global_streams_per_user
or increase max_global_streams_per_user
value instead.
Check limits_config for more details.
Ingestion is limited in terms of sample size per second as ingestion_rate_mb
and per distributor local rate as ingestion_burst_size_mb
:
limits_config:
ingestion_rate_mb: 4
ingestion_burst_size_mb: 6
It’s common to put more than 10Mb
on each. You can safely increase these two values but keep an eye on your ingester performances and on your storage size.
Check limits_config for more details.
Metric queries are limited in terms of unique series count. You can increase max_query_series
up to 10000
if you get maximum of series reached for a single query
error in the console plugin.
We don’t recommend to go above 10000
since it affects both query frontend stability and browser rendering performance.
Check limits_config for more details.