bannerbannerbanner
IT Cloud

Eugeny Shtoltc
IT Cloud

Полная версия

253741748224

container_fs_limit_bytes {beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", container = "POD", container_name = "POD", device = "/ dev / vda1", id = "/ kubepods.slice / kubepods-besteffort.subods / kubepods-besteffort.slice -besteffort-pod5a815a40_f2de_11ea_88d2_0242ac110032.slice / docker-76711789af076c8f2331d8212dad4c044d263c5cc3fa333347921bd6de7950a4.scope ", image =" k8s.gcr.io/pause:3.1 ", instance =" controlplane ", job =" kubernetes-cadvisor ", kubernetes_io_arch =" amd64 ", kubernetes_io_hostname = "controlplane", kubernetes_io_os = "linux", name = "k8s_POD_kube-proxy-nhzhn_kube-system_5a815a40-f2de-11ea-88d2-0242ac110032_0", namespace = "kube-system", pod = "kube_name =", podhn "kube-proxy-nhzhn"}

253741748224

It contains the metrics of RAM through its device: "container_fs_limit_bytes {device =" tmpfs "} / 1000/1000/1000"

{beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", device = "tmpfs", id = "/", instance = "controlplane", job = "kubernetes-cadvisor", kubernetes_io_arch = "amd64", kubernetes control_ioplane_host , kubernetes_io_os = "linux"} 0.209702912

{beta_kubernetes_io_arch = "amd64", beta_kubernetes_io_os = "linux", device = "tmpfs", id = "/", instance = "node01", job = "kubernetes-cadvisor", kubernetes_io_arch = "amd64", kubernetes_io_host , kubernetes_io_os = "linux"} 0.409296896

If we want to get the minimum disk, then we need to remove the RAM device from the list: "min (container_fs_limit_bytes {device! =" Tmpfs "} / 1000/1000/1000)"

{} 253.74174822400002

In addition to metrics that indicate the value of the metric itself, there are metrics and counters. Their names usually end in "_total". If we look at them, we will see an ascending line. To get the value, we need to get the difference (using the rate function) over a period of time (indicated in square brackets), something like rate (name_metric_total) [time]. Time is usually kept in seconds or minutes. The prefix "s" is used to represent seconds, for example 40s, 60s. For minutes – "m", for example, 2m, 5m. It is important to note that you cannot set a time shorter than the exporter polling time, otherwise the metric will not be displayed.

And you can see the names of the metrics that you could record along the path / metrics:

controlplane $ curl https://2886795314-9090-ollie08.environments.katacoda.com/metrics 2> / dev / null | head

# HELP go_gc_duration_seconds A summary of the GC invocation durations.

# TYPE go_gc_duration_seconds summary

go_gc_duration_seconds {quantile = "0"} 3.536e-05

go_gc_duration_seconds {quantile = "0.25"} 7.5348e-05

go_gc_duration_seconds {quantile = "0.5"} 0.000163193

go_gc_duration_seconds {quantile = "0.75"} 0.001391603

go_gc_duration_seconds {quantile = "1"} 0.246707852

go_gc_duration_seconds_sum 0.388611299

go_gc_duration_seconds_count 74

# HELP go_goroutines Number of goroutines that currently exist.

Raising the Prometheus and Graphana ligament

We examined the metrics in the already configured Prometheus, now we will raise Prometheus and configure it ourselves:

essh @ kubernetes-master: ~ $ docker run -d –net = host –name prometheus prom / prometheus

09416fc74bf8b54a35609a1954236e686f8f6dfc598f7e05fa12234f287070ab

essh @ kubernetes-master: ~ $ docker ps -f name = prometheus

CONTAINER ID IMAGE NAMES

09416fc74bf8 prom / prometheus prometheus

UI with graphs for displaying metrics:

essh @ kubernetes-master: ~ $ firefox localhost: 9090

Add the go_gc_duration_seconds {quantile = "0"} metric from the list:

essh @ kubernetes-master: ~ $ curl localhost: 9090 / metrics 2> / dev / null | head -n 4

# HELP go_gc_duration_seconds A summary of the GC invocation durations.

# TYPE go_gc_duration_seconds summary

go_gc_duration_seconds {quantile = "0"} 1.0097e-05

go_gc_duration_seconds {quantile = "0.25"} 1.7841e-05

Going to the UI at localhost: 9090 in the menu, select Graph. Let's add to the dashboard with the chart: select the metric using the list – insert metrics at cursor . Here we see the same metrics as in the localhost: 9090 / metrics list, but aggregated by parameters, for example, just go_gc_duration_seconds. We select the go_gc_duration_seconds metric and show it on the Execute button . In the console tab of the dashboard, we see the metrics:

go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0"} 0.000009186 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.25"} 0.000012056 = go_congc_ instance "localhost: 9090", JOB = "prometheus", quantile = "0.5"} 0.000023256 go_gc_duration_seconds {instance = "localhost: 9090", JOB = "prometheus", quantile = "0.75"} 0.000068848 go_gc_duration_seconds {instance = "localhost: 9090 ", JOB =" prometheus ", quantile =" 1 "} 0.00021869

by going to the Graph tab – their graphical representation.

Now Prometheus collects metrics from the current node: go_ *, net_ *, process_ *, prometheus_ *, promhttp_ *, scrape_ * and up. To collect metrics from Docker, we tell him to write his metrics in Prometheus on port 9323:

eSSH @ Kubernetes-master: ~ $ curl http: // localhost: 9323 / metrics 2> / dev / null | head -n 20

# HELP builder_builds_failed_total Number of failed image builds

# TYPE builder_builds_failed_total counter

builder_builds_failed_total {reason = "build_canceled"} 0

builder_builds_failed_total {reason = "build_target_not_reachable_error"} 0

builder_builds_failed_total {reason = "command_not_supported_error"} 0

builder_builds_failed_total {reason = "Dockerfile_empty_error"} 0

builder_builds_failed_total {reason = "Dockerfile_syntax_error"} 0

builder_builds_failed_total {reason = "error_processing_commands_error"} 0

builder_builds_failed_total {reason = "missing_onbuild_arguments_error"} 0

builder_builds_failed_total {reason = "unknown_instruction_error"} 0

# HELP builder_builds_triggered_total Number of triggered image builds

# TYPE builder_builds_triggered_total counter

builder_builds_triggered_total 0

# HELP engine_daemon_container_actions_seconds The number of seconds it takes to process each container action

# TYPE engine_daemon_container_actions_seconds histogram

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.005"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.01"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.025"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.05"} 1

engine_daemon_container_actions_seconds_bucket {action = "changes", le = "0.1"} 1

In order for the docker daemon to apply the parameters, it must be restarted, which will lead to the fall of all containers, and when the daemon starts, the containers will be raised in accordance with their policy:

essh @ kubernetes-master: ~ $ sudo chmod a + w /etc/docker/daemon.json

essh @ kubernetes-master: ~ $ echo '{"metrics-addr": "127.0.0.1:9323", "experimental": true}' | jq -M -f / dev / null> /etc/docker/daemon.json

essh @ kubernetes-master: ~ $ cat /etc/docker/daemon.json

{

"metrics-addr": "127.0.0.1:9323",

"experimental": true

}

essh @ kubernetes-master: ~ $ systemctl restart docker

Prometheus will only respond to metrics on the same server from different sources. In order for us to collect metrics from different nodes and see the aggregated result, we need to put an agent collecting metrics on each node:

essh @ kubernetes-master: ~ $ docker run -d \

–v "/ proc: / host / proc" \

–v "/ sys: / host / sys" \

–v "/: / rootfs" \

–-net = "host" \

–-name = explorer \

quay.io/prometheus/node-exporter:v0.13.0 \

–collector.procfs / host / proc \

–collector.sysfs / host / sys \

–collector.filesystem.ignored-mount-points "^ / (sys | proc | dev | host | etc) ($ | /)"

1faf800c878447e6110f26aa3c61718f5e7276f93023ab4ed5bc1e782bf39d56

and register to listen to the address of the node, but for now everything is local, localhost: 9100. Now let's tell Prometheus to listen to agent and docker:

essh @ kubernetes-master: ~ $ mkdir prometheus && cd $ _

essh @ kubernetes-master: ~ / prometheus $ cat << EOF> ./prometheus.yml

global:

scrape_interval: 1s

evaluation_interval: 1s

scrape_configs:

– job_name: 'prometheus'

static_configs:

– targets: ['127.0.0.1:9090', '127.0.0.1:9100', '127.0.0.1:9323']

labels:

group: 'prometheus'

EOF

essh @ kubernetes-master: ~ / prometheus $ docker rm -f prometheus

prometheus

essh @ kubernetes-master: ~ / prometheus $ docker run \

–d \

–-net = host \

–-restart always \

–-name prometheus \

–v $ (pwd) /prometheus.yml:/etc/prometheus/prometheus.yml

prom / prometheus

7dd991397d43597ded6be388f73583386dab3d527f5278b7e16403e7ea633eef

essh @ kubernetes-master: ~ / prometheus $ docker ps \

–f name = prometheus

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

7dd991397d43 prom / prometheus "/ bin / prometheus –c…" 53 seconds ago Up 53 seconds prometheus

1702 host metrics are now available:

essh @ kubernetes-master: ~ / prometheus $ curl http: // localhost: 9100 / metrics | grep -v '#' | wc -l

1702

out of all the variety, it is difficult to find the ones you need for everyday tasks, for example, the amount of memory used by node_memory_Active. There are metrics aggregators for this:

 

http: // localhost: 9090 / consoles / node.html

http: // localhost: 9090 / consoles / node-cpu.html

But it's better to use Grafana. Let's install it too, you can see an example:

essh @ kubernetes-master: ~ / prometheus $ docker run \

–d \

–-name = grafana \

–-net = host

grafana / grafana

Unable to find image 'grafana / grafana: latest' locally

latest: Pulling from grafana / grafana

9d48c3bd43c5: Already exists

df58635243b1: Pull complete

09b2e1de003c: Pull complete

f21b6d64aaf0: Pull complete

719d3f6b4656: Pull complete

d18fca935678: Pull complete

7c7f1ccbce63: Pull complete

Digest: sha256: a10521576058f40427306fcb5be48138c77ea7c55ede24327381211e653f478a

Status: Downloaded newer image for grafana / grafana: latest

6f9ca05c7efb2f5cd8437ddcb4c708515707dbed12eaa417c2dca111d7cb17dc

essh @ kubernetes-master: ~ / prometheus $ firefox localhost: 3000

We will enter the login admin and the password admin, after which we will be prompted to change the password. Next, you need to perform the subsequent configuration.

In Grafana, the initial login is admin and this password. First, we are prompted to select a source – select Prometheus, enter localhost: 9090, select the connection not as to the server, but as to the browser (that is, over the network) and select that we have basic authentication – that's all – click Save and Test and Prometheus is connected.

It is clear that it is not worth giving out a password and login from admin rights to everyone. To do this, you will need to create users or integrate them with an external user database such as Microsoft Active Directory.

I will select in the Dashboard tab and activate all three reconfigured dashboards. From the New Dashboard list in the top menu, select the Prometheus 2.0 Stats dashboard. But, there is no data:

I click on the "+" menu item and select "Dashboard", it is proposed to create a dashboard. A dashboard can contain several widgets, for example, charts that can be positioned and customized, so click on the add chart button and select its type. On the graph itself, we select edit by choosing a size, click edit, and the most important thing here is the choice of the displayed metric. Choosing Prometheus

Complete assembly available:

essh @ kubernetes-master: ~ / prometheus $ wget \

https://raw.githubusercontent.com/grafana/grafana/master/devenv/docker/ha_test/docker-compose.yaml

–-2019-10-30 07: 29: 52– https://raw.githubusercontent.com/grafana/grafana/master/devenv/docker/ha_test/docker-compose.yaml

Resolving raw.githubusercontent.com (raw.githubusercontent.com) … 151.101.112.133

Connecting to raw.githubusercontent.com (raw.githubusercontent.com) | 151.101.112.133 |: 443 … connected.

HTTP request sent, awaiting response … 200 OK

Length: 2996 (2.9K) [text / plain]

Saving to: 'docker-compose.yaml'

docker-compose.yaml 100% [=========>] 2.93K –.– KB / s in 0s

2019-10-30 07:29:52 (23.4 MB / s) – 'docker-compose.yaml' saved [2996/2996]

Obtaining application metrics

Up to this point, we have looked at the case where Prometheus polled the standard metric accumulator, getting the standard metrics. Now let's try to create an application and submit our metrics. First, let's take a NodeJS server and write an application for it. To do this, let's create a NodeJS project:

vagrant @ ubuntu: ~ $ mkdir nodejs && cd $ _

vagrant @ ubuntu: ~ / nodejs $ npm init

This utility will walk you through creating a package.json file.

It only covers the most common items, and tries to guess sensible defaults.

See `npm help json` for definitive documentation on these fields

and exactly what they do.

Use `npm install <pkg> –save` afterwards to install a package and

save it as a dependency in the package.json file.

name: (nodejs)

version: (1.0.0)

description:

entry point: (index.js)

test command:

git repository:

keywords:

author: ESSch

license: (ISC)

About to write to /home/vagrant/nodejs/package.json:

{

"name": "nodejs",

"version": "1.0.0",

"description": "",

"main": "index.js",

"scripts": {

"test": "echo \" Error: no test specified \ "&& exit 1"

},

"author": "ESSch",

"license": "ISC"

}

Is this ok? (yes) yes

First, let's create a WEB server. I'll use the library to create it:

vagrant @ ubuntu: ~ / nodejs $ npm install Express –save

npm WARN deprecated Express@3.0.1: Package unsupported. Please use the express package (all lowercase) instead.

nodejs@1.0.0 / home / vagrant / nodejs

└── Express@3.0.1

npm WARN nodejs@1.0.0 No description

npm WARN nodejs@1.0.0 No repository field.

vagrant @ ubuntu: ~ / nodejs $ cat << EOF> index.js

const express = require ('express');

const app = express ();

app.get ('/ healt', function (req, res) {

res.send ({status: "Healt"});

});

app.listen (9999, () => {

console.log ({status: "start"});

});

EOF

vagrant @ ubuntu: ~ / nodejs $ node index.js &

[1] 18963

vagrant @ ubuntu: ~ / nodejs $ {status: 'start'}

vagrant @ ubuntu: ~ / nodejs $ curl localhost: 9999 / healt

{"status": "Healt"}

Our server is ready to work with Prometheus. We need to configure Prometheus for it.

The Prometheus scaling problem arises when the data does not fit on one server, more precisely, when one server does not have time to record data and when the processing of data by one server does not suit the performance. Thanos solves this problem by not requiring federation setup, by providing the user with an interface and API that it broadcasts to Prometheus instances. A web interface similar to Prometheus is available to the user. He himself interacts with agents that are installed on instances as a side-car, as Istio does. He and the agents are available as containers and as a Helm chart. For example, an agent can be brought up as a container configured on Prometheus, and Prometheus is configured with a config followed by a reboot.

docker run –rm quay.io/thanos/thanos:v0.7.0 –help

docker run -d –net = host –rm \

–v $ (pwd) /prometheus0_eu1.yml:/etc/prometheus/prometheus.yml \

–-name prometheus-0-sidecar-eu1 \

–u root \

quay.io/thanos/thanos:v0.7.0 \

sidecar \

–-http-address 0.0.0.0:19090 \

–-grpc-address 0.0.0.0:19190 \

–-reloader.config-file /etc/prometheus/prometheus.yml \

–-prometheus.url http://127.0.0.1:9090

Notifications are an important part of monitoring. Notifications consist of firing triggers and a provider. A trigger is written in PromQL, as a rule, with a condition in Prometheus. When a trigger is triggered (metric condition), Prometheus signals the provider to send a notification. The standard provider is Alertmanager and is capable of sending messages to various receivers such as email and Slack.

For example, the metric "up", which takes the values 0 or 1, can be used to poison a message if the server is off for more than 1 minute. For this, a rule is written:

groups:

– name: example

rules:

– alert: Instance Down

expr: up == 0

for: 1m

When the metric is equal to 0 for more than 1 minute, then this trigger is triggered and Prometheus sends a request to the Alertmanager. Alertmanager specifies what to do with this event. We can prescribe that when the InstanceDown event is received, we need to send a message to the mail. To do this, configure Alertmanager to do this:

global:

smtp_smarthost: 'localhost: 25'

smtp_from: 'youraddress@example.org'

route:

receiver: example-email

receivers:

– name: example-email

email_configs:

– to: 'youraddress@example.org'

Alertmanager itself will use the installed protocol on this computer. In order for it to be able to do this, it must be installed. Take Simple Mail Transfer Protocol (SMTP), for example. To test it, let's install a console mail server in parallel with the Alert Manager – sendmail.

Fast and clear analysis of system logs

OpenSource full-text search engine Lucene is used for quick search in logs. On its basis, two low-level products were built: Sold and Elasticsearch, which are quite similar in capabilities, but differ in usability and license. Many popular assemblies are built on them, for example, just a delivery set with ElasticSearch: ELK (Elasticsearch (Apache Lucene), Logstash, Kibana), EFK (Elasticsearch, Fluentd, Kibana), and products, for example, GrayLog2. Both GrayLog2 and assemblies (ELK / EFK) are actively used due to the lesser need to configure non-test benches, for example, you can put EFK in a Kubernetes cluster with almost one command

helm install efk-stack stable / elastic-stack –set logstash.enabled = false –set fluentd.enabled = true –set fluentd-elastics

An alternative that has not yet received much consideration are systems built on the previously considered Prometheus, for example, PLG (Promtail (agent) – Loki (Prometheus) – Grafana).

Comparison of ElasticSearch and Sold (systems are comparable):

Elastic:

** Commercial with open source and the ability to commit (via approval);

** Supports more complex queries, more analytics, out of the box support for distributed queries, more complete REST-full JSON-BASH, chaining, machine learning, SQL (paid);

*** Full-text search;

*** Real-time index;

*** Monitoring (paid);

*** Monitoring via Elastic FQ;

*** Machine learning (paid);

*** Simple indexing;

*** More data types and structures;

** Lucene engine;

** Parent-child (JOIN);

** Scalable native;

** Documentation from 2010;

Solr:

** OpenSource;

** High speed with JOIN;

*** Full-text search;

*** Real-time index;

*** Monitoring in the admin panel;

*** Machine learning through modules;

*** Input data: Work, PDF and others;

*** Requires a schema for indexing;

*** Data: nested objects;

** Lucene engine;

** JSON join;

** Scalable: Solar Cloud (setting) && ZooKeeper (setting);

** Documentation since 2004.

At the present time, micro-service architecture is increasingly used, which allows due to weak

the connectivity between their components and their simplicity to simplify their development, testing, and debugging.

But in general, the system becomes more difficult to analyze due to its distribution. To analyze the condition

in general, logs are used, collected in a centralized place and converted into an understandable form. Also arises

the need to analyze other data, for example, access_log NGINX, to collect metrics about attendance, mail log,

mail server to detect attempts to guess a password, etc. Take ELK as an example of such a solution. ELK means

a bunch of three products: Logstash, Elasticsearch and Kubana, the first and last of which are heavily focused on the central and

provide ease of use. More generally ELK is called Elastic Stack, since the tool for preparing logs Logstash

can be replaced by analogs such as Fluentd or Rsyslog, and the Kibana renderer can be replaced by Grafana. For example, although

Kibana provides great analysis capabilities, Grafana provides notifications when events occur, and

can be used in conjunction with other products, for example, CAdVisor – analysis of the state of the system and individual containers.

EKL products can be self-installed, downloaded as self-contained containers for which you need to configure

communication or as a single container.

For Elasticsearch to work properly, you need the data to come in JSON format. If the data is submitted to

text format (the log is written in one line, separated from the previous one by a line break), then it can

provide only full-text searches as they will be interpreted as one line. For transmission

logs in JSON format, there are two options: either configure the product under investigation to be output in this format,

for example, for NGINX there is such a possibility. But, often this is impossible, since there is already

the accumulated database of logs, and traditionally they are written in text format. For such cases, it is necessary

post processing of logs from text format to JSON, which is handled by Logstash. It is important to note that if

 

it is possible to immediately transfer data in a structured form (JSON, XML and others), then this follows

do, because if you do detailed parsing, then any deviation is a one-sided deviation from the format

will lead to inoperability, and if superficial – we lose valuable information. Anyway, parsing in

this system is a bottleneck, although it can be scaled to a limited extent to a service or log

file. Fortunately, more and more products are starting to support structured logging, such as

the latest versions of NGINX support logs in JSON format.

For systems that do not support this format, you can use the conversion to it using such

programs like Logstash, File bear and Fluentd. The first one is included in the standard Elastic Stack delivery from the vendor

and can be installed in one way ELK in Docker – container. It supports fetching data from files, network and

standard stream both at the input and at the output, and most importantly, the native Elastic Search protocol.

Logstash monitors log files based on modification date or receives over the network telnet data from a distributed

systems, for example, containers and, after transformation, it is sent to the output, usually in Elastic Search. It is simple and

comes standard with the Elastic Stack, making it easy and hassle-free to configure. But thanks to

Java machine inside is heavy and not very functional, although it supports plugins, for example, synchronization with MySQL

to send new data. Filebeat provides slightly more options. An enterprise tool for everything

cases of life can serve Fluentd due to its high functionality (reading logs, system logs, etc.),

scalability and the ability to roll out across Kubernetes clusters using the Helm chart, and monitor everything

data center in the standard package, but about this relevant section.

To manage logs, you can use Curator, which can archive old ones from ElasticSearch

logs or delete them, increasing the efficiency of its work.

The process of obtaining logs is logical carried out by special collectors: logstash, fluentd, filebeat or

others.

fluentd is the least demanding and simpler analogue of Logstash. Customization

produced in /etc/td-agent/td-agent.conf, which contains four blocks:

** match – contains settings for transferring received data;

** include – contains information about file types;

** system – contains system settings.

Logstash provides a much more functional configuration language. Logstash agent daemon – logstash monitors

changes in files. If the logs are not located locally, but on a distributed system, then logstash is installed on each server and

runs in agent mode bin / logstash agent -f /env/conf/my.conf . Since run

logstash only as an agent for sending logs is wasteful, then you can use a product from those

the same developers Logstash Forwarder (formerly Lumberjack) forwards logs via the lumberjack protocol to

logstash to the server. You can use the Packetbeat agent to track and retrieve data from MySQL

(https://www.8host.com/blog/sbor-metrik-infrastruktury-s-pomoshhyu-packetbeat-i-elk-v-ubuntu-14-04/).

Also logstash allows you to convert data of different types:

** grok – set regular expressions to rip fields from a string, often for logs from text format to JSON;

** date – in case of archived logs, set the date when the log was created not as the current date, but take it from the log itself

** kv – for logs like key = value;

** mutate – select only the required fields and change the data in the fields, for example, replace the "/" character with "_";

** multiline – for multi-line logs with delimiters.

For example, you can decompose a log in the format "date type number" into components, for example "01.01.2021 INFO 1" decompose into a hash "message":

filter {

grok {

type => "my_log"

match => ["message", "% {MYDATE: date}% {WORD: loglevel} $ {ID.id.int}"]

}

}

The $ {ID.id.int} template takes the class – the ID template, the resulting value will be substituted into the id field and the string value will be converted to the int type.

In the "Output" block, we can specify: output data to the console using the "Stdout" block, to a file – "File", transfer via http via JSON REST API – "Elasticsearch" or send by mail – "Email". You can also order conditions for the fields obtained in the filter block. For instance,:

output {

if [type] == "Info" {

elasticsearch {

host => localhost

index => "log -% {+ YYYY.MM.dd}"

}

}

}

Here the Elasticsearch index (a database, if we can analogy with SQL) changes every day. To create a new index, you do not need to create it specially – this is how NoSQL databases do it, since there is no strict requirement to describe the structure – property and type. But it is still recommended to describe it, otherwise all fields will be with string values, if a number is not specified. To display Elasticsearch data, a plugin of the WEB-ui interface in AngularJS – Kibana is used. To display a timeline in its charts, you need to describe at least one field with the date type, and for aggregate functions – a numeric one, be it an integer or floating point. Also, if new fields are added, indexing and displaying them requires re-indexing the entire index, so the most complete description of the structure will help to avoid the very time-consuming operation of reindexing.

The division of the index by days is done to speed up the work of Elasticsearch, and in Kibana you can select several by pattern, here log- * , the limitation of one million documents per index is also removed.

Consider a more detailed Logstash output plugin:

output {

if [type] == "Info" {

elasticsearch {

claster => elasticsearch

action => "create"

hosts => ["localhost: 9200"]

index => "log -% {+ YYYY.MM.dd}"

document_type => ....

document_id => "% {id}"

}

}

}

Interaction with ElasticSearch is carried out through the JSON REST API, for which there are drivers for most modern languages. But in order not to write code, we will use the Logstash utility, which also knows how to convert text data to JSON based on regular expressions. There are also predefined templates, like classes in regular expressions, such as % {IP: client} and others, which can be viewed at https://github.com/elastic/logstash/tree/v1.1.9/patterns. For standard services with standard settings on the Internet there are many ready-made configs, for example, for NGINX – https://github.com/zooniverse/static/blob/master/logstash- Nginx.conf. More similarly, it is described in the article https://habr.com/post/165059/.

ElasticSearch is a NoSQL database, so you don't need to specify a format (set of fields and its types). For searching, he still needs it, so he defines it himself, and with each format change, re-indexing occurs, in which work is impossible. To maintain a unified structure in the Serilog logger (DOT Net) there is an EventType field in which you can encrypt a set of fields and their types, for the rest you will have to implement them separately. To analyze the logs from a microservice architecture application, it is important to set the ID while it is being executed, that is, the request ID, which will be unchanged and transmitted from the microservice to the microservice, so that you can trace the entire path of the request.

Install ElasticSearch (https://habr.com/post/280488/) and check that curl -X GET localhost: 9200 works

sudo sysctl -w vm.max_map_count = 262144

$ curl 'localhost: 9200 / _cat / indices? v'

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

green open graylog_0 h2NICPMTQlqQRZhfkvsXRw 4 0 0 0 1kb 1kb

green open .kibana_1 iMJl7vyOTuu1eG8DlWl1OQ 1 0 3 0 11.9kb 11.9kb

yellow open indexname le87KQZwT22lFll8LSRdjw 5 1 1 0 4.5kb 4.5kb

yellow open db i6I2DmplQ7O40AUzyA-a6A 5 1 0 0 1.2kb 1.2kb

Create an entry in the blog database and post table curl -X PUT "$ ES_URL / blog / post / 1? Pretty" -d '

ElasticSearch search engine

In the previous section, we looked at the ELK stack that ElasticSearch, Logstash, and Kibana make up. In the full set, and often it is still extended by Filebeat – more tailored to work with the Logstash extension, for working with text logs. Despite the fact that Logstash quickly performs its task unnecessarily, they do not use it, and logs in JSON format are sent via the dump upload API directly to Logstash.

If we have an application, then pure ElasticSearch is used, which is used as a search engine, and Kibana is used as a tool for writing and debugging queries – the Dev Tools block. Although relational databases have a long history of development, the principle remains that the more demoralized the data, the slower it becomes, because it has to be merged with every request. This problem is solved by creating a View, which stores the resulting selection. But although modern databases have acquired impressive functionality, up to full-text search, they still cannot be compared in the efficiency and functionality of search with search engines. I will give an example from work: several tables with metrics, which are combined in a query into one, and a search is performed by the selected parameters in the admin panel, such as a date range, a page in pagination and content in a chat column term. This is not a lot, at the output we get a table of half a million rows, and the search by date and part of the row fits in milliseconds. But pagination slows down, in the initial pages its request takes about two minutes, in the final pages – more than four. At the same time, it will not work to combine a request for logical data and receive pagination in the forehead. And the same overgrowth, while it is not optimized, is executed in ElasticSearch in 22 milliseconds and contains both the data and the number of all data for pagination.

1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36 
Рейтинг@Mail.ru