- the device of a cluster of logs, which allows us to understand what is happening with payments and transactions (as well as with components and services in general);
- the work of data engineers in machine learning;
- implementation and transformation of CI / CD.
We share valuable experience so that you do not make our mistakes. We hope it will be useful!
Our rake is the key to your success
Maxim Ogryzkov, Senior System Administrator
The talk will be about processing logs of several data centers with access through a single interface. Let's discuss the reasons and consequences of the cluster upgrade. I'll tell you about the transport of delivery of logs from different systems and environments, and where does Apache Kafka have to do with it. And also why we don't use logstash and how to "attach" a cluster with one request in Kibana.
1:17 What the talk will be about: a cluster of logs
1:43 How do logs get into the cluster?
3:50 Why did we choose Apache Kafka
5:02 Rsyslog: advantages of using
9:00 Where to store logs from different DCs?
12:08What if the amount of data is too large?
14:00 Cluster update.
20:30 Our rake and solutions
22:35 Translog
24:25 Bulk request
26:28 Opendistro-perfomance-analyzer
28:28 Index Shrink
29:49 Librdkafka
31:37 Summary: what our cluster looks like now
Data Engineers in Machine Learning
Evgeny Vinogradov, Head of Data Warehouse Development Department
A story about how industrial work on experiments in ML looks like - which problems are solved at the model level, and which are only at the data level, and how to provide a controlled learning process.
1:40 About the speaker
2:41 Who is involved in DS projects?
8:30 What is a Data Science project?
14:15 Procedure in the DS project
15:42 The process of collecting a dataset
20:26 How everything
works in Apache Kafka 29:10 What happens after collecting a
dataset 29:21How to choose a model?
30:40 Examples of problems that a data engineer can solve
34:38 What technologies does all this work on?
35:03 Conclusions of the report
CI / CD for data engineer: round trip
Anton Spirin, Senior Developer of BI
Presentation on the implementation of CI / CD principles in BI development, goals, their transformation and overcoming difficulties.
2:00 Inquiry about the speaker
2:44 Description of the problem
4:28 Who is a data engineer?
5:43 CI / CD - what is the job of an engineer?
6:55 More about the stack and information systems
8:00 Starting point: where we started
10:34 The first stage of changes
15:50 Everything seems to be good, but ... the second stage of improvements
19:01 Almost demo: JenkinsFile, Pipelines
20:44 What did we get at the exit?
22:43 How long did it take? Release statistics
23:37 Our challenges and what could have been done differently. Future plans
All reports from the big IT conference YuMoneyDay . Materials about PM, testing and mobile development are on the way.