Operations | Monitoring | ITSM | DevOps | Cloud

Database Sharding: What is it, and How it Works?

Today’s world runs on data. We are constantly improving our solutions thanks to the plethora of data available to us in the public domain. Our society has seen a behavioral change when it comes to formulating remedies. We are increasingly adopting data-driven decisions, and rightly so. Now, talking about this whole data logic, where do you think this enormous amount of data gets stored? Well, the answer is a database!

Leveraging Calico flow logs for enhanced observability

In my previous blog post, I discussed how transitioning from legacy monolithic applications to microservices based applications running on Kubernetes brings a range of benefits, but that it also increases the application’s attack surface. I zoomed in on creating security policies to harden the distributed microservice application, but another key challenge this transition brings is observing and monitoring the workload communication and known and unknown security gaps.

What is Network Traffic Analysis?

How much network traffic is received by a business in the United States on average? More specifically, how many gigabytes do you think it is? The numbers may surprise you. According to Statista, the average traffic received was nearly 200 BILLION gigabytes (178.21 billion GB). And it is expected to grow to 224.08 in 2023. Another interesting statistic involving traffic, with numbers provided by Broadband Search, is that users in America generate 3.1 million GBs per minute every minute.

The Future of Logz.io: Simple, Cost-effective Observability

Asaf and I founded Logz.io in 2015 to provide developers with the ultimate open source log management experience. With our product, logging with the ELK Stack was simple, efficient, and automated for the first time – so customers could save engineering costs and accelerate MTTR.

Lightrun's Product Updates - Q2 2023

During the second quarter of this year, Lightrun persisted producing a wealth of developer productivity solutions and enhancements, aiming for greater troubleshooting of distributed workload applications, reduction of MTTR for complex issues, and cost optimization within cloud-computing. Read more below the main new features as well as the key product enhancements that were released in Q2 of 2023!

The Cost of Upgrading Hundreds of Kubernetes Clusters

At Qovery, we manage hundreds of Kubernetes clusters for our customers on different Cloud Providers. For most non-operational people, it’s hard to understand what it means behind the scene, the amount of work it represents, pitfalls we can encounter, and associated complexity. Our customers are coming for several reasons, but they’re all happy to have Qovery management on the Kubernetes maintenance and upgrade stack. On our side, it’s too many clusters to manage them manually.

ITIL & Risk Management: How Do They Relate?

ITIL and Risk Management are closely related. They're both focused on helping organizations run their IT departments efficiently and, most importantly, safely. But here's the thing. The relationship between the two hasn’t always been clearly defined. That is, until the latest version of ITIL launched in 2019. A new version of ITIL is always exciting in the IT Service Management (ITSM) world, and incorporating knowledge on dedicated Risk Management practices was a very welcomed inclusion.

Benefits of GitOps in IT app development

Benefits of GitOps in IT monitoring The GitOps model has gained popularity as a software development approach. It enables IT teams to deliver higher-quality software faster and more efficiently. By streamlining and automating the development process, GitOps provides substantial productivity improvements while ensuring comprehensive observability for monitoring and control.

Open-sourcing sysgrok - An AI assistant for analyzing, understanding, and optimizing systems

In this post I will introduce sysgrok, a research prototype in which we are investigating how large language models (LLMs), like OpenAI's GPT models, can be applied to problems in the domains of performance optimization, root cause analysis, and systems engineering. You can find it on GitHub.