Operations | Monitoring | ITSM | DevOps | Cloud

Analyze Your Mailchimp Campaigns Using Telegraf

Monitoring your email campaigns helps you track key performance indicators (KPIs) such as open rates, click-through rates, and conversion rates. This evaluation provides insights into the success of your email campaigns and allows you to identify areas for improvement and by analyzing metrics like open rates and click-through rates, you can gauge the level of engagement your emails are generating.

Managing Telemetry Data Overflow in Kubernetes with Resource Quotas and Limits

One of the inherent challenges you'll face when working with Kubernetes is that a typical cluster includes many resources that produce telemetry data. Because producing and moving telemetry data consumes resources, you can end up in situations where different workloads are competing for the resources necessary to manage telemetry data.

Real Production Readiness with Internal Developer Portals

In cultures of continuous improvement, the criteria by which teams define a release's fitness for production is flexible by definition. Engineering organizations strive to balance risk and velocity, aiming for high quality releases on a cadence that doesn’t impede overall business throughput.

5 Cloud Outages Tracker Tools To Monitor Vendors in 2024

Whether you’re a business owner, a tech enthusiast, or simply a user who relies on cloud services for daily tasks, the cloud outage tracker can be a useful tool. It informs you of downtime, degraded performance, and maintenance of services that modern businesses rely on. Here’s the list of cloud outage tracker tools that can help you prepare for and mitigate the effects of inevitable disruptions in the cloud.

Partitioning Data for Query Performance in InfluxDB 3.0

Query performance is critical in any database. Data partitioning is a mechanism that helps prune unnecessary data, allowing queries to run faster. However, there are always trade-offs between large and small numbers of partitions. For instance, fine-grained partitioning on high cardinality columns can reduce performance. This post describes different partitioning schemes supported by InfluxDB 3.0 and explains their trade-offs.

Coming Soon: Cloudsmith Migration Toolkit

One of our core motivations in building Cloudsmith is to make software developers' lives easier. We want Cloudsmith to be one of those great products that feels intuitive and automates everything. As we’re picking up more and larger customers, we’re seeing an increased need for migration tools. We want to make it as easy as possible for teams who are stuck using JFrog Artifactory, or Sonatype Nexus, or other legacy tools to move over to the joy of SaaS artifact management using Cloudsmith.

Finding relationships in your data with embeddings

With the world still working out the limits of LLMs and ever more powerful models being released each month, it’s a little hard to know where to begin. Whether it’s summarising and generating text, building a useful chat assistant, or comparing the relatedness of strings with embeddings, almost all of this now can be done via a few simple API calls. It has never been easier to incorporate these new technologies into your own product.

NGINX Access and Error Logs

Nginx, a widely used web server and reverse proxy, maintains two crucial logs that provide valuable insights into its performance and user interactions: the access log and the error log. These logs play a pivotal role in monitoring and troubleshooting web server activities. The access log records every request made to the server, capturing details such as the requested URL, client's IP address, response status code, and user agent.

Alerts Are Fundamentally Messy

Good alerting hygiene consists of a few components: chasing down alert conditions, reflecting on incidents, and thinking of what makes a signal good or bad. The hope is that we can get our alerts to the stage where they will page us when they should, and they won’t when they shouldn’t. However, the reality of alerting in a socio-technical system must cater not only to the mess around the signal, but also to the longer term interpretation of alerts by people and automation acting on them.