Operations | Monitoring | ITSM | DevOps | Cloud

September 2022

Monitor Azure Container Apps with Datadog

Azure Container Apps is a serverless platform that enables you to deploy containerized applications and microservices—regardless of their code or framework—without managing any underlying cloud infrastructure or orchestrators. By using serverless containers, Azure Container Apps can automatically scale based on HTTP requests or events supported by Kubernetes event-driven autoscaling (KEDA) in order to accommodate peak demand and meet your budgeting goals.

Run Datadog Synthetic tests in Azure Pipelines

Continuous integration (CI) demands continous testing: shifting left helps prevent faulty code from spreading, which is one of the core aims of CI. Datadog’s new Azure DevOps extension enables you to seamlessly incorporate integration and end-to-end tests into existing CI/CD workflows on Azure Pipelines, a dedicated CI/CD service that automatically runs builds, performs tests, and deploys your services and applications via cloud-hosted pipelines.

6 Best Network Inventory Documentation Tools

Although network inventory documentation isn’t the most glamorous task, it is essential for the success of any IT department. According to Steady Networks, “Workers spend about 22 minutes each day dealing with some type of IT issue.” An accurate network inventory can greatly reduce the time it takes for internal IT to identify and solve issues.

Migrating to an open-source private cloud platform: key considerations

Private clouds combine the many benefits of cloud computing, like elasticity, scalability and agility, with the security, access control and resource customisation of on-prem infrastructure. Private clouds allow financial institutions to have greater control over hardware and software choices. They make it easier to enforce compliance with regulatory standards.

Azure VM health monitoring - what's right for you?

Virtual Machines (VMs) are virtual computers with dedicated amounts of RAM, CPU power, and storage borrowed from a physical host computer. A Virtual Machine is a computer file, typically an image, that acts like a real computer. A Virtual Machine can have any operating system that runs in a window as a separate computing environment. Users can choose between the Linux distribution or Windows Server in the operating system.

Create Alerts on Cloud Monitoring

Are you interested to know about alerts in Cloud Monitoring? Would you like to know how to create metric based alerts for Google cloud products through cloud monitoring? In this video we introduce you to Alerts in Cloud Monitoring, how it works, the different types of alerting policies. Watch this video to learn how to create metric based alerts for Google cloud products.

Data Center Redundancy 101

The world depends on data centers in all aspects of daily life. To meet all-time high levels of demand that continue to grow with no end in sight, downtime is unacceptable for most organizations. The cost of downtime is rising and 40% of businesses report that just one hour of downtime can cost anywhere from $1 million to $5 million, not including the other associated fees. Large companies report that an interruption during peak business hours can cost almost $1 million per minute.

Chaos testing: Reliability for cloud-native apps

Reliability is a critical concern for software delivery teams. Every second of lackluster performance or service interruption comes with high costs. The consequences can extend beyond just monetary expenses and have a huge impact on a company’s reputation. In a survey conducted in 2022, participants reported that over 60% of digital infrastructure failures resulted in losses of $100,000 or more.

Validate your skills with our new Datadog Certification Program

The Datadog platform has evolved to meet the needs of organizations that are investing in cloud-based solutions and modernizations. These organizations need professionals that are highly skilled and understand how to get the most out of the Datadog platform. With our suite of products, features, and tools in mind, we wanted to offer a path for individuals to demonstrate their knowledge of our best-in-class monitoring platform and their understanding of observability best practices.

How Kuberenetes is shaping a cloud native future for operators

Leveraging Kubernetes to modernize application infrastructures presents several opportunities. Learn more about how Kubernetes is offering complete automation with reduced human intervention and shaping a cloud-native future. Executive Speakers: Ravi Alluboyina - Vice President of Engineering, Robin.io, A Rakuten Symphony Company David Lu - VP, Software Defined Networking Platforms and Systems, AT&T

Understanding Domain-Agnostic v. Domain-Centric AIOps Platforms

No matter what we do, we’ll always be surrounded by choices. Do I save money and take the bus, or do I spend money filling up my gas tank? Do I make dinner at home, or do I eat dinner out? Whatever the outcome, it’s our needs – what we require and what we can afford – that help guide us to where we should go. Technology is no exception. Especially in AIOps.

Interlink Enterprise AIOps App - Visualize and manage operational health in a single app.

The power of Enterprise AIOps at your fingertips. The Interlink Software Enterprise AIOps mobile app meets the performance and usability levels of consumer apps, delivering single pane visibility of the operational IT health of your organization to a wide variety of personas.

Ubuntu Arrives on Amazon WorkSpaces: The First Fully Managed Ubuntu VDI on a Public Cloud

29th September 2022 – Canonical is proud to announce the availability of Ubuntu WorkSpaces on AWS, a fully managed virtual desktop infrastructure (VDI) on the public cloud and the first third-party Linux OS available on the platform. Ubuntu Desktop’s availability on Amazon WorkSpaces was announced today at the AWS End User Computing Innovation Day in Seattle, WA.

The Time is Now to Learn from Availability to Optimize Customer Experience

We’ve just launched our inaugural State of Availability Report and the results are sobering. We discovered that: We’d hoped that at this point in the global digital transformation, organizations had gotten further ahead with mastering availability but there’s still a long way to go.

What role does ingenuity play in software development? ft. Bryan Finster of Defense Unicorns

Is it possible that the new tech and clever ideas can get in the way of adding value to our business? Bryan Finster, Distinguished Engineer at Defense Unicorn, sits down with Rob and shares how his perspective on solving business problems with code. Learn valuable insights on how we can keep the end user in mind while using technology to make their lives better.

Understanding Hyperconverged Infrastructure at the Edge from Adoption to Acceleration

You may be tired of the regular three-tiered infrastructure and the management issues it can bring in distributed systems and maintenance. Or perhaps you’ve looked at your infrastructure and realized that you need to move away from its current configuration. If that’s the case, hyperconverged infrastructure (HCI) may be a good solution because it removes a lot of management overhead, acting like a hypervisor that can handle networking and storage.

SBOMs The New Standard in Supply Chain Security - DevOpsCon NY 2022

Software supply chain attacks using software vulnerabilities remain a key avenue of initial access for attackers Organizations had to scramble to find out if critical vulnerabilities like Log4J were running on their systems. In response, Software Bill of Materials or SBOMs are being quickly adopted by enterprises around the globe, so what are they all about? The Linux Foundation research team revealed that 78% of organizations expect to produce or consume the Software Bill of Materials (SBOMs) in 2022.

JFrog's Newest Board Member Shares Her Thoughts on DevOps, Security & IoT

As At JFrog, we are passionate about hiring talented people who will help us leap higher and think big, further our innovation, and win in the market – it’s written in our Codex. For this reason, we continue to grow our board of directors and advisors because having solid guidance and the right mix of talent on our board is important to help us, our community and shareholders reach the next level of success in a market that is defined by rapid transformation.

Observability and Auto-Remediation

Organizations today are under pressure to stay ahead and maintain IT applications and infrastructure optimally. That means their IT teams are tasked to make sure that functions move along smoothly while minimizing downtime. To keep the lights on, enterprises add whatever domain-specific tools they need. However, these tools are often reactive, and not nearly robust enough to handle complex application topologies.

Transform Data into Actionable Insights to Empower Digital Experiences

Today we are living and working in a world that is digital-first and hybrid by design, with cloud, SaaS and legacy technologies working together, and employees working from everywhere. In this world, a click is everything. That action comes with intent and expectation—of a flawless digital experience. These experiences are the heartbeat of the fierce and competitive landscape we all work in.

Make Alerts Meaningful Again! Minimizing Alert Noise with Netreo

Alert noise, as well as false positives or too few alerts, undermine the effectiveness of any monitoring solution. Inaccurate alerts condition users to draw poor conclusions. Too many alerts contribute to serious alerts going undetected. Too many false positive or non-actionable alerts cause the significance of all alerts to diminish over time. And too few alerts can lead to misreading system performance and missing critical problems.

Securing Terraform Modules with tfsec

Infrastructure as Code (IaC) patterns have enabled velocity, repeatability, and codification of best practices for our environments. However, using IaC has introduced new challenges, especially around security. Securing manually deployed infrastructure is already difficult. This problem rapidly multiplies when organizations adopt IaC patterns, since they must now contend with the complexity of code and the proliferation of environments enabled by this increased velocity.

Remote vs office, Stack Overflow diversity, CTO skills, Meta layoffs, hiring fraud, and more - News

Atlassian releases the State of Teams report, DevOps World 2022 is postponed, Stack Overflow has a diversity problem, CTOs should be technical, layoffs abound, and hiring fraud is just getting weird. These stories and more on Sleuth News. 0:06: Remote vs Office? 0:43: Remote vs Office? (commentary) About Sleuth: Give Sleuth a try and see why it's a deploy-based Accelerate / DORA metrics tracker both managers and developers love.

Improving Software Failure: Measure, Change, Learn

How do you treat software development failure? Do you take time to measure and learn from software failure? Or do you try to fix it quickly only after your customers complain about it? Failure can be an opportunity to learn and get better. So how can you measure and learn from software failure, and turn failure into at least a partially positive experience? Failure happens all the time, but if you're not measuring it, how do you know what you’re missing?

How Batch Processing is Evolving to Keep Up with the Modern World

Batch processing, as the name suggests, is tackling things in batches rather than having to cater to them individually. When you are running a business, there are hundreds of tasks that are redundant and can be tackled together. This could be a task related to the warehouse or running multiple scripts in one go. Today running a business is very different than what it was years ago. Today businesses require you to choose batch processing. Batch processing will revolutionize your business.

Product metrics @ incident.io, a year (and a half) in

We’ve been celebrating a few big milestones 🎉 at incident.io in the last few months. We were recently discussing product metrics (as you do for fun on a Friday afternoon 🤓) , and Lawrence was very surprised with a particular stat around the number of workflows that have been run using incident.io.

Introducing Logz.io's New Metrics Integration for HashiCorp Consul with OpenTelemetry

HashiCorp Consul began as an open-source project for service discovery. It has evolved to provide other valuable functionality like secure service mesh to help secure microservice architectures based on service identity, but also the ability to achieve repeatable application deployment lifecycles via Network Infrastructure Automation and control access to the service mesh via Consul API Gateway.These features are considered the four core pillars of Consul service networking.

The Complex But Elegant Relationship Between AIOps and Observability

Digital transformation requires organizational evolution. Constant demand for rapid delivery of upgrades and new products forces change. Surely, the old days of managing monolithic applications housed in private servers are over. Applications consist of virtualized, containerized, and serverless code that’s networked via APIs across a hybrid infrastructure of public and private clouds.

Got an incident? pull the Andon Cord

Andon Cord catapulted Toyota into 40 years of unprecedented quality and domination. What is Andon Cord and how did they do it? In the early 1900s, Taiichi Ohno architected and introduced Andon cord in Toyota's manufacturing plants. The problem: This costs a lot of money. Production costs have always been high. In 1984, it cost NUMMI $15,000 per minute. That's $42,758 in today's value.

The benefits of running Microsoft SQL Server on Ubuntu Pro

Since November 2021, Canonical and Microsoft have been offering a jointly supported Microsoft SQL Server on Ubuntu Pro solution. With this offering, you can set up an optimised configuration of SQL Server on Ubuntu in a few steps. As database professionals, we should ensure the highest possible standards for database security and availability. In this blog, we will detail how the combination of SQL Server and Ubuntu Pro can help you achieve those goals.

Practical Guide on Setting up Prometheus and Grafana for Monitoring Your Microservices

Observability is a very important aspect of software that’s often taken for granted. You need to have visibility into what your application is doing at different levels to better understand an issue when it occurs. There are multiple open-source tools and initiatives to help you achieve improved visibility. When we talk about observability, there are three parts to consider: logs, traces and metrics.

New video: How to visualize your traces - tools and new ideas

In microservices, distributed tracing is a method for aggregating all the operations that occur in your distributed systems that were triggered by a specific request. If these traces are visualized, developers can gain insights into how their service behaves when it’s run with other services, which helps them understand why errors occur.

DevOps vs. SRE: What's the Difference?

Despite there being significant differences in the roles, DevOps and Site Reliability Engineering are often lumped together because many people assume they do similar work. Although both attempt to reduce the issues arising from software development processes, their goals, skill sets, and approaches are actually quite different. DevOps engineers focus on the development pipeline, and their goal is to enable better development processes and workflows.

Top 20 CI/CD Pipeline Interview Questions & Answers

The key to acing a CI/CD interview is preparation. The first step in preparation is to learn as much as you can about the possible company, including its background, offerings, and hiring practices. In order to help you master your next interview and land your dream job, this blog post includes CICD interview questions, all neatly organized into themes.. Refreshing your technical knowledge is the next item on the list because it will help you stand apart.

Sponsored Post

What Are Runbooks and How Does It Apply to Network Operation Centers (NOCs)?

Much like in other production environments, the production of cloud services is based on and orchestrated by a plethora of tools-making part of cloud services' overall cloud infrastructure. Given how cloud services are as complex as they are intricate, a vast range of detailed steps need to be performed in a certain order for the production environment to run smoothly, whether it's carrying out maintenance procedures, updates and upgrades, or resolving issues to prevent downtime.

IoT project lifecycle: App-centric software development [Part II]

The traditional embedded Linux development model ties applications to the OS. Such a constraint means apps have to target a specific release, which lowers development velocity. Furthermore, broken upgrades in one part of the device may affect refreshes in the rest of the OS. On the other hand, embedded developers are increasingly looking at open-source software to enable rapid app-centric software deployment and global collaboration.

Load Testing: How Fast Can We Go?

Speedscale creates load tests from recorded traffic so generating load is pretty core to what we do. As a brief overview, we record traffic from your service in one environment and replay it in another, optionally increasing load several fold. During a replay the Speedscale load generator makes requests against the system under test (SUT), with the responses from external dependencies like APIs or a payment processor optionally mocked out for consistency. Your service is the SUT here.

Part 6: Observability Maturity Model Summary

For decades, IT operations teams have relied on monitoring for insight into the availability and performance of their systems. But the shift to more advanced IT technologies and practices is driving the need for more than monitoring – and so observability evolved. With infrastructures and applications that span multiple dynamic, distributed and modular IT environments, organizations need a deeper, more precise understanding of everything that happens within these systems.

Get Certified with Argo CD - Now at Level 2 with "Gitops at Scale"

At the start of this year, Codefresh announced the first ever GitOps certification for Argo CD. The certification consists of 3 different levels covering GitOps end-to-end from beginning to end. With the launch of Level 1 (Fundamentals), we have seen an unprecedented interest from people that want to learn about GitOps resulting in over 10,000 students and 3,000 certified engineers!

Installation Guide: Aqua Security on DKP

In this installation guide, learn how to get Aqua Security up and running on D2iQ Kubernetes Platform. D2iQ, the leading enterprise Kubernetes provider for smart cloud-native applications, has partnered with Aqua Security, the largest pure-play cloud-native security company, to enable organizations to create a seamless DevSecOps experience that accelerates the deployment of secure smart cloud-native applications – and helps stop cloud native attacks.

Automate deployment of a Vue.js application to Firebase

A quick search of the internet will reveal many services available for freely hosting single page applications or static sites. Firebase is one of these services. Firebase is a development platform developed by Google that provides file storage, hosting, database, authentication, and analytics. It is free, provides an SSL certificate by default, and offers impressive speed across multiple regions. I have chosen Firebase for hosting our demo application in this tutorial.

Deploy your Next.js application on Vercel using Sentry and GitHub Actions

Thanks to the power of open source tooling and cloud services, shipping an application to production has never been that easy, In this blog post, we are going to go from bootstrapping a Next.js application to deploying it on Vercel. We will use Github Action to handle the Continuous Integration and Sentry to monitor the application once it is deployed to be warned of any problems as soon as it arrives.

How to get complete CI/CD pipeline observability

It's not like it used to be back in the day! Before CI/CD, we were building on-premises, service-oriented products following system style architecture and we were able to map out the build system and end-to-end process in a PowerPoint or Visio document. Although time-consuming and inefficient, it was relatively straightforward and the build pipeline was unlikely to change drastically. But that's no longer the case.

PostgreSQL Monitoring Upgrade

Netdata for PostgreSQL monitoring just got a huge upgrade, collecting 100+ PostgreSQL metrics and displaying these across 60+ different composite charts. You can check the reference documentation for the full list of metrics, and see them running live in the demo space. If you are using PostgreSQL in production, it is crucial that you monitor it for potential issues. And the more comprehensive the monitoring the better!

Scorecards for Resources

Cortex’s Resource Catalog allows engineers to track all of their infrastructure components — from databases to Kafta topics — in a single place. The Resource Catalog demystifies infrastructure, giving developers clear insights into exactly how their service architecture works. Using the Resource Catalog, it’s easy to find information about which infrastructure assets are running and how all the distinct components connect.
Featured Post

The Economic Crunch is Here: Time to Get AIOps Right

Economic warning signs are flashing, and organisations of all sizes are balancing the need for fiscal discipline and efficiency while fighting to retain customers, when a single negative interaction can send them running to a competitor. Business digital operations are more complex than ever, compounding the problem is that companies are still adapting to remote work and pandemic-driven digitisation. Our recent report confirms that delivery teams are facing increased pressures, unreasonable business demands, and higher rates of burnout.

Open-source cloud for beginners with OpenStack

In the beginning, there was Amazon Web Services (AWS). And AWS set a standard for cloud computing. AWS was fast, flexible, convenient to use and geo-redundant. Definitely much better than legacy IT infrastructure or VMware. A lot of enterprises all over the world started migrating their business applications to AWS.

Service Catalog: Simplifying Service Management and Ownership

With the adoption of cloud and microservices, modern IT infrastructures operate with a mesh of services that cater to multiple user requirements. It can get very difficult to simultaneously keep track of numerous services. A Service Catalog helps organize service-related information in a single pane, achieve end-to-end service ownership and get real-time performance insights.

Multi-cloud trends in the media sector

The cloud is helping organisations in the media and entertainment sector unlock new levels of efficiency and speed as they find themselves under pressure to produce and distribute content faster than ever before. In our latest blog we look at how a Network-as-a-Service model can enhance their multi-cloud strategy.

Kubernetes questions for beginners

Whether you are applying for your dream job or simply want to learn more about Kubernetes, we have created the complete Kubernetes Q&A to get you started. Throughout this blog, we aim to provide you with the answers to the most common Kubernetes questions. From the basics of what Kubernetes is to how you can run Kubernetes locally, this blog will cover questions for those just starting out with Kubernetes to those that are looking to refresh their knowledge.

The 10-Step Framework To Developing A Cloud Cost Strategy

Cloud cost intelligence is a game-changing way to understand where your money is going and what that means for your business. Any company with a cloud services bill can benefit from a cloud cost intelligence strategy that outlines which spending decisions are beneficial and which should be scrapped in favor of something else. But let’s say you have done your background research and you understand what cloud cost intelligence is and why it matters. Where do you go from there?

The Best Way to Control Kubernetes and Cloud Costs

Although reducing costs is one of the benefits organizations seek in deploying Kubernetes in the cloud, many organizations find it difficult or impossible to monitor and control their costs. The problem typically stems from a lack of visibility. For example, 53% of respondents in the Anodot State of Cloud Cost Report 202 2 said their biggest challenge to controlling costs is gaining visibility into their cloud usage and associated costs.

Civo All Things Kubernetes London Meetup

On Tuesday 10th May 2022, at Investigo in Central London, we hosted our first in-person meetup where we discussed and had multiple talks on "All Things Kubernetes". Around 60 community members joined us, where they networked with others in the space and engaged in cloud-native conversations. Check out our All Things Kubernetes London Meetup and learn more about the talks from our team, and the behind the scenes of all the action!

Top 6 services to Integrate with MetricFire

At MetricFire, we focus on integration with your infrastructure. As one of our business values, MetricFire strives to ensure you can integrate your infrastructure with our Hosted Graphite monitoring service easily. Our engineers are committed to going above and beyond in finding solutions to get our customers the insights they need.

9 Key Reasons to Use or Not Kubernetes for Your Dev Environments

We all know that Kubernetes is the best container orchestration tool in the industry. However, it is not frequently used in development environments due to its complexity and time to set up. That deprives you of many benefits you can gain from Kubernetes in the development environment. In this article, we will discuss the pros and cons of deploying Kubernetes in a development environment. We will go through various factors which decide the suitability of Kubernetes for your organization.
Sponsored Post

Exploring PagerDuty Alternatives for Incident Response

Incident response refers to effectively responding to infrastructure issues and resolving them in the shortest time frame possible. Due to several loss-inducing high-profile outages over the last few years, organizations have sought to create rigorous processes with specialized tools to resolve incidents quickly and learn from their failures. As one of the first platforms to enter the incident response space, PagerDuty is a dominant player, but over the years, competing platforms have begun carving out their own niche in the incident response space.

Fundamentals: Application Acceleration and the Benefits for your Service Delivery

Application acceleration is all about improving the responsiveness of a digital service. When clients access web applications, they are expecting near-immediate feedback from servers. Maintaining that level of performance requires ensuring the right resources are available to process requests, shortening the information retrieval process, and maintaining system uptime by warding off threats.

How to Run Solr Cloud on Docker Containers | Setup Tutorial for Beginners - Sematext

Solr is one of the most powerful and popular open-source search engines. And being able to put Solr in docker is an absolute must for anyone looking to get into DevOps. In this video tutorial, we will discuss the benefits of putting solr in containers, the 2 types of architecture solr can utilize, and containerize solr cloud in docker.

The Difference Between Monitoring and Observability and Why It Matters

Organizations are adopting cloud native and multi-cloud architectures to drive innovation, achieve faster time to market, improve yield, and deliver exceptional experiences to their customers. However, for all the business benefits of modernizing, the process does not come without challenges.

Logic App Best practices, Tips, and Tricks: #16 roll back to a previous version of an Azure Logic App Consumption

Today I’m going to speak about another critical best practice, Tips and Tricks that you need to know, especially when you are developing your Logic Apps Consumption directly on the Azure Portal: If we are developing in the Azure Portal, can we roll back to a previous version of an Azure Logic App Consumption?

DORA metrics: Where are you on the journey? @Sleuth TV

We're wrapping up Season 1 of Sleuth TV Live! We've covered a lot of ground, so we'll pull it all together with a talk about the DORA metrics journey and where you can go from here. In episode 6 of Sleuth TV Live, Sleuth's CTO Don Brown and Head of Customer Success Leigh Ann Whitmarsh discussed four levels along the DORA metrics journey, with tips and real-life examples of the opportunities available to maximize your investment in DORA metrics.

Enlightning - What the Flux?! GitOps at Your Fingertips

GitOps with Flux brings you security, reliability, velocity and more - no more pagers on Saturdays! No more breaches to the cluster that you can’t roll back. No more worrying about how you’ll fare in the next security audit. Pinky & Mae will share an overview of Flux and how it works as well as their personal experience on why Flux has been an essential part of achieving a best-in-class delivery and platform team.

Public cloud for telco - Part 2: Google Cloud Platform

This is the second blog in a series focusing on how telecom operators can leverage public clouds to meet their business demands. In a previous blog, we talked about Amazon Web Services (AWS) and how its services made it possible for telcos to shift towards public clouds. In this blog, you’ll get to know about Google Cloud Platform (GCP) and its role in enabling the telecommunications industry to leverage the cloud’s capabilities.

Want to improve your incident response plan? Focus on better incident communication.

Resolving the incident is only half the battle when it comes to responding to incidents. For many teams, incident communication is an afterthought, leaving stakeholders inside and outside the organization guessing what happened. But ensuring that important information about the incident is disseminated clearly and quickly is essential.

Polyglot persistence vs multi-model databases for microservices

Microservice architecture is an application system design pattern in which an entire business application is composed of individual functional scoped services, which can scale on demand. Each team focuses on an individual service and builds it according to their skillset or language of choice. In addition to flexibility, this pattern provides: These features have made microservices architecture a popular choice for enterprises.

Understanding the Observability Maturity Model

Based on research and conversations with enterprises from various industries, StackState created the Observability Maturity Model. This model defines the four stages of observability maturity. The ultimate destination is level four, Proactive Observability with AIOps. However, even moving from level one to two, or from level two to three, is a huge improvement in your ability to get essential insights into your IT environment.

Get More From Your JFrog Platform Using the Cloud Marketplaces

These days, organizations of all types, sizes and industries are using the cloud for a wide variety of business reasons and use cases. Benefits include agility, elasticity, cost savings, ease of deployment, ease of management, ease of procurement and ability to leverage cutting edge technologies. JFrog customers are no different. Indeed, JFrog recognizes that many customers want a hybrid cloud approach with the ability to work across multi-clouds.

Part 5: Proactive Observability With AIOps- Level 4

Level 4, Proactive Observability With AIOps, is the most advanced level of observability. At this stage, artificial intelligence for IT operations (AIOps) is added to the mix. AIOps, in the context of monitoring and observability, is about applying AI and machine learning (ML) to sort through mountains of data looking for patterns.

Robin io Webinar Simplifying Kubernetes Storage and Data Management For DevOps

As container and Kubernetes adoption grows, developers and DevOps teams are expanding the use cases beyond stateless applications to stateful applications in order to drive operational consistency, extend the agility of containerization to data, gain faster collaboration, and simplify the delivery of data services. However, when it comes to provisioning storage for complex stateful applications that span multiple pods and services, careful storage management and day-to-day data management capabilities and expertise are critical requirements.

Canonical joins the Connectivity Standards Alliance

September 21st, 2022 – Canonical, the publisher of Ubuntu, announces today that it has joined the Connectivity Standards Alliance as a participant member. In this role, Canonical will help the alliance to develop open standards for the Internet of Things (IoT) and advocate for the role of open-source software in this domain. Canonical is the first company offering a major independent Linux distribution to join the alliance.

Speed up XCUITest execution with parallelism and test splitting

In this article, I’ll show you how to reduce the execution time of XCUITest (UI tests on iOS simulators) by splitting and running them in parallel. Automated tests and CI/CD platforms like CircleCI are necessary for iOS application development. It is important not only to introduce them once but to improve them continuously. When application code grows and automated tests increase, the execution time of build and test in CI/CD gets longer.

StackState 5.0 UI; Gain a Rapid Understanding and a Speed up Discovery

Do you experience this: Your brain seems to explode because there is so much you try to fit into ”working” memory? It can happen on a Friday afternoon, after a busy work week. Or on a Monday, looking at your calendar while figuring out how to fit in all those meetings and still get real work done.

Tips for working with web services on Ubuntu WSL #WSL #Ubuntu #Linux

Systemd support has arrived in WSL - unlocking a huge number of quality of life features for managing processes and services. This is particularly useful for web developers who want to set up and develop service applications inside WSL before deploying them to the cloud. In this video, Canonical Product Manager Oliver Smith shares how you can leverage this, and easily install a complete Ubuntu terminal environment in minutes on Windows with Windows Subsystem for Linux (WSL).

The Future of CI/CD: Challenge on the Horizon

Over the last 10 years, software development has shifted. Modern teams build applications on top of third-party dependencies, open source libraries, and more, which has dramatically increased complexity. With so much complexity in software development, how can today’s dev teams build with speed and agility while avoiding risk? Join CircleCI CTO, Rob Zuber, in the first of three executive webinars aimed at empowering CircleCI customers to optimize their software delivery practices.

Expanding the security frontier with JFrog Xray

How are you currently addressing the challenges of securing your software supply chain? In today’s world it’s essential to go beyond just standard security tools when safeguarding your applications from development to production? But it doesn’t need a plethora of point solutions to do that. The JFrog DevSecOps Platform can take care of it. Expand the software security frontier and join the JFrog Product Management team, as they discuss how JFrog Xray provides intelligent supply chain security and compliance at DevOps speed.

Current state of OpenTelemetry and how it fits in the DevOps ecosystem | Q&A

OpenTelemetry is an open-source project under the Cloud Native Computing Foundation(CNCF) that aims to standardize the generation and collection of telemetry data. The telemetry data helps developer, DevOps and IT teams to keep a check on their application health. The telemetry data collected by OpenTelemetry consist of logs, metrics, and traces. Together, they are used for performance monitoring and observability in distributed systems. At SigNoz, we are building an OpenTelemetry native APM.

How We Built Qovery - Part 1

I am excited to launch a new series of engineering articles to dig into all the details of How we Built Qovery. A platform built for DevOps, SRE, Platform Engineers, and Developers since January 2020. Since day 1, the Qovery team has strived to make Qovery as open as possible and fight against the black box effect! In this series of 5 articles, I will explain as much as possible how things work behind the scene.

Happy IT Professionals Day! A huge shoutout to all the IT pros

It’s the coolest day of the year again—time to honor and celebrate the people who work behind the scenes to keep our businesses running: IT professionals! Established in 2015, IT Professionals Day is celebrated each year on the third Tuesday of September to celebrate the unsung heroes of the IT world. Join us as we wish all IT pros worldwide a very happy IT professionals day! Kudos to the IT pros!

Common use cases for digital twins in automotive

Digital twins have become somewhat of a buzzword in the past couple of years. But what exactly are they? A digital twin, as its name indicates, is a non-physical copy of a physical object. Just like a digital scan of a physical picture. This virtual element enables a real-time view of all relevant data coming from said object. Depending on the system being studied, specific sensors can be tracked and monitored.

Part 4: Causal Observability - Level 3

It’s not surprising that most failures are caused by a change somewhere in a system, such as a new code deployment, configuration change, auto-scaling activity or auto-healing event. As you investigate the root cause of an incident, the best place to start is to find what changed. To understand what change caused a problem and what effects propagated across your stack, you need to be able to see how the relationships between stack components have changed over time.

How to deploy a React app to Kubernetes using Docker

The concept of containerization helps you run applications as lightweight virtual machines. As a web developer, setting up local development environments can be tiresome. However, using tools like Docker and Kubernetes gives developers an upper hand to quickly set up and deploy applications. This guide uses Docker to deploy a React app to Kubernetes.

Monitoring 10,000 clouds with Hashicorp

While there are many advantages to cloud computing, the cloud is crazy complex. Organizations need to be able to see exactly what’s happening in a variety of cloud environments in real-time, to get a clear picture of the health and operational status of relevant cloud-based components and devices. In this fireside chat, learn how HashiCorp puts Sumo Logic at the center of its security monitoring strategy to help monitor and secure thousands of public and hybrid cloud environments for its customers.

Grafana alerts as code: Get started with Terraform and Grafana Alerting

Alerting infrastructure is often complex, with many pieces of the pipeline that often live in different places. Scaling this across many teams and organizations is an especially challenging task. As organizations grow in size, the observability component tends to grow along with it. For example, you may have many components, each of which needs a different set of alerts. You may have several teams, each with a different channel where notifications should be delivered.

Qovery V3: Advanced Settings Building the Path to Beta Testing

Right at the beginning of the summer was the launch of our console V3 in Alpha testing; as explained in this article, the main goal of this V3 was to solve the UX issues present in the V2; it's also fully open source and rewritten from scratch in React. We gathered many feedbacks, and our Frontend team is continuing to add every feature already available on the V2 to go from Alpha to Beta testing at the end of September.

Ubuntu Core set to redefine industrial computing with new edge AI platform NVIDIA IGX

Enterprises struggle to bring AI and automation to the edge due to strict requirements and regulations across verticals. Long-term support, zero-trust security, and built-in functional safety are only a few challenges faced by players who wish to accelerate their technology adoption.

Spot PC security and compliance

End user computing is a popular target for malware attacks. Virtual desktops are no exception. As noted in previous posts, Spot PC emphasizes a “security in layers” approach to securing virtual desktop sessions. This includes using Windows 365 and Azure Virtual Desktop (AVD) and their built-in user identity and security management offered by Microsoft Azure Active Directory. Spot PC also enables Defender for Cloud for every managed virtual machine.

Software Delivery Platforms to Benefit DevOps Practices

In this era where applications are taking over the world, delivering the service to your customer with scalability and security is of the utmost importance. The software delivery platform helps to manage the data flow, traffic management, and security of the data from both sides of the application. If you are studying software delivery platforms, then most of you must have heard about the Codefresh software delivery platform for continuous integration and continuous deployment of the application.

An overview of Monitoring Azure Kubernetes Services

Modern applications are increasingly built using containers and microservices packed with their dependencies and configurations. Kubernetes is open-source software for deploying and managing those containers at scale—and it is also the Greek word for helmsmen of a ship or pilot. Monitoring the health and performance of your Azure Kubernetes Service (AKS) cluster is critical to ensure that your applications are up and running as expected.

The Cloud And SD-WAN

As more and more businesses move to the cloud, they discover that their traditional wide area networks (WANs) can no longer serve their needs. SD-WAN is a networking solution that offers many other benefits over traditional WANs, including reduced costs, improved performance, and increased flexibility. But what does SD-WAN stand for? Below, we'll take a deeper look at what SD-WAN solutions are and how they can help your business take advantage of the cloud.

Our journey to become Powerful Incident Management platform

Over the last couple of years, Spike.sh has largely been a Simple Incident Management Platform helping engineering teams across the world. Our focus on simplicity has been well received by all of you and we couldn't be more happy about it. After speaking with users earlier this year, we quickly realised there is a lot we can do to help our responders and help them better than we currently are.

8 Steps To Take Before You Can Start Forecasting Cloud Costs

Accurate, fair budgets make everyone in the company happy. Engineers love when they can build products that make the company money, executives enjoy seeing nice, wide margins, and finance departments celebrate when everything goes according to plan. But keeping the budget for cloud computing reasonable can seem like a lofty goal when your company’s cloud spending seems to change with the direction of the wind.

Partnership: Save Your Cloud Costs with Usage AI and Qovery

You are using Qovery, and you love the product, but what about saving some money on your cloud bills on top? Today, we are making your dreams come true with a brand new partnership with Usage AI, and today, I will explain everything you need to know about it.

What the heck is an incident?

Incident management is easily one of the most annoying things anyone has to ever deal with. There will always be only a handful of people who would ever want to walk into the building on fire to mitigate. That’s the same with most engineering teams. Only a handful are willing to get in, find the root cause, and mitigate the incident.

Introducing the CircleCI Config SDK

We are excited to announce the new CircleCI Config SDK is now available as an open-source TypeScript library. Developers can now write and manage their CircleCI config.yml files using TypeScript and JavaScript. For developers used to the ecosystem and flexibility of a full-fledged programming language, sometimes YAML can feel limiting or intimidating. With the Config SDK you can define and generate your YAML config from type-safe and annotated JavaScript.

Automated Ocean nodes monitoring & alerting is now available

Spot by NetApp is pleased to launch a new API that returns all Ocean managed nodes in a cluster. This new “Get Nodes” API provides detailed information about the nodes in the cluster as well as Ocean related information. Ocean is an automated cloud infrastructure SaaS for containers. With any automated system, it is essential to have ways to monitor what it is doing on your behalf.

Robin.io continues high performance spree; named Leader and Outperformer once again by GigaOm

2022 has been a phenomenal year for Robin.io. We had some big-ticket moments at the Mobile World Congress Barcelona; we became a part of the prestigious Rakuten group, and now, we are continuing the sprint, having been named Leader and Outperformer by the 2022 GigaOm Radar Report for Mobile Edge Solutions.

Making Peace with the Grim Reaper - Liveness & Readiness Probes | Guy Menahem & Anais Ulrichs

Learn all about liveness and readiness probes (done right) from Guy Menachem - Solution Architect at Komodor, the first Kubernetes-native troubleshooting platform, with vast experience working with DBs from old-timey mainframes to cloud-native systems.

ASUS IoT and Canonical partner on Ubuntu Certification for IoT Applications

TAIPEI, Taiwan, September 14, 2022 — ASUS IoT, a global AIoT solution provider, today announced a partnership agreement with Canonical to certify the device manufacturer’s boards and systems with Ubuntu 20.04 LTS. ASUS IoT devices are used in a wide range of edge computing applications. New devices like the PE100A will be certified for optimised performance with Ubuntu, ensuring faster development times and ease of configuration.

FAQ: MLOps with Charmed Kubeflow

Charmed Kubeflow is Canonical’s Kubeflow distribution and MLOps platform. The latest release shipped on 8 September. Our engineering team hosted a couple of livestreams to answer the questions from the community: a beta-release webcast and a technical deep-dive. In case you missed them, you can read the most frequently asked questions (FAQ) about MLOps and access helpful resources in this blog post. Note that you can also watch the videos on Youtube: Beta-release & a technical deep-dive.

What is DevOps? A Comprehensive Guide

The term DevOps is a combination of the words “development” and “operations.” In practice, DevOps is a collaborative approach to the work that is performed by an enterprise’s IT operations staff and their application developers. Collaboration and communication between these two teams, who might otherwise function separately, are meant to increase the speed and quality of product or application releases.

The Ultimate Kubernetes Cost Monitoring And Management Guide

While Kubernetes enables your team to deliver more value, more rapidly, cost discussions around Kubernetes — and Kubernetes cost monitoring — can be difficult. You have disposable and replaceable compute resources constantly coming and going, on a range of types of infrastructure. Yet at the end of the month, you just get a billing line item for EKS cost and a bunch of EC2 instances.

4 Ways to reproduce issues in microservices

Let’s say we have an issue in production. We’ve all been there, right? The first thing we want to be able to do is reproduce the issue. By reproducing, we can confirm it’s a recurring issue, rather than a sporadic one, and that it requires a fix to ensure that our product is working properly. When shifting from a monolith to microservices, reproducing issues becomes more of a challenge.

The DORA metrics backstory

DORA metrics are becoming the industry standard for measuring engineering efficiency, but where did they come from? ‍ We talk a lot about DORA metrics here at Sleuth — what they are and how to measure them. But we haven’t shared much about the context of DORA metrics — their history and why we use them. So let’s do that. This article provides a summary.

Tips to make your Retrospectives Meaningful

If done right, retrospectives can help you inspect past actions, help adapt to future requirements and guide teams towards continuous improvement. However, organizations find it difficult to adopt the right mindset to execute retrospectives effectively. This blog will help you understand what retrospectives are and provide valuable tips to make your retrospectives meaningful. This blog will cover,

PostgreSQL Monitoring with Netdata

PostgreSQL is a popular open source object-relational database system designed to work for a wide range of workloads from single machines to data warehouses to web services with many concurrent users. PostgreSQL runs on all major operating systems and is used by teams and organizations across the world, including Netdata. If you are using PostgreSQL in production, it is crucial that you monitor it for potential issues. And the more comprehensive the monitoring the better!

How Rush Capped Time to Resolution by Integrating Sentry With Their With CI/CD Pipeline

Rated as the top order tracking and revenue generation app on Shopify, Rush lets businesses build and personalize their own dashboards to manage the post-sale process with real-time data, custom product recommendations, and user feedback. Their business model focuses on low touch and user-centered design (UCD), which leaves little room for issues impacting how people interact with the platform.

Canary vs blue-green deployment to reduce enterprise downtime

Even before the cloud, no one liked deployment downtime. With applications hosted in traditional data centers that restricted access for local users, many organizations scheduled deployments when users were less likely to be using the applications, like the middle of the night. With widespread adoption of cloud-based, 24x7 environments available from all time zones, every hour of the day, easy-to-find deployment windows are gone.

Identify and manage impacted customers with our new Zendesk integration

Customer support tickets are a key indicator of which customers are being actively impacted by an incident. Incident-related support tickets are an important component of impact assessment, incident prioritization, and effective stakeholder communications. FireHydrant's new Zendesk integration allows Enterprise tier users to: With our Zendesk integration you can streamline customer impact assessments and incident communications, resulting in reduced support response times and incident durations.

Here's how to drive velocity and business success with self-service

“Give customers the power to help themselves. Self-service options allow for faster problem resolution while reducing strain on your support teams.” – our friends at ServiceNow Self-service is a crucial component to any DevOps strategy. Many IT organizations still depend on manual and ticket-driven workflows with strong reliance on dedicated teams to make simple and frequent change requests. Unfortunately, these traditional models don’t scale.

How to Break Stuff with Chaos Engineering and Chaos Mesh

In 2011, a Netflix engineering team introduced the concept of chaos engineering with its release of Chaos Monkey. This was initially an in-house tool developed to orchestrate fault injection that Netflix eventually made open source. However, the reliance of Chaos Monkey on Spinnaker, another Netflix engineering innovation, establishes some limitations.

DevOps top programming languages support engineering metrics goals

This post, authored by CircleCI Senior Technical Content Marketing Manager Jacob Schmitt, was originally published on The New Stack. One of the privileges of working in the continuous integration space is the unique perspective it offers into how software teams organize their work to deliver value quickly without sacrificing quality, security, or developer happiness. At CircleCI, we support more than 2 million developers running 90 million build, test, and deploy jobs each month.

Why Enterprises Choose Canonical Ubuntu on AWS

Canonical is excited to partner with AWS and feature on this week’s episode of AWS on Air. Watch us live on September 16, at 12pm PT. As the publisher of the Linux distribution Ubuntu, Canonical support, secure, and manage Ubuntu infrastructure and devices for thousands of businesses. Ubuntu runs from cloud to edge. It is the platform that everybody uses on the public cloud including AWS, and the preferred workstation experience for builders all over the world!

Experimenting our way to success ft. Aniel Sud, CTO of Optimizely

In this episode, Rob is joined by Optimizely CTO, Aniel Sud, to discuss the importance of experimenting for growth. Entrepreneurship and innovation require courage, but having courage can bring on emotions that make it difficult for us to experiment objectively. How do we hold our strong opinions loosely to press forward with new information?

Difference between Docker Image & Docker Container

A Docker image is a combination of instructions and for creating a docker container a instruction is used to execute code in a Docker container. Docker images work as a set of instructions to build and run a Docker container, as a template. Docker images also perform as the initial point when using Docker. A Docker image contains read-only files. when a docker image is created it can not be changed and modified, insert template that has instructions for deploying containers.

Scaling up your CI/CD Pipeline as your Organization Grows

As enterprises adopt cloud-based technologies, their IT infrastructure becomes more complex. The need to constantly scale up the CI/CD pipeline presents new challenges for IT teams. This blog post will share best practices and learnings from our work with enterprise customers who have successfully scaled up their CI/CD pipelines and discuss how to solve these challenges with Cloudify in production environments.

Resolve Systems Recognized in the 2022 Gartner® Hype Cycle for I&O Automation Report

Resolve Systems announces that Gartner has named the company as a Sample Vendor in its Hype Cycle for I&O Automation, 2022 report in the service orchestration and automation platforms (SOAP) category. We believe this report helps I&O leaders to understand and evaluate automation-centric technologies that deliver faster value, improve efficiency, and optimize costs.

Building Workflows, Part 1 - Core concepts and the Workflow Builder

At incident.io, we’re building tools to help people respond to incidents, often by automating their organisations’ process. Much of this is powered by our Workflows product, which customers can use to achieve things like: Workflows as a product feature are incredibly powerful, and we’re proud of the value they provide to our customers. Behind-the-scenes, though, building something like workflows can be difficult.

Building Workflows, Part 2 - the executor and evaluation

This is the second in a two part series on how we built our workflow engine, and continues from Building workflows (part 1). Having covered core workflow concepts and a deep-dive into the Workflow Builder in part one, this post describes the workflow executor, and concludes the series with an evaluation of the project against our goals.

Should you use open-source databases?

You are not the only one asking this seemingly popular question! Several companies are torn between the rise in appeal of open-source databases and the undeniable challenges inherent to their adoption. Let’s explore the trends, the drivers and the challenges related to open-source database adoption.

AIOps for Real: Characteristics of a Platform That Add Value and Drive Change

When you’re investing in automation solutions, ultimately, tangible results need to follow quickly. Getting a return on investment (ROI) out of an automation project after two years is something that would have been OK in the not-so-distant past but is no longer acceptable nowadays. With the current speed of change, where new technologies come and go and existing ones evolve at lightning speed, IT teams require much faster time to value on automation investments.

How to Build and Maintain a Winning Culture of Success - Steve Smith (DecisionPoint Systems)

The binding element for any company is its culture and how that culture translates into growth, profits and success for employees, customers and investors. In his thought leadership session, CEO of DescionPoint Systems Steve Smith reveals his individual formula for sales success and the expectations customers have of individual employees, as a group and as a company.

Introducing Webforms - Involve end users directly into your Incident Management process

Over the years we’ve received requests from our customers for a feature that can enable their customers and their end users to create/ report incidents directly on Squadcast. To our valued customers - we heard you! We are excited to introduce Webforms to do exactly that. In the past, we’ve addressed the challenges pertaining to On-call processes and best practices that teams can implement.

What's difficult about problem detection? - Three Key Takeaways

Welcome to episode 4 of our webinar series, From Theory to Practice. Blameless’s Matt Davis and Kurt Andersen were joined by Joanna Mazgaj, Director of Production Support at Tala, and Laura Nolan, Principal Software Engineer at Stanza Systems. They tackled a tricky and often overlooked aspect of incident management: problem detection. ‍

Kubernetes Load Testing: Speedscale vs NeoLoad

In this article, you’ll be introduced to two tools: Speedscale and NeoLoad. Both of these tools offer you a way to load test your applications. This post will compare their ease of setup, development experience, fit within a modern infrastructure, and integration into CI/CD. Load testing is not a new concept in any way: the term was common even before Google Trends started recording data in 2004.

Seamlessly Secure Your Cloud-Native Applications with D2iQ + Aqua

As businesses embrace cloud-native application development as a basis for modernization, the shift creates significant security challenges. D2iQ has partnered with Aqua Security to enable organizations to create a seamless DevSecOps experience that accelerates the deployment of secure smart cloud-native applications to prevent and foil cloud-native cyber attacks.

20+ Google Cloud Monitoring Tools And Best Practices For 2022

Google Cloud Platform (GCP) offers a great alternative to Amazon Web Services (AWS) and Microsoft Azure. In case you use Google Products extensively at work, such as Google Workspace, moving to GCP may be a natural choice too. Perhaps you already use GCP with AWS, Azure, or another cloud provider as part of your hybrid cloud strategy. But maybe you struggle to manage GCP costs, monitor security and compliance, and observe performance.

How to Manage Staging Environments to Speed Up Your Deployments By 5x

The staging environment plays a crucial role in product development. It's the last checkpoint before the product updates are live for customers. Every successful product has a robust and effective staging environment on the back. However, the traditional staging environments cannot keep pace with the modern CI/CD workflow. This article will go through how traditional shared and static staging environments hinder faster deployments and efficiency.

Automate testing for a Vue.js application

One of the leading frameworks in the JavaScript community, Vue.js is a progressive framework for building reusable components for the web user interface. Its intuitive API and robust flexibility for handling front-end logic are just two of the reasons that Vue has been adopted by developers worldwide. In this tutorial, I will lead you through building a simple listing application that shows the names and roles of users. I will show you how to write tests for the application.

Jenkins vs. Jenkins X: Which Automation Server Should You Choose?

Today’s DevOps teams can reach project goals quickly because they rely on continuous integration and continuous development (CI/CD) tools that automate tasks. The problem is that there are tons of CI/CD tools out there. How do you know which one serves your team’s needs best? Jenkins stood out as a top contender for years after its release in 2011. More recently (2018), Jenkins X appeared as an alternative.

Consider these 9 microservices best practices to help you ditch your monolith

Microservice architectures have become extremely popular in recent years, and for good reason. When managed properly, they improve scalability, encourage faster development and deployment, and reduce data and domain coupling. Companies of all sizes, from small startups to large enterprises, have migrated their monolithic applications over to microservice and service-oriented architectures. Making the move from monolith to microservices is a big shift, though.

Part 2: Monitoring - Level 1

The first level of the Observability Maturity Model, Monitoring, is not new to IT. But as reliable IT system operation becomes more and more critical, the importance of monitoring continues to increase. A monitor tracks a specific parameter of an individual component in the system to make sure it stays within an acceptable range; if the value moves out of the range, the monitor triggers an action, such as an alert, state change or warning.

Observability and Resilience in Microservices-based Environments [Komodor + Epsagon Webinar]

Kubernetes has made it easier to manage and scale microservices. However, keeping track of so many moving parts is often challenging for Dev & Ops teams. Achieving clear observability for better monitoring and troubleshooting is key to improving the development process.

How do Resolve's intelligent IT automation solutions help IT reduce complexity? | Resolve

Today's IT organizations are dealing with a level of business system complexity like we've never seen before. The IT-verse is packed with a growing variety of technology challenges making it almost possible to keep up with business demands in a timely fashion. In this video, Resolve CEO Vijay Kurkal explains how Resolve can help your IT team rein in all of this complexity with one platform and thousands of pre-built automations to free up your staff to do more meaningful work.

Managing Squadcast resources with our expanded Terraform provider

Hey folks! We’re excited to announce that we’ve vastly expanded the capabilities of our Terraform provider. Previously, our Terraform provider was limited to creating and managing services as a resource. We have now covered the entire spectrum of resources available on Squadcast right from creating and managing users, escalation policies and also managing SLO’s via our Terraform provider. What does that mean for you?

Zoom Phone Delivers Local Survivability with Ribbon SBCs

Migrating to a cloud-based phone system is compelling because it eliminates the costs and time associated with deploying and managing legacy phone system/PBX hardware and proprietary business phones. However, moving 100% of an organization’s communications infrastructure off-site can present new challenges. If the connection to the cloud is lost, a site could lose both external communications and intra-site communications.

Asynchronous Correlation with Serverless360 BAM

One of the critical differences between distributed tracing and business activity monitoring is that distributed tracing usually assumes that your transaction executes from start to end in a reasonably short time. An example would be that your call to an API might then drop a message on a queue which a function processes and loads into a database.

Comparing Ways to Connect to Microsoft Azure

Curious about Microsoft Azure and the best ways to connect? Azure is a hybrid Cloud Service Provider (CSP) with customized, scalable, cloud-based packages. These encompass Software as a Service (SaaS), based on subscription-based software licensing and delivery, Platform as a Service (PaaS), allowing companies to develop, deploy, manage, and update applications, and Infrastructure as a Service (IaaS), providing high-level application programming interfaces (APIs).

Running Cloudify Github Actions Locally

Cloudify offers a set of GitHub actions that can be used to interact with your managers. You can combine and use those actions based on your needs. You can check them out in the GitHub marketplace. This brings us to the main point where a developer would require a way to test GitHub workflows or debug them locally without needing to modify the workflow on the repository -extra commits for debugging- and then go through the logs using the Github actions tab.

Track your carbon footprint with Hardware Sentry's offering in the Datadog Marketplace

As we enter a critical period in the effort to mitigate climate change, organizations are facing mounting regulatory pressure—along with a biological imperative—to reduce their carbon footprint. And for those that maintain significant on-prem infrastructure, energy costs associated with operating hardware components can significantly affect their bottom line.

Multi-cloud trends in the retail sector

In retail, the cloud changes everything because it offers flexibility and the opportunity to do things better, or to do things that were previously impossible, for a lower cost. This is especially true in a multi-cloud and hybrid cloud environment, where retailers can connect their legacy private data centres with public clouds to enhance their offering and make it more competitive.

Cloudsmith: The Single Source of Truth For Your Artifact Management

Say hello to Cloudsmith! Cloudsmith is the only cloud-native, global, universal artifact management platform for engineers looking to set up a secure artifact repository in 60 seconds. Cloudsmith offers support for 28+ formats, has 410+ points of presence, is ISO 27001 certified, and integrates with all of the tools you already use and love.

Cybersecurity Companies Have A Customer Profitability Problem - Here's Why

The age of growth-at-all-costs is over. Profitability matters — and it matters now. For SaaS companies who rely heavily on the public cloud, understanding what’s driving (or hurting) profitability can be tricky. Different customers have different needs and usage patterns, drive different levels of cost, and impact profitability unequally. Cybersecurity has played a central role in CloudZero from the beginning —or really, before it.

Civo Update - September 2022

In August, we announced a brand new tech event, Civo Navigate! This event will bring together the brightest minds in cloud native technology to provide a packed event with over 50 speakers. Civo Navigate is focused on bringing together the cloud native community and creating an environment where everyone can share ideas, collaborate and learn more. If you are interested in joining us in person, register now to get in-person access to all keynote sessions, workshops, and breakout sessions.

Redgate Clone: exclusive early access to next-gen database provisioning

At the end of last year, we announced that Redgate’s database cloning technology was getting an upgrade: Multi-RDBMS, instance-level clones, and support for containerized workflows. This next generation of database provisioning provides DevOps test data to more teams for fast, quality releases across your software organization. Today, we invite you to join the early access program (EAP) for Redgate Clone.

Analyze Pacemaker logs in Cloud Logging

As an SAP system administrator, you've probably asked yourself: why did my Compute Instance restart? Why did Pacemaker restart my instance? Why did/didn’t my SAP system failover? By streaming Pacemaker logs into Cloud Logging, you can now find the answers to these questions by using a Cloud Logging query template to filter out the noise generated by Pacemaker logs.

Blameless Expands Microsoft Partnership to Deliver Faster, More Intuitive Incident Response Collaboration

At Blameless, the world’s leading software engineering teams rely on us during incident management. A key part of our offering is the ability to seamlessly integrate with a customer’s unique tech stack. As such, we value partnerships with companies like Microsoft that enhance our user experience and meet the needs of our customers. We understand how essential it is to integrate with communication tools like Microsoft Teams, because it’s the first place a user goes to start an incident.

Developer Week Cloud Austin: Hit or Miss?

Hi there, Albane writing, Product Marketing Manager at Qovery, writing to your from Texas 🤠 With Romaric (CEO at Qovery) and Morgan (Co-founder at Qovery), we spent the last two days promoting Qovery at the Developer Week Cloud that took space in Austin, Texas. After weeks of preparation, quite a few hours of travel, and a whole setup, let me give you some insights about our first-ever conference as an exhibitor!
Sponsored Post

Top Tools to Help Debug Kubernetes Applications

When building cloud-based applications, managing the infrastructure becomes a bigger challenge as you scale. Kubernetes brings order to the chaos, letting you control and automate the containers used to deploy your application. Debugging in the cloud presents further challenges, and the complexities of distributed applications make it hard for many debugging setups to keep pace. Tools designed to run locally aren't effective. However, there are Kubernetes debugging tools that can handle the shift in paradigm. In this article, you'll read about several options that make debugging Kubernetes applications much easier.

What's new in Kubernetes v1.25?

Kubernetes as a project is growing at a rapid pace, resulting in many features being added, deprecated, and removed throughout the process. With the announcement of Kubernetes v1.25, there were a total of 40 enhancements and 2 features being either deprecated or removed. This blog will look at the significant changes coming with the release of Kubernetes v1.25 and how this will impact future use.

Announcing macOS Runners in Bitbucket Pipelines

We are happy to announce that Bitbucket Pipelines now supports macOS self-hosted runners. We have moved from beta to an official release. You can now create a self-hosted runner and run it on your macOS infrastructure to run macOS and iOS builds. Since you’re using your own runner, you won't be charged for Bitbucket Pipelines build minutes.

Enlightning - Delivering Your Platform Your Way Using Kratix

Your organization wants to ship valuable products faster. You either have or are considering an internal platform to centralize common services and provide a self-service developer experience. To deliver this experience, as a platform team, you need to balance software engineers vying for the newest tools and business stakeholders requiring assurances and visibility. Configuring the right solution to meet organization specific needs is hard. You have to frequently context switch as you divide your requirements across a number of different tools.

Mobile app security testing: tools and best practices

To minimize the security risks of an application, developers need their apps to stand up to stringent security testing. Fortunately, there are tools available that simplify and even automate these security tests. There are also best practices to guide and inform the testing process. In this article, I will cover the most common security issues for mobile apps and highlight popular security tests.

A technical deep dive into Kubeflow 1.6

Kubeflow 1.6 is finally here! 🎉🎉🎉 The open source MLOps platform of choice keeps evolving year over year, growing in popularity and available features. Learn about the technical aspects of the new release and listen to a deep dive into the new features with the engineering team of Charmed Kubeflow. We will be talking about pipelines, Katib and the news about the scheduler.

Should you measure developer productivity? @Sleuth TV

In episode 4 of Sleuth TV Live, Sleuth's CTO Don Brown and CEO Dylan Etkin had an engaging, insightful conversation about how DORA metrics connect to your people and how they play into developers' happiness. Check out specific points in the conversation: Resources mentioned in this episode: Give Sleuth a try and see why it's a deploy-based Accelerate / DORA metrics tracker both managers and developers love.

SRE vs DevOps: Can they coexist or do they compete?

Systems fail, sometimes publicly and at great cost. Airlines have experienced system-wide ticketing outages, causing hundreds of flight cancellations and significant inconvenience to customers. Retailers have experienced website crashes on the busiest shopping days of the year, costing millions in lost revenue and customer goodwill. It is vital to understand both DevOps and SRE and the roles they play in preventing such outages.

Send Amazon VPC flow logs to Amazon Kinesis Data Firehose and Datadog

Amazon Virtual Private Cloud (Amazon VPC) is an isolated and secure virtual network in which you can deploy resources, such as Amazon Elastic Compute Cloud (EC2) and Amazon Relational Database Service (RDS) instances, while restricting their exposure to the internet. As part of your monitoring strategy, you can collect and analyze VPC flow logs, which record network traffic flow between VPC components.

Charmed Kubeflow 1.6 is now available from Canonical

8 September 2022- Canonical, the publisher of Ubuntu, announces today the release of Charmed Kubeflow 1.6, an end-to-end MLOps platform with optimised complex model training capabilities. Charmed Kubeflow is Canonical’s enterprise-ready distribution of Kubeflow, an open-source machine learning toolkit designed for use with Kubernetes. Charmed Kubeflow 1.6 follows the same release cadence as the Kubeflow upstream project.

CloudHedge Partners with Tech Mahindra to Accelerate Modernization of Legacy Apps to Cloud using OmniDeq

Iselin, New Jersey – 8th September, 2022 – CloudHedge Technologies, the leader in application modernization, partners with Tech Mahindra, a leading provider of digital transformation, consulting and business reengineering services and solutions. The partnership will accelerate modernization of business applications for enterprise customers using OmniDeq™.

New Puppet Enterprise LTS release increases security and compliance

We’re pleased to introduce updates to Puppet Enterprise that give infrastructure operations teams the insights they need to manage and protect infrastructure and complex workflows in a simple yet powerful way. With Puppet Enterprise 2021.7, teams gain automatic access control and a host of system insights related to runs and events.

How to create your ChatOps bot

Communication within organizations has evolved from email threads to real-time chats. With the vast potential of real-time messaging, organizations are beginning to explore how they could make these chat applications do more. We now even have a name for this phenomenon — ChatOps. To get a taste of the power of ChatOps in action, let’s build a ChatOps bot that sits in a Mattermost channel with Errbot.

D2iQ Kubernetes Platform and Amazon EKS: Better Together

Amazon Web Services (AWS) Elastic Kubernetes Service (EKS) offers a great foundation to build cloud-native applications by minimizing the expertise needed to operate Kubernetes. However, production-grade enterprise platforms require more than Kubernetes and need to be augmented with additional capabilities to meet requirements. These additional services can be added easily by D2iQ, which is a close AWS partner.

What's the outlook for digital transformation? | Resolve

Digital transformation has gone from utopian buzz phrase to now a near-term set of goals and initiatives for most companies. What's driving this sense of urgency? Resolve CEO, Vijay Kurkal, explains how every functional area is undergoing its own transformation and what that means for the enterprise as a whole.

Deploy a Dockerized Laravel application

As web applications become more complex, software engineering teams must rely on many different products and services to create the best developer experience. The application development ecosystem has grown beyond version control and hosting deployment. Manually managing the deployment of new features across all services can create a serious bottleneck in the software development lifecycle. It also introduces the risk of human error.

What Is Cisco ACI?

Cisco ACI is an enterprise-class, software-defined networking (SDN) solution that provides complete control of the data center network. Using a policy-based approach, Cisco ACI delivers security, performance, and scalability for today’s demanding applications. Cisco ACI is part of the broader Cisco SDN portfolio, which also includes Nexus switches and Application Centric Infrastructure (ACI) controllers. If you are looking for a way to improve your data center network, Cisco ACI may be the answer.

Using Observability with Kubernetes to Automate Site Reliability Engineering

In this video, Anthony Evans, solution architect, explains how the StackState topology-powered observability platform can help SREs to automate site reliability, putting their organizations on the path to becoming a zero-downtime enterprise. See how StackState helps to unify and correlate data across your stack, visualize your entire IT environment, instantly pinpoint root cause, reduce alert storms and with AIOps capabilities, even prevent problems proactively. It's all here!

10 Essential Cloud DevOps Tools for AWS

Building, testing, and monitoring applications in the cloud is a unique challenge. While many organizations have embraced a DevOps methodology, their DevOps machine is still not at the level of maturity they might like it to be. According to a recent survey, 53% work on a team with a 'low level' of DevOps based on maturity factors.

Customizing and Securing Your Epinio Installation

As I’ve written about before, Epinio is built to be very flexible. In this blog, I will highlight several places we can hook into the rest of your existing infrastructure to give you a better experience. If you’re new to Epinio, it’s the application development engine for Kubernetes that lets you to go from code to URL in one step.

5 reasons why you shouldn't buy incident.io

Not many companies will tell you why you shouldn’t use their product, but any product that tries to be everything to everyone is doomed to failure. When you build without a specific user in mind, your target becomes the intersection of many viewpoints, and what you build is the lowest common denominator. What usually follows is software that can technically do everything, but feels unfocused, complex, and unpleasant to use. Something everyone is equally unhappy with.

Network Monitoring & eBPF

I’m not going to lie, I have a strong hatred towards the Berkeley Packet Filter (BPF). There are a lot of reasons mainly having to do with having to support BPF on a network monitoring tool. There’s also the challenge of writing BPF filters and the weird way they work. So when I first heard about eBPF, I was more than a little reluctant to be excited. As I dug in further, I became much more excited about the technology and the benefits it can bring. So, what is eBPF then?

One codebase, many projects: Drupal multisite fleet management

For years, Chromatic managed virtual servers on behalf of our clients. As tools like Platform.sh matured, our team realized we were spending much of our clients’ budgets on simply maintaining those servers—instead of making their sites better (which is kind of our reason for existing).

Platform.sh hires Leah Goldfarb as Environmental Impact Officer to oversee greener web hosting strategy

Platform.sh has announced Leah Goldfarb as its new Environmental Impact Officer. In what is the first hire of its kind for the PaaS industry, Goldfarb will oversee the company's greener web hosting strategy as Platform.sh looks to further reduce the digital carbon footprint of its large enterprise clients.

Monitor user-facing bugs with LambdaTest's subscription in the Datadog Marketplace

As your products and client base scale, maintaining effective test suites and providing rapid response to user-facing issues becomes increasingly challenging. Without thorough testing, bugs are more likely to go undetected, which creates poor user experiences and slower release cycles. LambdaTest is a cloud-based platform that supports real-time and automated testing for over 3,000 browsers, real devices, and operating systems.

25 AWS Monitoring Tools And Best Practices For 2022

Cloud computing offers several advantages over legacy on-premises systems, including cost, scalability, and performance. Today, Amazon Web Services (AWS) offers over 200 cloud services that can integrate seamlessly with your existing workflows, making it one of the most popular public cloud platforms. AWS strives to make its tools easy to use, but managing resources and services can be challenging.

SecurityDAM's NOC Management Takes Off With MoovingON.ai Platform

When SecurityDAM (acquired by Radware), a DDoS protection service provider, needed to upgrade their NOC operations, they tested out multiple solutions before choosing MoovingON.ai. From increasing efficiency and visibility to improving ticket resolution times and runbook automation, MoovingON.ai provided the NOC manager and team with everything they needed to run operations more smoothly and effectively.

DevOps 101: Unlocking the value of frequent deployments

In this DevOps 101 series, I introduce the concept of DevOps and talk about how you can include the database as a natural partner. In the previous post in the series, I discussed how automation introduces faster and more frequent deployments as a key benefit. We’re now going to take a deep dive into the value you can unlock through frequent deployments using database DevOps, along with how you get started doing them.

RIA Vendor Selection Matrix for AIOps 2022

In July, the research firm Research In Action (RIA), published the 2022 edition of their annual Vendor Selection Matrix™. Despite AIOps being a well established technology (Moogsoft has customers who have been reaping the benefits of AIOps for many years) selecting a vendor can still be quite difficult, given the plethora of vendors who quickly re-branded their solutions as AIOps. So a vendor selection guide is a valuable resource.

Keep control of your Organization's Usage with our New Organization Usage page. #Blackfire

Predicting the traffic of our #applications is as challenging as forecasting the weather. Whether it’s a sudden spike in your apps’ traffic, a set of bugs pushed in #production, or even a full-on #cyberattack, unexpected surges can bring significant consequences. We may have the facts to anticipate a solid estimate, but we can’t plan for the anomalies. This is a concern shared by many #blackfire Monitoring customers. And it’s directly linked to ensuring you have the right amount of traces to maintain your applications’ continuous instrumentation—while controlling costs.

What makes Resolve the leader in intelligent IT process automation? | Resolve

IT is complicated. The good news is that it no longer has to be. In this video, Resolve CEO Vijay Kurkal explains the advantages that Resolve's automation solutions can bring to IT organizations, from streamlining large amounts of service desk requests and to auto-remediating incidents, with seamless integrations across multi-system siloes.

Changes are Observability's Biggest Blind Spot

Classically, the space of observability lies within layers of information on a dashboard. It operates by using the fundamental trio of data — metrics, logs and traces — from each layer of the environment to assess the health of an IT infrastructure. However, a time component is critical, making the stack observable at any point in time. Gathering reliable data and insights into your IT infrastructure remains the primary role of observability tools and services.

Tech Story: Papershift x Qovery Infrastructure Scaling Made Easy

A few days ago, I chatted with Florian Suchan (CTO and Co-Founder at Papershift) about their journey to easy infrastructure autoscaling. As you will see in the article, they tried a wide range of solutions before finding the right fit; if you feel you’re going through the same journey, this article is for you!

Business Activity Monitoring with Flat File Messages

Let us consider that you have implemented an integrated solution and are using Serverless360 Business Activity Monitoring (BAM) to help provide your support users and business users with visibility of what is happening in the business transaction. BAM provides you with distributed tracing to attain maximum visibility on the integration solution that the functional operations team needs.

Introducing Netdata Source Plugin for Grafana: Enhanced high-fidelity troubleshooting data source for the Open Source community!

The open-source community is about to benefit greatly from Netdata’s new Grafana data source plugin, which makes use of a powerful data collection engine. This new plugin maximizes the troubleshooting capabilities of Netdata in Grafana, making them more widely available. Some of the key capabilities provided to you with this plugin include the following.

Radware's NOC Management Takes Off With MoovingON.ai Platform

When Radware, a DDoS protection solutions provider, needed to upgrade their NOC operations, they tested out multiple solutions before choosing moovingon.ai. From increasing efficiency and visibility to improving ticket resolution times and runbook automation, moovingon.ai provided the NOC manager and team with everything they needed to run operations more smoothly and effectively.

We're making our on-call calculator free

We've all done it: "that'll be simple, I'll just write a quick script and..." In the case of calculating on-call pay, we really have done it before: our team have built the on-call pay scripts for several companies, and each attempt was a painful, error prone process. While we believe everyone on-call should be paid for their inconvenience, relying on someones side-project or back-of-napkin maths to calculate pay leads to mistakes, frustration, and wasted time.

What is a Security Operation Center and how do SOC teams work?

With the growing complexity of IT environments, it is essential to have robust security processes that can safeguard IT environments from cyber threats. In this blog, we will explore how security operation centers (SOCs), help you monitor, identify and prevent cyber threats to safeguard your IT environments. This blog covers the following pointers.

Authors' Cut-No More Pipeline Blues: Accelerate CI/CD with Observability

It’s no secret that CI/CD pipelines make the lives of engineering and operations easier by accelerating the feedback loop for higher quality code and apps. They build code, run tests, and safely deploy new versions of your application. But just like any aspect of development, poor integration, invisible bottlenecks, and bugs can plague your pipelines. And debugging them? Well, it’s complicated.

The Right Car (Or Cloud Service) For Your Family (Or Work) Adventures

Managing gas bills has a lot in common with managing cloud spend. In the summer, our gas bill increased with our summer adventures. Sitting home with the kids out of school and the warm summer months was not an option. Likewise, while discounts on gas certainly help, in reality they do not put much of a dent in the overall bill. The cheapest gallon of gas is still the one not used. If you’re a modern cloud-native business, you can’t just stop using the cloud.

Real World Insights - My Take on the Observability Maturity Model

A prelude to our upcoming six-part Observability Maturity Model Fundamentals blog series. By Lodewijk Bogaards At StackState, we have spent eight years in the monitoring and observability spaces. During this time, we have spoken with countless DevOps engineers, architects, SREs, heads of IT operations and CTOs, and we have heard the same struggles over and over.

JFrog Joins Rust Foundation as Platinum Member

The technology ecosystem is continually evolving but one truth remains, if there is a new and emerging coding language that captures the heart and minds of developers JFrog will be there. JFrog provides a DevOps Platform to store and secure its artifacts while engaging with the community and foundations that support developers using that language. We have a long history of working with Java, Python, C/C++ and more recently Go, Swift and Rust.

The Software Supply Chain Risks You Need to Know

Code that an organization’s developers create is only the beginning of modern software development. In fact, first-party code is likely to be only a small proportion of an application – sometimes as little as 10% of the application’s artifact ecosystem. An enterprise’s software supply chain is made of many parts, from many sources: open source packages, commercial software, infrastructure-as-code (IaC) files, and more.

Automate deployment of React applications to Firebase

Many platforms offer free hosting services for React and other JavaScript frameworks. These frameworks can be used for building single-page applications, which is handy when you need to launch a minimum viable product or a quick proof of concept. Your fellow developers are taking advantage of these tools, and you can too. To narrow down options, I will focus on Firebase in this tutorial.

Digital Transformation-A Journey, Not a Destination

by: VMware CIO Jason Conyard and VMware CTO Jerry Ibrahim Digital transformation is not a destination, it’s a journey. VMware as an organization has been architecting the transformation quite successfully, not just for itself but for its partners and customers as well. What exactly is digital transformation? The implementation of digital technology by an organization … Continued The post Digital Transformation—A Journey, Not a Destination appeared first on VMware on VMware Blogs.

Data Collection Strategies for Infrastructure Monitoring - Troubleshooting Specifics

Monitoring and troubleshooting; unfortunately, these terms are still used interchangeably, which can lead to misunderstandings about data collection strategies. In this article we aim to clarify some important definitions, processes, and common data collection strategies for monitoring solutions. We will specify the limitations of the described strategies, as well as key benefits which can potentially be also used for troubleshooting needs.

Why you need an incident timeline

We get it – incidents happen. What differentiates resilient teams from others is how they learn from them: using them as an opportunity to find the biggest improvements in how they work. Incident timelines are one of the most simple and effective tools available to you when it comes to learning from an incident. It’s vital that you ensure they’re accurate and useful, in order to make the biggest improvements after an incident.

Best practices to publish open-source software operators

Running or operating applications requires several tasks throughout their lifecycle: scaling instances, checking the health, integrating with other applications, running backups, and applying updates – to name a few examples. It’s a time and labour-intensive process. To automate these tasks, developers can implement scripts for repeated execution. This is where the software operator comes in.

7 Critical Considerations for Evaluating Infrastructure Monitoring Platforms

I remember how excited I was to build my first Network Operations Center (NOC). It was a new idea at the time (yes, I know I’m dating myself), and boy, did we feel like we were cutting edge. The mere idea that we needed a place and a set of tools to monitor our entire infrastructure (because it’s never really been about just the network) was a big transition at the time. How things have changed.

7 Common Kubernetes Pitfalls

Kubernetes is the industry's most popular open-source platform for container orchestration. It helps you automate many tasks related to container management. Companies use it to solve their problems related to deployment, scalability, testing, management, etc. However, Kubernetes is complex and requires a steep learning curve. In this article, we will go through some common Kubernetes pitfalls most companies fall to.

4 Tips With Qovery To Reduce Your Cloud Costs

While the cloud offers significant benefits compared to traditional on-premise infrastructure, its inherent elasticity and scalability lead to uncontrolled costs. Cloud costs can be opaque and difficult to analyze — and without some system of identifying the source of costs and how to manage them — they can quickly undermine your profit margins. Since Qovery makes it easy to create on-demand environments, it can drastically grow your cloud costs.

What is a "service" in a microservices architecture?

The past ten years marked a significant change in how software teams build and deploy applications. We moved away from bulky, slow, monolithic applications toward lightweight, scalable, distributed service-based applications. Meanwhile, tools like Docker, Kubernetes, and other container platforms helped accelerate this process. Despite this sudden growth, a fundamental question remains: what exactly is a service, and how does it fit into a microservice architecture?

What are the four Golden Signals?

When it comes to building reliable and scalable software, few organizations have as much authority and expertise as Google. Their Site Reliability Engineering Handbook, first published in 2016, details their practices to maintain reliability as Google scaled. But when you have over a million servers running thousands of services across more than twenty data centers, how do you monitor them in a consistent, logical, and relevant way?

Fundamentals: Load Balancing and the Right Distribution Algorithm for You

With the right load balancing in place, the demand of increasing web traffic can become manageable, but how do you determine which load balancing algorithm is best suited for your applications? Does the ease of use of static load balancing better suit the services you provide, or would your system benefit from a more complex and dynamic set of algorithms to maximize efficiency? In this blog post, we discuss what to consider when deciding on the right load-balancing algorithm.

Keep control of your Organization's Usage with our New Organization Usage page. #Blackfire

Predicting the traffic of our #applications is as challenging as forecasting the weather. Whether it’s a sudden spike in your apps’ traffic, a set of bugs pushed in #production, or even a full-on #cyberattack, unexpected surges can bring significant consequences. We may have the facts to anticipate a solid estimate, but we can’t plan for the anomalies. This is a concern shared by many #blackfire Monitoring customers. And it’s directly linked to ensuring you have the right amount of traces to maintain your applications’ continuous instrumentation—while controlling costs.

The Scaling Limitations of Graphite and Solutions to Overcome Them

Graphite is a free open-source software (FOSS) tool that monitors and graphs numeric time-series data. Graphite was originally a project developed internally at Orbitz in 2006, which eventually grew to be their foundational monitoring tool. In 2008, Orbitz allowed Graphite to be released under the open source Apache 2.0 license. Graphite made it possible to know more than simply if applications were up and running.

Four tests to measure and improve reliability: what matters and how it works

Legendary race car driver Carroll Smith once said, "until we have established reliability, there is no sense at all in wasting time trying to make the thing go faster." Even though he was referring to cars, the same goes for technology: no amount of code optimization or new features can replace stable systems. Unfortunately, much like race cars, it's hard to know that a system is unreliable until it blows a tire, the brakes stop working, or the steering wheel comes off the column.

How to add a Golden Signal to a service in Gremlin RM

In this video, we show you how to add a Golden Signal to a service. Gremlin uses your Golden Signals to ensure your services are still healthy and responsive during reliability tests. You can configure Golden Signals to use an existing monitor in your observability tools, such as Datadog, New Relic, or Prometheus. We recommend adding all four Golden Signals to each of your services to ensure comprehensive coverage.

How DBAs are Using SolarWinds DPM and Marginalia to reduce MTTR

Database performance is critical to business revenue. Slow and non-responsive databases can result in hundreds of thousands or millions of dollars in lost revenue from poor customer experience and downtime. High expectations in performance require IT infrastructure to function at full speed. In 2006, an Amazon study found that every 100ms in added page load time cost them 1% in sales.

Canonical Kubernetes 1.25 is now generally available

The Canonical Kubernetes team is delighted to announce that Canonical Kubernetes 1.25 is now generally available, with Charmed Kubernetes joining our Microk8s release last week, following the release of upstream Kubernetes on 23 August. We consistently follow the upstream release cadence to provide our users and customers with the latest improvements and fixes, together with security maintenance and enterprise support for Kubernetes on Ubuntu.

Logic App Best practices, Tips, and Tricks: #14 Implement good governance policies

Welcome again to another Logic Apps Best practices, Tips, and Tricks. In my previous blog posts, I talked about some of the most essential best practices you should have while working with the Azure Logic App: And some tips and tricks: Today I’m going to speak about another critical Best practice, Tips and Tricks that you need to implement while administrating your cloud integration resources: Implement good governance policies.

FIPS Certified vs FIPS Compliant #security #fips #development

How are FIPS Certified and FIPS Compliant implementations different? What makes the most sense for your organisation? The answer may surprise you. As consumers, we are prone to accept something that’s certified as best-in-class. When it comes to FIPS, which offering provides the best security posture? Watch this short video to learn about the difference. Subscribe to our Channel for more content. And follow our other social accounts.

How to get started with the new Grafana Ansible collection for Grafana Cloud

More than 20,000 companies around the world use Ansible as their Infrastructure as Code and configuration management tool. With the rising popularity towards managing infrastructure using IaC and config management tools, Ansible is one of the best open source tools to choose from. That is why we are excited to announce a new Grafana Ansible collection available to all Grafana Cloud users, including those in the generous free tier.

Continuous performance testing for mobile apps

As of 2021, roughly 5.7 million mobile apps are available in app stores — 2.2 million for iOS and 3.48 million for Android users. Given the massive numbers, customers have a wide variety of choices. With such a high number of apps available, customer satisfaction is paramount, which means avoiding customer churn and retaining users.

How to avoid losing your Slack message history

We understand how critical a message archive can be to your organization. Empowering you with complete control over your data—including your message history—is a key tenet of our mission here at Mattermost! If you’re part of one of the many teams and communities that use Slack to collaborate – take note: After September 1st, 2022, you will no longer be able to access your Slack message history older than 90 days on your free workspaces.

How Netdata's Machine Learning works

Following on from the recent launch of our Anomaly Advisor feature, and in keeping with our approach to machine learning, here is a detailed Python notebook outlining exactly how the machine learning powering the Anomaly Advisor actually works under the hood. Or if you’d rather watch a video walkthrough of the notebook then check out below. Try it for yourself, get started by signing in to Netdata and connecting a node.

Device Onboarding with Netreo's Auto Configuration

Wouldn’t it be great if there was some attribute you could query and set on a device? Then you could automagically configure that device based on that attribute that you just set for fully automated device onboarding! Welcome to Dynamic Device Attribute Pollers and Auto-Configuration Parameters! Rolls right off the tongue doesn’t it?

How to Improve Your Organization's Value Delivery with Health Markers

At VMware Tanzu Labs, we help customers get iterative value from modern app platforms, build better apps, and develop capabilities to continue these improvements long after we’re gone. This involves not just a focus on technology, but also people and processes. For example, we consider the way a team is organized and how/where people on the team focus their time.

Top 8 CI/CD Best Practices for Building Successful Applications

Developers commonly integrate the code and these frequent modifications in a central repository as part of the software development method is known as continuous integration (CI). Improved software quality, faster quality audit and bug fixes, and quick validation and release cycles are all major goals of continuous integration. Continuous Delivery (CD), which builds on top of Continuous Integration(CI), includes automating both builds and the complete software release process.

How Netdata's machine learning works

In this video we will walk though the Netdata Anomaly Advisor deepdive python notebook. The aim of this notebook is to explain, in detail, how the unsupervised anomaly detection in the Netdata agent actually works under the hood. No buzzwords, no magic, no mystery :) Try it for yourself, get started by signing in to Netdata and connecting a node. Once initial models have been trained (usually after the agent has about one hour of data, zero configuration needed), you'll be able to start exploring in the Anomaly Advisor tab of Netdata.