Middleware Pipelines Overview

Middleware Pipelines define how telemetry moves from source to storage, and how that data is collected, filtered, modified, and stored along the way. This gives teams control over logs, metrics, traces, and RUM before data is persisted.

A pipeline follows this path:

Source → Ingestion Control → Server→ Backend Control → Storage

Middleware Agent Pipeline configuration interface

What is a Pipeline?

A pipeline in Middleware is the data path that controls telemetry lifecycle from intake to storage. It is built so teams can manage data quality and relevance before storage happens.

With pipelines, you can:

  • Reduce noise
  • Control data volume
  • Remove sensitive fields
  • Apply smart filtering and transformations

Why Pipelines Exist

Without pipelines, all data from all sources is sent and stored blindly. Pipelines add control points so you can decide what is useful and what should be dropped or transformed.

With pipelines, you can:

  • Drop unwanted data before it reaches storage
  • Apply different rules for different hosts, clusters, or integrations
  • Apply backend rules without redeploying agents
  • Apply OTEL-native filters directly on agents for clusters
  • Control cost, performance, and compliance

Pipeline Source Types

Every pipeline starts by selecting a source. Middleware supports three source types:

Source TypeWhat it represents
HostA single server or machine running the Middleware Agent
ClusterA Kubernetes cluster running the Middleware Agent
ServerlessServerless applications including managed databases, cloud services, etc.

Each source type supports different control depth based on where data is collected and where filtering can be applied.

Pipeline Control Layers

Middleware provides three control layers inside pipelines:

LayerWhere it runsPurpose
Ingestion ControlMiddleware ingestion layerModify, route, enrich, or sample data before it enters the platform
OTEL-Native FiltersAgent / cluster sideDrop or filter data at source before it is sent
Backend ControlMiddleware backendDrop data before storage based on rules

Not every source supports all three layers. Availability depends on how data is collected for that source type.

For creating pipeline, user needs to add one or more than one processor. User can also add same processor multiple times.