OpsAI Prompt Library
OpsAI works through natural conversation, so the quality of your answer depends a lot on how you ask. This page is a copy-and-adapt library of prompts, organized by the surface you are working in.
OpsAI exposes three prompting surfaces, each backed by its own agents:
| Surface | What it does | Where you use it |
|---|---|---|
| Ask OpsAI | Full conversation and investigation - query telemetry, run root-cause analysis across logs, metrics, traces, RUM, and Kubernetes, read code, and open fixes. Backed by the conversation and investigation teams. | The OpsAI chat panel |
| Dashboard widget builder | Builds and edits widgets on the dashboard you are currently in. State-aware: it already knows which dashboard you are on and discovers the right metrics for you. | Inside a dashboard, in the AI widget builder |
| Alert builder | Turns a plain-language request into a ready-to-apply alert rule config. State-aware: it can refine the alert draft you are currently editing. | The AI alert creation panel |
Pick the surface that matches your task, copy a prompt, and replace the placeholders (service name, app name, dashboard, time range) with your own values.
You can talk to OpsAI directly from the OpsAI panel - no setup is required to start a conversation. To act on errors, code, or clusters, connect the relevant integration first (GitHub/Bitbucket for code, the Kube Agent for Kubernetes). See OpsAI Overview to get set up.
How to write effective prompts#
A few habits make every prompt sharper, regardless of the surface:
- Name the resource. Mention the service, application, host, cluster, namespace, dashboard, or RUM app you care about. "the
checkoutservice" beats "my app". - Scope the time window. OpsAI defaults to a recent window. Say "in the last 1 hour", "over the past 24 hours", or "between 2pm and 3pm today" to control it.
- State the goal, not just the data. "Find why latency spiked and what changed" gets you an investigation; "show p99 latency" gets you a number.
- Add filters. Environment (
prod,staging), status code, error type, region, or tag values narrow the results. - Follow up. OpsAI keeps context across the conversation. Start broad, then drill in: "now group that by endpoint", "show only 5xx", "what changed right before this?".
- Ask it to act. End with "find the root cause and propose a fix", "add these as widgets", or "create the alert" to move from analysis to action.
Ask OpsAI
The chat panel is the general-purpose surface. It routes your request to the right agents - the conversation team for queries, the investigation team for root-cause analysis, and the fix flow for code changes. Use the sections below by module.
Logs#
OpsAI can search, filter, and summarize log data, and pull error logs into an investigation.
Show me error logs for the payment-service in the last 1 hour.
Search logs across all services for "connection refused" in the last 30 minutes.
Summarize the most common error messages in the orders-service logs today.
Show logs for the checkout service filtered to severity ERROR and group them by message.
Find log spikes in the last 6 hours and tell me which service they came from.
Pull the logs around the time this error occurred and explain what went wrong.Metrics#
OpsAI can discover which metrics exist for a resource, then query and aggregate them over a time range.
What metrics are available for the api-gateway service?
Show CPU usage for the checkout service over the past 24 hours.
Query memory usage by host for the last 3 hours and show the top 5.
What is the p99 latency for the orders-service right now, and how does it compare to yesterday?
Show the request rate and error rate for payment-service side by side for the last hour.
List the resource types I can query in this environment.Services and traces (APM)#
OpsAI understands your APM traces and spans, and can use them to explore performance and errors in distributed applications.
Which services have the highest error rate in the last hour?
Show me the slowest endpoints in the checkout service over the past 24 hours.
Trace a slow request through the orders-service and tell me where the time is spent.
Find traces with errors in the payment-service and group them by exception type.
Why did latency spike for the api-gateway around 3pm? Correlate traces, logs, and any deploys.
Which downstream service is causing failures in the checkout flow?Errors and incidents#
OpsAI surfaces grouped errors and incidents, and can open any one of them for a full root-cause investigation.
List the top errors across my services in the last 24 hours.
Show me the most frequent errors in the payment-service and which release introduced them.
Get the full details for this error: <fingerprint or issue URL>.
Investigate this error, find the root cause, and propose a fix.
Which new errors appeared after the last deployment of the orders-service?
Summarize this incident and tell me the likely cause and impact.When you ask OpsAI to "propose a fix" for an APM or RUM error and your repository is connected, it gathers the relevant code, pinpoints the file and line, and can open a pull request with the change for review. See GitHub below.
Real User Monitoring (RUM)#
OpsAI can analyze frontend sessions, browser errors, and user journeys, and correlate them with backend traces.
List my RUM applications.
Show the RUM errors for my web app in the last 24 hours.
Which pages have the most JavaScript errors this week?
Find sessions where users hit an error during checkout and replay what happened.
Correlate this browser error with the backend trace and logs, then suggest a code fix.
What is the slowest page load for my frontend app, and what is causing it?Kubernetes#
With the Middleware Kube Agent connected, OpsAI can read live cluster state, diagnose workloads, and - when you grant write access - remediate issues.
List my configured Kubernetes clusters.
Show the status of all pods in the monitoring namespace.
Why is the payment-service deployment not ready? Diagnose it.
Describe the failing pod and show me its recent events and logs.
Check resource usage across nodes and tell me which ones are under pressure.
This deployment keeps crash-looping - find the root cause and fix it.
Scale the orders-service deployment to 4 replicas.Kubernetes remediation only happens when you explicitly grant access to write tools. OpsAI inspects the live cluster state first, applies the change, and then verifies it. For read-only use, it simply diagnoses and recommends.
GitHub (code and pull requests)#
When your repository is connected, OpsAI can read code, search the repo, manage issues and pull requests, and open a PR with a proposed fix.
Find the file that handles payment retries in my repo and explain what it does.
Search the repository for where we call the Stripe API.
Show me the changes in pull request #482.
Based on this error, read the relevant code and open a pull request with a fix.
Open a GitHub issue describing this incident and assign it to me.
Who last changed this file, and what did the change do?OpsAI reads only the files related to the error through the MCP integration and does not store your source code or error context. Connect GitHub or Bitbucket to enable code reads and PR creation.
Account, users, and usage#
OpsAI can answer questions about your Middleware account - projects, teams, members, and usage.
Who are the members of my organization?
List the projects in my account.
What teams do I have set up?
Show my current Middleware usage and how close I am to my limits.
Show me my account details.Middleware product help#
OpsAI is grounded in Middleware's documentation, so you can ask it how to use the product itself.
How do I configure RUM in Middleware?
What's the difference between APM and RUM in Middleware?
How do I install the Kube Agent with OpsAI enabled?
How do I connect my GitHub repository to OpsAI?
How do I set up auto-investigation for my services?End-to-end investigations#
These prompts combine modules into a single request. OpsAI plans the steps, pulls data from each source, and returns a root cause with a recommended or applied fix.
The checkout flow is failing in production. Investigate across traces, logs, and RUM, find the root cause, and propose a fix.
Latency on the api-gateway doubled in the last hour. Tell me what changed - deploys, config, or a downstream service - and how to fix it.
Users are reporting errors on the payment page. Correlate the browser errors with the backend service, identify the root cause, and open a pull request with the fix.
This Kubernetes alert just fired. Diagnose the cluster, explain the impact, and remediate it.
Find the noisiest error across all my services this week and walk me through fixing it end to end.For a hands-off workflow, enable Auto Investigation per source (clusters for Kubernetes, services for APM, apps for RUM) in OpsAI Settings. OpsAI then analyzes new problems automatically and proposes a fix you can review and apply.
Dashboard widget builder
The widget builder works on the dashboard you are currently in. It already knows which dashboard that is, so you do not need to name it. It only builds and edits widgets - create, update, delete, and arrange them - and it discovers the correct resource, metric, filters, and group-by for you before building.
Good to know:
- It operates on the current, existing dashboard only. It does not create or delete dashboards, or switch you to another one.
- If you do not say how many widgets you want, it creates up to 10 in a single request. Ask for more explicitly if you need them.
- Updates are strict - it changes only the field you mention and leaves every other widget setting as-is.
- After creating, updating, or deleting widgets, it tidies the layout automatically unless you tell it not to.
Create widgets#
Add a widget showing CPU usage by host.
Create a line chart of request rate for the checkout service over the last hour.
Add three widgets for the orders-service: p99 latency, error rate, and throughput.
Show memory usage grouped by Kubernetes pod as a time series.
Add a widget for 5xx error count broken down by service.
Create a widget tracking average response time for the api-gateway, filtered to the prod environment.Update widgets#
Change the latency widget to show p95 instead of p99.
Update the CPU widget to group by service instead of host.
Rename the "Errors" widget to "5xx Errors" and keep everything else the same.
Switch the request-rate widget from a line chart to a bar chart.
Add a prod-environment filter to the memory usage widget.Delete and arrange widgets#
Delete the unused throughput widget.
Remove the duplicate CPU widget.
Reorganize the widgets into a clean two-column layout.
Put the latency and error-rate widgets side by side at the top.
Make the request-rate widget full width.The widget builder discovers real metrics, filters, and group-by dimensions from your data before building - it never guesses. If something you ask for is not available in the current dashboard's data, it will tell you what is missing instead of inventing it.
Alert builder
The alert builder turns a plain-language description into a ready-to-apply alert rule configuration. It maps your request to the right alert type, discovers the real metrics and attributes, picks sensible thresholds (including from your live data when you ask), and hands the config to the UI for you to apply. It can also refine the alert draft you are currently editing.
Good to know:
- It supports these alert types: metric (the default for infrastructure and KPI thresholds, including Kubernetes via a
k8s.*source), host (host inventory), log, trace, rum, llm, billing, anomaly, forecast, and error_tracking. - It builds the config only - you apply it from the panel. It will not silently create or trigger the alert.
- You can ask for data-driven thresholds ("based on recent behavior") and it will sample your data to pick warning and critical levels.
- Notification channels (Slack, Teams, email, PagerDuty, Opsgenie, webhook) are chosen in the product UI - mention them and it will note where to set them, but the channel is selected there, not in the prompt.
Metric and infrastructure alerts#
Alert me when the checkout-service error rate goes above 5% for 5 minutes.
Create a CPU alert for hosts: warning at 80%, critical at 90%, evaluated over 5 minutes.
Alert when memory usage on any host in the prod cluster stays above 85% for 10 minutes.
Set a p99 latency alert on the api-gateway with a critical threshold of 800ms.
Create a Kubernetes alert for pod restarts above 5 in 15 minutes in the payments namespace.Data-driven (behavior-based) alerts#
Create a CPU alert for the orders-service and pick the thresholds based on recent behavior.
Sanity-check the last hour of request rate, then set a sensible warning and critical threshold.
Set a latency alert on the checkout service based on its normal baseline, not a fixed number.Log, trace, RUM, and error-tracking alerts#
Alert me when "OutOfMemoryError" appears in the payment-service logs more than 3 times in 5 minutes.
Create a trace alert for the checkout service when the error rate on the /pay endpoint exceeds 2%.
Alert when RUM JavaScript errors on the web app exceed 50 in 10 minutes.
Set an error-tracking alert for the orders-service when new errors spike.Anomaly, forecast, and billing alerts#
Create an anomaly alert on request rate for the api-gateway.
Forecast disk usage and alert me before it runs out in the next 24 hours.
Alert me when my log ingestion is on track to exceed my billing limit this month.Refine an existing alert draft#
Lower the critical threshold to 75% and leave everything else unchanged.
Add a prod-environment filter to this alert.
Change the evaluation window to 10 minutes.
Also send this alert to Slack.The alert builder produces a draft config for review - apply it from the alert panel. For routing alerts to Slack, Teams, PagerDuty, Opsgenie, email, or a webhook, select the channel from the product UI or your notification settings.
Tips for better results#
- Match the surface to the task. Use Ask OpsAI to investigate and query, the widget builder to shape a dashboard, and the alert builder to define monitoring rules.
- Start with the symptom, not the solution. Describe what you see ("errors on checkout") and let OpsAI investigate, rather than assuming the cause.
- Narrow iteratively. If the first answer is too broad, reply with a filter: "only prod", "just 5xx", "group by endpoint".
- Be explicit about time. Without a window, OpsAI uses a recent default - say the range when timing matters.
- Ask for the next step. End with "what should I do next?", "add it as a widget", or "create the alert" to turn analysis into action.
Need assistance or want to learn more about Middleware? Contact our support team at [email protected] or join our Slack channel.