Get started with problem detection

100% Open Source

Last login:

prequel@problem-detection:~$ kubectl preq pg17-postgresql-0

1245932 lines       done! [4.27GB in 195ms; 21.91GB/s]
Problems detected   done! [3 in 195ms; 15/s]
CRE-2024-0121       medium   [41 hits @ Sun Mar 30 01:46:00]
CRE-2025-0301       high     [2 hits @ Sun Mar 30 01:46:01]
CRE-2025-0403       critical [1 hits @ Sun Mar 30 01:46:02]

Saving report to prequel-report-2025-03-30T01:46:38.json|
For your Code and your Stack
preq is an open-source tool that helps Site Reliability Engineers (SREs), developers, and observability experts find and fix issues before they become incidents. Powered by the Common Reliability Enumeration (CRE) standard, preq continuously scans your system’s logs and telemetry for known failure patterns and misconfigurations – all using community-curated knowledge.

Think of CREs as the reliability counterpart to security’s CVEs – but instead of vulnerabilities, CREs catalog software bugs, misconfigurations, and anti-patterns that can cause reliability issues.

preq is your new secret weapon: proactive, community-driven problem detection. Read the docs and give it a try  
Community-Driven
When community members discover new problems, they submit CREs to the public repo with a unique ID (e.g. CRE-2024-0007).
rules/

├── Stable/

│   ├── cre-2024-0007

│   │   └── rabbitmq-mnesia-overloaded

│   ├── cre-2024-0016

│   │   └── gke-metrics-export-failed

│   ├── cre-2025-0021

│   │   └── keda-nil-pointer

│   ├── cre-2025-0021

│   │   └── keda-nil-pointer

│   ├── cre-2025-0021
Automated
Run preq as a standalone tool or as a Kubernetes plugin. Available for Linux, MacOS, and Windows.
+--------------+-----------+------------+

|   problem    |  service  |    hits    |

+--------------+-----------+------------+

| CRE-0225-01  |  rabbitMQ |     3      |

| CRE-0189-02  |  temporal |     1      |

| CRE-0451-03  |   ngingx  |     5      |

| CRE-0317-04  |    otel   |     1      |

| CRE-0999-05  |   celery  |     1      |

+--------------+-----------+------------+

|    Total     |           |     11     |

+--------------+-----------+------------+

Open Standard
Each CRE includes the problem’s description, likely cause, potential impact, and recommended fixes.

Use, fork, and extend detection rules without vendor lock-in.
cre:
 id: CRE-2024-007
 severity: 0
 title: RabbitMQ Mnesia overloaded recovering persistent queues
 category: message-queue-problems
 author: Prequel
 description: |
   - The RabbitMQ cluster is processing a large number of persistent mirrored queues at boot.
 cause: |
   - The Erlang process, Mnesia, is overloaded while recovering persistent queues on boot.
 impact: |
   - RabbitMQ is unable to process any new messages and can cause outages in consumers and producers.
 tags:
   - cre-2024-0007
   - known-problem
   - rabbitmq
 mitigation: |
   - Adjusting mirroring policies to limit the number of mirrored queues
   - Remove high-availability policies from queues
   - Add additional CPU resources and restart the RabbitMQ cluster
   - Use [lazy queues](https://www.rabbitmq.com/docs/lazy-queues) to avoid incurring the costs of writing data to disk
 references:
   - https://groups.google.com/g/rabbitmq-users/c/ekV9tTBRZms/m/1EXw-ruuBQAJ
 applications:
   - name: "rabbitmq"
     version: "3.9.x"
metadata:
 kind: prequel
 id: 5UD1RZxGC5LJQnVpAkV11A
 generation: 1
Declarative, expressive, powerful
Go beyond keyword searches. Prequel rules enable complex detectors that include sequences, correlation, negative conditions and multiple event types.
rule:
 sequence:
   window: 30s
   event:
     src: log
     container_name: rabbitmq
   order:
     - Discarding message(.+)in an old incarnation(.+)of this node
     - Mnesia is overloaded
   negate:
     - SIGTERM received - shutting down

Get started with problem detection