- 10 Dec 2024
- 4 minute read
- Print
- DarkLight
- PDF
Troubleshooting Rule Backups
- Updated 10 Dec 2024
- 4 minute read
- Print
- DarkLight
- PDF
When building rules in Slate, be sure to reference the Rule Efficiency and Troubleshooting Rules documentation to observe best practices when setting up your rules.
Rules are a powerful tool for automating processes in your instance, however it is important to optimize for efficient execution and avoid recurring timeout errors. When rules are unable to complete execution of queued records upon multiple attempts, a backup occurs.
Below are several steps to help with identifying the source of rule backups. In some cases, if the backup persists or the source of the backup is not able to be isolated and resolved, a Support Desk ticket may be necessary.
What are some factors that can lead to rule backups?
Misconfigured or inefficient filter criteria in rules
Complex custom SQL or formula rules
Large imports during high volume or peak times of the day
How do I know if I have a rules backup?
A rules backup can be determined from the Rules Health section in your database via the Max Age column that is calculating the timeframe for queued records. There is not a specific threshold for Max Age that constitutes a backup but if you begin to see this number climbing into multiple hours along with consistent errors displaying in the Rules Log, you will want to begin running through the following troubleshooting steps.
Where to start for troubleshooting?
If you begin to see your rules Max Age climbing as well as consistent errors in your Rules Log, the following tools will help with diagnosing the source.
After each mitigation step where you take action by inactivating a rule, you will want to give your database time to process and run through the rule loops a few times. A backup can take a bit to clear. Look for Rules Health duration below 20 minutes, which indicates the rules have caught up, otherwise a new timeout error will appear in the Rules Log following each failed attempt.
Rule Log
Click on the error in the rule log and see if a GUID is identified for a specific rule. If a rule GUID displays on the error, inactivate that rule. If you have multiple GUIDs showing for multiple errors, you will want to inactive all of the rules identified.
In some cases, the rule errors will not identify a specific rule GUID, depending on the kind of error that has occurred. When rules are unable to complete execution within the 15 minute timeout window, a generalized timeout may occur which points not to specific rules but rather indicates that the cumulative performance of the rules needs to be addressed to improve performance.
Check Rules
Use the Check Rules tool to take a broad look at rule performance and identify groups of rules most likely to be contributing to a timeout.
Within this tool, the Duration column lists the time it takes to assess the rule’s filter criteria against your pool of records, and the Count column is the number of records returned by those criteria.
Start by addressing any rules that display an error (”ERR”) in the Count column with 0.0 in the Duration column, as this indicates the rule couldn’t be assessed at all and may be misconfigured or have inefficient filters.
Address any rules where the Duration column displays in red indicating a duration of greater than 30 seconds or where the Count column displays (”ERR”) and the Duration column displays 60 seconds meaning the individual rule timed out as it couldn’t be assessed in the allotted 60 second timeframe.
Review any rules where the Type column indicates either a formula or custom SQL in the rule’s action. When rules execute, it is not only the filter criteria but also the action that must perform within timeout tolerances, and formula or SQL rules may contribute significantly to a timeout even if the filter duration is low.
Database Activity Monitor
The Database Activity Monitor displays information about various processes running in Slate in real time. This tool can be useful for identifying the underlying cause of frequent rule timeouts, presuming that the previous steps have already been taken without success.
To use the Database Activity Monitor:
It is most productive to monitor this tool while rules are actively attempting to run. To determine if rules are currently running, look for any mention of “rule” as a keyword. For example, the Statement column could read "insert into #rule_...", or the Procedure column might display "CREATE PROCEDURE [dbo].[ruleMulti..."
Use your browser's refresh button often, while keeping an eye on the total elapsed time column. Rule execution must complete in under 15 minutes (900 seconds) to succeed, so any one process taking upwards of 300 seconds may be a cause for concern. Take a screenshot or note down any long-running processes. You may also attempt to identify the specific rule by the displayed SQL or even a GUID (similar to how it appears in the Rule Log), however the GUID will not always be present.
If something other than the rules is taking significant time to run (300+ seconds), look into it. Does this indicate a pending import, scheduled export, or some other database action? Note this down and troubleshoot those processes as needed.
If these steps aren't sufficient to identify further issues that can be addressed, then you will want to submit a ticket to the Support Desk for further assistance.