Skip to content

DmesPlugin update: threshold for MCE/RAS errors #236

Open
alexandraBara wants to merge 3 commits into
developmentfrom
dmesg_update
Open

DmesPlugin update: threshold for MCE/RAS errors #236
alexandraBara wants to merge 3 commits into
developmentfrom
dmesg_update

Conversation

@alexandraBara

@alexandraBara alexandraBara commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

-Adding user defined threshold for MCE/RAS errors. If threshold is exceeded, the WARNINGS turn to ERROR.

Test plan

  • pytest test/unit
  • pytest test/functional (if applicable)
  • pre-commit run --all-files

Checklist

  • Added/updated tests (or explained why not)
  • Updated docs/README if behavior changed
  • No secrets or credentials committed

Sample run:

(venv) alexbara@ausalexbara02:~/node-scraper_public$ node-scraper  --plugin-configs=dmesg_mce_threshold_config.json   run-plugins DmesgPlugin   --data /home/alexbara/node-scraper_public/dmesg_sample.log   --collection False
  ...--------------------
  2026-06-23 11:59:35 CDT       INFO               nodescraper | Running plugin DmesgPlugin
  2026-06-23 11:59:35 CDT       INFO               nodescraper | Running data analyzer: DmesgAnalyzer
  2026-06-23 11:59:35 CDT      ERROR               nodescraper | CPU0 has 1 correctable MCE(s), mce_threshold=1
  2026-06-23 11:59:35 CDT      ERROR               nodescraper | CPU1/mc0 has 3 correctable MCE(s), mce_threshold=1
  2026-06-23 11:59:35 CDT      ERROR               nodescraper | (DmesgPlugin) task detected errors (4 warnings: RAS Correctable Error, RAS Corrected PCIe Error, MCE Corrected Error, RAS Corrected Error; 2 errors: CPU0 has 1 correctable MCE(s), mce_threshold=1, CPU1/mc0 has 3 correctable MCE(s), mce_threshold=1)
  2026-06-23 11:59:35 CDT       INFO               nodescraper | Closing connections
  2026-06-23 11:59:35 CDT       INFO               nodescraper | Running result collators
  2026-06-23 11:59:35 CDT       INFO               nodescraper | Running TableSummary result collator
  2026-06-23 11:59:35 CDT       INFO               nodescraper |

+-------------------------+--------+---------+
| Connection              | Status | Message |
+-------------------------+--------+---------+
| InBandConnectionManager | UNSET  |         |
+-------------------------+--------+---------+

+-------------+--------+--------------------------------------------------------------------------------+
| Plugin      | Status | Message                                                                        |
+-------------+--------+--------------------------------------------------------------------------------+
| DmesgPlugin | ERROR  | Analysis error: task detected errors (4 warnings: RAS Correctable Error, RAS   |
|             |        | Corrected PCIe Error, MCE Corrected Error, RAS Corrected Error; 2 errors: CPU0 |
|             |        | has 1 correctable MCE(s), mce_threshold=1, CPU1/mc0 has 3 correctable MCE(s),  |
|             |        | mce_threshold=1)                                                               |
+-------------+--------+--------------------------------------------------------------------------------+

  2026-06-23 11:59:35 CDT       INFO               nodescraper | Data written to csv file: ./scraper_logs_ausalexbara02_2026_06_23-11_59_35_AM/nodescraper.csv

dmesg_mce_threshold_config.json

{
  "global_args": {},
  "plugins": {
    "DmesgPlugin": {
      "analysis_args": {
        "mce_threshold": 1,
        "check_unknown_dmesg_errors": false
      }
    }
  },
  "result_collators": {}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant