Skip to content

SONARJAVA-6524 Generate built-in profiles from rule metadata#5705

Draft
romainbrenguier wants to merge 4 commits into
masterfrom
romain/generated-profiles-no-json
Draft

SONARJAVA-6524 Generate built-in profiles from rule metadata#5705
romainbrenguier wants to merge 4 commits into
masterfrom
romain/generated-profiles-no-json

Conversation

@romainbrenguier

@romainbrenguier romainbrenguier commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • generate the Sonar way and Sonar agentic AI profile JSON files during the build
  • move built-in profile membership into the per-rule metadata files and stop tracking the generated profile JSONs
  • update plugin tests to validate the generated classpath resources instead of src/main/resources files

Testing

  • mvn -pl sonar-java-plugin -am -DskipLicenseValidation -Dsurefire.failIfNoSpecifiedTests=false -Dtest=MetadataTest,JavaAgenticWayProfileTest,JavaSonarWayProfileTest test

Summary by Gitar

  • Build Infrastructure:
    • Introduced ProfileJsonGenerator.java to automate the creation of profile JSON files during the build process.
    • Updated pom.xml to include generated resource directories and configured exec-maven-plugin to execute the generator.
  • Documentation:
    • Added README.md in src/main/resources/profiles/ detailing the new rule management and build process.
  • Test Updates:
    • Updated JavaAgenticWayProfileTest to reflect the change in the total count of active rules from 465 to 467.

This will update automatically on new commits.

@hashicorp-vault-sonar-prod hashicorp-vault-sonar-prod Bot changed the title Generate built-in profiles from rule metadata SONARJAVA-6524 Generate built-in profiles from rule metadata Jun 25, 2026
@hashicorp-vault-sonar-prod

hashicorp-vault-sonar-prod Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

SONARJAVA-6524

Comment thread sonar-java-plugin/src/main/build/ProfileJsonGenerator.java Outdated
Comment thread sonar-java-plugin/src/main/build/ProfileJsonGenerator.java Outdated
@romainbrenguier romainbrenguier force-pushed the romain/generated-profiles-no-json branch 2 times, most recently from c0ca171 to f826968 Compare June 26, 2026 13:19
Comment thread sonar-java-plugin/pom.xml
Comment thread sonar-java-plugin/src/main/build/ProfileJsonGenerator.java
@romainbrenguier romainbrenguier force-pushed the romain/generated-profiles-no-json branch from 6c724fb to 4ebb9c8 Compare June 29, 2026 08:21
Comment thread sonar-java-plugin/pom.xml
Comment thread sonar-java-plugin/src/main/build/ProfileJsonGenerator.java
@romainbrenguier

Copy link
Copy Markdown
Contributor Author

Thanks for the comprehensive review. I've addressed both remaining suggestions:

  1. Removed redundant copy-generated-profiles execution (sonar-java-plugin/pom.xml:397-411) - The <resources> entry already handles copying the generated profiles during process-resources, so the separate copy-resources execution was indeed redundant.

  2. Added warning for misnamed rule-key files (sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:67-73) - Added .peek() to log a warning to stderr when non-rule-key files are encountered in profile directories, making it easier to catch typos or stray files at build time.

Both changes improve the build configuration clarity and help prevent silent profile-membership mistakes.

@gitar-bot

gitar-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown
Code Review ✅ Approved 7 resolved / 7 findings

Automates profile JSON generation from rule metadata during the build and transitions to directory-based profile composition. All previous findings regarding parsing fragility, stale resource handling, and silent failures have been resolved.

✅ 7 resolved
Quality: Profile generator silently drops rules with unknown profile names

📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:64-72 📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97
collectKeysByProfile looks up each profile name extracted from a rule's defaultQualityProfiles with keysByProfile.get(profile) and only adds the rule key when the returned list is non-null. Any profile name that is not exactly one of the two keys in PROFILES ("Sonar way", "Sonar agentic AI") is therefore silently ignored.

This migration moves profile membership into ~500 hand-edited rule metadata files, so a typo such as "Sonar Way", "sonar way", or "Sonar agentic Al" in any single rule would silently exclude that rule from the built-in profile with no error. The safety nets are weak: MetadataTest.ensure_sane_Sonar_way_profile only asserts the Sonar way size is > 400, so a handful of dropped rules would go completely unnoticed (the agentic test uses an exact size, but Sonar way does not). Likewise, a rule whose JSON omits defaultQualityProfiles entirely is silently excluded.

Recommend failing the build (or at minimum warning) when a rule references a profile name that is not in PROFILES, so accidental omissions surface at build time instead of shipping an incomplete profile.

Quality: Regex-based JSON parsing in ProfileJsonGenerator is fragile

📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:33-35 📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97 📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:99-105
ProfileJsonGenerator extracts sqKey and defaultQualityProfiles via hand-written regular expressions rather than a JSON parser. This works for the current well-formatted metadata, but it is brittle: JSON_STRING_PATTERN blindly captures every quoted token inside the defaultQualityProfiles array, so any future change such as an inline comment, an escaped quote, or reformatting could yield wrong profile names or miss entries. Because the generator runs as a single-file source launch (java ProfileJsonGenerator.java) it cannot easily depend on Gson; however the fragility is worth a comment and tight patterns. Consider at least documenting the assumption that metadata files are machine-generated and strictly formatted, and validating extracted profile names against the known set (see related finding) so malformed input cannot silently produce an incorrect profile.

Bug: Stale source profile JSONs collide with generated ones

📄 sonar-java-plugin/pom.xml:148-155 📄 sonar-java-plugin/pom.xml:397-411 📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:42 📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:56-57
The PR's stated goal is to "stop tracking the generated profile JSONs," but the old hand-maintained files are still present in source: sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json and Sonar_agentic_AI_profile.json (the diff shows 0 deletions). ProfileJsonGenerator now writes freshly generated files to the SAME packaged path (org/sonar/l10n/java/rules/java/Sonar_way_profile.json).

In the pom, both src/main/resources and ${project.build.directory}/generated-resources/profiles are declared as resource directories (lines 148-155), and there is also a copy-generated-profiles copy-resources execution. Both the stale src copy and the generated copy resolve to the identical target path in target/classes. Which one ends up packaged depends entirely on maven-resources-plugin copy ordering and its overwrite timestamp semantics (by default a resource is only copied when the source is newer than the destination). This is fragile: the plugin may ship the stale, hand-maintained profile instead of the generated one, and at minimum the two definitions can silently diverge while both remain authoritative-looking.

Delete the old Sonar_way_profile.json / Sonar_agentic_AI_profile.json from src/main/resources so the generated artifact is the single source of truth, and ensure the per-rule profile membership files fully reproduce the previous profile contents.

Edge Case: numericKey throws cryptic NumberFormatException on stray files

📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:61-75
collectRuleKeys lists every regular file in a profile directory and feeds each filename to numericKey, which does Integer.parseInt(ruleKey.substring(1)). Any file whose name is not exactly S<digits> — e.g. a .gitkeep, .DS_Store, editor swap file, or a typo'd rule key such as S891O (letter O) — causes a NumberFormatException that aborts the build with an opaque message ("For input string ...") and no indication of the offending directory/file.

Consider filtering to files matching S\d+ (and/or sorting with a fallback comparator) and throwing a descriptive error that names the bad file, so contributors immediately understand the problem.

Bug: MetadataTest reads deleted src/main/resources profile JSON

📄 sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/.gitignore:1
This PR deletes src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json (and the agentic one) and adds a .gitignore for *_profile.json, so the profile JSONs now only exist as generated artifacts under target/generated-resources / target/classes. However MetadataTest.ensure_sane_Sonar_way_profile() still reads the profile via a hard-coded filesystem path: Path.of("src/main/resources/" + JavaSonarWayProfile.SONAR_WAY_PATH) and opens it with Files.newReader(profilePath.toFile(), ...). Since that file no longer exists in the source tree, the test will fail with FileNotFoundException. This test is explicitly listed in the PR's test command (-Dtest=MetadataTest,...). The PR description says tests should be updated 'to validate the generated classpath resources instead of src/main/resources files', but MetadataTest was not updated. Point the test at the generated output (e.g. target/classes + SONAR_WAY_PATH) or load it from the classpath via getResourceAsStream(SONAR_WAY_PATH).

...and 2 more resolved from earlier reviews

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqube-next

Copy link
Copy Markdown

Replace metadata-based profile generation with directory-based approach.
Each rule's profile membership is now represented by a file in
profile-specific directories (profiles/sonar_way/, profiles/sonar_agentic_ai/).

This eliminates merge conflicts when parallel PRs add rules to profiles,
as each PR creates a new file instead of editing a shared JSON array.

Changes:
- Add ProfileJsonGenerator to scan profile directories and generate JSONs
- Create profile directories with 534 (Sonar way) and 467 (Agentic AI) rule files
- Update pom.xml to generate and copy profiles during build
- Add README.md with usage instructions
…7) to match the 468 files in sonar_agentic_ai profile directory; updated MetadataTest to read generated Sonar_way_profile.json from target/classes/ instead of src/main/resources/ since it is now generated during the build
Comment: <details>
<summary><b>Code Review</b> <kbd>👍 Approved with suggestions</kbd> <kbd>5 resolved / 7 findings</kbd></summary>

Automates built-in profile generation by moving rule membership metadata into individual rule files, resolving issues with stale JSON tracking and brittle manual updates. Consolidate the duplicate copy operations in the build configuration and refine the rule-key validation logic to prevent silent file drops.

<details>
<summary>💡 <b>Quality:</b> Generated profiles copied twice via <resources> and copy-resources</summary>

<kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-a2a59812e774224a494679a03de77f5fe24ceb84295e379d6b9583ef97a1ee15R148-R155">sonar-java-plugin/pom.xml:148-155</a></kbd> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-a2a59812e774224a494679a03de77f5fe24ceb84295e379d6b9583ef97a1ee15R397-R411">sonar-java-plugin/pom.xml:397-411</a></kbd>

The build both declares `${project.build.directory}/generated-resources/profiles` as a `<resource>` directory (which the default `process-resources` execution already copies into `${project.build.outputDirectory}`) and adds a separate `copy-generated-profiles` maven-resources-plugin execution that copies the same directory to the same `outputDirectory`. The two mechanisms are redundant. Keeping only one (the `<resources>` entry is sufficient) would reduce confusion and avoid double-processing the same files.

<details>
<summary>Fix</summary>

````
<!-- Remove the redundant copy-generated-profiles execution; the
     <resources> entry for generated-resources/profiles already copies
     the files into ${project.build.outputDirectory} during process-resources. -->
````

</details>

</details>

<details>
<summary>💡 <b>Quality:</b> Misnamed rule-key files are silently dropped from profiles</summary>

<kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-527d6d3ff6d0b2988ebdcb2fe8ecc63ce3bf3ce782e105cb2ddd1881b66929edR67">sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:67</a></kbd> <kbd>📄 <a href="https://github.com/SonarSource/sonar-java/pull/5705/files#diff-527d6d3ff6d0b2988ebdcb2fe8ecc63ce3bf3ce782e105cb2ddd1881b66929edR73-R76">sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:73-76</a></kbd>

`collectRuleKeys` filters profile-directory entries with `isValidRuleKey` (`S\d+`). Any file that does not exactly match — e.g. a typo like `s106` (lowercase), `S106 ` (trailing space), or `S106.txt` — is silently skipped, so the corresponding rule disappears from the generated profile with no error or warning. Given the whole design relies on humans creating empty files named after rule keys, a silent drop makes profile-membership mistakes hard to detect. Consider logging a warning for files in a profile directory that do not match the expected rule-key pattern (excluding known files such as README/.gitignore).

<details>
<summary>Fix</summary>

````
files
  .filter(Files::isRegularFile)
  .map(Path::getFileName)
  .map(Path::toString)
  .peek(name -> {
    if (!isValidRuleKey(name)) {
      System.err.println("Ignoring non-rule-key file in profile directory: " + name);
    }
  })
  .filter(ProfileJsonGenerator::isValidRuleKey)
  .sorted(Comparator.comparingInt(ProfileJsonGenerator::numericKey))
  .collect(Collectors.toList());
````

</details>

</details>

<details>
<summary><kbd>✅ 5 resolved</kbd></summary>

<details>
<summary>✅ <b>Quality:</b> Profile generator silently drops rules with unknown profile names</summary>

> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:64-72</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97</kbd>
> `collectKeysByProfile` looks up each profile name extracted from a rule's `defaultQualityProfiles` with `keysByProfile.get(profile)` and only adds the rule key when the returned list is non-null. Any profile name that is not exactly one of the two keys in `PROFILES` ("Sonar way", "Sonar agentic AI") is therefore silently ignored.
>
> This migration moves profile membership into ~500 hand-edited rule metadata files, so a typo such as "Sonar Way", "sonar way", or "Sonar agentic Al" in any single rule would silently exclude that rule from the built-in profile with no error. The safety nets are weak: `MetadataTest.ensure_sane_Sonar_way_profile` only asserts the Sonar way size is `> 400`, so a handful of dropped rules would go completely unnoticed (the agentic test uses an exact size, but Sonar way does not). Likewise, a rule whose JSON omits `defaultQualityProfiles` entirely is silently excluded.
>
> Recommend failing the build (or at minimum warning) when a rule references a profile name that is not in `PROFILES`, so accidental omissions surface at build time instead of shipping an incomplete profile.

</details>

<details>
<summary>✅ <b>Quality:</b> Regex-based JSON parsing in ProfileJsonGenerator is fragile</summary>

> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:33-35</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:84-97</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:99-105</kbd>
> `ProfileJsonGenerator` extracts `sqKey` and `defaultQualityProfiles` via hand-written regular expressions rather than a JSON parser. This works for the current well-formatted metadata, but it is brittle: `JSON_STRING_PATTERN` blindly captures every quoted token inside the `defaultQualityProfiles` array, so any future change such as an inline comment, an escaped quote, or reformatting could yield wrong profile names or miss entries. Because the generator runs as a single-file source launch (`java ProfileJsonGenerator.java`) it cannot easily depend on Gson; however the fragility is worth a comment and tight patterns. Consider at least documenting the assumption that metadata files are machine-generated and strictly formatted, and validating extracted profile names against the known set (see related finding) so malformed input cannot silently produce an incorrect profile.

</details>

<details>
<summary>✅ <b>Bug:</b> Stale source profile JSONs collide with generated ones</summary>

> <kbd>📄 sonar-java-plugin/pom.xml:148-155</kbd> <kbd>📄 sonar-java-plugin/pom.xml:397-411</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:42</kbd> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:56-57</kbd>
> The PR's stated goal is to "stop tracking the generated profile JSONs," but the old hand-maintained files are still present in source: `sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json` and `Sonar_agentic_AI_profile.json` (the diff shows 0 deletions). `ProfileJsonGenerator` now writes freshly generated files to the SAME packaged path (`org/sonar/l10n/java/rules/java/Sonar_way_profile.json`).
>
> In the pom, both `src/main/resources` and `${project.build.directory}/generated-resources/profiles` are declared as resource directories (lines 148-155), and there is also a `copy-generated-profiles` copy-resources execution. Both the stale src copy and the generated copy resolve to the identical target path in `target/classes`. Which one ends up packaged depends entirely on maven-resources-plugin copy ordering and its `overwrite` timestamp semantics (by default a resource is only copied when the source is newer than the destination). This is fragile: the plugin may ship the stale, hand-maintained profile instead of the generated one, and at minimum the two definitions can silently diverge while both remain authoritative-looking.
>
> Delete the old `Sonar_way_profile.json` / `Sonar_agentic_AI_profile.json` from `src/main/resources` so the generated artifact is the single source of truth, and ensure the per-rule profile membership files fully reproduce the previous profile contents.

</details>

<details>
<summary>✅ <b>Edge Case:</b> numericKey throws cryptic NumberFormatException on stray files</summary>

> <kbd>📄 sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:61-75</kbd>
> `collectRuleKeys` lists every regular file in a profile directory and feeds each filename to `numericKey`, which does `Integer.parseInt(ruleKey.substring(1))`. Any file whose name is not exactly `S<digits>` — e.g. a `.gitkeep`, `.DS_Store`, editor swap file, or a typo'd rule key such as `S891O` (letter O) — causes a `NumberFormatException` that aborts the build with an opaque message ("For input string ...") and no indication of the offending directory/file.
>
> Consider filtering to files matching `S\d+` (and/or sorting with a fallback comparator) and throwing a descriptive error that names the bad file, so contributors immediately understand the problem.

</details>

<details>
<summary>✅ <b>Bug:</b> MetadataTest reads deleted src/main/resources profile JSON</summary>

> <kbd>📄 sonar-java-plugin/src/main/resources/org/sonar/l10n/java/rules/java/.gitignore:1</kbd>
> This PR deletes `src/main/resources/org/sonar/l10n/java/rules/java/Sonar_way_profile.json` (and the agentic one) and adds a `.gitignore` for `*_profile.json`, so the profile JSONs now only exist as generated artifacts under `target/generated-resources` / `target/classes`. However `MetadataTest.ensure_sane_Sonar_way_profile()` still reads the profile via a hard-coded filesystem path: `Path.of("src/main/resources/" + JavaSonarWayProfile.SONAR_WAY_PATH)` and opens it with `Files.newReader(profilePath.toFile(), ...)`. Since that file no longer exists in the source tree, the test will fail with FileNotFoundException. This test is explicitly listed in the PR's test command (`-Dtest=MetadataTest,...`). The PR description says tests should be updated 'to validate the generated classpath resources instead of src/main/resources files', but MetadataTest was not updated. Point the test at the generated output (e.g. `target/classes` + SONAR_WAY_PATH) or load it from the classpath via `getResourceAsStream(SONAR_WAY_PATH)`.

</details>

</details>

<details>
<summary>🤖 <b>Prompt for agents</b></summary>

````
Code Review: Automates built-in profile generation by moving rule membership metadata into individual rule files, resolving issues with stale JSON tracking and brittle manual updates. Consolidate the duplicate copy operations in the build configuration and refine the rule-key validation logic to prevent silent file drops.

1. 💡 Quality: Generated profiles copied twice via <resources> and copy-resources
   Files: sonar-java-plugin/pom.xml:148-155, sonar-java-plugin/pom.xml:397-411

   The build both declares `${project.build.directory}/generated-resources/profiles` as a `<resource>` directory (which the default `process-resources` execution already copies into `${project.build.outputDirectory}`) and adds a separate `copy-generated-profiles` maven-resources-plugin execution that copies the same directory to the same `outputDirectory`. The two mechanisms are redundant. Keeping only one (the `<resources>` entry is sufficient) would reduce confusion and avoid double-processing the same files.

   Fix:
   <!-- Remove the redundant copy-generated-profiles execution; the
        <resources> entry for generated-resources/profiles already copies
        the files into ${project.build.outputDirectory} during process-resources. -->

2. 💡 Quality: Misnamed rule-key files are silently dropped from profiles
   Files: sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:67, sonar-java-plugin/src/main/build/ProfileJsonGenerator.java:73-76

   `collectRuleKeys` filters profile-directory entries with `isValidRuleKey` (`S\d+`). Any file that does not exactly match — e.g. a typo like `s106` (lowercase), `S106 ` (trailing space), or `S106.txt` — is silently skipped, so the corresponding rule disappears from the generated profile with no error or warning. Given the whole design relies on humans creating empty files named after rule keys, a silent drop makes profile-membership mistakes hard to detect. Consider logging a warning for files in a profile directory that do not match the expected rule-key pattern (excluding known files such as README/.gitignore).

   Fix:
   files
     .filter(Files::isRegularFile)
     .map(Path::getFileName)
     .map(Path::toString)
     .peek(name -> {
       if (!isValidRuleKey(name)) {
         System.err.println("Ignoring non-rule-key file in profile directory: " + name);
       }
     })
     .filter(ProfileJsonGenerator::isValidRuleKey)
     .sorted(Comparator.comparingInt(ProfileJsonGenerator::numericKey))
     .collect(Collectors.toList());

````

</details>

</details>

<details>
<summary><b>Options</b> </summary>

<kbd>Auto-apply is off</kbd> → Gitar will not commit updates to this branch.<br><kbd>Display: compact</kbd> → Showing less information.

Comment with these commands to change:

<table>
<tr>
<td><kbd>Auto-apply</kbd></td>
<td><kbd>Compact</kbd></td>
</tr>
<tr>
<td>

```
gitar auto-apply:on
```
</td>
<td>

```
gitar display:verbose
```
</td>
</tr>
</table>

</details>

<sub>Was this helpful? React with 👍 / 👎 | [Gitar](https://gitar.ai)</sub>
@romainbrenguier romainbrenguier force-pushed the romain/generated-profiles-no-json branch from 358c470 to a68bc01 Compare June 30, 2026 08:51
These files are now generated during the Maven build from the
profile directories (sonar_way/ and sonar_agentic_ai/), so they
should not be tracked in git.

The generated files are placed in target/classes/ during the build.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant