What is the purpose?

An opinionated, AI-native development workflow for Java Enterprise: reusable Skills, Agents, Commands, and third-party MCP servers combined with a human-in-the-loop model to modernize real-world SDLC practices.

Starting with this release, the project introduces a simple way to describe any SDLC action through three phases: Plan, Build, and Operate. Software engineers can use this structure when writing a User prompt in an AI user interface or terminal.

Build
  /implement-issue
    @robot-tech-lead
      /create-feature-branch
      /create-worktree
      /review-alignment
      @robot-java-coder
      @robot-java-spring-boot-coder
      @robot-java-quarkus-coder
      @robot-java-micronaut-coder
      @robot-no-java

We will go into more detail later, but first, let's review the most interesting features added in this release:

Thanks to our community members in Singapore, Hong Kong, Hanoi, London, and New York. πŸ‘‹πŸ‘‹πŸ‘‹

If you have questions about the project, how to customize it for your team, how to use the skills in daily work, or how to solve tooling issues, use GitHub Discussions.

Help this project grow: If this project helps your team, become a sponsor.

Enriching the workflow with Commands and Agents, not only Skills

The project started more than a year ago with a set of reusable rules / system prompts. That approach worked well after removing the restriction that associated rules with particular files, as described in ADR-002. With the rise of Skills, it was a good decision to convert that material into skills and use the new capabilities provided by Skill registries like https://www.skills.sh/ and other registries.

In this release, we go further by adding new semantics for expressing the actions a software engineer performs while solving a problem.

That model is organized around three delivery paths:

Plan
  /create-issue
  /update-issue
    @robot-business-analyst
      @043-planning-github-issues
      @044-planning-jira
      @014-agile-user-story
  /create-adr
    @robot-architect
      @030-architecture-adr-general
      @031-architecture-adr-functional-requirements
      @032-architecture-adr-non-functional-requirements
  /create-diagram
    @robot-architect
      @033-architecture-diagrams
  /create-spec
    @robot-tech-lead
      @042-planning-openspec
  /explore-design
    @robot-architect
      @034-architecture-design-exploration
  /review-alignment
    @robot-business-analyst

Build
  /implement-issue
    @robot-tech-lead
      /create-feature-branch
      /create-worktree
      /review-alignment
      @robot-java-coder
      @robot-java-spring-boot-coder
      @robot-java-quarkus-coder
      @robot-java-micronaut-coder
      @robot-no-java

Operate
  /profile
    @robot-java-performance
      @161-java-profiling-detect
      @162-java-profiling-analyze
      @163-java-profiling-refactor
      @164-java-profiling-verify
  /benchmark
    @robot-java-performance
      @151-java-performance-jmeter
      @152-java-performance-gatling

Of course, you can continue using the project in the classic way: add the Java class and a particular skill to the context, or describe the action in natural language and let the AI agent harness tools trigger the right skill. However, combining commands with agents and skills gives you more benefits.

Example:

Create AGENTS.md #It will trigger the skill @200-agents-md
/update-issue from github #xxx and use User Story format.
/create-spec using ideas from github issue #xxx
/review-alignment between the issue #xxx and the change #yyy
/implement-issue based on OpenSpec change #yyy

In upcoming releases, this model will be enriched in different ways, but its pillars are established in this release.

In other projects, you can find useful Skills, Agents, or Commands, but not always a fully connected workflow designed with Java in mind.

What are the Top 10 Skills from this project in Skills.sh?

The project has 106 skills and uses Skills.sh as its main skill registry. It has served 11.0K installs in total. These are the current top 10 skills used by users there:

  1. 110-java-maven-best-practices - search query: maven
  2. 121-java-object-oriented-design - search query: java object oriented
  3. 124-java-secure-coding - search query: java security
  4. 131-java-testing-unit-testing - search query: java unit testing
  5. 142-java-functional-programming - search query: java functional programming
  6. 128-java-generics - search query: java generics
  7. 111-java-maven-dependencies - search query: maven
  8. 141-java-refactoring-with-modern-features
  9. 125-java-concurrency - search query: java concurrency
  10. 143-java-functional-exception-handling - search query: java functional programming

What is your favorite Skill from this project? You can share it here: https://github.com/jabrena/cursor-rules-java/discussions/804

Applying Zero Trust with your Agent skills

Skills are not ordinary Markdown files. They are executable guidance for AI agents. A skill can tell an agent how to read code, run commands, inspect evidence, write files, install tools, or make a technical recommendation.

That is useful, but it also means generated skills need a zero trust review mindset. In 0.15.0, the project introduced its first validators for generated skills. In 0.16.0, that support has grown into a broader validation stack with multiple independent gates:

  • MarkdownValidator checks that project documentation and generated Markdown remain parseable and healthy.
  • skill-check validates the skill package structure.
  • cisco-ai-skill-scanner scans generated skills recursively with behavioral scanning and a strict policy.
  • SkillSpector adds another static quality and security review.
  • Snyk Agent Scan adds supply-chain and prompt-risk signals.

The point is not to claim that a generated skill is perfect. The point is to make suspicious behavior visible before maintainers or users rely on it.

Common skill risks include:

  • Prompt injection patterns
  • Data exfiltration instructions
  • Suspicious command execution
  • Hidden or obfuscated content
  • Excessive agency
  • Supply-chain risk
  • Description-behavior mismatch
  • Insecure credential handling
  • System modification and privilege escalation
  • Untrusted content and indirect prompt injection
  • Tool poisoning and tool shadowing

Note: The project runs an analysis for all skills on every commit using the tools described above. https://github.com/jabrena/cursor-rules-java/blob/main/.github/workflows/maven.yaml

If you are interested in this kind of validation, I recommend reading the following article: How to validate skills?

Improving the approach to test the behavior of Agent Skills

During the evolution of this project, files change over time for different reasons. After each change, it is necessary to validate them again, so the release process includes time to ensure they continue to add value for software engineers and AI agents running in pipelines.

During this release, we ran a Spike to validate an idea for improving the testing process. We added Gherkin support for all skills created or updated in this release, and the results were successful. Testing time was reduced, and more importantly, the project now generates evidence for specific deterministic behaviors from the skills under test.

Let's review two examples to show the value of the new tests.

Example to validate a skill

All skills have an acceptance-test inventory file, and it lives in acceptance-tests-prompts-skills.md. When a generated skill changes for any reason, it is now possible to run only the matching prompt for that changed skill. Let's review the scenario for @111-java-maven-dependencies.

@111-java-maven-dependencies:

The inventory has a prompt to validate the skill:

execute @skills-generator/src/test/resources/gherkin/skills/111-java-maven-dependencies.feature
and verify that acceptance-tests pass.

That prompt is linked with the following Gherkin file:

Feature: Validate changes from usage of Maven dependencies skill

Background:
  Given the skill "111-java-maven-dependencies"

@acceptance-test
Scenario: Add JSpecify and Error Prone + NullAway to Maven demo
  Given the example project "examples/maven/maven-demo"
  And the example project has a baseline "pom.xml"
  And the folder "examples/maven/maven-demo" has no git changes
  And the dependency selection answers are:
    | question                  | answer                           |
    | code-quality dependencies | JSpecify; Error Prone + NullAway |
    | main project package name | info.jab                         |
  When the skill "111-java-maven-dependencies" is applied to "examples/maven/maven-demo"
  Then "./mvnw validate" succeeds in "examples/maven/maven-demo"
  And the "pom.xml" declares selected dependencies and compiler plugin arguments
  And ".mvn/jvm.config" contains the required JVM arguments
  And "./mvnw clean verify" fails during compile in "examples/maven/maven-demo"
  And the verification failure is accepted because "-Xlint:all" and "-Werror" convert existing warnings into errors
  And any git changes produced during skill execution and verification are reset

For that particular skill, the scenario fixes the example project, the selected dependency answers, the expected pom.xml changes, the expected .mvn/jvm.config changes, the validation command, the accepted compiler failure, and the cleanup expectation. The goal is not to test every possible conversation. The goal is to prove that the changed skill still follows its intended workflow against a stable fixture. A Gherkin file cannot cover every possible use case, so it focuses on the most important ones.

"Program testing can be used to show the presence of bugs, but never to show their absence!"

  • Edsger W. Dijkstra

Let's review another, more complex scenario and one of the key features included in this release.

Example to validate a command

All commands have an acceptance-test inventory file, and it lives in acceptance-tests-prompts-skills.md. When a generated command changes for any reason, it is now possible to run only the matching prompt for that changed command. Let's review the scenario about the command /implement-issue.

To demonstrate the new capabilities, let's try to solve the first problem from the project Latency problems:

# Problem 1

## User Story Statement

- **As an** API consumer / data analyst
- **I want to** consume God APIs (Greek, Roman & Nordic), filter gods whose names start with a requested letter, convert each filtered god name into a decimal representation, and return the sum of those values
- **So that** I can perform cross-pantheon analysis and aggregate mythology data for research, reporting, or educational applications.

**Notes:**

- Decimal conversion: For each god name, each character is converted to its Unicode integer value and those integers are concatenated as strings (for example, `Zeus` -> `90101117115`). The final result is the numeric sum of all per-name string representations.
- Case sensitivity: The `filter` parameter accepts exactly one Unicode code point and matching is case-sensitive. The documented source data returns god names with uppercase initial letters, such as `Nike`, `Nemesis`, `Neptun`, and `Njord`, so `filter=N` is the meaningful value for the documented aggregate examples. A lowercase `filter=n` is valid but returns no matches for the current documented data.
- HTTP timeouts: Outbound calls use Spring `RestClient` with connect/read timeouts set once in configuration (defaults in `application.yml`; optional environment variable overrides). There is no automatic retry of failed or timed-out requests; aggregation continues with the sources that return in time.
- Configuration: Single default configuration with environment variable overrides for operational flexibility.
- Data sources:
  - Greek API: https://my-json-server.typicode.com/jabrena/latency-problems/greek
  - Roman API: https://my-json-server.typicode.com/jabrena/latency-problems/roman
  - Nordic API: https://my-json-server.typicode.com/jabrena/latency-problems/nordic

Given this User story and the OpenSpec change defined here: https://github.com/jabrena/cursor-rules-java/tree/main/examples/openspec, you can implement it using the new /implement-issue command. Let's see how to do it and how to validate it.

/implement-issue:

Using this prompt:

execute @skills-generator/src/test/resources/gherkin/commands/implement-issue.feature
and verify that acceptance-tests pass.

You can run the following Gherkin file:

Feature: Validate implement-issue command with the God Analysis API OpenSpec example

Background:
  Given the command prompt file ".cursor/commands/implement-issue.md"
  And the OpenSpec project path "examples/openspec/god-analysis-api"
  And the OpenSpec change path "examples/openspec/god-analysis-api/openspec/changes/add-god-analysis-api"
  And the implementation target directory "examples/openspec/god-analysis-api/demo"
  And the implementation target directory starts empty except for ".gitkeep"

@acceptance-test
Scenario: Implement God Analysis API from a validated OpenSpec change
  Remark: Acceptance execution must use the implement-issue command contract and must not implement outside the requested demo directory.
  Given the OpenSpec change "add-god-analysis-api" contains "proposal.md", "design.md", "tasks.md", and "specs/god-analysis-api/spec.md"
  And the OpenSpec change is validated with "openspec validate --all" from "examples/openspec/god-analysis-api"
  And the command prompt source ".cursor/commands/implement-issue.md" is read before execution
  When the user executes the prompt "/implement-issue examples/openspec/god-analysis-api/openspec/changes/add-god-analysis-api implement in examples/openspec/god-analysis-api/demo"
  Then the command loads the selected OpenSpec "tasks.md" as the execution contract
  And the command confirms the selected OpenSpec change is current, validated, and internally consistent
  And the command identifies the implementation as a Spring Boot MVC Java service from the OpenSpec design and technology constraints
  And the command routes implementation work through "@robot-tech-lead" and the appropriate Java Spring Boot implementation agent
  And the command reports using the current branch as the isolation strategy before implementation starts
  And all generated implementation files are created under "examples/openspec/god-analysis-api/demo"
  And the implementation provides "GET /api/v1/gods/stats/sum"
  And the implementation covers the documented happy path sum "78179288397447443426"
  And the implementation covers the documented partial timeout sum "78101109179220212216"
  And the implementation rejects missing, empty, multi-character, and invalid query parameters with HTTP 400
  And the implementation does not add WebFlux, WebClient, Rest Assured, Resilience4j Retry, Spring Retry, or custom retry loops for US-001
  And the command runs the focused Maven verification command from "examples/openspec/god-analysis-api/demo"
  And the command marks OpenSpec tasks complete only after their acceptance criteria and verification gates pass
  And the command reports changed files, validation evidence, updated OpenSpec task status, risks, and blockers
  And any git changes produced under "examples/openspec/god-analysis-api/demo" during command execution and verification are reset

When the prompt is executed, under the hood the Gherkin file triggers the agents and skills:

Build
  /implement-issue
    @robot-tech-lead
      /create-feature-branch
      /create-worktree
      /review-alignment
      @robot-java-coder
      @robot-java-spring-boot-coder
      @robot-java-quarkus-coder
      @robot-java-micronaut-coder
      @robot-no-java

In this case, the command internally uses the agent @robot-tech-lead, which redirects to the specific agent @robot-java-spring-boot-coder based on the analysis of the specification. That agent handles specific Java skills and specific Spring Boot skills. This is the result for a Spring Boot implementation:

asciicast

Running the test with Codex CLI for the Spring Boot variant

Running the test with VS Code + Codex plugin

But if you refine the prompt a bit, you can implement the requirement in Quarkus:

execute @skills-generator/src/test/resources/gherkin/commands/implement-issue.feature
and verify that acceptance-tests pass.
Implement it using Quarkus, not Spring Boot, as the default requirement.

In this case, the agent @robot-tech-lead redirects the workload to the specific agent @robot-java-quarkus-coder, which handles specific Java skills and specific Quarkus skills. This is the result for a Quarkus implementation:

asciicast

Running the test with Codex CLI for the Quarkus variant

Or, if required, the agent @robot-tech-lead redirects to the specific agent @robot-java-micronaut-coder, which handles specific Java skills and specific Micronaut skills. This is the result for a Micronaut implementation:

execute @skills-generator/src/test/resources/gherkin/commands/implement-issue.feature
and verify that acceptance-tests pass.
Implement it using Micronaut, not Spring Boot, as the default requirement.

And the project will implement the feature without any issues:

asciicast

Running the test with Codex CLI for the Micronaut variant

As you can see, one of the unique features of this project is the ability to implement requirements across multiple Java frameworks. With this idea in mind, you can explore moving from one framework to another during a Spike, evaluate how complex the change is, identify which annotations change, and discover which features are framework-specific. If you have good tests, the journey becomes easier.

Another benefit discovered through this new testing approach is that by following the test execution, we can find issues with skills and reinforce them after each test run. One example is the lack of support for Mongock in Spring Boot 4.0.x, but now the skill is able to provide a workaround, consistent with the solutions for Quarkus and Micronaut.

Improving the way to install Agents and Commands

With the rise of Skills, there is a need for public registries for them. But what happens to Agents, Commands, or other files? The reality is that Agents and Commands are often treated as second-class citizens.

To take advantage of the public registry and the process for generating skills from XML sources, it is relatively easy to embed commands and agents in a Meta Skill. Once you have installed the skills, you can use the following inventory and installation workflows:

Then you can use them to install assets or generate the inventory files.

Example:

install @004-commands-installation cursor
install @004-commands-installation claude-code
install @004-commands-installation codex
install @004-commands-installation github-copilot

Note: It is a good practice after releasing a new version to download all Skills and then install the Agents and Commands aligned.

New capabilities for Java Enterprise Frameworks

In this new release, we have added a few new Java framework capabilities:

  • Project creation: starter skills for Spring Boot, Quarkus, and Micronaut, helping teams bootstrap Maven-based services with the expected Java version, framework baseline, package structure, and verification commands from the first commit: @300-frameworks-spring-boot-create-project, @400-frameworks-quarkus-create-project, @500-frameworks-micronaut-create-project.
  • Spring Modulith: dedicated guidance for modular monolith design with Spring Boot: @305-frameworks-spring-boot-modulith.
  • MongoDB migrations: Mongock migration skills for Spring Boot, Quarkus, and Micronaut: @316-frameworks-spring-mongodb-migrations-mongock, @416-frameworks-quarkus-mongodb-migrations-mongock, @516-frameworks-micronaut-mongodb-migrations-mongock.

The value is that teams can pick the framework lane first, then apply the matching skills for project creation, REST APIs, validation, security, persistence, messaging, migrations, MongoDB, and tests. If you use one of the main Java frameworks, you can review the following skills:

Spring Boot skills:

Quarkus skills:

Micronaut skills:

Increasing engineering awareness with EU regulations

EU regulations are becoming part of the daily software engineering context, not something that only appears at the end of a release. Modern Java systems process personal data, integrate third-party services, expose APIs, run distributed infrastructure, and increasingly include AI models, RAG pipelines, agents, tool calling, or generated decision support.

For software development teams, the value is practical: regulations help turn vague risk into reviewable engineering questions. They push teams to identify the system scope, data flows, operational controls, human oversight, audit evidence, incident paths, and ownership boundaries before a change reaches production.

This becomes even more important with GenAI tooling. When prompts, embeddings, generated code, agent actions, and tool calls enter the SDLC, teams need a repeatable way to ask what data is being used, what decisions are automated, what evidence is kept, and which legal, security, privacy, risk, or product owners must review the work.

0.16.0 introduces a new alpha family of regulation engineering review skills:

These skills are engineering review aids. They do not provide legal advice and they do not replace qualified legal, compliance, privacy, security, risk, product, or governance owners.

For distributed systems using GenAI tools, a practical review set is:

  • EU AI Act when the system uses AI models, LLMs, RAG, AI agents, tool calling, or generated decision support.
  • GDPR when the system processes personal data in prompts, logs, embeddings, retrieval sources, exports, backups, or generated outputs.
  • NIS2 when the system supports essential or important services, cybersecurity incident flows, supply-chain dependencies, or critical operations.
  • DORA when the system supports financial entities, important business services, ICT risk, third-party ICT providers, or operational resilience evidence.
  • Cyber Resilience Act when a product, library, agent tool, or software component may be distributed as a product with digital elements.
  • Data Act when the system exposes data access, sharing, portability, cloud switching, connected-product data, or interoperability workflows.
  • Digital Services Act when the system supports hosting, platforms, marketplaces, content moderation, recommender systems, advertising, or transparency reporting.
  • Digital Markets Act when the system supports gatekeeper-platform concerns, core platform services, interoperability, data access, ranking, or self-preferencing controls.

If you are interested in this set of skills, I recommend reading the following article: Introduction to EU regulations Part I

Β What trends from Radar #34 follow this project?

The Thoughtworks Technology Radar Vol. 34 (April 2026) maps the current technology landscape across four rings: Adopt, Trial, Assess, and Caution. Several blips align directly with the direction this project has been taking. Lets review what recommendations matches with this project.

Adopt

  • Curated shared instructions for software teams β€” The Radar explicitly calls out AGENTS.md as a distribution mechanism for AI guidance, anchored into service templates so every new repository inherits the latest agent workflows. This project has been doing exactly that from the start. See @200-agents-md.
  • Context engineering β€” Treating the context window as a design surface rather than a static text box is now a foundational concern. The skill system in this project is a practical application of that principle: skills are loaded on demand, not front-loaded into a monolithic prompt.
  • Zero trust architecture β€” The Radar recommends ZTA as a non-negotiable default for agent deployments: never trust, always verify, least-privilege access. The Applying Zero Trust with Agent Skills section in this release directly addresses this applied to skill-based agent workflows.

Trial

  • Agent Skills β€” The Radar places Agent Skills in Trial, noting they are an open standard for modularizing context, reducing token consumption and providing a controlled alternative to MCP. This project is one of the early skill registries on skills.sh, shipping skills for Java Enterprise workflows.
  • Feedback sensors for coding agents β€” Deterministic quality gates wired into agent workflows so failures trigger self-correction. This project uses skill-check and skill-scanner as post-generation feedback sensors, and is now expanding coverage with Gherkin acceptance tests for Skills, Agents and Commands.
  • Progressive context disclosure β€” Agents should load only what is needed for the current task. The 001-commands-inventory, 002-agents-inventory, and 003-skills-inventory skills serve as lightweight discovery indexes before detailed skill content is loaded.
  • Mutation testing β€” The Radar highlights Pitest for Java as a way to verify that a passing test suite genuinely validates behavior. The Java testing skills in this project include guidance on mutation testing as a quality signal. Use @112-java-maven-plugins to add and configure the Pitest Maven plugin.
  • Mapping code smells to refactoring techniques β€” The Radar recommends Agent Skills and slash commands for mapping legacy patterns to specific refactoring approaches. This project covers this for Java across multiple dimensions: @141-java-refactoring-with-modern-features for modernising existing code, @121-java-object-oriented-design for OOP principles and code smells, @122-java-type-design for type-level design decisions, @142-java-functional-programming for functional style, and @143-java-functional-exception-handling for functional error handling patterns.

Assess

  • Architecture drift reduction with LLMs β€” The Radar mentions ArchUnit as a deterministic structural tool to combine with LLM-powered evaluation. Use @111-java-maven-dependencies to add ArchUnit to a Java project.
  • Code intelligence as agentic tooling β€” The Radar calls out the Serena MCP server for semantic code retrieval. This project already enables serena as one of its configured MCP servers, giving agents structured access to the codebase rather than relying on text search.

Cross-cutting theme: Putting coding agents on a leash

The Radar's editorial theme explicitly names OpenSpec alongside GitHub Spec-Kit as a spec-driven development framework for structuring workflows through planning, design and implementation. It also calls out HITL as a necessary counterweight to agent autonomy, and Agent Skills as a safer alternative to unrestricted MCP access. All three are central to this project.

Next steps

The next phase is already visible in the v0.17.0 milestone. The backlog continues the same direction as 0.16.0: make agent workflows more useful, more deterministic, and easier to adopt in real teams.

Functionally, the next workstreams are:

  • Expand executable acceptance coverage for Skills, Agents, and Commands, so important behavior is checked with stable Gherkin scenarios instead of relying only on package shape or manual review.
  • Improve analysis methods, including the hamburger method from Gojko Adzic and the two-step method from Kent Beck, so discovery work can become more structured before teams generate ADRs, Specs or implementation tasks.
  • Improve a few references with fewer examples and add more triggers to increase auto-discovery.
  • Add new capabilities such as Karpathy's LLM Wiki; when analyzing a particular feature, take into consideration other aspects of the whole distributed system.
  • Extend Maven guidance with JavaMoney support in the Maven plugin workflow, improving how teams introduce money and currency handling into enterprise builds.
  • Complete the EU regulation review family, so teams can map distributed-system and GenAI decisions against a broader set of engineering evidence patterns.

Enjoy.