ホーム/AI 工程/langchain4j-testing-strategies

langchain4j-testing-strategies

Name: langchain4j-testing-strategies AI Agent Skill
Rating: 4.8 (23 reviews)
Author: giuseppe-trisciuoglio

by @giuseppe-trisciuogliov

4.8(23)

LangChain4jのテスト戦略と実践パターンを提供し、開発者がAIエージェントとコンポーネントの機能を効果的に検証し、その安定性と信頼性を確保するのを支援します。

LangChain4jLLM TestingAI TestingUnit TestingIntegration TestingGitHub

インストール方法

npx skills add giuseppe-trisciuoglio/developer-kit --skill langchain4j-testing-strategies

compare_arrows

Before / After 効果比較

1 组

使用前

LLMベースのアプリケーション（Langchain4jアプリケーションなど）のテストは、出力の不確実性があるため困難な場合があります。開発者はエンドツーエンドテストのみを行う傾向があり、テストカバレッジの低さ、実行速度の遅さ、デバッグの困難さにつながります。

使用後

Langchain4j Testing Strategiesスキルを通じて、ユニットテスト（LLM応答のシミュレーション）、統合テスト（実際のLLMとの対話）、エンドツーエンドテストを含む階層的なテストアプローチを学び、適用することで、テストの信頼性、速度、カバレッジを向上させることができます。

description SKILL.md

langchain4j-testing-strategies

LangChain4J Testing Strategies

Overview

LangChain4J testing requires specialized strategies due to the non-deterministic nature of LLM responses and the complexity of AI workflows. This skill provides comprehensive patterns for unit testing with mocks, integration testing with Testcontainers, and end-to-end testing for RAG systems, AI Services, and tool execution.

When to Use This Skill

Use this skill when:

Building AI-powered applications with LangChain4J
Writing unit tests for AI services and guardrails
Setting up integration tests with real LLM models
Creating mock-based tests for faster test execution
Using Testcontainers for isolated testing environments
Testing RAG (Retrieval-Augmented Generation) systems
Validating tool execution and function calling
Testing streaming responses and async operations
Setting up end-to-end tests for AI workflows
Implementing performance and load testing

Instructions

To test LangChain4J applications effectively, follow these key strategies:

1. Start with Unit Testing

Use mock models for fast, isolated testing of business logic. See references/unit-testing.md for detailed examples.

// Example: Mock ChatModel for unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(any(String.class)))
    .thenReturn(Response.from(AiMessage.from("Mocked response")));

var service = AiServices.builder(AiService.class)
        .chatModel(mockModel)
        .build();

2. Configure Testing Dependencies

Setup proper Maven/Gradle dependencies for testing. See references/testing-dependencies.md for complete configuration.

Key dependencies:

langchain4j-test - Testing utilities and guardrail assertions
testcontainers - Integration testing with containerized services
mockito - Mock external dependencies
assertj - Better assertions

3. Implement Integration Tests

Test with real services using Testcontainers. See references/integration-testing.md for container setup examples.

@Testcontainers
class OllamaIntegrationTest {
    @Container
    static GenericContainer<?> ollama = new GenericContainer<>(
        DockerImageName.parse("ollama/ollama:0.5.4")
    ).withExposedPorts(11434);

    @Test
    void shouldGenerateResponse() {
        ChatModel model = OllamaChatModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();
        String response = model.generate("Test query");
        assertNotNull(response);
    }
}

4. Test Advanced Features

For streaming responses, memory management, and complex workflows, refer to references/advanced-testing.md.

5. Apply Testing Workflows

Follow testing pyramid patterns and best practices from references/workflow-patterns.md.

70% Unit Tests: Fast, isolated business logic testing
20% Integration Tests: Real service interactions
10% End-to-End Tests: Complete user workflows

Examples

Basic Unit Test

@Test
void shouldProcessQueryWithMock() {
    ChatModel mockModel = mock(ChatModel.class);
    when(mockModel.generate(any(String.class)))
        .thenReturn(Response.from(AiMessage.from("Test response")));

    var service = AiServices.builder(AiService.class)
            .chatModel(mockModel)
            .build();

    String result = service.chat("What is Java?");
    assertEquals("Test response", result);
}

Integration Test with Testcontainers

@Testcontainers
class RAGIntegrationTest {
    @Container
    static GenericContainer<?> ollama = new GenericContainer<>(
        DockerImageName.parse("ollama/ollama:0.5.4")
    );

    @Test
    void shouldCompleteRAGWorkflow() {
        // Setup models and stores
        var chatModel = OllamaChatModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();

        var embeddingModel = OllamaEmbeddingModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();

        var store = new InMemoryEmbeddingStore<>();
        var retriever = EmbeddingStoreContentRetriever.builder()
                .chatModel(chatModel)
                .embeddingStore(store)
                .embeddingModel(embeddingModel)
                .build();

        // Test complete workflow
        var assistant = AiServices.builder(RagAssistant.class)
                .chatLanguageModel(chatModel)
                .contentRetriever(retriever)
                .build();

        String response = assistant.chat("What is Spring Boot?");
        assertNotNull(response);
        assertTrue(response.contains("Spring"));
    }
}

Best Practices

Test Isolation

Each test must be independent
Use @BeforeEach and @AfterEach for setup/teardown
Avoid sharing state between tests

Mock External Dependencies

Never call real APIs in unit tests
Use mocks for ChatModel, EmbeddingModel, and external services
Test error handling scenarios

Performance Considerations

Unit tests should run in < 50ms
Integration tests should use container reuse
Include timeout assertions for slow operations

Quality Assertions

Test both success and error scenarios
Validate response coherence and relevance
Include edge case testing (empty inputs, large payloads)

Reference Documentation

For comprehensive testing guides and API references, see the included reference documents:

Testing Dependencies - Maven/Gradle configuration and setup
Unit Testing - Mock models, guardrails, and individual components
Integration Testing - Testcontainers and real service testing
Advanced Testing - Streaming, memory, and error handling
Workflow Patterns - Test pyramid and best practices

Common Patterns

Mock Strategy

// For fast unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(anyString())).thenReturn(Response.from(AiMessage.from("Mocked")));

// For specific responses
when(mockModel.generate(eq("Hello"))).thenReturn(Response.from(AiMessage.from("Hi")));
when(mockModel.generate(contains("Java"))).thenReturn(Response.from(AiMessage.from("Java response")));

Test Configuration

// Use test-specific profiles
@TestPropertySource(properties = {
    "langchain4j.ollama.base-url=http://localhost:11434"
})
class TestConfig {
    // Test with isolated configuration
}

Assertion Helpers

// Custom assertions for AI responses
assertThat(response).isNotNull().isNotEmpty();
assertThat(response).containsAll(expectedKeywords);
assertThat(response).doesNotContain("error");

Performance Requirements

Unit Tests: < 50ms per test
Integration Tests: Use container reuse for faster startup
Timeout Tests: Include @Timeout for external service calls
Memory Management: Test conversation window limits and cleanup

Security Considerations

Never use real API keys in tests
Mock external API calls completely
Test prompt injection detection
Validate output sanitization

Testing Pyramid Implementation

70% Unit Tests
  ├─ Business logic validation
  ├─ Guardrail testing
  ├─ Mock tool execution
  └─ Edge case handling

20% Integration Tests
  ├─ Testcontainers with Ollama
  ├─ Vector store testing
  ├─ RAG workflow validation
  └─ Performance benchmarking

10% End-to-End Tests
  ├─ Complete user journeys
  ├─ Real model interactions
  └─ Performance under load

Related Skills

spring-boot-test-patterns
unit-test-service-layer
unit-test-boundary-conditions

References

Constraints and Warnings

AI model responses are non-deterministic; tests should use mocks for reliability.
Real API calls in tests should be avoided to prevent costs and rate limiting issues.
Integration tests with Testcontainers require Docker to be available.
Memory management tests should verify proper cleanup between test runs.
Tool execution tests should validate both success and failure scenarios.
Streaming response tests require proper handling of partial data.
RAG tests need properly seeded embedding stores for consistent results.
Performance tests may have high variance due to LLM response times.
Always use test-specific configuration profiles to avoid affecting production data.
Mock-based tests cannot guarantee actual LLM behavior; supplement with integration tests.

Weekly Installs301Repositorygiuseppe-trisci…oper-kitGitHub Stars167First SeenFeb 3, 2026Security AuditsGen Agent Trust HubPass SocketPass SnykWarnInstalled onclaude-code236gemini-cli221opencode221cursor219codex216github-copilot200

forumユーザーレビュー (0)

レビューを書く

効果

使いやすさ

ドキュメント

互換性

レビューなし

統計データ

インストール数801

評価4.8 / 5.0

バージョン

更新日2026年3月17日

比較事例1 件

ユーザー評価

4.8(23)

この Skill を評価

0.0

対応プラットフォーム

🔧Claude Code

🔧OpenClaw

🔧OpenCode

🔧Codex

🔧Gemini CLI

🔧GitHub Copilot

🔧Amp

🔧Kimi CLI

タイムライン

作成2026年3月17日

最終更新2026年3月17日