ホーム/AI 工程/langchain4j-testing-strategies
L

langchain4j-testing-strategies

by @giuseppe-trisciuogliov
4.8(23)

LangChain4jのテスト戦略と実践パターンを提供し、開発者がAIエージェントとコンポーネントの機能を効果的に検証し、その安定性と信頼性を確保するのを支援します。

LangChain4jLLM TestingAI TestingUnit TestingIntegration TestingGitHub
インストール方法
npx skills add giuseppe-trisciuoglio/developer-kit --skill langchain4j-testing-strategies
compare_arrows

Before / After 効果比較

1
使用前

LLMベースのアプリケーション(Langchain4jアプリケーションなど)のテストは、出力の不確実性があるため困難な場合があります。開発者はエンドツーエンドテストのみを行う傾向があり、テストカバレッジの低さ、実行速度の遅さ、デバッグの困難さにつながります。

使用後

Langchain4j Testing Strategiesスキルを通じて、ユニットテスト(LLM応答のシミュレーション)、統合テスト(実際のLLMとの対話)、エンドツーエンドテストを含む階層的なテストアプローチを学び、適用することで、テストの信頼性、速度、カバレッジを向上させることができます。

description SKILL.md

langchain4j-testing-strategies

LangChain4J Testing Strategies

Overview

LangChain4J testing requires specialized strategies due to the non-deterministic nature of LLM responses and the complexity of AI workflows. This skill provides comprehensive patterns for unit testing with mocks, integration testing with Testcontainers, and end-to-end testing for RAG systems, AI Services, and tool execution.

When to Use This Skill

Use this skill when:

  • Building AI-powered applications with LangChain4J

  • Writing unit tests for AI services and guardrails

  • Setting up integration tests with real LLM models

  • Creating mock-based tests for faster test execution

  • Using Testcontainers for isolated testing environments

  • Testing RAG (Retrieval-Augmented Generation) systems

  • Validating tool execution and function calling

  • Testing streaming responses and async operations

  • Setting up end-to-end tests for AI workflows

  • Implementing performance and load testing

Instructions

To test LangChain4J applications effectively, follow these key strategies:

1. Start with Unit Testing

Use mock models for fast, isolated testing of business logic. See references/unit-testing.md for detailed examples.

// Example: Mock ChatModel for unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(any(String.class)))
    .thenReturn(Response.from(AiMessage.from("Mocked response")));

var service = AiServices.builder(AiService.class)
        .chatModel(mockModel)
        .build();

2. Configure Testing Dependencies

Setup proper Maven/Gradle dependencies for testing. See references/testing-dependencies.md for complete configuration.

Key dependencies:

  • langchain4j-test - Testing utilities and guardrail assertions

  • testcontainers - Integration testing with containerized services

  • mockito - Mock external dependencies

  • assertj - Better assertions

3. Implement Integration Tests

Test with real services using Testcontainers. See references/integration-testing.md for container setup examples.

@Testcontainers
class OllamaIntegrationTest {
    @Container
    static GenericContainer<?> ollama = new GenericContainer<>(
        DockerImageName.parse("ollama/ollama:0.5.4")
    ).withExposedPorts(11434);

    @Test
    void shouldGenerateResponse() {
        ChatModel model = OllamaChatModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();
        String response = model.generate("Test query");
        assertNotNull(response);
    }
}

4. Test Advanced Features

For streaming responses, memory management, and complex workflows, refer to references/advanced-testing.md.

5. Apply Testing Workflows

Follow testing pyramid patterns and best practices from references/workflow-patterns.md.

  • 70% Unit Tests: Fast, isolated business logic testing

  • 20% Integration Tests: Real service interactions

  • 10% End-to-End Tests: Complete user workflows

Examples

Basic Unit Test

@Test
void shouldProcessQueryWithMock() {
    ChatModel mockModel = mock(ChatModel.class);
    when(mockModel.generate(any(String.class)))
        .thenReturn(Response.from(AiMessage.from("Test response")));

    var service = AiServices.builder(AiService.class)
            .chatModel(mockModel)
            .build();

    String result = service.chat("What is Java?");
    assertEquals("Test response", result);
}

Integration Test with Testcontainers

@Testcontainers
class RAGIntegrationTest {
    @Container
    static GenericContainer<?> ollama = new GenericContainer<>(
        DockerImageName.parse("ollama/ollama:0.5.4")
    );

    @Test
    void shouldCompleteRAGWorkflow() {
        // Setup models and stores
        var chatModel = OllamaChatModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();

        var embeddingModel = OllamaEmbeddingModel.builder()
                .baseUrl(ollama.getEndpoint())
                .build();

        var store = new InMemoryEmbeddingStore<>();
        var retriever = EmbeddingStoreContentRetriever.builder()
                .chatModel(chatModel)
                .embeddingStore(store)
                .embeddingModel(embeddingModel)
                .build();

        // Test complete workflow
        var assistant = AiServices.builder(RagAssistant.class)
                .chatLanguageModel(chatModel)
                .contentRetriever(retriever)
                .build();

        String response = assistant.chat("What is Spring Boot?");
        assertNotNull(response);
        assertTrue(response.contains("Spring"));
    }
}

Best Practices

Test Isolation

  • Each test must be independent

  • Use @BeforeEach and @AfterEach for setup/teardown

  • Avoid sharing state between tests

Mock External Dependencies

  • Never call real APIs in unit tests

  • Use mocks for ChatModel, EmbeddingModel, and external services

  • Test error handling scenarios

Performance Considerations

  • Unit tests should run in < 50ms

  • Integration tests should use container reuse

  • Include timeout assertions for slow operations

Quality Assertions

  • Test both success and error scenarios

  • Validate response coherence and relevance

  • Include edge case testing (empty inputs, large payloads)

Reference Documentation

For comprehensive testing guides and API references, see the included reference documents:

Common Patterns

Mock Strategy

// For fast unit tests
ChatModel mockModel = mock(ChatModel.class);
when(mockModel.generate(anyString())).thenReturn(Response.from(AiMessage.from("Mocked")));

// For specific responses
when(mockModel.generate(eq("Hello"))).thenReturn(Response.from(AiMessage.from("Hi")));
when(mockModel.generate(contains("Java"))).thenReturn(Response.from(AiMessage.from("Java response")));

Test Configuration

// Use test-specific profiles
@TestPropertySource(properties = {
    "langchain4j.ollama.base-url=http://localhost:11434"
})
class TestConfig {
    // Test with isolated configuration
}

Assertion Helpers

// Custom assertions for AI responses
assertThat(response).isNotNull().isNotEmpty();
assertThat(response).containsAll(expectedKeywords);
assertThat(response).doesNotContain("error");

Performance Requirements

  • Unit Tests: < 50ms per test

  • Integration Tests: Use container reuse for faster startup

  • Timeout Tests: Include @Timeout for external service calls

  • Memory Management: Test conversation window limits and cleanup

Security Considerations

  • Never use real API keys in tests

  • Mock external API calls completely

  • Test prompt injection detection

  • Validate output sanitization

Testing Pyramid Implementation

70% Unit Tests
  ├─ Business logic validation
  ├─ Guardrail testing
  ├─ Mock tool execution
  └─ Edge case handling

20% Integration Tests
  ├─ Testcontainers with Ollama
  ├─ Vector store testing
  ├─ RAG workflow validation
  └─ Performance benchmarking

10% End-to-End Tests
  ├─ Complete user journeys
  ├─ Real model interactions
  └─ Performance under load

Related Skills

  • spring-boot-test-patterns

  • unit-test-service-layer

  • unit-test-boundary-conditions

References

Constraints and Warnings

  • AI model responses are non-deterministic; tests should use mocks for reliability.

  • Real API calls in tests should be avoided to prevent costs and rate limiting issues.

  • Integration tests with Testcontainers require Docker to be available.

  • Memory management tests should verify proper cleanup between test runs.

  • Tool execution tests should validate both success and failure scenarios.

  • Streaming response tests require proper handling of partial data.

  • RAG tests need properly seeded embedding stores for consistent results.

  • Performance tests may have high variance due to LLM response times.

  • Always use test-specific configuration profiles to avoid affecting production data.

  • Mock-based tests cannot guarantee actual LLM behavior; supplement with integration tests.

Weekly Installs301Repositorygiuseppe-trisci…oper-kitGitHub Stars167First SeenFeb 3, 2026Security AuditsGen Agent Trust HubPassSocketPassSnykWarnInstalled onclaude-code236gemini-cli221opencode221cursor219codex216github-copilot200

forumユーザーレビュー (0)

レビューを書く

効果
使いやすさ
ドキュメント
互換性

レビューなし

統計データ

インストール数801
評価4.8 / 5.0
バージョン
更新日2026年3月17日
比較事例1 件

ユーザー評価

4.8(23)
5
0%
4
0%
3
0%
2
0%
1
0%

この Skill を評価

0.0

対応プラットフォーム

🔧Claude Code
🔧OpenClaw
🔧OpenCode
🔧Codex
🔧Gemini CLI
🔧GitHub Copilot
🔧Amp
🔧Kimi CLI

タイムライン

作成2026年3月17日
最終更新2026年3月17日