Files
claude-context-mcp/evaluation/case_study/README.md
Cheney Zhang 5df059a8c1 add case study (#184)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
2025-08-26 17:51:32 +08:00

1.7 KiB

Case Study

This directory includes some case analysis. We compare the both method(grep + Claude Context semantic search) and the traditional grep only method.

These cases are selected from the Princeton NLP's SWE-bench_Verified dataset. The results and the logs are generated by the run_evaluation.py script. For more details, please refer to the evaluation README.md file.

  • 📁 django_14170: Query optimization in YearLookup breaks filtering by "__iso_year"
  • 📁 pydata_xarray_6938: .swap_dims() can modify original object

Each case study includes:

  • Original Issue: The GitHub issue description and requirements
  • Problem Analysis: Technical breakdown of the bug and expected solution
  • Method Comparison: Detailed comparison of both approaches
  • Conversation Logs: The interaction records showing how the LLM agent call the ols and generate the final answer.
  • Results: Performance metrics and outcome analysis

Key Results

Compared with traditional grep only, the both method(grep + Claude Context semantic search) is more efficient and accurate.

Why Grep Fails

  1. Information Overload - Generates hundreds of irrelevant matches
  2. No Semantic Understanding - Only literal text matching
  3. Context Loss - Can't understand code relationships
  4. Inefficient Navigation - Produces many irrelevant results

How Grep + Semantic Search Wins

  1. Intelligent Filtering - Automatically ranks by relevance
  2. Conceptual Understanding - Grasps code meaning and relationships
  3. Efficient Navigation - Direct targeting of relevant sections