DaisukeYoda f0321f0b04 fix: make duplication scoring more strict and lower LSH threshold
## Problems Fixed

1. **Duplication scoring too lenient**: 5.8% duplication was scored as
   100/100 (perfect) because thresholds were too high
2. **LSH threshold preventing clone detection**: 0.78 threshold filtered
   too many candidates, reducing detection accuracy

## Changes

### Duplication Scoring Thresholds (domain/analyze.go)
- `DuplicationThresholdLow`: 10.0 → 3.0
- `DuplicationThresholdMedium`: 25.0 → 10.0
- `DuplicationThresholdHigh`: 40.0 → 20.0

New scoring behavior:
- 0-3% duplication → score 100 (excellent)
- 3-10% duplication → score 70 (good, needs attention)
- 10-20% duplication → score 40 (poor)
- >20% duplication → score 0 (critical)

### LSH Threshold (.pyscn.toml)
- `lsh_similarity_threshold`: 0.78 → 0.50
- Lower threshold allows more clone candidates for APTED verification
- Improves recall without significantly impacting precision

## Impact

Before: `Duplication: 100/100  (5.8% duplication, 5 groups)`
After: `Duplication: 70/100 👍 (5.8% duplication, 5 groups)`

The score now accurately reflects that 5.8% duplication with 5 clone
groups requires attention, while still maintaining an overall healthy
codebase grade.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-05 16:50:19 +09:00
2025-09-12 15:05:55 +09:00
2025-08-07 03:52:24 +09:00
2025-10-05 01:22:07 +09:00

pyscn - Python Code Quality Analyzer

PyPI Go License CI

pyscn is a code quality analyzer for Python vibe coders.

Building with Cursor, Claude, or ChatGPT? pyscn performs structural analysis to keep your codebase maintainable.

Quick Start

# Run analysis without installation
uvx pyscn analyze .
# or
pipx run pyscn analyze .

Demo

https://github.com/user-attachments/assets/b8e52d90-2a8e-4b49-a7e5-8a9a46f6a672

Features

  • 🔍 CFG-based dead code detection Find unreachable code after exhaustive if-elif-else chains
  • 📋 Clone detection with APTED + LSH Identify refactoring opportunities with tree edit distance
  • 🔗 Coupling metrics (CBO) Track architecture quality and module dependencies
  • 📊 Cyclomatic complexity analysis Spot functions that need breaking down

100,000+ lines/sec • Built with Go + tree-sitter

Common Commands

pyscn analyze

Run comprehensive analysis with HTML report

pyscn analyze .                              # All analyses with HTML report
pyscn analyze --json .                       # Generate JSON report
pyscn analyze --select complexity .          # Only complexity analysis
pyscn analyze --select deps .                # Only dependency analysis
pyscn analyze --select complexity,deps,deadcode . # Multiple analyses

pyscn check

Fast CI-friendly quality gate

pyscn check .                      # Quick pass/fail check
pyscn check --max-complexity 15 .  # Custom thresholds

pyscn init

Create configuration file

pyscn init                         # Generate .pyscn.toml

💡 Run pyscn --help or pyscn <command> --help for complete options

Configuration

Create a .pyscn.toml file or add [tool.pyscn] to your pyproject.toml:

# .pyscn.toml
[complexity]
max_complexity = 15

[dead_code]
min_severity = "warning"

[output]
directory = "reports"

⚙️ Run pyscn init to generate a full configuration file with all available options

Installation

# Install with pipx (recommended)
pipx install pyscn

# Or run directly with uvx
uvx pyscn
Alternative installation methods

Build from source

git clone https://github.com/ludo-technologies/pyscn.git
cd pyscn
make build

Go install

go install github.com/ludo-technologies/pyscn/cmd/pyscn@latest

CI/CD Integration

# .github/workflows/code-quality.yml
name: Code Quality
on: [push, pull_request]

jobs:
  quality-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install pyscn
      - name: Quick quality check
        run: pyscn check .
      - name: Generate detailed report
        run: pyscn analyze --json --select complexity,deadcode,deps src/
      - name: Upload report
        uses: actions/upload-artifact@v4
        with:
          name: code-quality-report
          path: .pyscn/reports/

Documentation

📚 Development GuideArchitectureTesting

License

MIT License — see LICENSE


Built with ❤️ using Go and tree-sitter

Languages
Go 97.3%
Shell 2.2%
Makefile 0.3%
Python 0.2%