The Config.Validate() method was not calling CloneConfig.Validate(),
which could allow invalid clone detection configurations to pass through.
This fix ensures proper validation of all clone-related settings including
thresholds, filtering, and analysis parameters.
Updated CBO (Coupling Between Objects) analysis to use industry-standard
thresholds for more accurate code quality assessment:
**CBO Risk Thresholds:**
- Low Risk: CBO ≤ 3 (was ≤ 5)
- Medium Risk: 3 < CBO ≤ 7 (was 5 < CBO ≤ 10)
- High Risk: CBO > 7 (was > 10)
**Coupling Ratio Thresholds:**
- Low: 5% of classes (was 10%)
- Medium: 15% of classes (was 30%)
- High: 30% of classes (was 50%)
**Coupling Penalties:**
- Low: 6 points (was 5)
- Medium: 12 points (was 10)
- High: 20 points (was 16, now aligned with other high penalties)
These changes make coupling detection more strict and aligned with industry
best practices, helping identify maintainability issues earlier.
**Files Updated:**
- domain/analyze.go: Updated coupling ratio thresholds and penalties
- domain/cbo.go: Updated default CBO thresholds with industry standard comments
- internal/analyzer/cbo.go: Updated default CBO options
- domain/analyze_test.go: Updated test expectations for new thresholds
- internal/analyzer/cbo_test.go: Updated test expectations
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Problem
LSH (Locality-Sensitive Hashing) mode was producing inflated similarity
scores compared to APTED mode, resulting in false positive clone detection.
## Root Cause
MinHash Jaccard similarity tends to be more lenient than APTED tree edit
distance similarity due to:
- Coarse feature extraction (maxSubtreeHeight: 3, kGramSize: 4)
- Set-based similarity ignoring structural differences
- Common patterns (If, For, Assign) creating false overlaps
## Solution
Increased LSHSimilarityThreshold from 0.78 to 0.88 across all layers:
- internal/analyzer/clone_detector.go:216
- domain/clone.go:350
- internal/config/clone_config.go:200
- cmd/pyscn/init.go:84 (template)
The new threshold (0.88) is Type2Threshold (0.85) + 0.03, ensuring:
- More strict candidate filtering in LSH stage
- Reduced false positives
- Final APTED verification still provides accurate similarity
## Impact
- Fragments >= 500: LSH mode now produces comparable results to APTED
- Better precision without sacrificing recall
- Maintains performance benefits of LSH acceleration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add SetBatchSizeLarge() method to CloneDetector for controlled access
- Update clone_memory_test.go to use setter instead of direct field access
- Maintains encapsulation of cloneDetectorConfig struct
Address issues found in code review:
1. Fix encapsulation violation:
- Change CloneDetectorConfig embedding from public to private
- Add SetUseLSH() method for controlled access
- Update service layer to use SetUseLSH() instead of direct field access
2. Remove dead code:
- Delete unused MemoryLimit field from CloneDetectorConfig
- Remove from type definition and default configuration
This prevents external packages from directly accessing internal
configuration fields while maintaining all functionality.
All tests pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Remove indirection and deprecated code related to CloneDetectorConfig:
- Embed CloneDetectorConfig directly into CloneDetector struct
- Replace all cd.config.X with cd.X (44 occurrences)
- Remove unused NewCloneDetectorFromConfig() function
- Remove DEPRECATED comments from CloneDetectorConfig and NewCloneDetector
- Update service layer to use detector.UseLSH directly
- Clean up unnecessary comments in clone_adapters.go
This simplifies the architecture while maintaining all functionality.
All tests pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Problem
LSH (Locality-Sensitive Hashing) acceleration was not working because:
1. LSH settings in [clones] section of .pyscn.toml were not being loaded
2. toml_loader.go expected nested [lsh] section, but config used flat structure
3. cloneConfigToCloneRequest() was not converting LSH settings to CloneRequest
4. Auto-enable logic based on fragment count was not implemented
This caused clone detection to always use slow APTED algorithm, even for
large projects where LSH would provide significant speedup.
## Solution
1. Added ClonesConfig struct to read flat [clones] section structure
2. Implemented mergeClonesSection() to load all settings including LSH
3. Extended CloneRequest with LSH fields (LSHEnabled, LSHAutoThreshold, etc.)
4. Added auto-enable logic in clone_service.go:
- "auto": enable LSH when fragments >= threshold (default: 500)
- "true": always enable LSH
- "false": always disable LSH
5. Added diagnostic messages showing LSH decision
## Changes
- domain/clone.go: Add LSH config fields to CloneRequest
- internal/config/toml_loader.go: Add ClonesConfig struct and merge logic
- service/clone_config_loader.go: Convert LSH settings to CloneRequest
- service/clone_service.go: Implement auto-enable logic based on fragment count
- .pyscn.toml: Document LSH settings (no functional change)
## Testing
- Verified LSH auto-detection with different thresholds
- Confirmed settings load correctly from .pyscn.toml
- All existing tests pass
## Related
- Fixes issue discovered during performance investigation
- Prepares for config refactoring in #124🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Align with Ruff's behavior where dedicated config files take precedence
over pyproject.toml. This makes the configuration priority more intuitive:
when users explicitly create a .pyscn.toml file, it should override any
settings in pyproject.toml.
Changes:
- Update TomlConfigLoader.LoadConfig() to check .pyscn.toml first
- Update GetSupportedConfigFiles() to reflect new priority order
- Update documentation in ARCHITECTURE.md to reflect new behavior
This is a breaking change for projects that have both .pyscn.toml and
pyproject.toml with different settings.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fix critical issues in CFG builder for more accurate dead code detection:
1. Fix blockTerminates() to check only last statement
- Previously checked all statements, causing false positives
- Now only checks the final statement to avoid detecting terminators
in nested control flow (e.g., return inside if-statement within block)
- Prevents incorrect dead code detection in complex scenarios
2. Replace error logging with panic for parser bugs
- Changed from logging to fail-fast panic when elif_clause has missing fields
- Parser bugs are programming errors, not user errors, so immediate failure
with detailed error message is appropriate
- Improves debugging by failing at the exact point of error
These changes improve the accuracy and debuggability of dead code analysis
without breaking any existing functionality.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixes#117
## Problem
Dead code detection failed to identify unreachable code after exhaustive
if-elif-else chains where all branches terminate with return/raise/break/
continue statements.
## Root Cause
1. Parser wasn't properly handling elif_clause and else_clause nodes from
tree-sitter, causing else clauses to be lost in elif chains
2. CFG builder always created fall-through edges to merge blocks even
when all conditional branches terminated
## Solution
### Parser Fixes (ast_builder.go)
- Added buildElifClause() to properly parse elif nodes with Test/Body/Orelse
- Added buildElseClause() to parse else nodes with Body
- Modified buildIfStatement() to collect all alternative branches and chain
them correctly via attachElseToElifChain()
- Tree-sitter returns both elif_clause and else_clause as "alternative"
field siblings, now both are properly collected and linked
### CFG Builder Enhancements (cfg_builder.go)
- Added blockTerminates() to check if block ends with terminating statements
- Added allBranchesTerminate() to check if all conditional branches terminate
- Modified processIfStatement() to detect exhaustive termination and create
unreachable blocks instead of connecting to merge
- Modified processIfStatementElif() with same termination detection
- Updated convertElifClauseToIf() to use parser-populated fields
- Both functions now handle else_clause nodes by extracting their Body
### Test Updates (dead_code_test.go)
- Updated UnreachableElif: expectedDead 0 → 1
- Added ExhaustiveIfElseReturn test case
- Added NestedExhaustiveReturn test case
- Added MixedTerminators test case (return/raise mix)
- Updated ComplexControlFlow: expectedDead 3 → 4 (more accurate)
## Testing
✅ All 18 dead code detection tests passing
✅ All analyzer tests passing (16.8s)
✅ All parser tests passing (0.26s)
## Pattern
Follows existing unreachable block creation pattern used for return/break/
continue statements (lines 323-324, 738-740, 759-761 in cfg_builder.go).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove incomplete quality analysis feature
The quality analysis feature was using fixed/mock values and incomplete
implementations, making it misleading for users. This commit removes:
- QualityMetricsResult and RefactoringTarget domain types
- AnalyzeQuality field from SystemAnalysisRequest
- AnalyzeQuality method from service interface and implementation
- Maintainability and TechnicalDebt calculations (always returned fixed values)
- Quality formatting in all output formats (text, JSON, CSV, HTML)
- Quality configuration options
- Deprecated unused helper functions that were kept for backward compatibility
The removal focuses the codebase on its core functionality:
- Dependency analysis with circular dependency detection
- Architecture validation and layer analysis
- Robert Martin's coupling metrics (Ca, Ce, I, A, D)
All tests pass and the codebase is cleaner and more maintainable.
Fixes#109🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix CI: Remove MaintainabilityIndex assertion from system_metrics_test
The MaintainabilityIndex field was removed as part of the quality analysis
cleanup, but a new test file was added to main after the PR was created.
This commit fixes the test to remove the assertion on the deleted field.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* chore: drop quality config toggle
---------
Co-authored-by: DaisukeYoda <daisukeyoda@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Implements Phase 1 of Issue #34 - CFG support for Python comprehensions.
- Add processComprehension() method to model comprehensions as implicit loops
- Support list/dict/set comprehensions and generator expressions
- Handle filter conditions with proper conditional edges
- Fix AST builder to correctly extract if clauses in comprehensions
- Add comprehensive test coverage for all comprehension types
The implementation models comprehensions with the following CFG structure:
init -> loop_header -> [filter] -> body -> append -> loop_header -> exit
This significantly improves static analysis accuracy for Python code that
uses comprehensions, which are a common Python idiom.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implement actual system analysis functionality replacing stub returns:
* Add extractModuleMetrics for real dependency metrics
* Add generateCycleBreakingSuggestions for circular dependency resolution
* Add classifyModulesByQuality for module quality classification
* Add generateRefactoringTargets with priority calculation
* Add identifyHotSpots for complex coupled modules
- Enable quality analysis in analyze command (AnalyzeQuality: true)
- Fix dependency resolution for project modules:
* Add resolveAbsoluteImportWithProject to prioritize project modules
* Search current directory, project root, and parent directory
- Add configuration loading to deps command:
* Load architecture rules from config files
* Merge CLI flags with configuration
- Improve architecture layer detection accuracy:
* Prioritize last part of module path (e.g., "router" → presentation)
* Remove business-specific names from layer patterns
* Add project prefix support in pattern matching
This resolves the issue where system analysis was returning mostly empty
values for dependencies, quality metrics, and recommendations.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat(clone): improve clone detection grouping and display
Major improvements to clone detection quality and performance:
Display improvements:
- Changed "Clone Pairs" to "Unique Fragments" for accurate representation
- Added TotalClones field to AnalyzeSummary struct
- Fixed misleading 10000 pairs display (was showing MaxClonePairs limit)
Grouping algorithm improvements:
- Implemented new Centroid-based grouping algorithm to avoid transitive similarity issues
- Changed default grouping from Connected Components to K-Core for better balance
- K-Core provides better quality groups while maintaining good performance
Threshold adjustments for better clone quality:
- Type3 threshold: 0.70 → 0.80
- Type4 threshold: 0.60 → 0.75
Results:
- Before: 417 clones with 11% average similarity (poor quality groups)
- After: 26 clones with 81% average similarity (high quality groups)
Available grouping modes:
- k_core (default): Good balance of quality and performance
- centroid: High quality but slower (O(n²), best for small projects)
- star: Medium quality and speed
- complete_linkage: High precision but slow
- connected: Fastest but lowest quality
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address code review feedback on clone detection
- Add MaxClonePairs limit check in generatePairsFromGroups function
- Add explanatory comments for dual Centroid implementations
- Clarify that clone_detector.go has optimized direct implementation
- Ensure centroid_grouping.go maintains GroupingStrategy compatibility
* refactor: use unified GroupingStrategy interface for all grouping modes
- Remove special handling for Centroid mode in DetectClonesWithContext
- Delete redundant detectClonesWithCentroid method
- All grouping strategies now use unified CreateGroupingStrategy interface
- Optimize CentroidGrouping with similarity index for better performance
- Maintain architectural consistency with strategy pattern
* test: update tests for new clone type thresholds
- Update Type3 threshold tests from 0.70 to 0.80
- Update Type4 threshold tests from 0.60 to 0.75
- Adjust test expectations for ClassifyCloneType:
- 0.75 similarity now maps to Type4 (was Type3)
- 0.65 similarity is no longer a valid clone (was Type4)
- Update IsSignificantClone test for 0.60 threshold (now below minimum)
* fix: resolve lint and test failures in PR #97
- Remove unused methods determineGroupCloneType and generatePairsFromGroups from clone_detector.go
- Update test expectations to match new clone thresholds (Type-3: 0.80, Type-4: 0.75)
- Fix test cases in clone_detector_test.go for threshold value tests
These changes ensure all CI checks pass for the improved clone detection grouping feature.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: resolve memory leak and performance issues in centroid grouping
- Replace pointer-based map keys with string keys to fix memory leak
- Add group size limit (50) for performance optimization on large codebases
- Use makePairKey function with fragmentID for consistent string-based keys
- Prevent potential GC issues with *CodeFragment keys in similarity index
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: resolve configuration consistency issues in clone type thresholds
This commit addresses the critical configuration management inconsistencies
identified in PR97 where clone type thresholds had different values across
multiple locations, causing unpredictable behavior.
Changes:
- Fix init.go template values to match constants (0.80→0.80, 0.75→0.75)
- Remove hardcoded thresholds from centroid_grouping.go classification logic
- Add dynamic threshold configuration to GroupingConfig and CentroidGrouping
- Ensure clone detector passes thresholds to all grouping strategies
- Add comprehensive tests for threshold consistency and configuration flow
Impact:
- User-configured thresholds now properly affect clone classification
- Consistent behavior across CLI flags, config files, and defaults
- Eliminates Type3/Type4 threshold conflicts (0.80 vs 0.75, 0.75 vs 0.65)
- Predictable clone detection results matching user expectations
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* style: fix indentation and formatting issues in analyze command and related files
This commit applies consistent code formatting across the analyze command
infrastructure, addressing indentation inconsistencies introduced during
recent refactoring work.
Changes:
- Fix mixed spaces/tabs indentation in cmd/pyscn/analyze.go
- Standardize struct field alignment in domain/analyze.go
- Correct formatting in service layer files for analyze functionality
- Apply consistent code style throughout system analysis components
Technical details:
- Convert spaces to tabs for Go struct field alignment
- Maintain consistent indentation patterns across all modified files
- No functional changes, purely cosmetic formatting improvements
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: add missing fragmentID and makePairKey methods to CentroidGrouping
This commit resolves undefined function references in centroid_grouping.go
by adding the missing helper methods as instance methods of CentroidGrouping.
Changes:
- Add fragmentID() method to generate stable fragment identifiers
- Add makePairKey() method for creating fragment pair keys
- Convert function calls to method calls (c.makePairKey, c.fragmentID)
- Add fmt import for string formatting
Technical details:
- fragmentID creates location-based IDs: "filepath|startLine|endLine|startCol|endCol"
- makePairKey ensures deterministic ordering for fragment pairs
- Methods are now encapsulated within CentroidGrouping for better design
- Removes dependency on external fragmentID function from star_medoid_grouping.go
Impact:
- Fixes potential build errors when centroid grouping is used independently
- Improves code organization and reduces cross-file dependencies
- Maintains same functionality with better encapsulation
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: DaisukeYoda <daisukeyoda@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
* feat(deps): add auto-detect architecture validation
- Add autoDetectArchitecture() to automatically identify common layer patterns
- Detect standard layers: presentation, application, domain, infrastructure
- Apply default layered architecture rules when no config exists
- Add --auto-detect flag (default: true) to deps command
- Fix module pattern matching to include submodules
- Fix context cancellation handling in architecture analysis
- Improve config merging to preserve rules while applying CLI overrides
This allows architecture validation without configuration files,
making it easier to adopt and reducing configuration complexity.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat(deps): make architecture validation default enabled
- Change --validate flag default from false to true
- Update help text to reflect validation is now default
- Users can disable with --no-validate if needed
- Improves code quality by default without user intervention
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(deps): disable auto-detection by default to avoid false positives
- Set --validate flag default to false (opt-in)
- Set --auto-detect flag default to false (opt-in)
- Fix layer patterns: remove "app" from application layer
- Make auto-detected rules more permissive for real-world projects
- Update help text to reflect opt-in behavior
Auto-detection was causing too many false positives in real projects.
Users should explicitly enable validation when their project structure
matches standard patterns or when they have custom rules defined.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: remove --validate flag and make architecture validation always enabled
- Remove depsValidate flag variable
- Set AnalyzeArchitecture to always true
- Update help text to reflect validation is always performed
- Remove validate flag from CLI options
This ensures architecture validation cannot be disabled,
preventing the mistake of shipping unusable features.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(formatter): improve Architecture tab HTML layout to match Dependencies tab
- Refactor Architecture HTML to use consistent GenerateMetricCard structure
- Add proper status badges for violations and compliance score
- Implement table-based layout for violations and layer dependencies
- Add color coding for compliance score (green/yellow/red)
- Show layer coupling and cohesion metrics in organized tables
- Limit violations display to 20 with overflow indicator
- Use consistent section headers and metric grid layout
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: add white background to tab-content for consistent appearance
- Add background: white to .tab-content CSS class
- Ensures both Dependencies and Architecture tabs have white backgrounds
- Fixes issue where Architecture tab had transparent background showing gradient
* fix: remove extra closing div tags in HTML content generation
- Remove extra </div> from writeHTMLDependenciesContent
- Remove extra </div> from writeHTMLArchitectureContent
- Fixes tab content structure to match properly within tabs container
* fix: exclude __init__.py to submodule dependencies and allow same-layer deps
- Skip dependencies from __init__.py to its own submodules
- This is a common Python pattern for re-exporting public APIs
- Allow same-layer dependencies in auto-detected architecture rules
- Reduces false positives from 344 to 15 violations (97.4% compliance)
* fix: improve layer detection by prioritizing specific patterns
- findLayerForModule now selects the most specific matching pattern
- Specificity determined by number of dots in pattern (app.services > app)
- Fixes misclassification of services modules as infrastructure
- Violations reduced from 344 to 13 (96% reduction)
- Compliance score: 97.7%
* fix: remove unnecessary --auto-detect flag and improve pattern detection
- Remove --auto-detect flag as auto-detection happens by default when no config exists
- Add 'app' and 'application' to application layer patterns for better detection
- Update help text to reflect automatic behavior
The deps command now automatically detects architecture patterns when no rules
are defined in config files, making the --auto-detect flag redundant.
* fix: remove incorrectly added 'app' and 'application' patterns
Remove the patterns that were causing all modules starting with 'app' to be
misclassified as application layer, which broke the layer detection.
* fix(arch,ui): improve layer matching specificity and HTML cards
- Use original pattern specificity with tie-breaker for layer assignment
- Include layer names in violation summary table
- Show Violations as large metric number (not small badge)
- Unify metric-grid class; harden tab switching script
* feat(analyze-summary): add system Dependencies and Architecture metrics to Summary tab and clarify CBO labels
- Summary now shows Total Modules, Total Dependencies, Max Depth, Cycles
- Summary shows Architecture Violations, Compliance, Layers, Total Rules
- Rename CBO labels to avoid confusion with module dependencies
- Update tabs to use concise names (no parentheses)
* feat(score): include module dependencies and architecture in Health Score\n\n- Add deps/arch metrics to AnalyzeSummary\n- Revise scoring to 5 domains (Complexity, Dead Code, Clones, CBO, Deps/Arch)\n- Pull System analysis metrics into summary during report generation\n
* fix(analyze): correct CBO goroutine block and separate system analysis goroutine
- Close missing braces and return paths in CBO analysis block
- Start system (deps+arch) goroutine independently after CBO block
- Prevent potential build failures from malformed block structure
---------
Co-authored-by: DaisukeYoda <daisukeyoda@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
- Fix error check for CSV writer.Write calls
- Replace if-HasSuffix with strings.TrimSuffix
- Remove unused append result for highComplexityModules
- Remove all unused functions across codebase
- Remove unused import in config loader
- Fix DOT format type consistency
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive dependency analysis with circular dependency detection,
coupling metrics, and unified HTML reporting across all commands.
- Add deps command for dependency analysis
- Implement Tarjan's algorithm for circular dependency detection
- Calculate Robert Martin's coupling metrics (Ca, Ce, I, A, D)
- Create unified HTML template for consistent reporting
- Fix import extraction with regex-based fallback
- Simplify command interfaces following KISS principle
- Standardize output file naming and location
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Remove all "like ruff" references from code comments to avoid
external dependencies in documentation. The TOML-only configuration
strategy stands on its own merits without needing to reference
other tools.
Files updated:
- service/clone_config_loader.go
- service/config_loader.go
- service/dead_code_config_loader.go
- cmd/pyscn/clone.go
- internal/config/toml_loader.go
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses critical bugs identified in PR review and
unifies configuration loading across all commands:
Critical Bug Fixes:
- Fix boolean merging bug where config values always override defaults
even when unset (use pointer types to detect unset vs false)
- Fix YAML syntax being written to TOML files in tests
- Fix missing CLI flag processing causing validation bypass
- Update help examples from deprecated --format to new --json syntax
- Fix test expectations for TOML section syntax
Configuration Unification:
- Unify all commands to use TOML-only configuration loading
- Eliminate double-loading conflicts between clone/deadcode/complexity
- Remove remaining YAML/JSON config file discovery
- Ensure consistent pyproject.toml > .pyscn.toml > defaults priority
All previously failing tests now pass with proper TOML configuration
handling and unified loading strategy across the entire codebase.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Update all references from pyqol and pyscan to pyscn
- Rename directories: cmd/pyqol → cmd/pyscn, python/src/pyscan → python/src/pyscn
- Update Go module path to github.com/ludo-technologies/pyscn
- Update PyPI package name to pyscn
- Update all documentation and configuration files
- Successfully published to PyPI as pyscn
This change provides a shorter, more memorable package name (5 characters)
that is easier to type in commands like 'pyscn analyze' while maintaining
the core meaning of Python code scanning/analysis.
- Remove IncludeThirdParty field from CBOOptions struct in analyzer
- Remove IncludeThirdParty from CBOAnalysisOptions in domain
- Update DefaultCBOOptions to remove unused field initialization
- Clean up buildCBOOptions method in service layer
- Update tests to remove references to deleted field
This field was defined but never used anywhere in the codebase,
causing code bloat and potential confusion. The field can be
re-implemented later if third-party dependency filtering is needed.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace string parsing with structural AST analysis for Call nodes
- Add extractClassNameFromCallNode and extractClassNameFromAttribute methods
- Fix function calls being incorrectly counted as class dependencies
- Enhance walkNode to traverse Args and Value fields for nested calls
- Only count actual class instantiations (local, imported, builtins) as dependencies
- Ensure A(B()) detects both A and B dependencies correctly
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add builtinFunctions map to CBOAnalyzer struct
- Separate built-in types (list, int, dict) from built-in functions (print, len, str)
- Always exclude built-in functions from dependencies regardless of IncludeBuiltins setting
- Built-in types only included when IncludeBuiltins=true
- Fix test failures: Logger class now CBO=0, builtin test now CBO=3
- Add isBuiltinFunction() helper method for cleaner dependency filtering
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed critical issues identified in code review:
- Type annotation processing: Added support for tree-sitter specific nodes (generic_type, type_parameter) and proper handling of subscript types like List[User]
- Instantiation detection: Fixed NodeCall processing to handle assignment-based instantiation (self.logger = Logger()) by parsing Call(Name(...)) format
- Wildcard pattern matching: Implemented regex-based matching for patterns like *Test* using regexp.MatchString
- Removed unused ImportDependencies field from CBOResult and CBOMetrics structs and all related code
- Changed zero classes error policy to return valid empty response with warnings instead of error
- Fixed builtin types handling for both type annotations and instantiation contexts
All CBO analyzer tests now pass (24/24). End-to-end testing confirms proper detection of:
- Class inheritance dependencies
- Type annotation dependencies (including generics)
- Object instantiation dependencies
- Builtin type usage
- Pattern-based filtering
- Risk level assessment
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Replace global viper instance with new instances in loadConfigFromFile
and SaveConfig functions to prevent race conditions during concurrent
execution of analyze command.
Fixes#62🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add error handling to resolveOutputDirectory function
- Fix Windows path traversal with volume-aware termination
- Fix test race conditions by setting cmd.Dir in E2E tests
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Overview
Implements a comprehensive configuration discovery system inspired by Ruff,
eliminating environment variable dependencies and providing seamless project
integration through YAML configuration files.
## Key Features
- **Hierarchical Discovery**: Target directory → upward traversal → XDG config → home directory
- **Multiple Formats**: Support for .pyqol.yaml, .pyqol.yml, pyqol.yaml, pyqol.yml, and JSON variants
- **Output Directory Control**: Configure report file destinations via output.directory
- **Test Environment**: All tests now use temporary directories, no project file pollution
## Configuration Discovery Order
1. Target directory & parent directories (search upward to filesystem root)
2. XDG config directory ($XDG_CONFIG_HOME/pyqol/ or ~/.config/pyqol/)
3. Home directory (~/.pyqol.yaml for backward compatibility)
## Implementation Details
- Added LoadConfigWithTarget() for target-aware config discovery
- Updated all commands to use generateFileNameWithTarget()
- Modified E2E tests to use temporary config files
- Fixed unit tests that were generating files in project directories
- Enhanced utils.go with configuration-aware filename generation
## Breaking Changes
- Environment variable PYQOL_OUTPUT_DIR is no longer supported
- All file output now controlled via .pyqol.yaml configuration
## Documentation Updates
- Updated README.md with configuration examples and usage
- Enhanced DEVELOPMENT.md with configuration system documentation
- Updated ARCHITECTURE.md to reflect new configuration discovery
- Added practical examples for configuration-based workflows
## Testing
- All tests pass without generating files in project directories
- E2E tests use temporary config files for output control
- Unit tests updated to avoid project directory pollution
- Configuration discovery thoroughly tested
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Replace --format flag with individual format flags (--html, --json, --csv, --yaml)
- Add automatic file generation for non-text formats with timestamps
- Add HTML report auto-opening with --no-open flag to disable
- Add unified analyze command with comprehensive reporting
- Add browser opening utility for cross-platform support
- Update all commands (clone, complexity, deadcode) to use new format system
- Add OutputPath and NoOpen fields to domain models
- Update configuration loaders to handle new flag system
- Add HTML format support to all formatters
- Update tests to match new flag expectations
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fix missing td matrix update in computeForestDistance else branch
- This was causing trees with same labels but different structures
to return distance 0.0 (100% similarity)
- Update test expectation for more accurate similarity calculation
- Average similarity in real projects now shows realistic 0.84 instead of 1.00
- Clone detection now reports accurate clone pair counts instead of fallback values
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implement thread-safe FlagTracker with sync.RWMutex for concurrent access
- Replace raw map[string]bool with FlagTracker throughout service layer
- Add comprehensive tests including concurrent access testing
- Create CloneConfigurationLoaderWithFlags for consistency
- Ensure all flag tracking operations are thread-safe
This addresses the potential race conditions identified in PR review.