mirror of https://github.com/ludo-technologies/pyscn.git synced 2025-10-06 00:59:45 +03:00

Files

DaisukeYoda cd8319d1f6 docs: update DEVELOPMENT.md to reflect .pyscn.toml priority

- Remove 'recommended' label from pyproject.toml
- Clarify that .pyscn.toml takes precedence over pyproject.toml
- Add explicit note about behavior when both files exist

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-05 10:27:44 +09:00

9.4 KiB

Raw Permalink Blame History

Development Guide

This guide covers everything you need to know to contribute to pyscn development.

Getting Started
Project Structure
Development Workflow
Task Management
Building and Testing
Code Style
Git Workflow

Getting Started

Prerequisites

Go 1.22+ (recommended: 1.24)
Git
GitHub CLI (gh)
Make (optional but recommended)

Initial Setup

# Clone the repository
git clone https://github.com/ludo-technologies/pyscn.git
cd pyscn

# Install dependencies
go mod download

# Run tests to verify setup
go test ./...

# Build the binary
go build ./cmd/pyscn

Project Structure

pyscn/
├── cmd/
│   └── pyscn/         # CLI entry point
│       └── main.go    # Main function
├── internal/          # Private packages
│   ├── parser/        # Tree-sitter integration
│   │   ├── python.go  # Python-specific parsing
│   │   └── ast.go     # AST definitions
│   ├── analyzer/      # Analysis algorithms
│   │   ├── cfg.go     # Control Flow Graph
│   │   ├── dead.go    # Dead code detection
│   │   └── apted.go   # Clone detection
│   └── config/        # Configuration
├── pkg/               # Public packages
│   └── api/           # Public API
├── testdata/          # Test fixtures
├── docs/              # Documentation
└── scripts/           # Utility scripts

Development Workflow

1. Create a Feature Branch

Follow our branching strategy (see BRANCHING.md):

# Create branch from main
git checkout main
git pull origin main
git checkout -b feature/tree-sitter-integration

# Branch naming patterns:
# feature/{description}  # New features  
# fix/{description}      # Bug fixes
# docs/{description}     # Documentation
# refactor/{description} # Code improvements  
# chore/{description}    # Maintenance

2. Implement the Feature

Follow the implementation checklist:

Write tests first (TDD approach)
Implement the feature
Ensure all tests pass
Add documentation
Run linters

3. Submit Pull Request

# Push your branch
git push origin feature/tree-sitter-integration

# Create PR via GitHub CLI
gh pr create --title "feat: Add tree-sitter integration" \
  --body "Brief description of the changes and motivation"

Configuration System

Configuration Files

pyscn uses a TOML-only configuration system similar to Ruff. Configuration files are searched in the following priority order:

.pyscn.toml (dedicated config file - takes precedence)
pyproject.toml with [tool.pyscn] section (fallback)
Parent Directories: Searching upward to filesystem root

When both .pyscn.toml and pyproject.toml exist in the same directory, .pyscn.toml is used and pyproject.toml is ignored.

Configuration File Names

Supported configuration file names (in priority order):

.pyscn.toml (dedicated config file)
pyproject.toml (with [tool.pyscn] section)

Configuration Example

# .pyscn.toml or [tool.pyscn] section in pyproject.toml
[output]
directory = "reports"  # Output directory for generated reports

[complexity]
low_threshold = 9
medium_threshold = 19

[clones]
similarity_threshold = 0.8
min_lines = 5

Using Configuration in Tests

For E2E and integration tests, create temporary configuration files:

// Create config file for test
configFile := filepath.Join(testDir, ".pyscn.toml")
configContent := fmt.Sprintf("[output]\ndirectory = \"%s\"\n", outputDir)
err := os.WriteFile(configFile, []byte(configContent), 0644)

This ensures test-generated files are placed in temporary directories, not in the project directory.

Building and Testing

Build Commands

# Build for current platform
go build -o pyscn ./cmd/pyscn

# Build for all platforms
GOOS=linux GOARCH=amd64 go build -o pyscn-linux-amd64 ./cmd/pyscn
GOOS=darwin GOARCH=amd64 go build -o pyscn-darwin-amd64 ./cmd/pyscn
GOOS=windows GOARCH=amd64 go build -o pyscn-windows-amd64.exe ./cmd/pyscn

# Build with version info
go build -ldflags "-X main.version=v0.1.0" ./cmd/pyscn

Testing

# Run all tests
go test ./...

# Run tests with coverage
go test -cover ./...

# Run tests with race detection
go test -race ./...

# Run specific package tests
go test ./internal/parser

# Run tests with verbose output
go test -v ./...

# Generate coverage report
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

Benchmarking

# Run benchmarks
go test -bench=. ./...

# Run specific benchmark
go test -bench=BenchmarkCFG ./internal/analyzer

# Benchmark with memory profiling
go test -bench=. -benchmem ./...

Code Style

Go Conventions

Follow Effective Go
Use gofmt for formatting
Use meaningful variable names
Keep functions small and focused
Document exported functions

Code Quality Tools

# Format code
go fmt ./...

# Vet code for issues
go vet ./...

# Run golangci-lint (if installed)
golangci-lint run

# Check for security issues
gosec ./...

Testing Standards

Write table-driven tests
Use meaningful test names
Test edge cases
Aim for >80% code coverage
Mock external dependencies

Example test structure:

func TestFunctionName(t *testing.T) {
    tests := []struct {
        name    string
        input   InputType
        want    OutputType
        wantErr bool
    }{
        {
            name:  "valid input",
            input: InputType{...},
            want:  OutputType{...},
        },
        {
            name:    "invalid input",
            input:   InputType{...},
            wantErr: true,
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got, err := FunctionName(tt.input)
            if (err != nil) != tt.wantErr {
                t.Errorf("FunctionName() error = %v, wantErr %v", err, tt.wantErr)
                return
            }
            if !reflect.DeepEqual(got, tt.want) {
                t.Errorf("FunctionName() = %v, want %v", got, tt.want)
            }
        })
    }
}

Git Workflow

Commit Messages

Follow the Conventional Commits specification (detailed in BRANCHING.md):

<type>(<scope>): <subject>

<body>

<footer>

Types:

feat: New feature
fix: Bug fix
refactor: Code refactoring
test: Testing changes
docs: Documentation
perf: Performance improvements
ci: CI/CD changes
chore: Maintenance tasks

Examples:

git commit -m "feat(parser): add tree-sitter Python integration"
git commit -m "fix(cfg): handle break statements in loops"
git commit -m "test(analyzer): add benchmarks for APTED algorithm"

Pull Request Process

Create focused PRs: One feature/fix per PR
Write descriptive PR titles: Include issue number
Fill out PR template: Describe changes and testing
Request reviews: Tag relevant maintainers
Address feedback: Respond to all comments
Keep PRs updated: Rebase on main if needed

Code Review Guidelines

When reviewing PRs:

Check test coverage
Verify documentation updates
Run code locally
Provide constructive feedback
Approve when satisfied

Continuous Integration

GitHub Actions runs the following checks on every PR:

Go 1.22 and 1.23 compatibility
Unit tests
Race condition detection
Code coverage reporting
Linting (go vet)
Build verification

Performance Considerations

Benchmarking Targets

Parsing: >100,000 lines/second
CFG Construction: >10,000 lines/second
APTED Comparison: <1 second for 1000-node trees
Memory Usage: <10x file size

Profiling

# CPU profiling
go test -cpuprofile=cpu.prof -bench=.
go tool pprof cpu.prof

# Memory profiling
go test -memprofile=mem.prof -bench=.
go tool pprof mem.prof

# Generate flame graph
go test -cpuprofile=cpu.prof -bench=.
go tool pprof -http=:8080 cpu.prof

Debugging

Debug Build

# Build with debug symbols
go build -gcflags="all=-N -l" ./cmd/pyscn

# Run with debug logging
PYSCN_DEBUG=1 ./pyscn analyze test.py

# Use delve debugger
dlv debug ./cmd/pyscn -- analyze test.py

Logging

Use structured logging for debugging:

import "log/slog"

slog.Debug("parsing file", 
    "file", filename,
    "size", fileSize,
    "duration", duration)

Release Process

Releases are automated via GitHub Actions when a tag is pushed:

# Create and push a tag
git tag -a v0.1.0 -m "Release v0.1.0"
git push origin v0.1.0

# This triggers:
# 1. Build for all platforms
# 2. Run full test suite
# 3. Create GitHub release
# 4. Upload binaries

Getting Help

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: This guide and /docs directory

Quick Reference

# Development cycle
./scripts/tasks.sh list           # View tasks
git checkout -b feature/issue-N   # Start feature
go test ./...                      # Test changes
git commit -m "feat: ..."          # Commit
gh pr create                       # Create PR
./scripts/tasks.sh done N          # Close issue

Happy coding! 🚀

9.4 KiB Raw Permalink Blame History