Merge pull request #18 from robertjakob/feature/executive-summary-agent

Update Executive Summary Agent with two-step reasoning process
2025-05-31 22:15:21 +03:00 · 2025-05-10 18:05:38 +02:00
parent 3a41c3bb07 d29aca6f14
commit 544ae97fab
4 changed files with 404 additions and 3 deletions
--- a/Agent1_Peer_Review/Executive
+++ b/Agent1_Peer_Review/Executive
@@ -0,0 +1,57 @@
+Executive Summary Agent Implementation:
+
+The Executive Summary Agent creates a comprehensive executive review summary through a two-step reasoning process:
+
+Inputs:
+- Original PDF manuscript in /manuscripts/ (user-submitted manuscript)
+- User context in /context/context.json (user priorities and focus areas)
+- Quality controlled JSON in /results/quality_control_results.json (AI review pipeline output)
+
+Process:
+
+1. Independent Review Generation
+   - Analyzes the manuscript without bias
+   - Generates a comprehensive review including:
+     * Summary of the manuscript
+     * Strengths and weaknesses
+     * Critical suggestions for improvement
+   - Focuses on target journal requirements and user priorities
+
+2. Balanced Summary Generation
+   - Synthesizes insights from both the independent review and quality control results
+   - Creates a unified executive summary in three paragraphs:
+     * First paragraph: Overview of the manuscript's content and contribution
+     * Second paragraph: Balanced assessment of strengths and weaknesses
+     * Third paragraph: Actionable recommendations for improvement
+   - Ensures natural flow while incorporating key insights from both sources
+   - Avoids mechanical listing of points
+   - Maintains consistency with the detailed assessment
+
+3. Score Calculation
+   - Calculates overall review scores from quality control results:
+     * Section Score: Average of S1-S10 scores
+     * Rigor Score: Average of R1-R7 scores
+     * Writing Score: Average of W1-W7 scores
+     * Final Score: Average of all three scores
+
+4. Output Generation
+   Creates a JSON file in the results folder containing:
+   - Manuscript title (extracted from content)
+   - Executive summary (three-paragraph synthesis)
+   - Independent review (for transparency)
+   - Calculated scores (Section, Rigor, Writing, Final)
+
+Key Features:
+- Two-step reasoning process for robust analysis
+- Natural balance between independent review and quality control findings
+- Focus on most significant points regardless of source
+- Professional language and concise format (half page)
+- Alignment with user priorities from context file
+- Uses GPT-4.1 for high-quality analysis
+
+Implementation Notes:
+- Does not modify existing files or pipeline components
+- Maintains clear separation of concerns
+- Provides transparent access to both independent review and final synthesis
+- Ensures recommendations are actionable and specific
+
--- a/Agent1_Peer_Review/README.md
+++ b/Agent1_Peer_Review/README.md
@@ -51,6 +51,22 @@ The Quality Control Agent serves as a final validation layer that:
  - Overall quality assessment
 - Uses GPT-4.1 for high-quality structured output

+### Executive Summary Agent
+The Executive Summary Agent provides a high-level synthesis through a two-step reasoning process:
+1. Independent Review Generation
+   - Analyzes the manuscript without bias
+   - Generates comprehensive review including summary, strengths/weaknesses, and suggestions
+   - Focuses on target journal requirements and user priorities
+
+2. Balanced Summary Generation
+   - Synthesizes insights from both independent review and quality control results
+   - Creates a unified executive summary in three paragraphs:
+     * Overview of content and contribution
+     * Balanced assessment of strengths and weaknesses
+     * Actionable recommendations
+   - Ensures natural flow while incorporating key insights
+   - Maintains consistency with detailed assessment
+
 ## Installation

 1. Clone the repository
@@ -70,6 +86,10 @@ python run_analysis.py
 ```bash
 python run_quality_control.py
 ```
+4. Generate executive summary:
+```bash
+python run_executive_summary.py
+```

 ## Output

@@ -78,6 +98,7 @@ The system generates JSON files in the `results/` directory containing:
 - Combined results (`combined_results.json`)
 - Manuscript data (`manuscript_data.json`)
 - Quality control results (`quality_control_results.json`)
+- Executive summary (`executive_summary.json`)

 Each agent's analysis follows a consistent JSON structure:

@@ -110,6 +131,28 @@ Each agent's analysis follows a consistent JSON structure:
 }
 ```

+The executive summary follows a specific structure:
+```json
+{
+    "manuscript_title": str,
+    "executive_summary": str,  // Three-paragraph synthesis
+    "independent_review": {
+        "summary": str,
+        "strengths_weaknesses": {
+            "strengths": [str],
+            "weaknesses": [str]
+        },
+        "critical_suggestions": [str]
+    },
+    "scores": {
+        "section_score": float,
+        "rigor_score": float,
+        "writing_score": float,
+        "final_score": float
+    }
+}
+```
+
 ## Configuration

 - Environment variables are managed in `.env`
@@ -120,18 +163,19 @@ Each agent's analysis follows a consistent JSON structure:

 ### Project Structure
 ```
-V6_multi_agent3/
+Agent1_Peer_Review/
 ├── src/
 │   ├── reviewer_agents/
 │   │   ├── section/      # Section agents (S1-S10)
 │   │   ├── rigor/        # Rigor agents (R1-R7)
 │   │   ├── writing/      # Writing agents (W1-W7)
 │   │   ├── quality/      # Quality control agent
-│   │   └── controller_agent.py
+│   │   └── executive_summary_agent.py
 │   ├── core/            # Core functionality and configuration
 │   └── utils/           # Utility functions
 ├── manuscripts/         # Input manuscripts
 ├── results/            # Analysis results
+├── context/           # User context and preferences
 └── tests/             # Test suite
 ```

@@ -177,6 +221,6 @@ For detailed guidelines on how to contribute, please see [CONTRIBUTING.md](CONTR

 **Share your feedback**: Contact us at rjakob@ethz.ch with your experiences and suggestions

-**Use more powerful models**: The default implementation uses GPT-4.1-nano for accessibility, but you can configure the system to use more sophisticated models with your own API keys.
+**Use more powerful models**: The default implementation uses GPT-4.1 for accessibility, but you can configure the system to use more sophisticated models with your own API keys.

 Together, we can build the best review agent team and improve the quality of scientific publishing!
--- a/Agent1_Peer_Review/run_executive_summary.py
+++ b/Agent1_Peer_Review/run_executive_summary.py
@@ -0,0 +1,46 @@
+#!/usr/bin/env python3
+"""
+Script to run the Executive Summary Agent and generate a high-level summary of the review results.
+"""
+
+import os
+import json
+from src.reviewer_agents.executive_summary_agent import ExecutiveSummaryAgent
+
+def main():
+    # Initialize the Executive Summary Agent
+    agent = ExecutiveSummaryAgent()
+    
+    # Define input paths
+    inputs = {
+        'manuscript_path': 'manuscripts/Systematic Review.pdf',
+        'context_path': 'context/context.json',
+        'quality_control_results_path': 'results/quality_control_results.json'
+    }
+    
+    # Define output path
+    output_path = 'results/executive_summary.json'
+    
+    try:
+        # Process the inputs and generate the executive summary
+        results = agent.process(inputs)
+        
+        # Save the results
+        agent.save_results(results, output_path)
+        
+        print("\nExecutive Summary Generation Complete!")
+        print(f"Results saved to: {output_path}")
+        
+        # Print the scores
+        print("\nOverall Scores:")
+        print(f"Section Score: {results['scores']['section_score']:.1f}/5")
+        print(f"Rigor Score: {results['scores']['rigor_score']:.1f}/5")
+        print(f"Writing Score: {results['scores']['writing_score']:.1f}/5")
+        print(f"Final Score: {results['scores']['final_score']:.1f}/5")
+        
+    except Exception as e:
+        print(f"Error generating executive summary: {str(e)}")
+        raise
+
+if __name__ == "__main__":
+    main() 
--- a/Agent1_Peer_Review/src/reviewer_agents/executive_summary_agent.py
+++ b/Agent1_Peer_Review/src/reviewer_agents/executive_summary_agent.py
@@ -0,0 +1,254 @@
+import json
+import os
+from typing import Dict, Any
+import PyPDF2
+from ..core.base_agent import BaseReviewerAgent
+
+class ExecutiveSummaryAgent(BaseReviewerAgent):
+    """
+    Executive Summary Agent that generates a high-level summary of the review results
+    and calculates overall scores based on the quality control results.
+    """
+    
+    def __init__(self, model: str = "gpt-4.1"):
+        super().__init__(model)
+        self.required_inputs = {
+            'manuscript_path': str,
+            'context_path': str,
+            'quality_control_results_path': str
+        }
+    
+    def validate_inputs(self, inputs: Dict[str, Any]) -> bool:
+        """Validate that all required input files exist and are accessible."""
+        for key, path in inputs.items():
+            if not os.path.exists(path):
+                raise FileNotFoundError(f"Required input file not found: {path}")
+        return True
+
+    def load_json_file(self, file_path: str) -> Dict:
+        """Load and parse a JSON file."""
+        with open(file_path, 'r', encoding='utf-8') as f:
+            return json.load(f)
+
+    def extract_pdf_text(self, pdf_path: str) -> str:
+        """Extract text from PDF file."""
+        text = ""
+        with open(pdf_path, 'rb') as file:
+            pdf_reader = PyPDF2.PdfReader(file)
+            for page in pdf_reader.pages:
+                text += page.extract_text() + "\n"
+        return text
+
+    def extract_title(self, pdf_path: str) -> str:
+        """Extract title from the first page of the PDF."""
+        with open(pdf_path, 'rb') as file:
+            pdf_reader = PyPDF2.PdfReader(file)
+            first_page = pdf_reader.pages[0]
+            text = first_page.extract_text()
+            # Assuming title is in the first few lines
+            lines = text.split('\n')
+            for line in lines[:5]:  # Check first 5 lines
+                if line.strip() and len(line.strip()) > 10:  # Basic title validation
+                    return line.strip()
+        return "Title not found"
+
+    def calculate_scores(self, quality_control_results: Dict) -> Dict[str, float]:
+        """Calculate overall scores from quality control results."""
+        scores = {
+            'section_score': 0.0,
+            'rigor_score': 0.0,
+            'writing_score': 0.0,
+            'final_score': 0.0
+        }
+        
+        # Calculate section score (S1-S10)
+        section_scores = []
+        for i in range(1, 11):
+            section_key = f'S{i}'
+            if section_key in quality_control_results.get('section_results', {}):
+                section_scores.append(quality_control_results['section_results'][section_key]['score'])
+        if section_scores:
+            scores['section_score'] = sum(section_scores) / len(section_scores)
+        
+        # Calculate rigor score (R1-R7)
+        rigor_scores = []
+        for i in range(1, 8):
+            rigor_key = f'R{i}'
+            if rigor_key in quality_control_results.get('rigor_results', {}):
+                rigor_scores.append(quality_control_results['rigor_results'][rigor_key]['score'])
+        if rigor_scores:
+            scores['rigor_score'] = sum(rigor_scores) / len(rigor_scores)
+        
+        # Calculate writing score (W1-W7)
+        writing_scores = []
+        for i in range(1, 8):
+            writing_key = f'W{i}'
+            if writing_key in quality_control_results.get('writing_results', {}):
+                writing_scores.append(quality_control_results['writing_results'][writing_key]['score'])
+        if writing_scores:
+            scores['writing_score'] = sum(writing_scores) / len(writing_scores)
+        
+        # Calculate final score
+        category_scores = [scores['section_score'], scores['rigor_score'], scores['writing_score']]
+        if category_scores:
+            scores['final_score'] = sum(category_scores) / len(category_scores)
+        
+        return scores
+
+    def validate_context(self, context: Dict) -> Dict:
+        """Validate and sanitize context data, providing defaults for missing or invalid values."""
+        # Initialize default values
+        sanitized_context = {
+            'target_publication_outlets': {
+                'user_input': 'the target journal'
+            },
+            'review_focus_areas': {
+                'user_input': 'general aspects'
+            }
+        }
+        
+        # Validate target publication outlets
+        if isinstance(context.get('target_publication_outlets'), dict):
+            user_input = context['target_publication_outlets'].get('user_input')
+            if isinstance(user_input, str) and user_input.strip():
+                sanitized_context['target_publication_outlets']['user_input'] = user_input.strip()
+        
+        # Validate review focus areas
+        if isinstance(context.get('review_focus_areas'), dict):
+            user_input = context['review_focus_areas'].get('user_input')
+            if isinstance(user_input, str) and user_input.strip():
+                sanitized_context['review_focus_areas']['user_input'] = user_input.strip()
+        
+        return sanitized_context
+
+    def generate_independent_review(self, manuscript_text: str, context: Dict) -> str:
+        """Generate an independent high-level review of the manuscript using GPT-4.1."""
+        # Sanitize context
+        sanitized_context = self.validate_context(context)
+        target_journal = sanitized_context['target_publication_outlets']['user_input']
+        focus_areas = sanitized_context['review_focus_areas']['user_input']
+        
+        prompt = f"""You are an expert reviewer for {target_journal}. Read the following manuscript content and user priorities, then independently write a high-level review in three paragraphs:
+
+Manuscript Content:
+{manuscript_text[:6000]}
+
+User Priorities:
+- Target Journal: {target_journal}
+- Focus Areas: {focus_areas}
+
+Write:
+1. A summary of what the manuscript is about
+2. The main strengths and weaknesses, with special attention to {focus_areas}
+3. The most critical suggestions for improvement, considering {target_journal} standards
+
+Be concise, professional, and focus on the most important points. Do not reference any other reviews or JSON files yet."""
+        response = self.llm(prompt)
+        return response.strip()
+
+    def generate_balanced_summary(self, independent_review: str, quality_control_results: Dict, context: Dict) -> str:
+        """Balance the agent's own review with the quality-controlled review JSON."""
+        # Sanitize context
+        sanitized_context = self.validate_context(context)
+        target_journal = sanitized_context['target_publication_outlets']['user_input']
+        focus_areas = sanitized_context['review_focus_areas']['user_input']
+        
+        prompt = f"""You are an Executive Summary Agent for {target_journal}. You have two sources:
+1. Your own independent review of the manuscript (below)
+2. The quality-controlled review JSON (below)
+
+First, extract the manuscript's title from the content. Then, write a unified executive summary in three paragraphs that:
+- Provides a clear, concise overview of the manuscript
+- Presents a balanced assessment of strengths and weaknesses
+- Offers specific, actionable recommendations for improvement
+
+IMPORTANT: While the quality-controlled review JSON provides valuable insights, your executive summary should:
+- Draw naturally from both your independent review and the quality control findings
+- Focus on the most significant and impactful points, regardless of source
+- Present a cohesive narrative that flows naturally
+- Avoid mechanically listing points from either source
+
+Your Own Review:
+{independent_review}
+
+User Priorities:
+- Target Journal: {target_journal}
+- Focus Areas: {focus_areas}
+
+Quality-Controlled Review (JSON):
+{json.dumps(quality_control_results, indent=2)}
+
+First, extract the manuscript's title. Then write a cohesive executive summary that:
+1. Summarizes the manuscript's content and contribution, highlighting its key insights and significance
+2. Evaluates its strengths and weaknesses, with special attention to {focus_areas}
+3. Provides clear, actionable recommendations for improvement
+
+Format your response as a JSON object with two fields:
+1. "title": The extracted manuscript title
+2. "executive_summary": The three-paragraph summary
+
+Keep the summary within half a page (about 250 words), use professional language, and be specific and constructive. Write as a single, unified document that flows naturally while incorporating insights from both sources."""
+        response = self.llm(prompt)
+        return response.strip()
+
+    def process(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Main processing method that:
+        1. Validates inputs
+        2. Extracts necessary information
+        3. Generates an independent review
+        4. Synthesizes a balanced executive summary
+        5. Calculates scores
+        6. Produces final output
+        """
+        # Validate inputs
+        self.validate_inputs(inputs)
+        
+        try:
+            # Load input data
+            context = self.load_json_file(inputs['context_path'])
+        except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
+            print(f"Warning: Could not load context file: {str(e)}. Using default values.")
+            context = {}
+        
+        try:
+            quality_control_results = self.load_json_file(inputs['quality_control_results_path'])
+        except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
+            raise RuntimeError(f"Failed to load quality control results: {str(e)}")
+        
+        # Extract manuscript text
+        manuscript_text = self.extract_pdf_text(inputs['manuscript_path'])
+        
+        # Step 1: Generate independent review
+        independent_review = self.generate_independent_review(manuscript_text, context)
+        
+        # Step 2: Synthesize balanced executive summary and extract title
+        summary_response = self.generate_balanced_summary(independent_review, quality_control_results, context)
+        try:
+            summary_data = json.loads(summary_response)
+            title = summary_data.get('title', 'Title not found')
+            summary = summary_data.get('executive_summary', '')
+        except json.JSONDecodeError:
+            print("Warning: Could not parse summary response as JSON. Using raw response.")
+            title = 'Title not found'
+            summary = summary_response
+        
+        # Calculate scores
+        scores = self.calculate_scores(quality_control_results)
+        
+        # Prepare output
+        output = {
+            'manuscript_title': title,
+            'executive_summary': summary,
+            'independent_review': independent_review,
+            'scores': scores
+        }
+        
+        return output
+
+    def save_results(self, results: Dict[str, Any], output_path: str) -> None:
+        """Save the results to a JSON file."""
+        os.makedirs(os.path.dirname(output_path), exist_ok=True)
+        with open(output_path, 'w', encoding='utf-8') as f:
+            json.dump(results, f, indent=2)
+        print(f"Executive summary results saved to {output_path}")