Merge pull request #18 from robertjakob/feature/executive-summary-agent

Update Executive Summary Agent with two-step reasoning process
This commit is contained in:
Robert Jakob
2025-05-10 18:05:38 +02:00
committed by GitHub
4 changed files with 404 additions and 3 deletions

View File

@@ -0,0 +1,57 @@
Executive Summary Agent Implementation:
The Executive Summary Agent creates a comprehensive executive review summary through a two-step reasoning process:
Inputs:
- Original PDF manuscript in /manuscripts/ (user-submitted manuscript)
- User context in /context/context.json (user priorities and focus areas)
- Quality controlled JSON in /results/quality_control_results.json (AI review pipeline output)
Process:
1. Independent Review Generation
- Analyzes the manuscript without bias
- Generates a comprehensive review including:
* Summary of the manuscript
* Strengths and weaknesses
* Critical suggestions for improvement
- Focuses on target journal requirements and user priorities
2. Balanced Summary Generation
- Synthesizes insights from both the independent review and quality control results
- Creates a unified executive summary in three paragraphs:
* First paragraph: Overview of the manuscript's content and contribution
* Second paragraph: Balanced assessment of strengths and weaknesses
* Third paragraph: Actionable recommendations for improvement
- Ensures natural flow while incorporating key insights from both sources
- Avoids mechanical listing of points
- Maintains consistency with the detailed assessment
3. Score Calculation
- Calculates overall review scores from quality control results:
* Section Score: Average of S1-S10 scores
* Rigor Score: Average of R1-R7 scores
* Writing Score: Average of W1-W7 scores
* Final Score: Average of all three scores
4. Output Generation
Creates a JSON file in the results folder containing:
- Manuscript title (extracted from content)
- Executive summary (three-paragraph synthesis)
- Independent review (for transparency)
- Calculated scores (Section, Rigor, Writing, Final)
Key Features:
- Two-step reasoning process for robust analysis
- Natural balance between independent review and quality control findings
- Focus on most significant points regardless of source
- Professional language and concise format (half page)
- Alignment with user priorities from context file
- Uses GPT-4.1 for high-quality analysis
Implementation Notes:
- Does not modify existing files or pipeline components
- Maintains clear separation of concerns
- Provides transparent access to both independent review and final synthesis
- Ensures recommendations are actionable and specific

View File

@@ -51,6 +51,22 @@ The Quality Control Agent serves as a final validation layer that:
- Overall quality assessment
- Uses GPT-4.1 for high-quality structured output
### Executive Summary Agent
The Executive Summary Agent provides a high-level synthesis through a two-step reasoning process:
1. Independent Review Generation
- Analyzes the manuscript without bias
- Generates comprehensive review including summary, strengths/weaknesses, and suggestions
- Focuses on target journal requirements and user priorities
2. Balanced Summary Generation
- Synthesizes insights from both independent review and quality control results
- Creates a unified executive summary in three paragraphs:
* Overview of content and contribution
* Balanced assessment of strengths and weaknesses
* Actionable recommendations
- Ensures natural flow while incorporating key insights
- Maintains consistency with detailed assessment
## Installation
1. Clone the repository
@@ -70,6 +86,10 @@ python run_analysis.py
```bash
python run_quality_control.py
```
4. Generate executive summary:
```bash
python run_executive_summary.py
```
## Output
@@ -78,6 +98,7 @@ The system generates JSON files in the `results/` directory containing:
- Combined results (`combined_results.json`)
- Manuscript data (`manuscript_data.json`)
- Quality control results (`quality_control_results.json`)
- Executive summary (`executive_summary.json`)
Each agent's analysis follows a consistent JSON structure:
@@ -110,6 +131,28 @@ Each agent's analysis follows a consistent JSON structure:
}
```
The executive summary follows a specific structure:
```json
{
"manuscript_title": str,
"executive_summary": str, // Three-paragraph synthesis
"independent_review": {
"summary": str,
"strengths_weaknesses": {
"strengths": [str],
"weaknesses": [str]
},
"critical_suggestions": [str]
},
"scores": {
"section_score": float,
"rigor_score": float,
"writing_score": float,
"final_score": float
}
}
```
## Configuration
- Environment variables are managed in `.env`
@@ -120,18 +163,19 @@ Each agent's analysis follows a consistent JSON structure:
### Project Structure
```
V6_multi_agent3/
Agent1_Peer_Review/
├── src/
│ ├── reviewer_agents/
│ │ ├── section/ # Section agents (S1-S10)
│ │ ├── rigor/ # Rigor agents (R1-R7)
│ │ ├── writing/ # Writing agents (W1-W7)
│ │ ├── quality/ # Quality control agent
│ │ └── controller_agent.py
│ │ └── executive_summary_agent.py
│ ├── core/ # Core functionality and configuration
│ └── utils/ # Utility functions
├── manuscripts/ # Input manuscripts
├── results/ # Analysis results
├── context/ # User context and preferences
└── tests/ # Test suite
```
@@ -177,6 +221,6 @@ For detailed guidelines on how to contribute, please see [CONTRIBUTING.md](CONTR
**Share your feedback**: Contact us at rjakob@ethz.ch with your experiences and suggestions
**Use more powerful models**: The default implementation uses GPT-4.1-nano for accessibility, but you can configure the system to use more sophisticated models with your own API keys.
**Use more powerful models**: The default implementation uses GPT-4.1 for accessibility, but you can configure the system to use more sophisticated models with your own API keys.
Together, we can build the best review agent team and improve the quality of scientific publishing!

View File

@@ -0,0 +1,46 @@
#!/usr/bin/env python3
"""
Script to run the Executive Summary Agent and generate a high-level summary of the review results.
"""
import os
import json
from src.reviewer_agents.executive_summary_agent import ExecutiveSummaryAgent
def main():
# Initialize the Executive Summary Agent
agent = ExecutiveSummaryAgent()
# Define input paths
inputs = {
'manuscript_path': 'manuscripts/Systematic Review.pdf',
'context_path': 'context/context.json',
'quality_control_results_path': 'results/quality_control_results.json'
}
# Define output path
output_path = 'results/executive_summary.json'
try:
# Process the inputs and generate the executive summary
results = agent.process(inputs)
# Save the results
agent.save_results(results, output_path)
print("\nExecutive Summary Generation Complete!")
print(f"Results saved to: {output_path}")
# Print the scores
print("\nOverall Scores:")
print(f"Section Score: {results['scores']['section_score']:.1f}/5")
print(f"Rigor Score: {results['scores']['rigor_score']:.1f}/5")
print(f"Writing Score: {results['scores']['writing_score']:.1f}/5")
print(f"Final Score: {results['scores']['final_score']:.1f}/5")
except Exception as e:
print(f"Error generating executive summary: {str(e)}")
raise
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,254 @@
import json
import os
from typing import Dict, Any
import PyPDF2
from ..core.base_agent import BaseReviewerAgent
class ExecutiveSummaryAgent(BaseReviewerAgent):
"""
Executive Summary Agent that generates a high-level summary of the review results
and calculates overall scores based on the quality control results.
"""
def __init__(self, model: str = "gpt-4.1"):
super().__init__(model)
self.required_inputs = {
'manuscript_path': str,
'context_path': str,
'quality_control_results_path': str
}
def validate_inputs(self, inputs: Dict[str, Any]) -> bool:
"""Validate that all required input files exist and are accessible."""
for key, path in inputs.items():
if not os.path.exists(path):
raise FileNotFoundError(f"Required input file not found: {path}")
return True
def load_json_file(self, file_path: str) -> Dict:
"""Load and parse a JSON file."""
with open(file_path, 'r', encoding='utf-8') as f:
return json.load(f)
def extract_pdf_text(self, pdf_path: str) -> str:
"""Extract text from PDF file."""
text = ""
with open(pdf_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
for page in pdf_reader.pages:
text += page.extract_text() + "\n"
return text
def extract_title(self, pdf_path: str) -> str:
"""Extract title from the first page of the PDF."""
with open(pdf_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
first_page = pdf_reader.pages[0]
text = first_page.extract_text()
# Assuming title is in the first few lines
lines = text.split('\n')
for line in lines[:5]: # Check first 5 lines
if line.strip() and len(line.strip()) > 10: # Basic title validation
return line.strip()
return "Title not found"
def calculate_scores(self, quality_control_results: Dict) -> Dict[str, float]:
"""Calculate overall scores from quality control results."""
scores = {
'section_score': 0.0,
'rigor_score': 0.0,
'writing_score': 0.0,
'final_score': 0.0
}
# Calculate section score (S1-S10)
section_scores = []
for i in range(1, 11):
section_key = f'S{i}'
if section_key in quality_control_results.get('section_results', {}):
section_scores.append(quality_control_results['section_results'][section_key]['score'])
if section_scores:
scores['section_score'] = sum(section_scores) / len(section_scores)
# Calculate rigor score (R1-R7)
rigor_scores = []
for i in range(1, 8):
rigor_key = f'R{i}'
if rigor_key in quality_control_results.get('rigor_results', {}):
rigor_scores.append(quality_control_results['rigor_results'][rigor_key]['score'])
if rigor_scores:
scores['rigor_score'] = sum(rigor_scores) / len(rigor_scores)
# Calculate writing score (W1-W7)
writing_scores = []
for i in range(1, 8):
writing_key = f'W{i}'
if writing_key in quality_control_results.get('writing_results', {}):
writing_scores.append(quality_control_results['writing_results'][writing_key]['score'])
if writing_scores:
scores['writing_score'] = sum(writing_scores) / len(writing_scores)
# Calculate final score
category_scores = [scores['section_score'], scores['rigor_score'], scores['writing_score']]
if category_scores:
scores['final_score'] = sum(category_scores) / len(category_scores)
return scores
def validate_context(self, context: Dict) -> Dict:
"""Validate and sanitize context data, providing defaults for missing or invalid values."""
# Initialize default values
sanitized_context = {
'target_publication_outlets': {
'user_input': 'the target journal'
},
'review_focus_areas': {
'user_input': 'general aspects'
}
}
# Validate target publication outlets
if isinstance(context.get('target_publication_outlets'), dict):
user_input = context['target_publication_outlets'].get('user_input')
if isinstance(user_input, str) and user_input.strip():
sanitized_context['target_publication_outlets']['user_input'] = user_input.strip()
# Validate review focus areas
if isinstance(context.get('review_focus_areas'), dict):
user_input = context['review_focus_areas'].get('user_input')
if isinstance(user_input, str) and user_input.strip():
sanitized_context['review_focus_areas']['user_input'] = user_input.strip()
return sanitized_context
def generate_independent_review(self, manuscript_text: str, context: Dict) -> str:
"""Generate an independent high-level review of the manuscript using GPT-4.1."""
# Sanitize context
sanitized_context = self.validate_context(context)
target_journal = sanitized_context['target_publication_outlets']['user_input']
focus_areas = sanitized_context['review_focus_areas']['user_input']
prompt = f"""You are an expert reviewer for {target_journal}. Read the following manuscript content and user priorities, then independently write a high-level review in three paragraphs:
Manuscript Content:
{manuscript_text[:6000]}
User Priorities:
- Target Journal: {target_journal}
- Focus Areas: {focus_areas}
Write:
1. A summary of what the manuscript is about
2. The main strengths and weaknesses, with special attention to {focus_areas}
3. The most critical suggestions for improvement, considering {target_journal} standards
Be concise, professional, and focus on the most important points. Do not reference any other reviews or JSON files yet."""
response = self.llm(prompt)
return response.strip()
def generate_balanced_summary(self, independent_review: str, quality_control_results: Dict, context: Dict) -> str:
"""Balance the agent's own review with the quality-controlled review JSON."""
# Sanitize context
sanitized_context = self.validate_context(context)
target_journal = sanitized_context['target_publication_outlets']['user_input']
focus_areas = sanitized_context['review_focus_areas']['user_input']
prompt = f"""You are an Executive Summary Agent for {target_journal}. You have two sources:
1. Your own independent review of the manuscript (below)
2. The quality-controlled review JSON (below)
First, extract the manuscript's title from the content. Then, write a unified executive summary in three paragraphs that:
- Provides a clear, concise overview of the manuscript
- Presents a balanced assessment of strengths and weaknesses
- Offers specific, actionable recommendations for improvement
IMPORTANT: While the quality-controlled review JSON provides valuable insights, your executive summary should:
- Draw naturally from both your independent review and the quality control findings
- Focus on the most significant and impactful points, regardless of source
- Present a cohesive narrative that flows naturally
- Avoid mechanically listing points from either source
Your Own Review:
{independent_review}
User Priorities:
- Target Journal: {target_journal}
- Focus Areas: {focus_areas}
Quality-Controlled Review (JSON):
{json.dumps(quality_control_results, indent=2)}
First, extract the manuscript's title. Then write a cohesive executive summary that:
1. Summarizes the manuscript's content and contribution, highlighting its key insights and significance
2. Evaluates its strengths and weaknesses, with special attention to {focus_areas}
3. Provides clear, actionable recommendations for improvement
Format your response as a JSON object with two fields:
1. "title": The extracted manuscript title
2. "executive_summary": The three-paragraph summary
Keep the summary within half a page (about 250 words), use professional language, and be specific and constructive. Write as a single, unified document that flows naturally while incorporating insights from both sources."""
response = self.llm(prompt)
return response.strip()
def process(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
"""
Main processing method that:
1. Validates inputs
2. Extracts necessary information
3. Generates an independent review
4. Synthesizes a balanced executive summary
5. Calculates scores
6. Produces final output
"""
# Validate inputs
self.validate_inputs(inputs)
try:
# Load input data
context = self.load_json_file(inputs['context_path'])
except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
print(f"Warning: Could not load context file: {str(e)}. Using default values.")
context = {}
try:
quality_control_results = self.load_json_file(inputs['quality_control_results_path'])
except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
raise RuntimeError(f"Failed to load quality control results: {str(e)}")
# Extract manuscript text
manuscript_text = self.extract_pdf_text(inputs['manuscript_path'])
# Step 1: Generate independent review
independent_review = self.generate_independent_review(manuscript_text, context)
# Step 2: Synthesize balanced executive summary and extract title
summary_response = self.generate_balanced_summary(independent_review, quality_control_results, context)
try:
summary_data = json.loads(summary_response)
title = summary_data.get('title', 'Title not found')
summary = summary_data.get('executive_summary', '')
except json.JSONDecodeError:
print("Warning: Could not parse summary response as JSON. Using raw response.")
title = 'Title not found'
summary = summary_response
# Calculate scores
scores = self.calculate_scores(quality_control_results)
# Prepare output
output = {
'manuscript_title': title,
'executive_summary': summary,
'independent_review': independent_review,
'scores': scores
}
return output
def save_results(self, results: Dict[str, Any], output_path: str) -> None:
"""Save the results to a JSON file."""
os.makedirs(os.path.dirname(output_path), exist_ok=True)
with open(output_path, 'w', encoding='utf-8') as f:
json.dump(results, f, indent=2)
print(f"Executive summary results saved to {output_path}")