mirror of
https://github.com/robertjakob/rigorous.git
synced 2025-05-31 22:15:21 +03:00
Merge pull request #18 from robertjakob/feature/executive-summary-agent
Update Executive Summary Agent with two-step reasoning process
This commit is contained in:
57
Agent1_Peer_Review/Executive SummaryAgent.md
Normal file
57
Agent1_Peer_Review/Executive SummaryAgent.md
Normal file
@@ -0,0 +1,57 @@
|
||||
Executive Summary Agent Implementation:
|
||||
|
||||
The Executive Summary Agent creates a comprehensive executive review summary through a two-step reasoning process:
|
||||
|
||||
Inputs:
|
||||
- Original PDF manuscript in /manuscripts/ (user-submitted manuscript)
|
||||
- User context in /context/context.json (user priorities and focus areas)
|
||||
- Quality controlled JSON in /results/quality_control_results.json (AI review pipeline output)
|
||||
|
||||
Process:
|
||||
|
||||
1. Independent Review Generation
|
||||
- Analyzes the manuscript without bias
|
||||
- Generates a comprehensive review including:
|
||||
* Summary of the manuscript
|
||||
* Strengths and weaknesses
|
||||
* Critical suggestions for improvement
|
||||
- Focuses on target journal requirements and user priorities
|
||||
|
||||
2. Balanced Summary Generation
|
||||
- Synthesizes insights from both the independent review and quality control results
|
||||
- Creates a unified executive summary in three paragraphs:
|
||||
* First paragraph: Overview of the manuscript's content and contribution
|
||||
* Second paragraph: Balanced assessment of strengths and weaknesses
|
||||
* Third paragraph: Actionable recommendations for improvement
|
||||
- Ensures natural flow while incorporating key insights from both sources
|
||||
- Avoids mechanical listing of points
|
||||
- Maintains consistency with the detailed assessment
|
||||
|
||||
3. Score Calculation
|
||||
- Calculates overall review scores from quality control results:
|
||||
* Section Score: Average of S1-S10 scores
|
||||
* Rigor Score: Average of R1-R7 scores
|
||||
* Writing Score: Average of W1-W7 scores
|
||||
* Final Score: Average of all three scores
|
||||
|
||||
4. Output Generation
|
||||
Creates a JSON file in the results folder containing:
|
||||
- Manuscript title (extracted from content)
|
||||
- Executive summary (three-paragraph synthesis)
|
||||
- Independent review (for transparency)
|
||||
- Calculated scores (Section, Rigor, Writing, Final)
|
||||
|
||||
Key Features:
|
||||
- Two-step reasoning process for robust analysis
|
||||
- Natural balance between independent review and quality control findings
|
||||
- Focus on most significant points regardless of source
|
||||
- Professional language and concise format (half page)
|
||||
- Alignment with user priorities from context file
|
||||
- Uses GPT-4.1 for high-quality analysis
|
||||
|
||||
Implementation Notes:
|
||||
- Does not modify existing files or pipeline components
|
||||
- Maintains clear separation of concerns
|
||||
- Provides transparent access to both independent review and final synthesis
|
||||
- Ensures recommendations are actionable and specific
|
||||
|
||||
@@ -51,6 +51,22 @@ The Quality Control Agent serves as a final validation layer that:
|
||||
- Overall quality assessment
|
||||
- Uses GPT-4.1 for high-quality structured output
|
||||
|
||||
### Executive Summary Agent
|
||||
The Executive Summary Agent provides a high-level synthesis through a two-step reasoning process:
|
||||
1. Independent Review Generation
|
||||
- Analyzes the manuscript without bias
|
||||
- Generates comprehensive review including summary, strengths/weaknesses, and suggestions
|
||||
- Focuses on target journal requirements and user priorities
|
||||
|
||||
2. Balanced Summary Generation
|
||||
- Synthesizes insights from both independent review and quality control results
|
||||
- Creates a unified executive summary in three paragraphs:
|
||||
* Overview of content and contribution
|
||||
* Balanced assessment of strengths and weaknesses
|
||||
* Actionable recommendations
|
||||
- Ensures natural flow while incorporating key insights
|
||||
- Maintains consistency with detailed assessment
|
||||
|
||||
## Installation
|
||||
|
||||
1. Clone the repository
|
||||
@@ -70,6 +86,10 @@ python run_analysis.py
|
||||
```bash
|
||||
python run_quality_control.py
|
||||
```
|
||||
4. Generate executive summary:
|
||||
```bash
|
||||
python run_executive_summary.py
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
@@ -78,6 +98,7 @@ The system generates JSON files in the `results/` directory containing:
|
||||
- Combined results (`combined_results.json`)
|
||||
- Manuscript data (`manuscript_data.json`)
|
||||
- Quality control results (`quality_control_results.json`)
|
||||
- Executive summary (`executive_summary.json`)
|
||||
|
||||
Each agent's analysis follows a consistent JSON structure:
|
||||
|
||||
@@ -110,6 +131,28 @@ Each agent's analysis follows a consistent JSON structure:
|
||||
}
|
||||
```
|
||||
|
||||
The executive summary follows a specific structure:
|
||||
```json
|
||||
{
|
||||
"manuscript_title": str,
|
||||
"executive_summary": str, // Three-paragraph synthesis
|
||||
"independent_review": {
|
||||
"summary": str,
|
||||
"strengths_weaknesses": {
|
||||
"strengths": [str],
|
||||
"weaknesses": [str]
|
||||
},
|
||||
"critical_suggestions": [str]
|
||||
},
|
||||
"scores": {
|
||||
"section_score": float,
|
||||
"rigor_score": float,
|
||||
"writing_score": float,
|
||||
"final_score": float
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
- Environment variables are managed in `.env`
|
||||
@@ -120,18 +163,19 @@ Each agent's analysis follows a consistent JSON structure:
|
||||
|
||||
### Project Structure
|
||||
```
|
||||
V6_multi_agent3/
|
||||
Agent1_Peer_Review/
|
||||
├── src/
|
||||
│ ├── reviewer_agents/
|
||||
│ │ ├── section/ # Section agents (S1-S10)
|
||||
│ │ ├── rigor/ # Rigor agents (R1-R7)
|
||||
│ │ ├── writing/ # Writing agents (W1-W7)
|
||||
│ │ ├── quality/ # Quality control agent
|
||||
│ │ └── controller_agent.py
|
||||
│ │ └── executive_summary_agent.py
|
||||
│ ├── core/ # Core functionality and configuration
|
||||
│ └── utils/ # Utility functions
|
||||
├── manuscripts/ # Input manuscripts
|
||||
├── results/ # Analysis results
|
||||
├── context/ # User context and preferences
|
||||
└── tests/ # Test suite
|
||||
```
|
||||
|
||||
@@ -177,6 +221,6 @@ For detailed guidelines on how to contribute, please see [CONTRIBUTING.md](CONTR
|
||||
|
||||
**Share your feedback**: Contact us at rjakob@ethz.ch with your experiences and suggestions
|
||||
|
||||
**Use more powerful models**: The default implementation uses GPT-4.1-nano for accessibility, but you can configure the system to use more sophisticated models with your own API keys.
|
||||
**Use more powerful models**: The default implementation uses GPT-4.1 for accessibility, but you can configure the system to use more sophisticated models with your own API keys.
|
||||
|
||||
Together, we can build the best review agent team and improve the quality of scientific publishing!
|
||||
46
Agent1_Peer_Review/run_executive_summary.py
Normal file
46
Agent1_Peer_Review/run_executive_summary.py
Normal file
@@ -0,0 +1,46 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Script to run the Executive Summary Agent and generate a high-level summary of the review results.
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
from src.reviewer_agents.executive_summary_agent import ExecutiveSummaryAgent
|
||||
|
||||
def main():
|
||||
# Initialize the Executive Summary Agent
|
||||
agent = ExecutiveSummaryAgent()
|
||||
|
||||
# Define input paths
|
||||
inputs = {
|
||||
'manuscript_path': 'manuscripts/Systematic Review.pdf',
|
||||
'context_path': 'context/context.json',
|
||||
'quality_control_results_path': 'results/quality_control_results.json'
|
||||
}
|
||||
|
||||
# Define output path
|
||||
output_path = 'results/executive_summary.json'
|
||||
|
||||
try:
|
||||
# Process the inputs and generate the executive summary
|
||||
results = agent.process(inputs)
|
||||
|
||||
# Save the results
|
||||
agent.save_results(results, output_path)
|
||||
|
||||
print("\nExecutive Summary Generation Complete!")
|
||||
print(f"Results saved to: {output_path}")
|
||||
|
||||
# Print the scores
|
||||
print("\nOverall Scores:")
|
||||
print(f"Section Score: {results['scores']['section_score']:.1f}/5")
|
||||
print(f"Rigor Score: {results['scores']['rigor_score']:.1f}/5")
|
||||
print(f"Writing Score: {results['scores']['writing_score']:.1f}/5")
|
||||
print(f"Final Score: {results['scores']['final_score']:.1f}/5")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error generating executive summary: {str(e)}")
|
||||
raise
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,254 @@
|
||||
import json
|
||||
import os
|
||||
from typing import Dict, Any
|
||||
import PyPDF2
|
||||
from ..core.base_agent import BaseReviewerAgent
|
||||
|
||||
class ExecutiveSummaryAgent(BaseReviewerAgent):
|
||||
"""
|
||||
Executive Summary Agent that generates a high-level summary of the review results
|
||||
and calculates overall scores based on the quality control results.
|
||||
"""
|
||||
|
||||
def __init__(self, model: str = "gpt-4.1"):
|
||||
super().__init__(model)
|
||||
self.required_inputs = {
|
||||
'manuscript_path': str,
|
||||
'context_path': str,
|
||||
'quality_control_results_path': str
|
||||
}
|
||||
|
||||
def validate_inputs(self, inputs: Dict[str, Any]) -> bool:
|
||||
"""Validate that all required input files exist and are accessible."""
|
||||
for key, path in inputs.items():
|
||||
if not os.path.exists(path):
|
||||
raise FileNotFoundError(f"Required input file not found: {path}")
|
||||
return True
|
||||
|
||||
def load_json_file(self, file_path: str) -> Dict:
|
||||
"""Load and parse a JSON file."""
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
return json.load(f)
|
||||
|
||||
def extract_pdf_text(self, pdf_path: str) -> str:
|
||||
"""Extract text from PDF file."""
|
||||
text = ""
|
||||
with open(pdf_path, 'rb') as file:
|
||||
pdf_reader = PyPDF2.PdfReader(file)
|
||||
for page in pdf_reader.pages:
|
||||
text += page.extract_text() + "\n"
|
||||
return text
|
||||
|
||||
def extract_title(self, pdf_path: str) -> str:
|
||||
"""Extract title from the first page of the PDF."""
|
||||
with open(pdf_path, 'rb') as file:
|
||||
pdf_reader = PyPDF2.PdfReader(file)
|
||||
first_page = pdf_reader.pages[0]
|
||||
text = first_page.extract_text()
|
||||
# Assuming title is in the first few lines
|
||||
lines = text.split('\n')
|
||||
for line in lines[:5]: # Check first 5 lines
|
||||
if line.strip() and len(line.strip()) > 10: # Basic title validation
|
||||
return line.strip()
|
||||
return "Title not found"
|
||||
|
||||
def calculate_scores(self, quality_control_results: Dict) -> Dict[str, float]:
|
||||
"""Calculate overall scores from quality control results."""
|
||||
scores = {
|
||||
'section_score': 0.0,
|
||||
'rigor_score': 0.0,
|
||||
'writing_score': 0.0,
|
||||
'final_score': 0.0
|
||||
}
|
||||
|
||||
# Calculate section score (S1-S10)
|
||||
section_scores = []
|
||||
for i in range(1, 11):
|
||||
section_key = f'S{i}'
|
||||
if section_key in quality_control_results.get('section_results', {}):
|
||||
section_scores.append(quality_control_results['section_results'][section_key]['score'])
|
||||
if section_scores:
|
||||
scores['section_score'] = sum(section_scores) / len(section_scores)
|
||||
|
||||
# Calculate rigor score (R1-R7)
|
||||
rigor_scores = []
|
||||
for i in range(1, 8):
|
||||
rigor_key = f'R{i}'
|
||||
if rigor_key in quality_control_results.get('rigor_results', {}):
|
||||
rigor_scores.append(quality_control_results['rigor_results'][rigor_key]['score'])
|
||||
if rigor_scores:
|
||||
scores['rigor_score'] = sum(rigor_scores) / len(rigor_scores)
|
||||
|
||||
# Calculate writing score (W1-W7)
|
||||
writing_scores = []
|
||||
for i in range(1, 8):
|
||||
writing_key = f'W{i}'
|
||||
if writing_key in quality_control_results.get('writing_results', {}):
|
||||
writing_scores.append(quality_control_results['writing_results'][writing_key]['score'])
|
||||
if writing_scores:
|
||||
scores['writing_score'] = sum(writing_scores) / len(writing_scores)
|
||||
|
||||
# Calculate final score
|
||||
category_scores = [scores['section_score'], scores['rigor_score'], scores['writing_score']]
|
||||
if category_scores:
|
||||
scores['final_score'] = sum(category_scores) / len(category_scores)
|
||||
|
||||
return scores
|
||||
|
||||
def validate_context(self, context: Dict) -> Dict:
|
||||
"""Validate and sanitize context data, providing defaults for missing or invalid values."""
|
||||
# Initialize default values
|
||||
sanitized_context = {
|
||||
'target_publication_outlets': {
|
||||
'user_input': 'the target journal'
|
||||
},
|
||||
'review_focus_areas': {
|
||||
'user_input': 'general aspects'
|
||||
}
|
||||
}
|
||||
|
||||
# Validate target publication outlets
|
||||
if isinstance(context.get('target_publication_outlets'), dict):
|
||||
user_input = context['target_publication_outlets'].get('user_input')
|
||||
if isinstance(user_input, str) and user_input.strip():
|
||||
sanitized_context['target_publication_outlets']['user_input'] = user_input.strip()
|
||||
|
||||
# Validate review focus areas
|
||||
if isinstance(context.get('review_focus_areas'), dict):
|
||||
user_input = context['review_focus_areas'].get('user_input')
|
||||
if isinstance(user_input, str) and user_input.strip():
|
||||
sanitized_context['review_focus_areas']['user_input'] = user_input.strip()
|
||||
|
||||
return sanitized_context
|
||||
|
||||
def generate_independent_review(self, manuscript_text: str, context: Dict) -> str:
|
||||
"""Generate an independent high-level review of the manuscript using GPT-4.1."""
|
||||
# Sanitize context
|
||||
sanitized_context = self.validate_context(context)
|
||||
target_journal = sanitized_context['target_publication_outlets']['user_input']
|
||||
focus_areas = sanitized_context['review_focus_areas']['user_input']
|
||||
|
||||
prompt = f"""You are an expert reviewer for {target_journal}. Read the following manuscript content and user priorities, then independently write a high-level review in three paragraphs:
|
||||
|
||||
Manuscript Content:
|
||||
{manuscript_text[:6000]}
|
||||
|
||||
User Priorities:
|
||||
- Target Journal: {target_journal}
|
||||
- Focus Areas: {focus_areas}
|
||||
|
||||
Write:
|
||||
1. A summary of what the manuscript is about
|
||||
2. The main strengths and weaknesses, with special attention to {focus_areas}
|
||||
3. The most critical suggestions for improvement, considering {target_journal} standards
|
||||
|
||||
Be concise, professional, and focus on the most important points. Do not reference any other reviews or JSON files yet."""
|
||||
response = self.llm(prompt)
|
||||
return response.strip()
|
||||
|
||||
def generate_balanced_summary(self, independent_review: str, quality_control_results: Dict, context: Dict) -> str:
|
||||
"""Balance the agent's own review with the quality-controlled review JSON."""
|
||||
# Sanitize context
|
||||
sanitized_context = self.validate_context(context)
|
||||
target_journal = sanitized_context['target_publication_outlets']['user_input']
|
||||
focus_areas = sanitized_context['review_focus_areas']['user_input']
|
||||
|
||||
prompt = f"""You are an Executive Summary Agent for {target_journal}. You have two sources:
|
||||
1. Your own independent review of the manuscript (below)
|
||||
2. The quality-controlled review JSON (below)
|
||||
|
||||
First, extract the manuscript's title from the content. Then, write a unified executive summary in three paragraphs that:
|
||||
- Provides a clear, concise overview of the manuscript
|
||||
- Presents a balanced assessment of strengths and weaknesses
|
||||
- Offers specific, actionable recommendations for improvement
|
||||
|
||||
IMPORTANT: While the quality-controlled review JSON provides valuable insights, your executive summary should:
|
||||
- Draw naturally from both your independent review and the quality control findings
|
||||
- Focus on the most significant and impactful points, regardless of source
|
||||
- Present a cohesive narrative that flows naturally
|
||||
- Avoid mechanically listing points from either source
|
||||
|
||||
Your Own Review:
|
||||
{independent_review}
|
||||
|
||||
User Priorities:
|
||||
- Target Journal: {target_journal}
|
||||
- Focus Areas: {focus_areas}
|
||||
|
||||
Quality-Controlled Review (JSON):
|
||||
{json.dumps(quality_control_results, indent=2)}
|
||||
|
||||
First, extract the manuscript's title. Then write a cohesive executive summary that:
|
||||
1. Summarizes the manuscript's content and contribution, highlighting its key insights and significance
|
||||
2. Evaluates its strengths and weaknesses, with special attention to {focus_areas}
|
||||
3. Provides clear, actionable recommendations for improvement
|
||||
|
||||
Format your response as a JSON object with two fields:
|
||||
1. "title": The extracted manuscript title
|
||||
2. "executive_summary": The three-paragraph summary
|
||||
|
||||
Keep the summary within half a page (about 250 words), use professional language, and be specific and constructive. Write as a single, unified document that flows naturally while incorporating insights from both sources."""
|
||||
response = self.llm(prompt)
|
||||
return response.strip()
|
||||
|
||||
def process(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Main processing method that:
|
||||
1. Validates inputs
|
||||
2. Extracts necessary information
|
||||
3. Generates an independent review
|
||||
4. Synthesizes a balanced executive summary
|
||||
5. Calculates scores
|
||||
6. Produces final output
|
||||
"""
|
||||
# Validate inputs
|
||||
self.validate_inputs(inputs)
|
||||
|
||||
try:
|
||||
# Load input data
|
||||
context = self.load_json_file(inputs['context_path'])
|
||||
except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
|
||||
print(f"Warning: Could not load context file: {str(e)}. Using default values.")
|
||||
context = {}
|
||||
|
||||
try:
|
||||
quality_control_results = self.load_json_file(inputs['quality_control_results_path'])
|
||||
except (json.JSONDecodeError, FileNotFoundError, PermissionError) as e:
|
||||
raise RuntimeError(f"Failed to load quality control results: {str(e)}")
|
||||
|
||||
# Extract manuscript text
|
||||
manuscript_text = self.extract_pdf_text(inputs['manuscript_path'])
|
||||
|
||||
# Step 1: Generate independent review
|
||||
independent_review = self.generate_independent_review(manuscript_text, context)
|
||||
|
||||
# Step 2: Synthesize balanced executive summary and extract title
|
||||
summary_response = self.generate_balanced_summary(independent_review, quality_control_results, context)
|
||||
try:
|
||||
summary_data = json.loads(summary_response)
|
||||
title = summary_data.get('title', 'Title not found')
|
||||
summary = summary_data.get('executive_summary', '')
|
||||
except json.JSONDecodeError:
|
||||
print("Warning: Could not parse summary response as JSON. Using raw response.")
|
||||
title = 'Title not found'
|
||||
summary = summary_response
|
||||
|
||||
# Calculate scores
|
||||
scores = self.calculate_scores(quality_control_results)
|
||||
|
||||
# Prepare output
|
||||
output = {
|
||||
'manuscript_title': title,
|
||||
'executive_summary': summary,
|
||||
'independent_review': independent_review,
|
||||
'scores': scores
|
||||
}
|
||||
|
||||
return output
|
||||
|
||||
def save_results(self, results: Dict[str, Any], output_path: str) -> None:
|
||||
"""Save the results to a JSON file."""
|
||||
os.makedirs(os.path.dirname(output_path), exist_ok=True)
|
||||
with open(output_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(results, f, indent=2)
|
||||
print(f"Executive summary results saved to {output_path}")
|
||||
Reference in New Issue
Block a user