Add quality control agent implementation with token optimization plan

2025-05-31 22:15:21 +03:00 · 2025-05-10 01:37:30 +02:00
parent e6f7ccd2a7
commit edba7eeba2
4 changed files with 544 additions and 0 deletions
--- a/Agent1_Peer_Review/QualityAgentImplementation
+++ b/Agent1_Peer_Review/QualityAgentImplementation
@@ -0,0 +1,215 @@
+{\rtf1\ansi\ansicpg1252\cocoartf2822
+\cocoatextscaling0\cocoaplatform0{\fonttbl\f0\froman\fcharset0 TimesNewRomanPS-BoldMT;\f1\froman\fcharset0 TimesNewRomanPSMT;\f2\fswiss\fcharset0 Helvetica;
+\f3\fmodern\fcharset0 CourierNewPSMT;}
+{\colortbl;\red255\green255\blue255;\red0\green0\blue0;}
+{\*\expandedcolortbl;;\cssrgb\c0\c0\c0;}
+\paperw11900\paperh16840\margl1440\margr1440\vieww30040\viewh17760\viewkind0
+\deftab720
+\pard\pardeftab720\sa320\partightenfactor0
+
+\f0\b\fs32 \cf2 \expnd0\expndtw0\kerning0
+Quality Control agent implementation
+\f1\b0 \
+We got the following aggregated AI review output from the 3 agent classes\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+\cf2 -
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/rigor_results.json\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/section_results.json\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/writing_results.json\
+\pard\pardeftab720\sa320\partightenfactor0
+\cf2 What I need now is, to setup a quality control agent which inputs\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+
+\f2 \cf2 -
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 Original PDF manuscript in /Users/robertjakob/rigorous-3/A1_Peer_Review/manuscripts\
+
+\f2 -
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 Additional context input in /Users/robertjakob/rigorous-6/Agent1_Peer_Review/context/context.json\
+
+\f2 -
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 The three AI Review JSON outputs mentioned above\
+\pard\pardeftab720\sa320\partightenfactor0
+\cf2 This 
+\f0\b Quality Control agent
+\f1\b0  task is:\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+\cf2 1.
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0 
+\fs32 Carefully read and analyze the inputs\
+2.
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0 
+\fs32 Critically reassess the three AI review JSON outputs, determining which points are genuinely helpful, accurate, and applicable given original PDF manuscript and the context\
+3.
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0 
+\fs32 Based on its own assessment, produce a final, streamlined report summarizing valid and constructive feedback, structured clearly under the following section headings in JSON Format.\
+\pard\pardeftab720\sa320\partightenfactor0
+
+\f0\b \cf2 Agent Reports:
+\f1\b0 \
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+
+\f2\fs26\fsmilli13333 \cf2 \'b7
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\f0\b\fs32 Section-Specific Agents (S1\'96S10):
+\f1\b0 \
+\pard\pardeftab720\li1920\fi-480\sa320\partightenfactor0
+
+\f3\fs26\fsmilli13333 \cf2 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S1 \'96 Title and Keywords\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S2 \'96 Abstract\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S3 \'96 Introduction\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S4 \'96 Literature Review\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S5 \'96 Methodology\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S6 \'96 Results\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S7 \'96 Discussion\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S8 \'96 Conclusion\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S9 \'96 References\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 S10 \'96 Supplementary Materials\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+
+\f2\fs26\fsmilli13333 \cf2 \'b7
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\f0\b\fs32 Rigor Agents (R1\'96R7):
+\f1\b0 \
+\pard\pardeftab720\li1920\fi-480\sa320\partightenfactor0
+
+\f3\fs26\fsmilli13333 \cf2 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R1 \'96 Originality and Contribution\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R2 \'96 Impact and Significance\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R3 \'96 Ethics and Compliance\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R4 \'96 Data and Code Availability\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R5 \'96 Statistical Rigor\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R6 \'96 Technical Accuracy\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 R7 \'96 Consistency\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+
+\f2\fs26\fsmilli13333 \cf2 \'b7
+\f1\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\f0\b\fs32 Writing Agents (W1\'96W8):
+\f1\b0 \
+\pard\pardeftab720\li1920\fi-480\sa320\partightenfactor0
+
+\f3\fs26\fsmilli13333 \cf2 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W1 \'96 Language and Style\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W2 \'96 Narrative and Structure\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W3 \'96 Clarity and Conciseness\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W4 \'96 Terminology Consistency\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W5 \'96 Inclusive Language\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W6 \'96 Citation Formatting\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W7 \'96 Target Audience Alignment\
+
+\f3\fs26\fsmilli13333 o
+\f1\fs18\fsmilli9333 \'a0\'a0 
+\fs32 W8 \'96 Visual Presentation\
+\pard\pardeftab720\sa320\partightenfactor0
+\cf2 \'a0\
+Additional important notes:\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+\cf2 -
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 The Quality Control Agent should add additional helpful review suggestions in each section.\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 If feedback in one agent category section is not applicable (e.g., no supplementary material), The Quality Control Agent  should clearly note this as "Not applicable \'96 no supplementary material detected."\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 The Quality Control Agent should keep the format whereby the feedback in each category first highlights Remarks, then highlights related Original Text, then improved version, and then explanation for the improvement. This can be multiple ones per section but should be limited to around 3 items.\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 For each agent category section the Quality Control Agent should also create a short paragraph summarizing critical remarks, tips for improvement, and importantly also highlight positive aspects of the manuscript\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 The Quality Control Agent should avoid mentioning the same issue twice and focus on the most servere issues and most helpful remarks and suggestions (in total we probably want to aim for around 3 suggestions per category)\
+-
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 The Quality Control Agent should  should also 
+\f2 Reassess the 1-5 score for each section and include the revised score in  the quality controlled json\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+
+\f1 \cf2 -
+\fs18\fsmilli9333 \'a0\'a0\'a0\'a0\'a0\'a0 
+\fs32 All other functionalities of the code should remain in tact and from a workflow perspective, The Quality Control Agent should start once previous code has successfully produced 
+\fs18\fsmilli9333 \'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/rigor_results.json ; 
+\fs18\fsmilli9333 \'a0\'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/section_results.json and
+\fs18\fsmilli9333 \'a0\'a0 
+\fs32 /Users/robertjakob/rigorous-3/A1_Peer_Review/results/writing_results.json\
+\
+\pard\pardeftab720\li960\fi-480\sa320\partightenfactor0
+\cf2 \
+}
--- a/Agent1_Peer_Review/run_quality_control.py
+++ b/Agent1_Peer_Review/run_quality_control.py
@@ -0,0 +1,66 @@
+import os
+import json
+import time
+from src.reviewer_agents.quality import QualityControlAgent
+
+def wait_for_files(file_paths: list, timeout: int = 300, check_interval: int = 5) -> bool:
+    """
+    Wait for files to be created and not be empty.
+    Returns True if all files exist and are not empty, False if timeout is reached.
+    """
+    # Check if files exist and are not empty
+    for file_path in file_paths:
+        if not os.path.exists(file_path) or os.path.getsize(file_path) == 0:
+            return False
+    return True
+
+def main():
+    # Define paths according to the implementation plan
+    base_dir = os.path.dirname(os.path.abspath(__file__))
+    context_path = os.path.join(base_dir, 'context', 'context.json')
+    
+    # Define result file paths
+    manuscript_dir = os.path.join(base_dir, 'manuscripts')
+    results_dir = os.path.join(base_dir, 'results')
+    
+    # Define result file paths
+    rigor_results_path = os.path.join(results_dir, 'rigor_results.json')
+    section_results_path = os.path.join(results_dir, 'section_results.json')
+    writing_results_path = os.path.join(results_dir, 'writing_results.json')
+    
+    # Check if files exist
+    print("Checking for required files...")
+    if not wait_for_files([rigor_results_path, section_results_path, writing_results_path]):
+        raise FileNotFoundError("Required result files not found")
+    print("All required files found")
+    
+    # Find the most recent manuscript
+    manuscript_files = [f for f in os.listdir(manuscript_dir) if f.endswith('.pdf')]
+    if not manuscript_files:
+        raise FileNotFoundError("No PDF manuscripts found in the manuscripts directory")
+    manuscript_path = os.path.join(manuscript_dir, manuscript_files[0])
+    
+    # Initialize the quality control agent
+    agent = QualityControlAgent()
+    
+    # Prepare inputs
+    inputs = {
+        'manuscript_path': manuscript_path,
+        'context_path': context_path,
+        'rigor_results_path': rigor_results_path,
+        'section_results_path': section_results_path,
+        'writing_results_path': writing_results_path
+    }
+    
+    # Run the quality control analysis
+    results = agent.process(inputs)
+    
+    # Save the results
+    output_path = os.path.join(results_dir, 'quality_control_results.json')
+    with open(output_path, 'w', encoding='utf-8') as f:
+        json.dump(results, f, indent=2)
+    
+    print(f"Quality control analysis completed. Results saved to: {output_path}")
+
+if __name__ == '__main__':
+    main() 
--- a/Agent1_Peer_Review/src/reviewer_agents/quality/init.py
+++ b/Agent1_Peer_Review/src/reviewer_agents/quality/init.py
@@ -0,0 +1,3 @@
+from .quality_control_agent import QualityControlAgent
+
+__all__ = ['QualityControlAgent'] 
--- a/Agent1_Peer_Review/src/reviewer_agents/quality/quality_control_agent.py
+++ b/Agent1_Peer_Review/src/reviewer_agents/quality/quality_control_agent.py
@@ -0,0 +1,260 @@
+import json
+import os
+from typing import Dict, List, Any
+import openai
+import PyPDF2
+from ...core.base_agent import BaseReviewerAgent
+
+class QualityControlAgent(BaseReviewerAgent):
+    """
+    Quality Control Agent that reviews and validates the outputs from all other agents.
+    It ensures the quality and consistency of the review process and provides a final,
+    streamlined report.
+    """
+    
+    def __init__(self, model: str = "gpt-4.1"):
+        super().__init__(model)
+        self.required_inputs = {
+            'manuscript_path': str,
+            'context_path': str,
+            'rigor_results_path': str,
+            'section_results_path': str,
+            'writing_results_path': str
+        }
+        
+        # Define section mappings with full names
+        self.section_mappings = {
+            'section_results': {
+                'S1': 'Title and Keywords',
+                'S2': 'Abstract',
+                'S3': 'Introduction',
+                'S4': 'Literature Review',
+                'S5': 'Methodology',
+                'S6': 'Results',
+                'S7': 'Discussion',
+                'S8': 'Conclusion',
+                'S9': 'References',
+                'S10': 'Supplementary Materials'
+            },
+            'rigor_results': {
+                'R1': 'Originality and Contribution',
+                'R2': 'Impact and Significance',
+                'R3': 'Ethics and Compliance',
+                'R4': 'Data and Code Availability',
+                'R5': 'Statistical Rigor',
+                'R6': 'Technical Accuracy',
+                'R7': 'Consistency'
+            },
+            'writing_results': {
+                'W1': 'Language and Style',
+                'W2': 'Narrative and Structure',
+                'W3': 'Clarity and Conciseness',
+                'W4': 'Terminology Consistency',
+                'W5': 'Inclusive Language',
+                'W6': 'Citation Formatting',
+                'W7': 'Target Audience Alignment',
+                'W8': 'Visual Presentation'
+            }
+        }
+        
+    def validate_inputs(self, inputs: Dict[str, Any]) -> bool:
+        """Validate that all required input files exist and are accessible."""
+        for key, path in inputs.items():
+            if not os.path.exists(path):
+                raise FileNotFoundError(f"Required input file not found: {path}")
+        return True
+
+    def load_json_file(self, file_path: str) -> Dict:
+        """Load and parse a JSON file."""
+        with open(file_path, 'r', encoding='utf-8') as f:
+            return json.load(f)
+
+    def extract_pdf_text(self, pdf_path: str) -> str:
+        """Extract text from PDF file."""
+        text = ""
+        with open(pdf_path, 'rb') as file:
+            pdf_reader = PyPDF2.PdfReader(file)
+            for page in pdf_reader.pages:
+                text += page.extract_text() + "\n"
+        return text
+
+    def process(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Main processing method that:
+        1. Validates inputs
+        2. Loads and analyzes all review outputs
+        3. Produces a quality-controlled final report
+        """
+        # Validate inputs
+        self.validate_inputs(inputs)
+        
+        # Load all input data
+        context = self.load_json_file(inputs['context_path'])
+        rigor_results = self.load_json_file(inputs['rigor_results_path'])
+        section_results = self.load_json_file(inputs['section_results_path'])
+        writing_results = self.load_json_file(inputs['writing_results_path'])
+        
+        # Extract manuscript text
+        manuscript_text = self.extract_pdf_text(inputs['manuscript_path'])
+        
+        # Process each category separately
+        final_results = {}
+        
+        # Process section results
+        print("Processing section results...")
+        section_prompt = self.generate_category_prompt(
+            'section_results',
+            section_results,
+            manuscript_text,
+            context
+        )
+        section_analysis = json.loads(self.llm(section_prompt))
+        final_results['section_results'] = section_analysis.get('section_results', {})
+        
+        # Process rigor results
+        print("Processing rigor results...")
+        rigor_prompt = self.generate_category_prompt(
+            'rigor_results',
+            rigor_results,
+            manuscript_text,
+            context
+        )
+        rigor_analysis = json.loads(self.llm(rigor_prompt))
+        final_results['rigor_results'] = rigor_analysis.get('rigor_results', {})
+        
+        # Process writing results
+        print("Processing writing results...")
+        writing_prompt = self.generate_category_prompt(
+            'writing_results',
+            writing_results,
+            manuscript_text,
+            context
+        )
+        writing_analysis = json.loads(self.llm(writing_prompt))
+        final_results['writing_results'] = writing_analysis.get('writing_results', {})
+        
+        # Format the output
+        formatted_output = self.format_output(final_results)
+        
+        return formatted_output
+
+    def generate_category_prompt(self, category: str, results: Dict, manuscript_text: str, context: Dict) -> str:
+        """
+        Generate a prompt for analyzing a specific category of results.
+        """
+        # Get section mappings for this category
+        sections = self.section_mappings[category]
+        
+        # Create section headers
+        section_headers = []
+        for code, name in sections.items():
+            section_headers.append(f"o   {code} – {name}")
+        
+        # Create example JSON structure for this category
+        example_json = {
+            category: {
+                list(sections.keys())[0]: {
+                    "section_name": sections[list(sections.keys())[0]],
+                    "score": 4,
+                    "summary": "Critical remarks, tips, and positive aspects...",
+                    "suggestions": [
+                        {
+                            "remarks": "Issue description",
+                            "original_text": "Original text from manuscript",
+                            "improved_version": "Suggested improvement",
+                            "explanation": "Explanation for the improvement"
+                        }
+                    ]
+                }
+            }
+        }
+        
+        prompt = f"""You are a Quality Control Agent responsible for reviewing and validating the outputs from AI review agents. Your task is to analyze the {category.replace('_', ' ')} category:
+
+Category Sections:
+{''.join(section_headers)}
+
+For each section, you should:
+1. Validate the accuracy and relevance of the feedback
+2. Identify the most critical and helpful suggestions (aim for ~3 per section)
+3. Add any additional valuable insights
+4. Note if any section is not applicable
+5. Reassess the 1-5 score for each section
+
+Structure your analysis in the following format for each section:
+- A summary paragraph highlighting:
+  * Critical remarks
+  * Tips for improvement
+  * Positive aspects of the manuscript
+- For each suggestion (up to 3 per section):
+  * Remarks
+  * Original Text
+  * Improved Version
+  * Explanation for the improvement
+
+Important guidelines:
+- Avoid duplicate issues
+- Focus on the most severe and helpful remarks
+- Clearly mark non-applicable sections
+- Maintain the existing JSON structure
+- Ensure all feedback is constructive and actionable
+
+Please analyze the following inputs:
+
+Manuscript Text (Preview):
+{manuscript_text[:1000]}...
+
+Context:
+{json.dumps(context, indent=2)}
+
+{category.replace('_', ' ').title()} Results:
+{json.dumps(results, indent=2)}
+
+Provide your analysis in a structured JSON format that exactly matches this structure:
+{json.dumps(example_json, indent=2)}
+
+For each section:
+1. Include the full section name
+2. Provide a score (1-5)
+3. Include a summary paragraph
+4. Include up to 3 suggestions with remarks, original text, improved version, and explanation
+5. If a section is not applicable, set status to "not_applicable" and include an appropriate message
+
+Ensure your response is valid JSON and includes all required fields."""
+
+        return prompt
+
+    def format_output(self, analysis_results: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Format the analysis results into the required JSON structure.
+        """
+        try:
+            # Validate the structure
+            if not isinstance(analysis_results, dict):
+                raise ValueError("Analysis results must be a dictionary")
+            
+            # Ensure all required sections are present with full names
+            for category, sections in self.section_mappings.items():
+                if category not in analysis_results:
+                    raise ValueError(f"Missing category: {category}")
+                
+                for code, name in sections.items():
+                    if code not in analysis_results[category]:
+                        analysis_results[category][code] = {
+                            'status': 'not_applicable',
+                            'message': f'Not applicable - no {name} content detected',
+                            'score': 0,
+                            'section_name': name
+                        }
+                    else:
+                        # Add section name to existing results
+                        analysis_results[category][code]['section_name'] = name
+            
+            return analysis_results
+            
+        except Exception as e:
+            return {
+                'status': 'error',
+                'message': f'Error formatting output: {str(e)}',
+                'results': analysis_results
+            }