Add more benchs and pipelines

2025-03-17 21:12:24 +03:00 · 2025-03-08 20:34:01 +03:00
parent d41142359c
commit b135ae1f42
1 changed files with 41 additions and 23 deletions
--- a/README.md
+++ b/README.md
@@ -13,29 +13,36 @@ contents [by default](https://github.blog/changelog/2021-04-13-table-of-contents

 ## Comparison

-| Pipeline                      | [OmniDocBench](#omnidocbench) Overall ↓ | [olmOCR](#olmoocreval) ELO ↑ | [Marker](#marker-benchmarks) Overall ↓ | [Mistral](#mistral-ocr-benchmarks) Overall ↑ | [READoc](#readoc) Overall ↑ | [Actualize.pro](#actualize-pro) Overall ↑ |
-|-------------------------------|-----------------------------------------|------------------------------|----------------------------------------|:---------------------------------------------|-----------------------------|-------------------------------------------|
-| [MinerU](#MinerU)             | **0.150** ⚠️                            | 1545.2                       |                                        |                                              | 60.17                       | **8**                                     |
-| [Marker](#Marker)             | 0.336                                   | 1429.1                       | **4.23916** ⚠️                         |                                              | 63.57                       | 6.5                                       |
-| [Mathpix](#Mathpix)           | 0.189                                   |                              | 4.15626                                |                                              |                             |                                           |
-| [DocLing](#DocLing)           | 0.589                                   |                              | 3.70429                                |                                              |                             | 7.3                                       |
-| [GOT-OCR](#GOT-OCR)           | 0.289                                   | 1212.7                       |                                        |                                              |                             |                                           |
-| [olmOCR](#olmOCR)             |                                         | **1813.0** ⚠️                |                                        |                                              |                             |                                           |
-| [LlamaParse](#LlamaParse)     |                                         |                              | 3.97619                                |                                              |                             | 7.1                                       |
-| [MarkItDown](#MarkItDown)     |                                         |                              |                                        |                                              |                             | 7.78                                      |
-| [Nougat](#Nougat)             | 0.453                                   |                              |                                        |                                              | **81.42**                   |                                           |
-| [Zerox](#Zerox)               |                                         |                              |                                        |                                              |                             | 7.9                                       |
-| [Unstructured](#Unstructured) |                                         |                              |                                        |                                              |                             | 6.2                                       |
-| [Pix2Text](#Pix2Text)         |                                         |                              |                                        |                                              | 64.39                       |                                           |
-| [open-parse](#open-parse)     |                                         |                              |                                        |                                              |                             |                                           |
-| [Markdrop](#markdrop)         |                                         |                              |                                        |                                              |                             |                                           |
-| Mistral OCR 2503              |                                         |                              |                                        | **94.89**  ⚠️                                |                             |                                           |
-| Google Document AI            |                                         |                              |                                        | 83.42                                        |                             |                                           |
-| Azure OCR                     |                                         |                              |                                        | 89.52                                        |                             |                                           |
-| Gemini-1.5-Flash-002          |                                         |                              |                                        | 90.23                                        |                             |                                           |
-| Gemini-1.5-Pro-002            |                                         |                              |                                        | 89.92                                        |                             |                                           |
-| Gemini-2.0-Flash-001          |                                         |                              |                                        | 88.69                                        |                             |                                           |
-| GPT4o                         | 0.233                                   |                              |                                        | 89.77                                        |                             |                                           |
+| Pipeline                      | [OmniDocBench](#omnidocbench) Overall ↓ | [Omni OCR](#omni-ocr-benchmark) Accuracy ↑ | [olmOCR](#olmoocr-eval) ELO ↑ | [Marker](#marker-benchmarks) Overall ↓ | [Mistral](#mistral-ocr-benchmarks) Overall ↑ | [dp-bench](#dp-bench) NID ↑ | [READoc](#readoc) Overall ↑ | [Actualize.pro](#actualize-pro) Overall ↑ |
+|-------------------------------|-----------------------------------------|:-------------------------------------------|-------------------------------|----------------------------------------|:---------------------------------------------|-----------------------------|-----------------------------|-------------------------------------------|
+| [MinerU](#MinerU)             | **0.150** ⚠️                            |                                            | 1545.2                        |                                        |                                              |                             | 60.17                       | **8**                                     |
+| [Marker](#Marker)             | 0.336                                   |                                            | 1429.1                        | **4.23916** ⚠️                         |                                              |                             | 63.57                       | 6.5                                       |
+| [DocLing](#DocLing)           | 0.589                                   |                                            |                               | 3.70429                                |                                              |                             |                             | 7.3                                       |
+| [GOT-OCR](#GOT-OCR)           | 0.289                                   |                                            | 1212.7                        |                                        |                                              |                             |                             |                                           |
+| [olmOCR](#olmOCR)             |                                         |                                            | **1813.0** ⚠️                 |                                        |                                              |                             |                             |                                           |
+| [MarkItDown](#MarkItDown)     |                                         |                                            |                               |                                        |                                              |                             |                             | 7.78                                      |
+| [Nougat](#Nougat)             | 0.453                                   |                                            |                               |                                        |                                              |                             | **81.42**                   |                                           |
+| [Zerox (OmniAI)](#Zerox)      |                                         | **91.7**    ⚠️                             |                               |                                        |                                              |                             |                             | 7.9                                       |
+| [Unstructured](#Unstructured) |                                         | 50.8                                       |                               |                                        |                                              | 91.18                       |                             | 6.2                                       |
+| [Pix2Text](#Pix2Text)         |                                         |                                            |                               |                                        |                                              |                             | 64.39                       |                                           |
+| [open-parse](#open-parse)     |                                         |                                            |                               |                                        |                                              |                             |                             |                                           |
+| [Markdrop](#markdrop)         |                                         |                                            |                               |                                        |                                              |                             |                             |                                           |
+|                               |                                         |                                            |                               |                                        |                                              |                             |                             |                                           |
+| Mistral OCR 2503              |                                         |                                            |                               |                                        | **94.89**  ⚠️                                |                             |                             |                                           |
+| Google Document AI            |                                         | 67.8                                       |                               |                                        | 83.42                                        | 90.86                       |                             |                                           |
+| Azure OCR                     |                                         | 85.1                                       |                               |                                        | 89.52                                        | 87.69                       |                             |                                           |
+| AWS Textract                  |                                         | 74.3                                       |                               |                                        |                                              | 96.71                       |                             |                                           |
+| [LlamaParse](#LlamaParse)     |                                         |                                            |                               | 3.97619                                |                                              | 92.82                       |                             | 7.1                                       |
+| [Mathpix](#Mathpix)           | 0.189                                   |                                            |                               | 4.15626                                |                                              |                             |                             |                                           |
+| upstage                       |                                         |                                            |                               |                                        |                                              | **97.02**  ⚠️               |                             |                                           |
+|                               |                                         |                                            |                               |                                        |                                              |                             |                             |                                           |
+| Gemini-1.5-Flash-002          |                                         |                                            |                               |                                        | 90.23                                        |                             |                             |                                           |
+| Gemini-1.5-Pro-002            |                                         |                                            |                               |                                        | 89.92                                        |                             |                             |                                           |
+| Gemini-2.0-Flash-001          |                                         | 86.1                                       |                               |                                        | 88.69                                        |                             |                             |                                           |
+| GPT4o                         | 0.233                                   | 75.5                                       |                               |                                        | 89.77                                        |                             |                             |                                           |
+| Claude Sonnet 3.5             |                                         | 69.3                                       |                               |                                        |                                              |                             |                             |                                           |
+
+### [dp-bench](https://huggingface.co/datasets/upstage/dp-bench)

 - **Bold** indicates the best result for a given metric.
 - " " means the pipeline was not evaluated in that benchmark.
@@ -694,6 +701,17 @@ that aligns text with ground truth text segments, and an LLM as a judge scoring
 | GPT-4o-2024-11-20    | 89.77     | 87.55     | 86.00        | 94.58     | 91.70     |
 | Mistral OCR 2503     | **94.89** | **94.29** | **89.55**    | **98.96** | **96.12** |

+### [dp-bench](https://huggingface.co/datasets/upstage/dp-bench)
+
+| Source       | Request date | TEDS ↑ | TEDS-S ↑ | NID ↑ | Avg. Time (secs) ↓ |
+|--------------|--------------|--------|----------|-------|--------------------|
+| upstage      | 2024-10-24   | 93.48  | 94.16    | 97.02 | 3.79               |
+| aws          | 2024-10-24   | 88.05  | 90.79    | 96.71 | 14.47              |
+| llamaparse   | 2024-10-24   | 74.57  | 76.34    | 92.82 | 4.14               |
+| unstructured | 2024-10-24   | 65.56  | 70.00    | 91.18 | 13.14              |
+| google       | 2024-10-24   | 66.13  | 71.58    | 90.86 | 5.85               |
+| microsoft    | 2024-10-24   | 87.19  | 89.75    | 87.69 | 4.44               |
+
 ### [Actualize pro](https://www.actualize.pro/recourses/unlocking-insights-from-pdfs-a-comparative-study-of-extraction-tools)

 [![GitHub last commit](https://img.shields.io/github/last-commit/actualize-ae/pdf-benchmarking?label=GitHub&logo=github)](https://github.com/actualize-ae/pdf-benchmarking)