Benchmark · compare
Benchmark
Compose competitors (OCR, OCR→LLM, VLM) and compare them on a corpus, in a single run. No competitor: demonstration.
0
Corpus
14
Options
Corpus
Select a prepared corpus, or stay in demonstration mode.
Corpus ready to run
Select a prepared corpus, or stay in demonstration mode.
Choose in the Library
Competitors
Each row summarizes one pipeline that will actually run.
Queue ready to run
Each row summarizes one pipeline that will actually run.

No competitor added yet.

Options
Normalization
Custom profile (YAML) — takes precedence if filled

    
Filtered from GT AND hypothesis before scoring (those characters are no longer measured).
Picks the report's ranking columns; does not drop collected data (sections unchanged).
Saves the form state (competitors + options) as JSON; reloadable later. No server persistence.
Run the benchmark
Launch the run, then open the final report or segmentation view.
Status Ready.