Modern Benchmark Report¶

Este notebook resume la réplica moderna del TFG y muestra el resultado del benchmark de forma rápida: tabla comparativa, gráficas y ejemplos visuales del conjunto de evaluación.

Contexto¶

La réplica moderna compara modelos actuales de anomalib sobre data/mandarins_pynq_cropped con seeds fijas. La métrica principal es image_AUROC, con desempate por image_AUPR y después por latencia.

In [1]:
from pathlib import Path
import json

import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import Image, display

PROJECT_ROOT = None
for candidate in [Path.cwd().resolve(), *Path.cwd().resolve().parents]:
    if (candidate / 'artifacts').exists() and (candidate / 'README.md').exists():
        PROJECT_ROOT = candidate
        break
if PROJECT_ROOT is None:
    raise RuntimeError('Could not locate the project root from the current working directory.')

benchmark_root = PROJECT_ROOT / 'artifacts' / 'modern' / 'benchmark'
final_root = PROJECT_ROOT / 'artifacts' / 'modern' / 'final_model'
leaderboard = pd.read_csv(benchmark_root / 'leaderboard.csv')
runs = pd.read_csv(benchmark_root / 'benchmark_runs.csv')
summary = json.loads((benchmark_root / 'metrics_summary.json').read_text(encoding='utf-8'))
winner_name = summary['winner']['model']
leaderboard
Out[1]:
model mean_image_AUROC std_image_AUROC mean_image_AUPR std_image_AUPR mean_image_F1 std_image_F1 mean_latency_ms std_latency_ms completed_runs
0 patchcore 0.933333 0.049889 0.939841 0.050396 0.840741 0.036665 173.664240 2.374339 3
1 anomalydino 0.786667 0.018856 0.811111 0.015713 0.551587 0.170681 106.279883 1.855926 3

Ganador del benchmark¶

Aquí se ve directamente el modelo ganador y sus medias agregadas sobre las tres seeds ejecutadas.

In [2]:
pd.DataFrame([summary['winner']])
Out[2]:
completed_runs mean_image_AUPR mean_image_AUROC mean_image_F1 mean_latency_ms model std_image_AUPR std_image_AUROC std_image_F1 std_latency_ms
0 3 0.939841 0.933333 0.840741 173.66424 patchcore 0.050396 0.049889 0.036665 2.374339

Gráficas rápidas¶

Estas dos vistas permiten ver de un vistazo el equilibrio entre calidad predictiva y latencia.

In [3]:
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

axes[0].bar(leaderboard['model'], leaderboard['mean_image_AUROC'], color=['#d97706', '#2563eb'])
axes[0].set_ylim(0, 1.05)
axes[0].set_title('Mean image AUROC')
axes[0].set_ylabel('score')

axes[1].bar(leaderboard['model'], leaderboard['mean_latency_ms'], color=['#b45309', '#1d4ed8'])
axes[1].set_title('Latency per image (ms)')
axes[1].set_ylabel('ms')

fig.tight_layout()
fig
Out[3]:
No description has been provided for this image
No description has been provided for this image

Resultados por seed¶

La tabla completa sirve para comprobar la estabilidad del comportamiento del modelo entre particiones.

In [4]:
runs
Out[4]:
seed model status error_message image_AUROC image_F1Score image_AUPR latency_ms
0 13 patchcore ok NaN 1.00 0.888889 1.000000 176.98990
1 13 anomalydino ok NaN 0.80 0.571429 0.800000 107.14405
2 23 patchcore ok NaN 0.88 0.800000 0.876667 171.59989
3 23 anomalydino ok NaN 0.80 0.333333 0.800000 107.99410
4 42 patchcore ok NaN 0.92 0.833333 0.942857 172.40293
5 42 anomalydino ok NaN 0.76 0.750000 0.833333 103.70150

Ejemplos visuales del conjunto final¶

El pipeline moderno ya genera imágenes exportadas por anomalib. Aquí se muestran ejemplos normales y anómalos del split final para que la inspección visual sea inmediata.

In [5]:
gallery_root = final_root / 'Patchcore' / 'mandarine_cropped_modern' / 'v0' / 'images'
good_examples = sorted((gallery_root / 'good').glob('*'))[:3]
bad_examples = sorted((gallery_root / 'bad').glob('*'))[:3]

print('Good examples:')
for path in good_examples:
    print(path.name)
    display(Image(filename=str(path), width=420))

print('Bad examples:')
for path in bad_examples:
    print(path.name)
    display(Image(filename=str(path), width=420))
Good examples:
cropped_normal_0002.jpg
No description has been provided for this image
cropped_normal_0008.jpg
No description has been provided for this image
cropped_normal_0009.jpg
No description has been provided for this image
Bad examples:
cropped_abnormal_0001.jpg
No description has been provided for this image
cropped_abnormal_0003.jpg
No description has been provided for this image
cropped_abnormal_0004.jpg
No description has been provided for this image