Extended metrics on MIDOG 2022

The MIDOG 2022 comes with extended metrics over its predecessor. Following popular request, we added to the threshold-based metrics (F1/precision/recall) also the average precision (AP), as is commonly used for object detection.

Additionally, participants will be able to see the performance on each of the tumor types (four in the preliminary test set and ten in the final test set).

Changes in the Docker container

These new metrics, however, also mean a small change in the container. The reference docker container has already been updated to reflect these changes.

In particular, the output format now shows for each detection:

In the probability field, we expect a number between 0 and 1, which reflects the detection probability (e.g., after softmax).

The “name” field is of particular importance: It needs to be set to “mitotic figure” whenever a detection is above threshold. Else, it may just have any other value (for instance: “non-mitotic figure”). We use this field for the threshold-based metrics (like F1, precision, and recall). For the AP metric, this field is not evaluated, but the “probability” is.

Dear all,
So what is the final ranking?
Thanks
Sen

I think F1 is more reasonable. AP has probability, but it is more commonly used in CV. It is really used in clinical practice. Doctors are either concerned about the final result of mitosis or not, and there is no numerical value of probability of existence.

Dear Sen,

the decisive metric is the F1 score as with MIDOG 2021. But I think AP adds significant insight into the results.For instance, if an approach has a very high AP value but failed (for some reason) to have a proper cut-off, then this would be an approach worth looking into further 🙂

Best regards,

Marc

Dear Marc,
It’s okay to put the AP value in the article, but in this case, my docker from last year has to be repackaged.
Thanks
Sen

Dear Sen,
this is only about MIDOG 2022, MIDOG 2021 is not affected. As you rightfully stated, we did not have the outputs needed to calculate an AP metric for MIDOG 2021.
If your plan was to use your docker container from MIDOG 2021 without modifications, I can only say that I’d strongly recommend to not do that. The domain shift is a pretty significant one from MIDOG 2021 to 2022. Please see also how the MIDOG 2021 baseline approach did in the leaderboard of MIDOG 2022.

Best,

Marc

Dear Marc,
Thanks.Yes, I will retrain with a new strategy.It’s just the way of packaging that needs to be changed.
Thanks
Sen

Sen Yang
26. July 2022 at 4:05

Dear all,
So what is the final ranking?
Thanks
Sen

Log in to Reply
Sen Yang
26. July 2022 at 4:31

I think F1 is more reasonable. AP has probability, but it is more commonly used in CV. It is really used in clinical practice. Doctors are either concerned about the final result of mitosis or not, and there is no numerical value of probability of existence.

Log in to Reply
aubreville
26. July 2022 at 6:40

Dear Sen,

the decisive metric is the F1 score as with MIDOG 2021. But I think AP adds significant insight into the results.For instance, if an approach has a very high AP value but failed (for some reason) to have a proper cut-off, then this would be an approach worth looking into further 🙂

Best regards,

Marc

Log in to Reply
Sen Yang
26. July 2022 at 8:05

Dear Marc,
It’s okay to put the AP value in the article, but in this case, my docker from last year has to be repackaged.
Thanks
Sen

Log in to Reply
aubreville
26. July 2022 at 8:29

Dear Sen,
this is only about MIDOG 2022, MIDOG 2021 is not affected. As you rightfully stated, we did not have the outputs needed to calculate an AP metric for MIDOG 2021.
If your plan was to use your docker container from MIDOG 2021 without modifications, I can only say that I’d strongly recommend to not do that. The domain shift is a pretty significant one from MIDOG 2021 to 2022. Please see also how the MIDOG 2021 baseline approach did in the leaderboard of MIDOG 2022.

Best,

Marc

Log in to Reply
Sen Yang
27. July 2022 at 9:54

Dear Marc,
Thanks.Yes, I will retrain with a new strategy.It’s just the way of packaging that needs to be changed.
Thanks
Sen

Log in to Reply

April 20	Training set released
August 5 - August 25 23:59 CEST	Preliminary test phase. Submissions on preliminary test set possible.
August 26 - August 29 23:59 CEST	Submission on final test set possible.
Sept 18	Challenge workshop

Changes in the Docker container

6 Comments

Leave a Reply Cancel reply