How was the reference approach trained?

We provided a reference docker container to all participants that was done by Frauke Wilm (FAU, Germany). Since the approach seems harder to beat than we expected, we prepared a two-pager on to let you know how exactly she did it. Of course the approach was not optimized on the test set or the preliminary test set in any way.

In principle, the approach is a vanilla RetinaNet approach with a domain adversarial branch added:

Figure 1 from the pre-print

You can find all the details in the paper:

Wilm, F., Breininger, K., Aubreville, M: Domain Adversarial RetinaNet as a Reference Algorithm for the MItosis DOmain Generalization (MIDOG) Challenge,


  1. Hello Marc,
    thank you very much for this great challenge. I’m just wondering why you are releasing Frauke Wilm’s reference method before the deadline. For me, and I bet for some others as well, it’s actually pretty easy to add this method to an existing pipeline (i.e. I’ve already implemented it in my networks, but not tested it). Using this feature right now would feel like cheating. So is the purpose of this blog post to make use of your experience, or what was the intent?

    • Dear Jakob,

      there’s a number of reasons for that:

    • We wanted to have a showcase of how to describe your algorithm for the two-page abstract submission that we require as a part of the submission. It’s one thing to provide a template but another thing to provide an example.
    • The approach is a pretty straight-forward standard approach for domain augmentation. That’s why we chose it as a reference – which was, by the way, available all the time (the code is part of the docker!) and having a look at the implementation reveals most parts of the approach.
    • If you want to build on that (and improve on it): Yes, go ahead. Everything we do is typically taking something that works well and improve on it with our own ideas. But have in mind that we require from all participants a description of what they did and of course your work needs to represent your own work. So just reimplementing the scheme we describe is not enough.
    • Please also have in mind that performance on the preliminary test set is not equal to performance on the test set.

      Have fun in the challenge! Best,


  2. It seems that the best method in the leaderboard achieves F1-score of 0.7514. exactly equal to this baseline algorithm ?

  3. Hi, according to preprint report, I think this baseline approach didn’t utilize the hard negative label. Am I right ?

    • Yes, that is correct. We experimented with both but ultimately decided to only train with the mitotic figure labels.

Leave a Reply

Your email address will not be published.