Mantas Deployed, Time for Whales
Since last Fall, I’ve been working on reidentifying reef manta rays with “low k-shot” (few labeled examples). That project is now deployed to the scientists and, until they start giving me feedback (and hopefully, more labeled data) pretty much at the finish line.
My ultimate goal is to build a “bring your own species” reidentification pipeline. The technique(s) I use are well-trodden at the research level and are unlikely to get much more attention in the near-term: everyone’s agog about Large Language Models and Generative AI, which this is not. There might be a breakthrough in zero-shot reidentification, the situation where you’re going through your photos and not only say “I haven’t seen this individual before,” but a while later you say “Hey, that’s the new individual I saw 100 photos ago.” My manta model works very well with a dozen photos of an individual and works helpfully-well with as few as five, but doesn’t generalize helpfully below that (it’s better than random, but not likely going to put it in the first screen or two of results.
I chose mantas for two reasons:
1) There was already a very good paper and dataset moskuyak link herethat I could use to validate my code. 2) I see boats heading to the manta ray feeding sites every night
Re-identifying cetaceans is also a well-trodden field from a research perspective. An organization called happywhale has a very large dataset of labeled photos and, for at least many species, some good reidentification models. I’m working with some proprietary data from a research organization and I don’t want to say any more than that the project involves melon-headed whales (P. electra). I’ve already done some Initial Data Analysis and a throwaway model that looks encouraging. The scientific goals go beyond re-identification and involve some pretty fun mathematics. From my perspective, a new species and dataset should allow me to help generalize my code towards the ultimate “bring your own species” goal.
I don’t know how long this project will last, as I might be able to contribute above-and-beyond the reidentification aspect.