Behind the Scenes: The Journey to Nepal's First ECCV Paper

Embarking on a journey in AI research from a third-economy country like Nepal is not for the faint-hearted. Here’s a behind-the-scenes look at how we, Anubhav and I, managed to achieve a significant milestone with our paper, “iHuman: Instant Digital Humans From Monocular Videos,” which has been accepted to ECCV 2024.

The Obsession Begins

For the past two years, I’ve been obsessed with reconstructing human forms from images and depths. I had developed various techniques, using optimization and depth maps, but I hadn’t yet cracked the code for high-fidelity human reconstruction. However, the idea of writing a paper was as distant as the Himalayas. In our college, the prestigious engineering hub of Nepal, paper-writing was rarely discussed. Professors mentioned it occasionally, but what exactly is a research paper? How do you even start writing one? These were questions with no answers in sight.

The Prestigious College with No Research Lab

Our college, Pulchowk Campus, IOE, is the crème de la crème of engineering schools in Nepal, with only the top 100 out of over 30,000 applicants making the cut in computer engineering. Yet, there’s no research lab. The concept itself is foreign. Final-year projects are the norm, with many leaning towards AI/ML. But the irony? Our college doesn’t have a single GPU. Students rely on Google Colab for their projects.

The GPU Struggle and Innovation Under Constraints

For our project, my co-author Anubhav and I desperately needed GPUs to run methods that generate meshes from videos. I had a PC with a 7GB GPU card, and Anubhav had a laptop with a 2GB GPU. That was our arsenal. So, I decided to develop a method that could run under 2GB of memory and in less than a minute. This constraint not only streamlined our research direction but also filtered out time-wasting avenues early on. In a twist of irony, the lack of resources became our greatest ally.

Later, we got access to an RTX 4090 from a company linked to our supervisor. The catch? We could only use it before 9 am and after 6 pm, sharing it with three other teams. The struggle was real, but so was our determination.

From Good Engineers to Research Paper Authors

While we were good engineers, we had no experience in writing research papers. In fact, no one at our college had significant experience with top conferences. So, we did what any resourceful students would do – we searched and cold-emailed top people in 3D vision from Nepal. Our persistence paid off when we finally connected with Dr. Danda Pani Paudel and Dr. Ajad Chhatkuli, both Post Docs at ETH Zurich. Their guidance on writing the paper was invaluable. Making it to ECCV wouldn’t have been possible without their support and help.

The Birth of iHuman and Recognition at ECCV

Through sheer perseverance and innovative thinking, we developed "iHuman: Instant Digital Humans From Monocular Videos." This method achieves state-of-the-art results in human body reconstruction with minimal memory and computational requirements. The breakthrough? iHuman was accepted to ECCV 2024, making it the first paper from Nepal to be recognized at one of the top three conferences in computer vision. This achievement is extra special as it highlights the capabilities and potential of researchers from Nepal despite the numerous challenges. With an acceptance rate of just 25-30%, even through filtered submissions from top universities, PhD holders, and renowned labs, this recognition is truly a testament to our hard work and dedication.

Conclusion

To anyone reading this, if you are in a position to help colleges like ours – where smart, driven students are hampered by a lack of resources and mentorship – please consider lending a hand. There’s untapped potential waiting to be unleashed. We hope our story inspires others and brings attention to the incredible talent and potential that exists in under-resourced regions.