kuulmets@ut.ee
I am a PhD candidate in the Natural Language Processing lab at the University of Tartu, supervised by Prof. Mark Fishel, with my dissertation completed and pending defense.
My research focused on learning under data scarcity, with an emphasis on cross-lingual knowledge transfer for low-resource languages. I studied how synthetic and native data can be leveraged at different stages of model training, through continued pre-training, fine-tuning and prompting, to improve language understanding in settings where high-quality data is limited. I contributed to the development of Llammas 🐑.
In addition that, my work also addressed practical challenges of model evaluation in low-resource settings, including the use of translated and synthetic datasets, and resulted in the development of a new evaluation benchmark for low-resource Finno-Ugric languages.
Alongside my PhD research, I have experience applying NLP in industry settings. I have worked as a data scientist and software developer, including an internship at STACC where I focused on detecting adverse drug reactions from clinical text, which formed my Master’s thesis.
My current research interests span language-robust representations, reliable language modeling, and learning beyond static training setups such as continual learning.
I am open to postdoctoral and industry research roles.
| 06/05/2025 | Launched baromeeter.ai - a collaborative Estonian chatbot evaluation platform. |
| 03/05/2025 | Presented at NAACL 2025. [poster] |
| 03/03/2025 | Presented at NoDaLiDa/Baltic-HLT 2025. [slides] |
| 23/01/2025 | LLMs for Extremely Low-Resource Finno-Ugric Languages was accepted at NAACL 2025 Findings. |
† Equal contribution