David Dukić

Logo

Postdoctoral NLP researcher at TakeLab, University of Zagreb, Croatia

I am a postdoctoral researcher at the University of Zagreb specializing in natural language processing. I obtained my PhD (summa cum laude) in 2025 from the Faculty of Electrical Engineering and Computing, University of Zagreb, under the supervision of Jan Šnajder. I completed my bachelor’s degree in 2019 and my master’s degree in 2021 at the same faculty. During my master’s studies, I received the Rector’s Award for scientific research. Since 2021, I have been employed as an assistant at the Faculty of Electrical Engineering and Computing and am a member of TakeLab, where I lead the TakeLab Retriever project. TakeLab Retriever is a platform that scanned and indexed (with topics, named entities and phrases) more than ten million articles published in the last 25 years on Croatian news outlets. During my PhD I co-authored more than ten scientific papers, publishing at leading NLP venues such as ACL and EMNLP. In my last year of PhD, I did a 3-month research visit at WüNLP group in Germany with Professor Goran Glavaš.

News

  1. December 4, 2025 — Talk at Faculty of Political Science in Zagreb
  2. November 5, 2025 — Interview at HRT radio
  3. November 5, 2025 — Talk at Croatian Academy of Sciences and Arts
  4. October 13, 2025 — Interview at HRT television
  5. September 30, 2025 — PhD thesis defended
  6. July 31, 2025 — Talk at SLAVIC NLP 2025 (Vienna)
  7. June 19, 2025 — Interview at HRT television
  8. June 5, 2025 — Talk at PhD day
  9. March 26, 2025 — Talk at Srce DEI 2025
  10. February 28, 2025 — DU-CHECK Podcast
  11. November 13, 2024 — Poster at EMNLP 2024 (Miami)
  12. September 4, 2024 — 3 month visit starts at WüNLP (Prof. Glavaš)
  13. August 14, 2024 — Poster at ACL 2024 (Bangkok)
  14. July 14, 2024 — First day of EEML summer school in Novi Sad
  15. April 9, 2024 — Interview for NOVA TV
  16. March 22, 2024 — Poster at EACL 2024 (Malta)
  17. December 7, 2023 — Interview for Netokracija outlet
  18. October 20, 2023 — Talk at AI2FUTURE conference (Zagreb)
  19. June 25, 2023 — First day of MLSS summer school in Krakow
  20. May 6, 2023 — Poster at SLAVIC NLP 2023 (Dubrovnik)
  21. November 8, 2022 — Talk at Faculty of Political Science (FPZG)
  22. July 23, 2022 — First day of LXMLS summer school in Lisbon


Selected Publications

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models
Matija Luka Kukić, Marko Čuljak, David Dukić, Martin Tutek, Jan Šnajder
EACL 2026 (Findings)  |  paper  | 

Description
Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
David Dukić, Ana Barić, Marko Čuljak, Josip Jukić, Martin Tutek
ACL 2025 (SLAVIC NLP Workshop)  |  paper  | 

Description
Are ELECTRA’s Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity
Ivan Rep, David Dukić, Jan Šnajder
EMNLP 2024 (Findings)  |  paper  | 

Description
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling
David Dukić, Jan Šnajder
ACL 2024 (Findings)  |  paper  | 

Description
Leveraging Open Information Extraction for More Robust Domain Transfer of Event Trigger Detection
David Dukić, Kiril Gashteovski, Goran Glavaš, Jan Šnajder
EACL 2024 (Findings)  |  paper  | 

Description
Target Two Birds With One SToNe: Entity-Level Sentiment and Tone Analysis in Croatian News Headlines
Ana Barić, Laura Majer, David Dukić, Marijana Grbeša-Zenzerović, Jan Šnajder
EACL 2023 (SLAVIC NLP Workshop)  |  paper  | 

Description
Are You Human? Detecting Bots on Twitter Using BERT
David Dukić, Dominik Keča, Dominik Stipić
2020 IEEE 7th International Conference on Data Science and Advanced Analytics  |  paper  | 

Description

Teaching

Introduction to Artificial Intelligence
Machine Learning 1