David Dukić

Logo

Postdoctoral NLP researcher at TakeLab, University of Zagreb, Croatia

I am a postdoctoral researcher at the University of Zagreb specializing in natural language processing. I obtained my PhD (summa cum laude) in 2025 from the Faculty of Electrical Engineering and Computing, University of Zagreb, under the supervision of Jan Šnajder. I completed my bachelor’s degree in 2019 and my master’s degree in 2021 at the same faculty. During my master’s studies, I received the Rector’s Award for scientific research. Since 2021, I have been employed as an assistant at the Faculty of Electrical Engineering and Computing and am a member of TakeLab, where I lead the TakeLab Retriever project. TakeLab Retriever is a platform that scanned and indexed (with topics, named entities and phrases) more than ten million articles published in the last 25 years on Croatian news outlets. During my PhD I co-authored more than ten scientific papers, publishing at leading NLP venues such as ACL and EMNLP. In my last year of PhD, I did a 3-month research visit at WüNLP group in Germany with Professor Goran Glavaš.

News

  1. April 30, 2026 — Talk at Fondazione Bruno Kessler (Trento)
  2. April 28, 2026 — Talk at Srce DEI 2026
  3. March 24, 2026 — Poster at EACL 2026 (Rabat)
  4. March 23, 2026 — Interview at HRT radio
  5. March 11, 2026 — Interview at HRT radio
  6. December 4, 2025 — Talk at Faculty of Political Science in Zagreb
  7. November 5, 2025 — Interview at HRT radio
  8. November 5, 2025 — Talk at Croatian Academy of Sciences and Arts
  9. October 13, 2025 — Interview at HRT television
  10. September 30, 2025 — PhD thesis defended
  11. July 31, 2025 — Talk at SLAVIC NLP 2025 (Vienna)
  12. June 19, 2025 — Interview at HRT television
  13. June 5, 2025 — Talk at PhD day
  14. March 26, 2025 — Talk at Srce DEI 2025
  15. February 28, 2025 — DU-CHECK Podcast
  16. November 13, 2024 — Poster at EMNLP 2024 (Miami)
  17. September 4, 2024 — 3 month visit starts at WüNLP (Prof. Glavaš)
  18. August 14, 2024 — Poster at ACL 2024 (Bangkok)
  19. July 14, 2024 — First day of EEML summer school in Novi Sad
  20. April 9, 2024 — Interview for NOVA TV
  21. March 22, 2024 — Poster at EACL 2024 (Malta)
  22. December 7, 2023 — Interview for Netokracija outlet
  23. October 20, 2023 — Talk at AI2FUTURE conference (Zagreb)
  24. June 25, 2023 — First day of MLSS summer school in Krakow
  25. May 6, 2023 — Poster at SLAVIC NLP 2023 (Dubrovnik)
  26. November 8, 2022 — Talk at Faculty of Political Science (FPZG)
  27. July 23, 2022 — First day of LXMLS summer school in Lisbon


Selected Publications

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models
Matija Luka Kukić, Marko Čuljak, David Dukić, Martin Tutek, Jan Šnajder
EACL 2026 (Findings)  |  paper  | 

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models
Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
David Dukić, Ana Barić, Marko Čuljak, Josip Jukić, Martin Tutek
ACL 2025 (SLAVIC NLP Workshop)  |  paper  | 

Characterizing Linguistic Shifts in Croatian News via Diachronic Word Embeddings
Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity
Ivan Rep, David Dukić, Jan Šnajder
EMNLP 2024 (Findings)  |  paper  | 

Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling
David Dukić, Jan Šnajder
ACL 2024 (Findings)  |  paper  | 

Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling
Leveraging Open Information Extraction for More Robust Domain Transfer of Event Trigger Detection
David Dukić, Kiril Gashteovski, Goran Glavaš, Jan Šnajder
EACL 2024 (Findings)  |  paper  | 

Leveraging Open Information Extraction for More Robust Domain Transfer of Event Trigger Detection
Target Two Birds With One SToNe: Entity-Level Sentiment and Tone Analysis in Croatian News Headlines
Ana Barić, Laura Majer, David Dukić, Marijana Grbeša-Zenzerović, Jan Šnajder
EACL 2023 (SLAVIC NLP Workshop)  |  paper  | 

Target Two Birds With One SToNe: Entity-Level Sentiment and Tone Analysis in Croatian News Headlines
Are You Human? Detecting Bots on Twitter Using BERT
David Dukić, Dominik Keča, Dominik Stipić
2020 IEEE 7th International Conference on Data Science and Advanced Analytics  |  paper  | 

Are You Human? Detecting Bots on Twitter Using BERT

Teaching

Introduction to Artificial Intelligence
Machine Learning 1