Nico Colic

Background

About

Having finished my MSc in Computational Linguistics at the University of Zurich in 2016, I worked as a researcher, programmer and lecturer in Switzerland and Japan 👨‍🎓 Meanwhile, I have tended bar constantly, creating award-winning alcoholic and non-alcoholic drinks and developing a fascination for all things olfaction 👃

Work Experience

Scientific Programmer,
Apr, 2021 - Dec, 20218 months
💊 💽 In the SwissMADE project, I am responsible for the automatic processing of electronic patient reports to discover automatically adverse drug events, while in the BERGAMOS project, I am contributing to making annotations on the biomedical literature more interchangeable.
Lecturer In Computer Science,
Aug, 2020 - Present
🕵️‍♀️ 🛁 Teaching to young adults aged 14 to 20 a variety of courses, such as advanced networking, introduction to programming in C#, web application development and, particularly engaging, an Arduino tinkering class. Furthermore, I am responsible for the development of new teaching materials for a machine learning course.
Visiting Lecturer,
Mar, 2021 - Present
🐍 👅 Developing and teaching an introductory python course aimed at linguists, with a focus on NLP.
Scientific Programmer,
Dec, 2018 - Dec, 20213 years
💊 📈 As part of a project on evaluating patient data involving several Swiss hospitals (CHUV, KSB, HUG, USZ) and universities (UZH, UniL), I work to discover adverse drug reactions using NLP techniques.
Teaching Assistant,
Dec, 2014 - Aug, 20205 years 8 months
🎩 👨‍🏫 Running exercises, tutoring sessions and the occasional lecture for XML, Software Engineering, Introduction to Programming and, most recently, Advanced Text Mining Techniques
Research Assistant,
Dec, 2016 - Dec, 20204 years
🩺 🏔 Several projects related to Biomedical Text Mining at the OntoGene group (ontogene.org) at the Department of Computational Linguistics working with natural language processing technologies, python and machine learning and co-authoring several papers.
Visiting Lecturer,
Dec, 2019 - Dec, 20191 day
🐠 👨‍🎓 Teaching in a course about XML for a continuing education programme for the employees of the Zentralbibliothek
Visiting Researcher,
Dec, 2016 - Dec, 20161 day
🇯🇵 🌳 Building a RESTful API for several dependency parsers and pubannotation.org, an online repository of biomedical annotations, in ruby.

Skills

Machine Learning
University Teaching
Natural Language Processing
Ruby
Computational Linguistics
Information Extraction
Text Mining
Python
Social Media Mining

Education

Computational Linguistics, Master’s Degree, University of Zurich
Dec, 2014 - Dec, 2016
summa cum laude
Neuroinformatics, Bachelor’s Degree, University of Zurich
Dec, 2009 - Dec, 2013

Awards

JSPS Fostering Joint International Research , Japan Society for the Promotion of Science
Awarded on: Dec 31, 2016
summa cum laude , University of Zurich
Awarded on: Dec 31, 2016

Volunteer Work

Nightliner, VSUZH Verband der Studierenden der Universität Zürich
Dec, 2012 - Dec, 2013
☎️ 🌙 During my Bachelor studies I volunteered at Nightline Zürich, which is a service that students can call if they are overwhelmed by their situation and need a listener and advice.
Martial Arts Instructor, Akademischer Sportverband Zürich (ASVZ)
Dec, 2006 - Dec, 2020
💪 🥋 I have been practising my karate for over 20 years, and have attained my dan (black belt). Particularly rewarding was my job as a trainer of a children's class. Since then I started to practise Wing Chun, and teach students at the university.

Publications

Automated Detection of Adverse Drug Events from Older Patients’ Electronic Medical Records Using Text Mining, ICPR 2021: Pattern Recognition. ICPR International Workshops and Challenges
Published on: Dec 31, 2021
The Swiss Monitoring of Adverse Drug Events (SwissMADE) project is part of the SNSF-funded Smarter Health Care initiative, which aims at improving health services for the public. Its goal is to use text mining on electronic patient reports to automatically detect adverse drug events automatically in hospitalised elderly patients who received anti-thrombotic drugs. The project is the first of its kind in Switzerland: the data is provided by four hospitals from both the German- and French-speaking part of Switzerland, all of which had not previously released electronic patient records for research, making extraction and anonymisation of records one of the major challenges of the project.

In this paper, we describe the part of the project concerned with the de-identification and annotation of German data obtained from one of the hospitals in the form of patient reports.

All of these reports are automatically de-identified using a dictionary-based approach augmented with manually created rules, and then automatically annotated. For this, we employ our entity recognition pipeline called OGER (OntoGene Entity Recognizer), also a dictionary-based approach, augmented by an adapted transformer model to obtain state of the art performance, to detect drug, disease and symptom mentions in these reports. Furthermore, a subset of reports are manually annotated for drugs and diagnoses by a medical expert, serving as a validation set for the automatic annotations.
Annotating the Pandemic: Named Entity Recognition and Normalisation in COVID-19 Literature, Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Published on: Dec 31, 2020
The COVID-19 pandemic has been accompanied by such an explosive increase in media coverage and scientific publications that researchers find it difficult to keep up. We are presenting a publicly available pipeline to perform named entity recognition and normalisation in parallel to help find relevant publications and to aid in downstream NLP tasks such as text summarisation. In our approach, we are using a dictionary-based system for its high recall in conjunction with two models based on BioBERT for their accuracy. Their outputs are combined according to different strategies depending on the entity type. In addition, we are using a manually crafted dictionary to increase performance for new concepts related to COVID-19. We have previously evaluated our work on the CRAFT corpus, and make the output of our pipeline available on two visualisation platforms.
Approaching SMM4H with Merged Models and Multi-task Learning, Proceedings of the 4th Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
Published on: Dec 31, 2019
We describe our submissions to the 4th edition of the Social Media Mining for Health Applications (SMM4H) shared task. Our team (UZH) participated in two sub-tasks: Automatic classifications of adverse effects mentions in tweets (Task 1) and Generalizable identification of personal health experience mentions (Task 4). For our submissions, we exploited ensembles based on a pre-trained language representation with a neural transformer architecture (BERT) (Tasks 1 and 4) and a CNN-BiLSTM(-CRF) network within a multi-task learning scenario (Task 1). These systems are placed on top of a carefully crafted pipeline of domain-specific preprocessing steps.
Improving spaCy dependency annotation and PoS tagging web service using independent NER services, Genomics Inform.
Published on: Dec 31, 2019
Dependency parsing is often used as a component in many text analysis pipelines. However, performance, especially in specialized domains, suffers from the presence of complex terminology. Our hypothesis is that including named entity annotations can improve the speed and quality of dependency parses. As part of BLAH5, we built a web service delivering improved dependency parses by taking into account named entity annotations obtained by third party services. Our evaluation shows improved results and better speed.
OGER++: hybrid multi-type entity recognition, Journal of Cheminformatics
Published on: Dec 31, 2019
We present a text-mining tool for recognizing biomedical entities in scientific literature. OGER++ is a hybrid system for named entity recognition and concept recognition (linking), which combines a dictionary-based annotator with a corpus-based disambiguation component. The annotator uses an efficient look-up strategy combined with a normalization method for matching spelling variants. The disambiguation classifier is implemented as a feed-forward neural network which acts as a postfilter to the previous step.
UZH@SMM4H: System Descriptions, SMM4H: The 3rd Social Media Mining for Health Applications Workshop and Shared Task
Published on: Dec 31, 2019
Our team at the University of Zurich participated in the first 3 of the 4 sub-tasks at the Social Media Mining for Health Applications (SMM4H) shared task. We experimented with different approaches for text classification, namely traditional feature-based classifiers (Logistic Regression and Support Vector Machines), shallow neural networks, RCNNs, and CNNs. This system description paper provides details regarding the different system architectures and the achieved results.
Using a Hybrid Approach for Entity Recognition in the Biomedical Domain, 7th International Symposium on Semantic Mining in Biomedicine
Published on: Dec 31, 2016
This paper presents an approach towards high performance extraction of biomedical entities from the literature, which is built by combining a high recall dictionary-based technique with a high-precision machine learning filtering step. The technique is then evaluated on the CRAFT corpus. We present the performance we obtained, analyze the errors and propose a possible follow-up of this work.