Welcome to the Human Language Technology Group at the Department of Computer and Systems Sciences (DSV) at Stockholm University.
Within the Human Language Technology Group we develop efficient and resource lean NLP methods, resources and tools for information access and refinement using Language Technology for very large text sets, with a special interest in the medical domain. Our main focus is the Swedish language, but we also work with English and other languages.
Research areas
- Text summarization
- Text extraction
- Text generation
- Semantic modelling
- Information retrieval with HLT techniques
- Medical informatics with HLT techniques
Courses
The links below will open the course syllabus in a separate tab/window.
The ISBI course gives an insight into the techniques for information searching and news monitoring applied on the Internet. The course presents the fundamentals of Information retrieval and human language technology as well as business intelligence techniques (omvärldsbevakning) for example news archives and indexing tools, news alerts and agents, and RSS based news surveillance tools.
- WEBMIN/IV2038 – Web Mining, 7.5 hp
Internet contains a huge amount of information, which is rapidly growing at an ever increasing pace. People, organizations and corporations from the whole world are adding different types of information to the web continuously in various languages. The web therefore contains potentially very interesting and valuable information. This course will investigate various techniques for processing the Web in order to extract such information, refine it and make it more structured, thus making it both more valuable and accessible. These techniques are often referred to as web mining techniques.
- Natural Language Processing, 7.5 hp, starting the fall 2012
Master thesis proposals
Ongoing projects
Past projects
- KEA – Knowledge Extraction Agent
- IMAIL – Intelligent e-mail answering service for eGovernment
- TvärSök – Tvärspråklig sökning på skandinaviska
- The use of language tools for writers in the context of learning Swedish as a second language
Publications
- Hercules Dalianis’ list of publications
- Martin Hassel’s list of publications
- Ola Knutssons’s list of publications
- Tessy C-Pargman’s list of publications
- Petter Karlströms’s list of publications
- Sumithra Velupillai’s list of publications
- Maria Skeppstedt’s list of publications
- Andrea Andrenucci’s list of publications
- Eriks Sneiders’ list of publications
Tools:
- SweSum – An automatic text summarizer for 10 languages (demo)
- JavaSDM – Java tool-kit for working with Random Indexing (Java class files and source)
- SimilarityMeasures – Java class package containing a large number of Vector/Matrix Similarity Measures (Java class files and source)
- KTH eXtract Corpus – A small corpus of manually made news extracts (in Swedish)
- StemmingLab – An environment for experimenting with stemming & IR (also available inSwedish)
- SweNam – Named Entity Tagging tuned for Swedish
Contact: Martin Hassel