Logo Utrecht University

Human Data Science (HDS)



SALTclass (Short and Long Text Classifier) is a Python module for text classification built under the MIT license.

Short text classification can be defined as follows: Given a set of documents with representation D and a set of labels C, define a function F that will assign a value from the set of C to each document in D. Since short text is characterized by shortness in the length, and sparsity in the representation, we try to optimize D and F in such a way that results in better performance in managing and analyzing electronic health records text data.

For more information check out the publication, GitHub page and PyPi repository.


cmfilter logo

The cmfilter (coordinate-wise mediation filter) is an R package for simultaneous discovery of multiple mediators in an x → M → y system using Coordinate-wise Mediation Filtering.

Check out the GitHub repository and publication for more information on CMFilter.


tensorsem logo

tensorsem is an R and Python package for structural equation modeling using Torch. The package is meant for researchers who know their way around SEM, Torch, and lavaan.

Check out the GitHub repository for tensorsem.



ASReview (Active Systematic Review) is a software developed to make the process of systematic reviews easier and faster. With the use of deep learning, ASReview is able to predict the papers that should likely be included in the review. The software is made in collaboration with University of Amsterdam.

For more information see the ASReview website and GitHub repository.


The SQP (Survey Quality Predictor) is a survey quality prediction system for questions used in survey research. SQP has an extensive open-source database of survey questions and quality estimates built through the collaboration of the users.

To read more about it see the SQP website.