Visualizing graphs with overlapping node groups
I recently came across some data about multilateral agreements, which needed to be visualized as network plots. This data had … Read More →continue reading.
I recently came across some data about multilateral agreements, which needed to be visualized as network plots. This data had … Read More →continue reading.
Recently, I’ve worked a lot with geospatial data in Python. One thing that we needed for our analysis was generating … Read More →continue reading.
Modern computers are equipped with processors that allow fast parallel computation at several levels: Vector or array operations, which allow … Read More →continue reading.
I just uploaded my slides on probabilistic Topic Modeling with LDA that give an overview of the theory, the basic … Read More →continue reading.
Web scraping, i.e. automated data mining from websites, usually involves fetching a web page’s HTML document, parsing it, extracting the … Read More →continue reading.
Topic modeling is a method for finding abstract topics in a large collection of documents. With it, it is possible … Read More →continue reading.
Suppose you have a list of addresses and want to connect them with some kind of location-based information. For example, … Read More →continue reading.
I’ve recently given a small workshop on Text Preprocessing and Feature Extraction for Quantitative Text Analysis with Python at the … Read More →continue reading.
This week the LATINNO project has published its comprehensive database on democratic innovations in South and Latin America on its … Read More →continue reading.
When doing text processing with NLTK on large corpora, you often need a lot of patience since even simple methods … Read More →continue reading.
Lemmatization is the process of finding the base (or dictionary) form of a possibly inflected word — its lemma. It … Read More →continue reading.
As explained before, balloon plots can be a good way to compare many observations with lots of variables. I have … Read More →continue reading.
The Django web framework is well suited for creating medium sized research databases. It allows rapid development of a convenient … Read More →continue reading.
During the last months I often had to deal with the problem of extracting tabular data from scanned documents. These … Read More →continue reading.
Heat maps are great to compare observations with lots of variables (which must be comparable in terms of unit, domain, … Read More →continue reading.
As a little side product from the experiments that I recently implemented with oTree, I created otreeutils. This package contains … Read More →continue reading.
I’m currently working on implementing some multiplayer decision strategy games for different experiments in the field of Experimental Economics. We … Read More →continue reading.
The Python Data Analysis Library pandas provides basic but reliable Excel in- and output. However, more advanced features for writing … Read More →continue reading.
Parallel Coordinate Plots are useful to visualize multivariate data. R provides several packages/functions to draw Parallel Coordinate Plots (PCPs): ggparcoord … Read More →continue reading.
Data manipulation works like a charm in R when using a library like dplyr. An often overlooked feature of this … Read More →continue reading.