New qeML Plotting Function
This article is originally published at https://matloff.wordpress.com
I’ve added a new function to qeML 1.2, qeMittalGraph, based on an idea by my student Aditya Mittal. Below is an example that I think is rather compelling.
The basic idea is quite simple (and not necessarily new, just something I had not seen below): Instead of comparing several curves directly, plot their growth from their initial baseline value. So if for example X is time, then all curves start from the common point X = 0, Y = 1. Viewing the curves in this manner may make comparison more insightful.
As an example, we’ll use the currency dataset included in qeML, consisting of data on five European pre-EU currencies.
> data(currency)
> head(currency)
Can..dollar Ger..mark Fr..franc UK.pound J..yen
1 19 580 4.763 29 602
2 18 609 4.818 44 609
3 20 618 4.806 66 613
4 46 635 4.825 79 607
5 42 631 4.796 77 611
6 45 635 4.818 74 610
curr <- cbind(1:nrow(currency),currency)
names(curr)[1] <- 'weeknum'
OK, let’s graph the raw values:
z <- reshape2::melt(curr,id.vars='weeknum')
qePlotCurves(z,1,3,2)
Now with qeMittalGraph:
qeMittalGraph(curr,'weeknum','rate','country')
We immediately see two clusters, frank/mark/yen and Cdollar/pound, potentially a significant insight. There may be some economic context needed, but clearly this view could be of great interest.
Note that the ‘loess’ smoothing option is the default, which has resulted in one of the curves not passing through (0,1). Setting this option to FALSE would fix this, but at a cost of having jagged curves.
Another class of use cases is graphing the effect of a hyperparameter, say graphing the effect of minimum leaf size X in random forests, over several different datasets, with Y = Mean Absolute Prediction Error.
Thanks for visiting r-craft.org
This article is originally published at https://matloff.wordpress.com
Please visit source website for post related comments.