Calculate the odd of winning Powerball in R
This Wednesday’s Powerball grand prize already climbed up to $1.5 BILLION. If you choose to cash out, it would be $930 million. And it keeps increasing… So, what’s the odd...continue reading.
This Wednesday’s Powerball grand prize already climbed up to $1.5 BILLION. If you choose to cash out, it would be $930 million. And it keeps increasing… So, what’s the odd...continue reading.
In the previous post (https://statcompute.wordpress.com/2016/01/01/the-power-of-decision-stumps), it was shown that the boosting algorithm performs extremely well even with a simple 1-level stump as the base learner and provides a better performance...continue reading.
A decision stump is the weak classification model with the simple tree structure consisting of one split, which can also be considered a one-level decision tree. Due to its simplicity,...continue reading.
This is an article we recently published on “Renewable and Sustainable Energy Reviews”. It starts with a thorough review of the methods used for wind resource assessment: from algorithms based...continue reading.
Back in January 2013 I wrote a blog post showing how to implement a basic cluster/block bootstrap in R. One drawback of the cluster bootstap is the length of time...continue reading.
In the world of big data and real-time analytics, Microsoft users are still living with the constraints of the bygone days of little data and basic numeracy.If you happen to...continue reading.
Are you in Montreal and curious about big data? Well here is your chance to attend a session about the same at Concordia University on Tuesday, Nov. 03 at 6:00...continue reading.
When modeling the frequency measure in the operational risk with regressions, most modelers often prefer Poisson or Negative Binomial regressions as best practices in the industry. However, as an alternative...continue reading.
The Canadian newspaper, Globe and Mail, is a leader in diction and style, but it may need improvement in the ‘grammar of graphics’.Globe’s recent depiction of metropolitan economic growth in...continue reading.
The reason why football is so exciting is uncertainty. The outcome of any match or league is unknown, and you get to watch the action unfold without knowing what’s going...continue reading.
Stata 14 has just been released. The new and big thing with version 14 is the introduction of Bayesian Statistics. A wide variety of new models can now be estimated...continue reading.
The example below shows how to estimate a simple univariate Poisson time series model with the tscount package. While the model estimation is straightforward and yeilds very similar parameter estimates...continue reading.
Modeling the time series of count outcome is of interest in the operational risk while forecasting the frequency of losses. Below is an example showing how to estimate a simple...continue reading.
Cubist is a tree-based model with a OLS regression attached to each terminal node and is somewhat similar to mob() function in the Party package (https://statcompute.wordpress.com/2014/10/26/model-segmentation-with-recursive-partitioning). Below is a demonstrate...continue reading.
For the 2015 NBA season, the only exciting Lakers news is the return of the Kobe show and Charles Barkley’s Lakers Lent.The Lakers started the season with 0 wins and...continue reading.
library(betareg) library(sas7bdat) df1 <- read.sas7bdat(‘lgd.sas7bdat’) df2 <- df1[df1$y < 1, ] fml <- as.formula(‘y ~ x2 + x3 + x4 + x5 + x6 | x3 + x4 | x1...continue reading.
pkgs <- c(‘sas7bdat’, ‘betareg’, ‘lmtest’) lapply(pkgs, require, character.only = T) df1 <- read.sas7bdat("lgd.sas7bdat") df2 <- df1[which(df1$y < 1), ] xvar <- paste("x", 1:7, sep = ”, collapse = " +...continue reading.
R is great at accomplishing complex tasks. Doing simple things with R though takes some effort. Consider the simple task of producing summary statistics for continuous variables over some factor...continue reading.
Similar to NLMIXED procedure in SAS, optim() in R provides the functionality to estimate a model by specifying the log likelihood function explicitly. Below is a demo showing how to...continue reading.
In [1]: import pandas as pd In [2]: import statsmodels.api as sm In [3]: data = pd.read_table(‘/home/liuwensui/Documents/data/csdata.txt’) In [4]: Y = data.LEV_LT3 In [5]: X = sm.add_constant(data[[‘COLLAT1’, ‘SIZE1’, ‘PROF2’, ‘LIQ’,...continue reading.