# Bayesian network in R: Introduction

This article is originally published at https://hameddaily.blogspot.com/

- It is easy to exploit expert knowledge in BN models.
- BN models have been found to be very robust in the sense of i) noisy data, ii) missing data and iii) sparse data.
- Unlike many machine learning models (including Artificial Neural Network), which usually appear as a “black box,” all the parameters in BNs have an understandable semantic interpretation.

This data contains the following information:

data(coronary)

*Smoking*(smoking): a two-level factor with levels no and yes.*M. Work*(strenuous mental work): a two-level factor with levels no and yes.*P. Work*(strenuous physical work): a two-level factor with levels no and yes.*Pressure*(systolic blood pressure): a two-level factor with levels <140 and >140.*Proteins*(ratio of beta and alpha lipoproteins): a two-level factor with levels.*Family*(family anamnesis of coronary heart disease): a two-level factor with levels neg and pos.

### Learn structure

bn_df <- data.frame(coronary)

res <- hc(bn_df)

plot(res)

res$arcs <- res$arcs[-which((res$arcs[,'from'] == "M..Work" & res$arcs[,'to'] == "Family")),]

### Training

After learning the structure, we need to find out the conditional probability tables (CPTs) at each node. The bn.fit function runs the EM algorithm to learn CPT for different nodes in the above graph.

fittedbn <- bn.fit(res, data = bn_df)

*Protein*node.

print(fittedbn$Proteins)

*Protein*is conditioned on

*M.Work*and

*Smoking*. Since both of these variables are binary variables (only two values) the CPT table has 2x2=4 entries:

### Inference

Now, the BN is ready and we can start inferring from the network.which results in 0.61. Note that although the

cpquery(fittedbn, event = (Proteins=="<3"), evidence = ( Smoking=="no") )

*Proteins*variable is conditioned on 2 variables, we did the query based on the available evidence on only one variables. But let make our evidence richer by asking the following: What is the chance that a non-smoker with pressure greater than 140 has a Proteins level less than 3?

which results in probability 0.63.

cpquery(fittedbn, event = (Proteins=="<3"), evidence = ( Smoking=="no" & Pressure==">140" ) )

We can also move in the opposite direction of an arc between two nodes. Let’s see if a person’s

*Proteins*level is greater than 3, then what is the chance that his or her

*Pressure*level is greater than 140?

the answer would be

cpquery(fittedbn, event = (Pressure==">140"), evidence = ( Proteins=="<3" ) )

*Pressure*is greater than 140 with probability 0.41

*Machine learning*65.1 (2006): 31-78.

Thanks for visiting r-craft.org

This article is originally published at https://hameddaily.blogspot.com/

Please visit source website for post related comments.