process mining introduction 3 – decision tree


a good article here
Entropy: the degree of uncertainty
invest of compressibility (zippability)
Goal: reduce entropy in leaves of the tree to improve predictability.
E = – (Sigma from i=1 to k)Pi log(Pi) in base 2
K: possible values enumerated
Pi = Ci / n is the fraction of elements having value i with Ci>= 1 the number of i value and n= (sigma i = 1 to k ) Ci
decision tree