explianit-cisco-2019-sigmod

causation理论的一点应用。证明分值不是偶然发生的

RCA的工具一般可以query and classify anomalies，相关性分析（causal probabilistic gaphical models）

spurious correlations。当dimensionality比data points数量多
交互式查询，target metrics of interest(Y),正常和异常时间段，specificity metrics for control(可选Z)，search space of metrics(可选X)=》TOP 20 root-cause insearchspace：scores(Xi)<-assoc(Y,Xi|Z)

原理

causal bayesian network。嗯，可以用带条件的两个变量关系去构造复杂的关系。

 - ExplainIt!– A Declarative Root-cause Analysis Engine for Time Series Data
 - Why? The above approach offers three main benefits. 
 - First, the formalism is a non-parametric and declarative way of expressing dependencies between variables and defers any specific approach to the runtime system. 
 - Second, the unified approach naturally lends itself to multivariate dependencies of more complex relationships beyond simple correlations between pairwise univariate metrics. 
 - Third, the approach also gives us a way to reason about dependencies that might be easier to detect only when holding some variables con- stant;

1.feature family （可以按照host聚合，类似group by。比如某个feature family是75th延时，当前clusterjobs数量）

2.ranking 假设（X,Y,Z）=》给出Xi的排序
单变量Z空score:X中每个Xi,Y中每个Yj,Pearson product-moment coorelation 的均值和最值 coorMean=meani,j|pi,j|。
多变量Z空，线性回归（random projection降维）+loss function 计算R方
Z不空：回归Y~Z,X~Z.得到RY;X.,RX;Z. 回归两个R计算R2(Y;X|Z)
当X中predictors很多，observations很少时。用Ridge penalty达到了和adjusted R2一样的效果。见后文。

实验是否能够补全图

评估

打分方法的评估：
ranking accuracy：cause是第r个，1/r
success rate: cause in topk 得1，否则0

理论

PC/SGS算法 use pairwise conditional independence=>full causal structture.also considering a joint set of variables.
rarely requires the full causal structuew

给出了过拟合用radj。当一个score至少大于s是意外正常发生的概率和n,p的关系。当s小于这个值时不可信的。

explianit-cisco-2019-sigmod

原理

评估

理论

梦想家

引用和评论

【6.C++基础】-智能指针

30分钟内输出结果，新加坡国立大学/MIT等基于SVM构建微生物污染检测模型