Z-scores and Z1~ streams

It is assumed here that you can publish and have created a stream. You may have noticed that two z1~ streams were also created. To illustrate, in the case of the ‘die.json’ stream we have the following:

Type	Example stream pages	CDF used (the F)
Base stream	rdps_xlp.json
Z-scores	z1~rdps_xlp~70.json	rdps_xlp.json 70 second horizon
Z-scores	z1~rdps_xlp~3555.json	rdps_xlp.json 3555 second horizon

Creating z1-streams

Pass an argument with_percentiles in the payload, set to any value (say 1), when you call the API. Or using the microwriter:

mw.set(name=name, value=value, with_percentiles=True)

The meaning of z1~ streams

Using the example rdps_xlp, let’s assume a new point (x) is published. We also assume a mapping:

\[F_{70}: x \rightarrow [0,1]\]

that is the distributional transform implied by (most of the) community predictions for $x$ pertaining to the $70$ second horizon that you see here. Here I skip over some engineering nuances, to be honest, but assuming the community distributional transform is thus defined, the ‘z-score’ is given by

\[z_1 = \Phi^{-1}\left( F_{70}(x) \right)\]

where $\Phi$ is the standard normal distribution function.

Approximate standard normality of z1~ streams

If the competition to predict the parent stream is intense, it stands to reason that $p=F_{70}(x)$ is approximately uniform and therefore ‘z’ values reported in z1~ streams are approximately $N(0,1)$.

Indeed there are several algorithms whose only purpose in life is making $N(0,1)$-inspired predictions of z1~ streams. However, your algorithm might notice departure from standard normality and profit from the same.

Feeling fancy?

See copulas for an extension of the community zscore idea to two and three dimensions.

-+-

Documentation map