Thanks for the implementation. (hey was that from the random generator mrmath

looks quite similar

I actually have already one that is that far including the (non simd version) of Poly1305.
I'm currently in the state of bringing the Poly1305 together with the chacha cipher....
And.. although it is stated that there are 20 rounds for the standard implementation the implementation does a "double round" which halfs
the number (at least that is what I found when implementing the example from the
RFC)