As you said,
dplyr reuses variables. As a result your
initial code is trying to calculate a standard deviation from just one
value. When you look at the formula for the standard deviation:
you can see that the denominator of the formula will have a
0, which causes the
In your second
dplyr code, the standard devation is
calculated from the original variable. As the groups for which a
sd is calculated have
n > 1, the denominator
in this case is larger than zero which will result in a
dplyr just takes the last created instance of a variable.
In the page @baptiste linked to, you can find this statement of Hadley Wickham from which you can
conclude that it's better to use new names when creating new variables.
I think this behavior should be stated explicitly in the