As you said, `dplyr`

reuses variables. As a result your
initial code is trying to calculate a standard deviation from just one
value. When you look at the formula for the standard deviation:

you can see that the denominator of the formula will have a
`0`

, which causes the `NaN`

result.

In your second `dplyr`

code, the standard devation is
calculated from the original variable. As the groups for which a
`sd`

is calculated have `n > 1`

, the denominator
in this case is larger than zero which will result in a `sd`

value.

`dplyr`

just takes the last created instance of a variable.
In the page @baptiste linked to, you can find this statement of Hadley Wickham from which you can
conclude that it's better to use new names when creating new variables.

I think this behavior should be stated explicitly in the
documentation.