This paper investigates cooperative estimation of
averaged moving target object poses in three dimensions for
visual sensor networks. In particular, we consider the situation
where multiple vision cameras see a common target object but
the poses consistent with visual measurements differ from camera
to camera due to a variety of uncertainties. Under the situation,
we try to estimate an average of the contaminated poses not
only for static but also for moving target objects by using only
local negotiations. For this purpose, we present a cooperative
estimation mechanism called networked visual motion observer.
We then derive an upper bound of the ultimate error between
the actual average and the estimates produced by the present
estimation mechanism for both static and moving target objects.
Finally the effectiveness of the networked visual motion observer
is demonstrated through simulation.