Graph protection strategies aim to suppress the epidemic propagation in a network by allocating protection resources to maximize the ratio of surviving node. Research on this topic has been active and promising due to its wide-range applications. However, most of the recent developments are simulated by assuming that the network structure remains static during epidemics. Moreover, the existing protection schemes are limited to the simplified pre-emptive and post-emptive schemes. The pre-emptive scheme protects the most critical nodes of networks prior to epidemic spreading, behaving as a prevention mechanism. In post-emptive schemes, the protections are allocated in the presence of epidemics, when the attacks have already spread over the network, simulating a late curative response. Given a limited k resource budget, both of those schemes spend the whole resources in a single chance. In this paper, we introduce a novel adaptive protection scheme by gradually protecting nodes in response to the incoming attacks. We consider the adaptive scheme in a more challenging network structure, the dynamic networks. We propose the n-step fitted Q-learning for training the model under reinforcement approach. We further incorporate graph embedding as a feature-based representation of the network state. We also demonstrate the potential of our proposal as a non-deterministic approach for this graph protection problem. Experimental results show that our proposed model effectively restrain epidemic propagation in real-world network datasets.