Mobile Edge Computing (MEC) has emerged as a promising computing paradigm in 5G networks, which can empower User Equipments (UEs) with computation and energy resources offered by migrating workloads from the UEs to the MEC servers. Although the issues of computation offloading and resource allocation in MEC have been studied with different optimization objectives, they mainly investigate quasi-static system environments, without considering the different resource requirements and time-varying system conditions in a dynamic system. In this paper, we exploit a multi-user MEC system, and investigate the task execution scheme for dynamic joint optimization of offloading decision and resource assignment. Our objective is to minimize the energy consumption of all UEs, with considering the delay constraint as well as the dynamic resource requirements of heterogeneous computation tasks. Accordingly, we formulate the problem as a mixed integer non-linear programming problem (MINLP), and propose a value iteration based Reinforcement Learning (RL) approach, named Q-Learning, to obtain the optimal policy of computation offloading and resource allocation. Simulation results demonstrate that the proposed approach can significantly decrease UEs' energy consumption in different scenarios, compared with other baseline methods.