|
9 months ago | |
---|---|---|
offlinedata | 9 months ago | |
.gitignore | 3 years ago | |
README.md | 11 months ago | |
setup.py | 3 years ago |
OfflineData is the repository to train policies and generate datasets for NeoRL benchmarks.
git clone https://agit.ai/Polixir/OfflineData.git
cd OfflineData
pip install -e .
Please install neorl
for getting environments:
git clone https://agit.ai/Polixir/NeoRL.git
cd NeoRL
pip install -e .
We use tianshou, a popular RL framework, to train the behavioral policies. Please install the tianshou through GitHub:
pip install tianshou==0.4.2
If you use mujoco environment, please make sure you install mujoco and mujoco_py.
You can use neorl
to get all standardized environments, like:
import neorl
env = neorl.make("halfcheetah-meidum-v3")
env = neorl.make("citylearn")
You can use the following environments now:
Env Name | observation shape | action shape | have done | max timesteps |
---|---|---|---|---|
HalfCheetah-v3 | 18 | 6 | False | 1000 |
Hopper-v3 | 12 | 3 | True | 1000 |
Walker2d-v3 | 18 | 6 | True | 1000 |
ib | 182 | 3 | False | 1000 |
finance | 181 | 30 | False | 2516 |
citylearn | 74 | 14 | False | 1000 |
waterworks | 14 | 4 | False | 287 |
python get_model.py --task env_name
The policy models labelled by trajectory return will be saved in models
. Some models are pre-saved in the folder.
get_model.py
uses SAC. You can also choose PPO by using get_model_ppo.py
.
You can also skip the first step and sample data use our pre-saved policy models.
python get_data.py --task env_name
python get_data.py --task env_name --add_noise
The datasets will be saved in datasets
.