This repo generates the data for NeoRL.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
Icarus f7217196fa add description for tianshou 2 weeks ago
offlinedata update 2 weeks ago
.gitignore add data parse script 9 months ago
README.md add description for tianshou 2 weeks ago
setup.py fix a bug 8 months ago

README.md

OfflineData

OfflineData is the repository to train policies and generate datasets for NeoRL benchmarks.

Install OfflineData

1. Install offlinedata

git clone https://agit.ai/Polixir/OfflineData.git
cd OfflineData
pip install -e .

2. Install neorl

Please install neorl for getting environments:

git clone https://agit.ai/Polixir/NeoRL.git
cd NeoRL
pip install -e .

3. Install tianshou

We use tianshou, a popular RL framework, to train the behavioral policies. Please install the tianshou through GitHub:

pip install git+https://github.com/thu-ml/tianshou.git@866e35d

If you use mujoco environment, please make sure you install mujoco and mujoco_py.

Envs

You can use neorl to get all standardized environments, like:

import neorl

env = neorl.make("halfcheetah-meidum-v3")

env = neorl.make("citylearn")

You can use the following environments now:

Env Name observation shape action shape have done max timesteps
HalfCheetah-v3 18 6 False 1000
Hopper-v3 12 3 True 1000
Walker2d-v3 18 6 True 1000
ib 182 3 False 1000
finance 181 30 False 2516
citylearn 74 14 False 1000

Usage

1.Train policy

python get_model.py --task env_name

The policy models labelled by trajectories total reward will be saved in models. Some models are pre-saved in the folder.

2.Sample data

You can also skip the first step and sample data use our pre-saved policy models.

(1) Deterministic policy sampling:

python get_data.py --task env_name

(2) stochastic policy sampling:

python get_data.py --task env_name --add_noise

The datasets will be saved in datasets.