Convert CAP Dataset from EDF to CSV and reduce its size to 10% of original data
Find a file
2021-03-25 12:19:06 +05:30
.gitignore Initial commit 2021-03-25 12:02:31 +05:30
files.py Create files.py 2021-03-25 12:03:34 +05:30
LICENSE Initial commit 2021-03-25 12:02:31 +05:30
minifier.py Create minifier.py 2021-03-25 12:03:07 +05:30
README.md Update README.md 2021-03-25 12:18:33 +05:30
requirements.txt Add files via upload 2021-03-25 12:19:06 +05:30

CAP-Dataset-CSV-Converter

Convert CAP Dataset from EDF to CSV and reduce its size to 10% of original data

Steps:

  1. Download dataset from https://physionet.org/content/capslpdb/1.0.0/ ( ~ 40.1 GB ) .
  2. Extract ZIP
  3. Put minifier.py, files.py and requirements.txt in dataset folder (inside /cap-sleep-dataset-1.0.0/)
  4. Install requirements ( pip install -r requirements.txt )
  5. Run minifier.py
  6. Wait for 1000 years.

Notes:

  1. Make sure your free diskspace is more than 100GB
  2. Edit files.py if you want to work with only part of dataset. By default it will convert all edf files
  3. It will take looong time to process. So if you decide to leave computer running, make sure your computer don't go to sleep automatically after some time.
  4. This code will strip csv to 10% of original edf data. For eg, brux1.edf converted to brux1.csv have 7342592 rows. But I minify data to include first 734259 rows only.