Recent site activity

ES-HDF schema

Why HDF5?


It is portable and platform independent. Once they are generated, they can be used anywhere.

Basic strategy

  • Same for serial and parallel executions. 
  • H5Group represents a physical entity or concept. 
  • Multiple instances of a datatype are represented by H5Group and enumerated. E.g., state_0, ..., state_99 for 100 states. 
  • Avoid using high-dimensional arrays. 
  • High-dimensionality due to physical concepts, such as spin, band and k-point, should be handled by HDF5 Group. 
  • Different codes use different index schemes ( band index, spin index, k-point etc) but the fast index in any language and any reasonable code means the same: 3D grid with three indexes or plane-wave vectors with one index. 
  • Repacking to a higher-dimensional array is inefficient (CPU and memory). 
  • Tools and libraries can handle any indexing scheme without requiring modification in application codes. 
  • Visualization tools can display data easily if an array is mapped on a regular grid. 

ESHDF schema

  • The memory layout of a multi-dimensional array follows C convention. 
  • The boolean type uses integer and follows C convention, i.e., 1 for true. 
  • Complex type is represented by extending the dimension of an array, e.g., psi_r[n0][n1][n2][2] for complex wavefunctions. 
  • To reduce the size of a file, recommend NOT to have psi_r 
    • This is a default behavior of pw2qmcpack and wfconvert tools.



Sample file : h5ls -r *.h5


/application             Group
/application/code        Dataset {1}
/application/version     Dataset {3}
/atoms                   Group
/atoms/number_of_atoms   Dataset {1}
/atoms/number_of_species Dataset {1}
/atoms/positions         Dataset {64, 3}
/atoms/species_0         Group
/atoms/species_0/atomic_number Dataset {1}
/atoms/species_0/name    Dataset {1}
/atoms/species_0/valence_charge Dataset {1}
/atoms/species_ids       Dataset {64}
/electrons               Group
/electrons/density       Group
/electrons/density/gvectors Dataset {758273, 3}
/electrons/density/mesh  Dataset {3}
/electrons/density/number_of_gvectors Dataset {1}
/electrons/density/spin_0 Group
/electrons/density/spin_0/density_g Dataset {758273, 2}
/electrons/kpoint_0      Group
/electrons/kpoint_0/gvectors Dataset {94760, 3}
/electrons/kpoint_0/number_of_gvectors Dataset {1}
/electrons/kpoint_0/reduced_k Dataset {3}
/electrons/kpoint_0/spin_0 Group
/electrons/kpoint_0/spin_0/eigenvalues Dataset {160}
/electrons/kpoint_0/spin_0/number_of_states Dataset {1}
/electrons/kpoint_0/spin_0/state_0 Group
/electrons/kpoint_0/spin_0/state_0/psi_g Dataset {94760, 2}
/electrons/kpoint_0/spin_0/state_0/psi_r Dataset {120, 120, 120, 2}
Omitted
/electrons/kpoint_0/spin_0/state_156 Group
/electrons/kpoint_0/spin_0/state_156/psi_g Dataset {94760, 2}
/electrons/kpoint_0/spin_0/state_156/psi_r Dataset {120, 120, 120, 2}
/electrons/kpoint_0/weight Dataset {1}
/electrons/number_of_electrons Dataset {2}
/electrons/number_of_kpoints Dataset {1}
/electrons/number_of_spins Dataset {1}
/electrons/psi_r_is_complex Dataset {1}
/electrons/psi_r_mesh    Dataset {3}
/format                  Dataset {1}
/supercell               Group
/supercell/primitive_vectors Dataset {3, 3}
/version                 Dataset {3}

Č
ċ
ď
eshdf.xml
(4k)
Jeongnim Kim,
May 2, 2011 2:09 PM