Class hdf_file (o2scl_hdf)

O2scl : Class List

class o2scl_hdf::hdf_file

Store data in an compatible HDF5 file.

See also the hdf_section section of the User’s guide.

The member functions which write or get data from an HDF file begin with either get or set. Where appropriate, the next character is either c for character, d for double, f for float, or i for int.

By default, vectors and matrices are written to HDF files in a chunked format, so their length can be changed later as necessary. The chunk size is chosen in def_chunk() to be the closest power of 10 to the current vector size.

All files not closed by the user are closed in the destructor, but the destructor does not automatically close groups.

Idea for Future:

This class opens all files in R/W mode, which may cause I/O problems in file systems. This needs to be fixed by allowing the user to open a read-only file. (AWS: 3/16/18 I think this is fixed now.)

Idea for Future:

The HDF functions do not always consistently choose between throwing exceptions and throwing HDF5 exceptions. Check and/or fix this.

Idea for Future:

Automatically close groups, e.g. by storing hid_t’s in a stack?

Idea for Future:

Rewrite the _arr_alloc() functions so that they return a shared_ptr?

Idea for Future:

Move the code from the ‘filelist’ acol command here into hdf_file.

Note

Currently, HDF I/O functions write data to HDF files assuming that int and float have 4 bytes, while size_t and double are 8 bytes. All output is done in little endian format. While get functions can read data with different sizes or in big endian format, the set functions cannot currently write data this way.

Note

It does make sense to write a zero-length vector to an HDF file if the vector does not have a fixed size in order to create a placeholder for future output. Thus the set_vec() and allow zero-length vectors and the set_arr() functions allow the size_t parameter to be zero, in which case the pointer parameter is ignored. The set_vec_fixed() and set_arr_fixed() functions do not allow this, and will throw an exception if sent a zero-length vector.

Warning

This class is still in development. Because of this, hdf5 files generated by this class may not be easily read by future versions. Later versions of may have stronger guarantees on backwards compatibility.

Open and close files

int open(std::string fname, bool write_access = false, bool err_on_fail = true)

Open a file named fname.

If err_on_fail is true, this calls the error handler if opening the file fails (e.g. because the file does not exist). If err_on_fail is false and opening the file fails, nothing is done and the function returns the value o2scl::exc_efilenotfound. If the open succeeds, this function returns o2scl::success.

void open_or_create(std::string fname)

Open a file named fname or create if it doesn’t already exist.

void close()

Close the file.

Manipulate ids

hid_t get_file_id()

Get the current file id.

void set_current_id(hid_t cur)

Set the current working id.

hid_t get_current_id()

Retrieve the current working id.

Simple get functions

If the specified object is not found, the error handler will be called.

int getc(std::string name, char &c)

Get a character named name.

int getd(std::string name, double &d)

Get a double named name.

int getf(std::string name, float &f)

Get a float named name.

int geti(std::string name, int &i)

Get a integer named name.

int get_szt(std::string name, size_t &u)

Get an unsigned integer named name.

int gets(std::string name, std::string &s)

Get a string named name.

Note

Strings are stored as character arrays and thus retrieving a string from a file requires loading the information from the file into a character array, and then copying it to the string. This will be slow for very long strings.

int gets_var(std::string name, std::string &s)

Get a variable length string named name.

int gets_fixed(std::string name, std::string &s)

Get a fixed-length string named name.

int gets_def_fixed(std::string name, std::string def, std::string &s)

Get a fixed-length string named name with default value s.

Simple set functions

void setc(std::string name, char c)

Set a character named name to value c.

void setd(std::string name, double d)

Set a double named name to value d.

void setf(std::string name, float f)

Set a float named name to value f.

void seti(std::string name, int i)

Set an integer named name to value i.

void set_szt(std::string name, size_t u)

Set an unsigned integer named name to value u.

void sets(std::string name, std::string s)

Set a string named name to value s.

The string is stored in the HDF file as an extensible character array rather than a string.

void sets_fixed(std::string name, std::string s)

Set a fixed-length string named name to value s.

This function stores s as a fixed-length string in the HDF file. If a dataset named name is already present, then s must not be longer than the string length already specified in the HDF file.

Group manipulation

hid_t open_group(hid_t init_id, std::string path)

Open a group relative to the location specified in init_id.

Note

In order to ensure that future objects are written to the newly-created group, the user must use set_current_id() using the newly-created group ID for the argument.

hid_t open_group(std::string path)

Open a group relative to the current location.

Note

In order to ensure that future objects are written to the newly-created group, the user must use set_current_id() using the newly-created group ID for the argument.

int close_group(hid_t group)

Close a previously created group.

Vector get functions

These functions automatically free any previously allocated memory in v and then allocate the proper space required to read the information from the HDF file.

int getd_vec(std::string name, std::vector<double> &v)

Get vector dataset and place data in v.

template<class vec_t>
int getd_vec_copy(std::string name, vec_t &v)

Get vector dataset and place data in v.

This works with any vector class which has a resize() method.

Idea for Future:

This currently requires a copy, but there may be a way to write a new version which does not.

int geti_vec(std::string name, std::vector<int> &v)

Get vector dataset and place data in v.

template<class vec_int_t>
int geti_vec_copy(std::string name, vec_int_t &v)

Get vector dataset and place data in v.

Idea for Future:

This currently requires a copy, but there may be a way to write a new version which does not.

int get_szt_vec(std::string name, std::vector<size_t> &v)

Get vector dataset and place data in v.

template<class vec_size_t>
int get_szt_vec_copy(std::string name, vec_size_t &v)

Get vector dataset and place data in v.

Idea for Future:

This currently requires a copy, but there may be a way to write a new version which does not.

int gets_vec(std::string name, std::vector<std::string> &s)

Get a vector of strings named name and store it in s.

Vector set functions

These functions automatically write all of the vector elements to the HDF file, if necessary extending the data that is already present.

int setd_vec(std::string name, const std::vector<double> &v)

Set vector dataset named name with v.

template<class vec_t>
int setd_vec_copy(std::string name, const vec_t &v)

Set vector dataset named name with v.

This requires a copy before the vector is written to the file.

int seti_vec(std::string name, const std::vector<int> &v)

Set vector dataset named name with v.

template<class vec_int_t>
int seti_vec_copy(std::string name, vec_int_t &v)

Set vector dataset named name with v.

This requires a copy before the vector is written to the file.

int set_szt_vec(std::string name, const std::vector<size_t> &v)

Set vector dataset named name with v.

template<class vec_size_t>
int set_szt_vec_copy(std::string name, const vec_size_t &v)

Set vector dataset named name with v.

This requires a copy before the vector is written to the file.

int sets_vec(std::string name, const std::vector<std::string> &s)

Set a vector of strings named name.

Developer note:

String vectors are reformatted as a single character array, in order to allow each string to have different length and to make each string extensible. The size of the vector

s is stored as an integer named nw.

Matrix get functions

These functions automatically free any previously allocated memory in m and then allocate the proper space required to read the information from the HDF file.

int getd_mat_copy(std::string name, ubmatrix &m)

Get matrix dataset and place data in m.

int geti_mat_copy(std::string name, ubmatrix_int &m)

Get matrix dataset and place data in m.

Matrix set functions

These functions automatically write all of the vector elements to the HDF file, if necessary extending the data that is already present.

int setd_mat_copy(std::string name, const ubmatrix &m)

Set matrix dataset named name with m.

int seti_mat_copy(std::string name, const ubmatrix_int &m)

Set matrix dataset named name with m.

template<class arr2d_t>
int setd_arr2d_copy(std::string name, size_t r, size_t c, const arr2d_t &a2d)

Set a two-dimensional array dataset named name with m.

template<class arr2d_t>
int seti_arr2d_copy(std::string name, size_t r, size_t c, const arr2d_t &a2d)

Set a two-dimensional array dataset named name with m.

template<class arr2d_t>
int set_szt_arr2d_copy(std::string name, size_t r, size_t c, const arr2d_t &a2d)

Set a two-dimensional array dataset named name with m.

Tensor I/O functions

int getd_ten(std::string name, o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)

Get a tensor of double-precision numbers from an HDF file.

This version does not require a full copy of the tensor.

int geti_ten(std::string name, o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)

Get a tensor of integers from an HDF file.

This version does not require a full copy of the tensor.

int get_szt_ten(std::string name, o2scl::tensor<size_t, std::vector<size_t>, std::vector<size_t>> &t)

Get a tensor of size_t from an HDF file.

This version does not require a full copy of the tensor.

template<class vec_t, class vec_size_t>
int getd_ten_copy(std::string name, o2scl::tensor<double, vec_t, vec_size_t> &t)

Get a tensor of double-precision numbers from an HDF file.

This version requires a full copy of the tensor from the HDF5 file into the o2scl::tensor object.

template<class vec_t, class vec_size_t>
int geti_ten_copy(std::string name, o2scl::tensor<int, vec_t, vec_size_t> &t)

Get a tensor of integers from an HDF file.

This version requires a full copy of the tensor from the HDF5 file into the o2scl::tensor object.

int setd_ten(std::string name, const o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)

Write a tensor of double-precision numbers to an HDF file.

You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.

int seti_ten(std::string name, const o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)

Write a tensor of integers to an HDF file.

You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.

int set_szt_ten(std::string name, const o2scl::tensor<size_t, std::vector<size_t>, std::vector<size_t>> &t)

Write a tensor of integers to an HDF file.

You may overwrite a tensor already present in the HDF file only if it has the same rank. This version does not require a full copy of the tensor.

template<class vec_t, class vec_size_t>
int setd_ten_copy(std::string name, const o2scl::tensor<double, std::vector<double>, std::vector<size_t>> &t)

Write a tensor of double-precision numbers to an HDF file.

You may overwrite a tensor already present in the HDF file only if it has the same rank. This version requires a full copy of the tensor from the o2scl::tensor object into the HDF5 file.

template<class vec_t, class vec_size_t>
int seti_ten_copy(std::string name, const o2scl::tensor<int, std::vector<int>, std::vector<size_t>> &t)

Write a tensor of integers to an HDF file.

You may overwrite a tensor already present in the HDF file only if it has the same rank. This version requires a full copy of the tensor from the o2scl::tensor object into the HDF5 file.

Array get functions

All of these functions assume that the pointer allocated beforehand, and matches the size of the array in the HDF file. If the specified object is not found, the error handler will be called.

int getc_arr(std::string name, size_t n, char *c)

Get a character array named name of size n.

Note

The pointer c must be allocated beforehand to hold n entries, and n must match the size of the array in the HDF file.

int getd_arr(std::string name, size_t n, double *d)

Get a double array named name of size n.

Note

The pointer d must be allocated beforehand to hold n entries, and n must match the size of the array in the HDF file.

int getd_arr_compr(std::string name, size_t n, double *d, int &compr)

Get a double array named name of size n and put the compression type in compr.

Note

The pointer d must be allocated beforehand to hold n entries, and n must match the size of the array in the HDF file.

int getf_arr(std::string name, size_t n, float *f)

Get a float array named name of size n.

Note

The pointer f must be allocated beforehand to hold n entries, and n must match the size of the array in the HDF file.

int geti_arr(std::string name, size_t n, int *i)

Get an integer array named name of size n.

Note

The pointer i must be allocated beforehand to hold n entries, and n must match the size of the array in the HDF file.

Array get functions with memory allocation

These functions allocate memory with new, which should be freed by the user with delete .

int getc_arr_alloc(std::string name, size_t &n, char *c)

Get a character array named name of size n.

int getd_arr_alloc(std::string name, size_t &n, double *d)

Get a double array named name of size n.

int getf_arr_alloc(std::string name, size_t &n, float *f)

Get a float array named name of size n.

int geti_arr_alloc(std::string name, size_t &n, int *i)

Get an integer array named name of size n.

Array set functions

int setc_arr(std::string name, size_t n, const char *c)

Set a character array named name of size n to value c.

int setd_arr(std::string name, size_t n, const double *d)

Set a double array named name of size n to value d.

int setf_arr(std::string name, size_t n, const float *f)

Set a float array named name of size n to value f.

int seti_arr(std::string name, size_t n, const int *i)

Set a integer array named name of size n to value i.

int set_szt_arr(std::string name, size_t n, const size_t *u)

Set a integer array named name of size n to value i.

Fixed-length array set functions

If a dataset named name is already present, then the user-specified array must not be longer than the array already present in the HDF file.

int setc_arr_fixed(std::string name, size_t n, const char *c)

Set a character array named name of size n to value c.

int setd_arr_fixed(std::string name, size_t n, const double *c)

Set a double array named name of size n to value d.

int setf_arr_fixed(std::string name, size_t n, const float *f)

Set a float array named name of size n to value f.

int seti_arr_fixed(std::string name, size_t n, const int *i)

Set an integer array named name of size n to value i.

Get functions with default values

If the requested dataset is not found in the HDF file, the object is set to the specified default value and the error handler is not called.

int getc_def(std::string name, char def, char &c)

Get a character named name.

int getd_def(std::string name, double def, double &d)

Get a double named name.

int getf_def(std::string name, float def, float &f)

Get a float named name.

int geti_def(std::string name, int def, int &i)

Get a integer named name.

int get_szt_def(std::string name, size_t def, size_t &i)

Get a size_t named name.

int gets_def(std::string name, std::string def, std::string &s)

Get a string named name.

int gets_var_def(std::string name, std::string def, std::string &s)

Get a variable length string named name.

Get functions with pre-allocated pointer

int getd_vec_prealloc(std::string name, size_t n, double *d)

Get a double array d pre-allocated to have size n.

int geti_vec_prealloc(std::string name, size_t n, int *i)

Get an integer array i pre-allocated to have size n.

int getd_mat_prealloc(std::string name, size_t n, size_t m, double *d)

Get a double matrix d pre-allocated to have size (n,m)

int geti_mat_prealloc(std::string name, size_t n, size_t m, int *i)

Get an integer matrix i pre-allocated to have size (n,m)

Find a group

int find_object_by_type(std::string type, std::string &name, int verbose = 0)

Look in hdf_file hf for an object of type type and if found, set group_name to the associated object name.

This function returns 0 if an object of type type is found and o2scl::exc_enoprog if it fails.

int find_object_by_name(std::string name, std::string &type, int verbose = 0)

Look in hdf_file hf for an object with name name and if found, set type to the associated type.

This function returns 0 if an object with name name is found and o2scl::exc_enoprog if it fails.

int find_object_by_pattern(std::string name, std::string &type, int verbose = 0)

Look in hdf_file hf for an object with name which matches (by fnmatch()) pattern.

If an object is found, type is set to the associated type. This function returns 0 if an object with name name is found and o2scl::exc_enoprog if it fails.

void file_list(int verbose)

List datasets and objects in the top-level of the file.

void copy(int verbose, hdf_file &hf2)

Desc.

Mode values for \ref iterate_parms

const int ip_filelist = 1
const int ip_name_from_type = 2
const int ip_type_from_name = 3
const int ip_type_from_pattern = 4
void type_process(iterate_parms &ip, int mode, size_t ndims, hsize_t dims[100], hsize_t max_dims[100], std::string base_type, std::string name)

Process a type for iterate_func()

herr_t iterate_func(hid_t loc, const char *name, const H5L_info_t *inf, void *op_data)

HDF object iteration function.

herr_t iterate_copy_func(hid_t loc, const char *name, const H5L_info_t *inf, void *op_data)

HDF5 object iteration function when copying.

hdf_file(const hdf_file&)
hdf_file &operator=(const hdf_file&)

Public Types

typedef boost::numeric::ublas::vector<double> ubvector
typedef boost::numeric::ublas::matrix<double> ubmatrix
typedef boost::numeric::ublas::vector<int> ubvector_int
typedef boost::numeric::ublas::matrix<int> ubmatrix_int

Public Functions

hdf_file()
~hdf_file()
bool has_write_access()

If true, then the file has read and write access.

Public Members

int compr_type

Compression type (support experimental)

size_t min_compr_size

Minimum size to compress by default.

Protected Functions

hsize_t def_chunk(size_t n)

Default chunk size.

Choose the closest power of 10 which is greater than or equal to 10 and less than or equal to \( 10^6 \).

Protected Attributes

hid_t file

File ID.

bool file_open

True if a file has been opened.

hid_t current

Current file or group location.

bool write_access

If true, then the file has read and write access.

struct iterate_copy_parms

Parameters for iterate_copy_func()

Public Members

o2scl_hdf::hdf_file *hf
o2scl_hdf::hdf_file *hf2
int verbose
struct iterate_parms

Parameters for iterate_func()

Public Members

std::string tname
o2scl_hdf::hdf_file *hf
bool found
std::string type
int verbose
int mode