pyiron.base.generic.hdfio module

class pyiron.base.generic.hdfio.FileHDFio(file_name, h5_path='/', mode='a')[source]

Bases: object

Class that provides all info to access a h5 file. This class is based on h5io.py, which allows to get and put a large variety of jobs to/from h5

Parameters:
  • file_name (str) – absolute path of the HDF5 file
  • h5_path (str) – absolute path inside the h5 path - starting from the root group
  • mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
file_name
absolute path to the HDF5 file
h5_path
path inside the HDF5 file - also stored as absolute path
history
previously opened groups / folders
file_exists
boolean if the HDF5 was already written
base_name
name of the HDF5 file but without any file extension
file_path
directory where the HDF5 file is located
is_root
boolean if the HDF5 object is located at the root level of the HDF5 file
is_open
boolean if the HDF5 file is currently opened - if an active file handler exists
is_empty
boolean if the HDF5 file is empty
base_name

Name of the HDF5 file - but without the file extension .h5

Returns:file name without the file extension
Return type:str
close()[source]

Close the current HDF5 path and return to the path before the last open

copy()[source]

Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.

Returns:New FileHDFio object pointing to the same HDF5 file
Return type:FileHDFio
copy_to(destination, file_name=None, maintain_name=True)[source]

Copy the content of the HDF5 file to a new location

Parameters:
  • destination (FileHDFio) – FileHDFio object pointing to the new location
  • file_name (str) – name of the new HDF5 file - optional
  • maintain_name (bool) – by default the names of the HDF5 groups are maintained
Returns:

FileHDFio object pointing to a file which now contains the same content as file of the current

FileHDFio object.

Return type:

FileHDFio

create_group(name)[source]

Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.

Parameters:name (str) – name of the HDF5 group
Returns:FileHDFio object pointing to the new group
Return type:FileHDFio
file_exists

Check if the HDF5 file exists already

Returns:[True/False]
Return type:bool
file_name

Get the file name of the HDF5 file

Returns:absolute path to the HDF5 file
Return type:str
file_path

Path where the HDF5 file is located - posixpath.dirname()

Returns:HDF5 file location
Return type:str
static file_size(hdf)[source]

Get size of the HDF5 file

Parameters:hdf (FileHDFio) – hdf file
Returns:file size in Bytes
Return type:float
get(key)[source]

Internal wrapper function for __getitem__() - self[name]

Parameters:key (str, slice) – path to the data or key of the data object
Returns:data or data object
Return type:dict, list, float, int
get_from_table(path, name)[source]

Get a specific value from a pandas.Dataframe

Parameters:
  • path (str) – relative path to the data object
  • name (str) – parameter key
Returns:

the value associated to the specific parameter key

Return type:

dict, list, float, int

get_pandas(name)[source]

Load a dictionary from the HDF5 file and display the dictionary as pandas Dataframe

Parameters:name (str) – HDF5 node name
Returns:The dictionary is returned as pandas.Dataframe object
Return type:pandas.Dataframe
get_size(hdf)[source]

Get size of the groups inside the HDF5 file

Parameters:hdf (FileHDFio) – hdf file
Returns:file size in Bytes
Return type:float
groups()[source]

Filter HDF5 file by groups

Returns:an HDF5 file which is filtered by groups
Return type:FileHDFio
h5_path

Get the path in the HDF5 file starting from the root group - meaning this path starts with ‘/’

Returns:HDF5 path
Return type:str
hd_copy(hdf_old, hdf_new, exclude_groups=None, exclude_nodes=None)[source]
Parameters:
  • hdf_old (ProjectHDFio) – old hdf
  • hdf_new (ProjectHDFio) – new hdf
  • exclude_groups (list/None) – list of groups to delete
  • exclude_nodes (list/None) – list of nodes to delete
is_empty

Check if the HDF5 file is empty

Returns:[True/False]
Return type:bool
is_root

Check if the current h5_path is pointing to the HDF5 root group.

Returns:[True/False]
Return type:bool
items()[source]

List all keys and values as items of all groups and nodes of the HDF5 file

Returns:list of sets (key, value)
Return type:list
keys()[source]

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns:all groups and nodes
Return type:list
list_all()[source]

List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.

Returns:{‘groups’: [list of groups], ‘nodes’: [list of nodes]}
Return type:dict
list_dirs()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns:list of groups in pytables for the path self.h5_path
Return type:(list)
list_groups()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns:list of groups in pytables for the path self.h5_path
Return type:(list)
list_nodes()[source]

List all groups and nodes of the HDF5 file

Returns:list of nodes
Return type:list
listdirs()[source]

equivalent to os.listdirs (consider groups as equivalent to dirs)

Returns:list of groups in pytables for the path self.h5_path
Return type:(list)
nodes()[source]

Filter HDF5 file by nodes

Returns:an HDF5 file which is filtered by nodes
Return type:FileHDFio
open(h5_rel_path)[source]

Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path only the h5_path is set correspondingly otherwise the group is created first.

Parameters:h5_rel_path (str) – relative path from the current HDF5 path - h5_path - to the new group
Returns:FileHDFio object pointing to the new group
Return type:FileHDFio
put(key, value)[source]

Store data inside the HDF5 file

Parameters:
  • key (str) – key to store the data
  • value (pandas.DataFrame, pandas.Series, dict, list, float, int) – basically any kind of data is supported
remove_file()[source]

Remove the HDF5 file with all the related content

remove_group()[source]

Remove an HDF5 group - if it exists. If the group does not exist no error message is raised.

rewrite_hdf5(job_name, info=False, exclude_groups=None, exclude_nodes=None)[source]
Parameters:
  • info (True/False) – whether to give the information on how much space has been saved
  • exclude_groups (list/None) – list of groups to delete from hdf
  • exclude_nodes (list/None) – list of nodes to delete from hdf
show_hdf()[source]

Iterating over the HDF5 datastructure and generating a human readable graph.

values()[source]

List all values for all groups and nodes of the HDF5 file

Returns:list of all values
Return type:list
class pyiron.base.generic.hdfio.HDFStoreIO(path, mode=None, complevel=None, complib=None, fletcher32=False, **kwargs)[source]

Bases: pandas.io.pytables.HDFStore

dict-like IO interface for storing pandas objects in PyTables either Fixed or Table format. - copied from pandas.HDFStore

Parameters:
  • path (str) – File path to HDF5 file
  • mode (str) –

    {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ 'r'

    Read-only; no data can be modified.
    'w'
    Write; a new file is created (an existing file with the same name would be deleted).
    'a'
    Append; an existing file is opened for reading and writing, and if the file does not exist it is created.
    'r+'
    It is similar to 'a', but the file must already exist.
  • complevel (int) – 1-9, default 0 If a complib is specified compression will be applied where possible
  • complib (str) – {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’, None}, default None If complevel is > 0 apply compression to objects written in the store wherever possible
  • fletcher32 (bool) – bool, default False If applying compression use the fletcher32 checksum
open(**kwargs)[source]

Open the file in the specified mode - copied from pandas.HDFStore.open()

Parameters:**kwargs – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
Returns:self - in contrast to the original implementation in pandas.
Return type:HDFStoreIO
class pyiron.base.generic.hdfio.ProjectHDFio(project, file_name, h5_path=None, mode=None)[source]

Bases: pyiron.base.generic.hdfio.FileHDFio

The ProjectHDFio class connects the FileHDFio and the Project class, it is derived from the FileHDFio class but in addition the a project object instance is located at self.project enabling direct access to the database and other project related functionality, some of which are mapped to the ProjectHDFio class as well.

Parameters:
  • project (Project) – pyiron Project the current HDF5 project is located in
  • file_name (str) – name of the HDF5 file - in contrast to the FileHDFio object where file_name represents the absolute path of the HDF5 file.
  • h5_path (str) – absolute path inside the h5 path - starting from the root group
  • mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
.. attribute:: project

Project instance the ProjectHDFio object is located in

.. attribute:: root_path

the pyiron user directory, defined in the .pyiron configuration

.. attribute:: project_path

the relative path of the current project / folder starting from the root path of the pyiron user directory

.. attribute:: path

the absolute path of the current project / folder plus the absolute path in the HDF5 file as one path

.. attribute:: file_name

absolute path to the HDF5 file

.. attribute:: h5_path

path inside the HDF5 file - also stored as absolute path

.. attribute:: history

previously opened groups / folders

.. attribute:: file_exists

boolean if the HDF5 was already written

.. attribute:: base_name

name of the HDF5 file but without any file extension

.. attribute:: file_path

directory where the HDF5 file is located

.. attribute:: is_root

boolean if the HDF5 object is located at the root level of the HDF5 file

.. attribute:: is_open

boolean if the HDF5 file is currently opened - if an active file handler exists

.. attribute:: is_empty

boolean if the HDF5 file is empty

.. attribute:: user

current unix/linux/windows user who is running pyiron

.. attribute:: sql_query

an SQL query to limit the jobs within the project to a subset which matches the SQL query.

.. attribute:: db

connection to the SQL database

.. attribute:: working_directory

working directory of the job is executed in - outside the HDF5 file

base_name

The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.

Returns:current project path
Return type:str
copy()[source]

Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path

Returns:copy of the ProjectHDFio object
Return type:ProjectHDFio
create_hdf(path, job_name)[source]

Create an ProjectHDFio object to store project related information - for testing aggregated data

Parameters:
  • path (str) – absolute path
  • job_name (str) – name of the HDF5 container
Returns:

HDF5 object

Return type:

ProjectHDFio

create_object(class_name, **qwargs)[source]

Internal function to create a pyiron object

Parameters:
  • class_name (str) – name of a pyiron class
  • **qwargs – object parameters
Returns:

defined by the pyiron class in class_name with the input from **qwargs

Return type:

pyiron object

create_working_directory()[source]

Create the working directory on the file system if it does not exist already.

db

Get connection to the SQL database

Returns:database conncetion
Return type:DatabaseAccess
get_job_id(job_specifier)[source]

get the job_id for job named job_name in the local project path from database

Parameters:job_specifier (str, int) – name of the job or job ID
Returns:job ID of the job
Return type:int
inspect(job_specifier)[source]

Inspect an existing pyiron object - most commonly a job - from the database

Parameters:job_specifier (str, int) – name of the job or job ID
Returns:Access to the HDF5 object - not a GenericJob object - use load() instead.
Return type:JobCore
load(job_specifier, convert_to_object=True)[source]

Load an existing pyiron object - most commonly a job - from the database

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

load_from_jobpath(job_id=None, db_entry=None, convert_to_object=True)[source]

Internal function to load an existing job either based on the job ID or based on the database entry dictionary.

Parameters:
  • job_id (int) – Job ID - optional, but either the job_id or the db_entry is required.
  • db_entry (dict) – database entry dictionary - optional, but either the job_id or the db_entry is required.
  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

name
path

Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.

Returns:absolute path
Return type:str
project

Get the project instance the ProjectHDFio object is located in

Returns:pyiron project
Return type:Project
project_path

the relative path of the current project / folder starting from the root path of the pyiron user directory

Returns:relative path of the current project / folder
Return type:str
remove_job(job_specifier, _unprotect=False)[source]

Remove a single job from the project based on its job_specifier - see also remove_jobs()

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • _unprotect (bool) – [True/False] delete the job without validating the dependencies to other jobs - default=False
root_path

the pyiron user directory, defined in the .pyiron configuration

Returns:pyiron user directory of the current project
Return type:str
sql_query

Get the SQL query for the project

Returns:SQL query
Return type:str
to_object(object_type=None, **qwargs)[source]

Load the full pyiron object from an HDF5 file

Parameters:
  • object_type – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set - optional
  • **qwargs – optional parameters [‘job_name’, ‘project’] - to specify the location of the HDF5 path
Returns:

pyiron object

Return type:

GenericJob

user

Get current unix/linux/windows user who is running pyiron

Returns:username
Return type:str
working_directory

Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem:

/absolute/path/to/the/file.h5/path/inside/the/hdf5/file
becomes:
/absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file
Returns:absolute path to the working directory
Return type:str