pyiron.base.generic.hdfio module¶
-
class
pyiron.base.generic.hdfio.
FileHDFio
(file_name, h5_path='/', mode='a')[source]¶ Bases:
object
Class that provides all info to access a h5 file. This class is based on h5io.py, which allows to get and put a large variety of jobs to/from h5
- Parameters
file_name (str) – absolute path of the HDF5 file
h5_path (str) – absolute path inside the h5 path - starting from the root group
mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
-
file_name
¶ -
absolute path to the HDF5 file
-
h5_path
¶ -
path inside the HDF5 file - also stored as absolute path
-
history
¶ -
previously opened groups / folders
-
file_exists
¶ -
boolean if the HDF5 was already written
-
base_name
¶ -
name of the HDF5 file but without any file extension
-
file_path
¶ -
directory where the HDF5 file is located
-
is_root
¶ -
boolean if the HDF5 object is located at the root level of the HDF5 file
-
is_open
¶ -
boolean if the HDF5 file is currently opened - if an active file handler exists
-
is_empty
¶ -
boolean if the HDF5 file is empty
-
property
base_name
¶ Name of the HDF5 file - but without the file extension .h5
- Returns
file name without the file extension
- Return type
str
-
copy
()[source]¶ Copy the Python object which links to the HDF5 file - in contrast to copy_to() which copies the content of the HDF5 file to a new location.
- Returns
New FileHDFio object pointing to the same HDF5 file
- Return type
-
copy_to
(destination, file_name=None, maintain_name=True)[source]¶ Copy the content of the HDF5 file to a new location
- Parameters
destination (FileHDFio) – FileHDFio object pointing to the new location
file_name (str) – name of the new HDF5 file - optional
maintain_name (bool) – by default the names of the HDF5 groups are maintained
- Returns
- FileHDFio object pointing to a file which now contains the same content as file of the current
FileHDFio object.
- Return type
-
create_group
(name)[source]¶ Create an HDF5 group - similar to a folder in the filesystem - the HDF5 groups allow the users to structure their data.
- Parameters
name (str) – name of the HDF5 group
- Returns
FileHDFio object pointing to the new group
- Return type
-
property
file_exists
¶ Check if the HDF5 file exists already
- Returns
[True/False]
- Return type
bool
-
property
file_name
¶ Get the file name of the HDF5 file
- Returns
absolute path to the HDF5 file
- Return type
str
-
property
file_path
¶ Path where the HDF5 file is located - posixpath.dirname()
- Returns
HDF5 file location
- Return type
str
-
static
file_size
(hdf)[source]¶ Get size of the HDF5 file
- Parameters
hdf (FileHDFio) – hdf file
- Returns
file size in Bytes
- Return type
float
-
get
(key)[source]¶ Internal wrapper function for __getitem__() - self[name]
- Parameters
key (str, slice) – path to the data or key of the data object
- Returns
data or data object
- Return type
dict, list, float, int
-
get_from_table
(path, name)[source]¶ Get a specific value from a pandas.Dataframe
- Parameters
path (str) – relative path to the data object
name (str) – parameter key
- Returns
the value associated to the specific parameter key
- Return type
dict, list, float, int
-
get_pandas
(name)[source]¶ Load a dictionary from the HDF5 file and display the dictionary as pandas Dataframe
- Parameters
name (str) – HDF5 node name
- Returns
The dictionary is returned as pandas.Dataframe object
- Return type
pandas.Dataframe
-
get_size
(hdf)[source]¶ Get size of the groups inside the HDF5 file
- Parameters
hdf (FileHDFio) – hdf file
- Returns
file size in Bytes
- Return type
float
-
groups
()[source]¶ Filter HDF5 file by groups
- Returns
an HDF5 file which is filtered by groups
- Return type
-
property
h5_path
¶ Get the path in the HDF5 file starting from the root group - meaning this path starts with ‘/’
- Returns
HDF5 path
- Return type
str
-
hd_copy
(hdf_old, hdf_new, exclude_groups=None, exclude_nodes=None)[source]¶ - Parameters
hdf_old (ProjectHDFio) – old hdf
hdf_new (ProjectHDFio) – new hdf
exclude_groups (list/None) – list of groups to delete
exclude_nodes (list/None) – list of nodes to delete
-
property
is_empty
¶ Check if the HDF5 file is empty
- Returns
[True/False]
- Return type
bool
-
property
is_root
¶ Check if the current h5_path is pointing to the HDF5 root group.
- Returns
[True/False]
- Return type
bool
-
items
()[source]¶ List all keys and values as items of all groups and nodes of the HDF5 file
- Returns
list of sets (key, value)
- Return type
list
-
keys
()[source]¶ List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
- Returns
all groups and nodes
- Return type
list
-
list_all
()[source]¶ List all groups and nodes of the HDF5 file - where groups are equivalent to directories and nodes to files.
- Returns
{‘groups’: [list of groups], ‘nodes’: [list of nodes]}
- Return type
dict
-
list_dirs
()[source]¶ equivalent to os.listdirs (consider groups as equivalent to dirs)
- Returns
list of groups in pytables for the path self.h5_path
- Return type
(list)
-
list_groups
()[source]¶ equivalent to os.listdirs (consider groups as equivalent to dirs)
- Returns
list of groups in pytables for the path self.h5_path
- Return type
(list)
-
list_nodes
()[source]¶ List all groups and nodes of the HDF5 file
- Returns
list of nodes
- Return type
list
-
listdirs
()[source]¶ equivalent to os.listdirs (consider groups as equivalent to dirs)
- Returns
list of groups in pytables for the path self.h5_path
- Return type
(list)
-
nodes
()[source]¶ Filter HDF5 file by nodes
- Returns
an HDF5 file which is filtered by nodes
- Return type
-
open
(h5_rel_path)[source]¶ Create an HDF5 group and enter this specific group. If the group exists in the HDF5 path only the h5_path is set correspondingly otherwise the group is created first.
- Parameters
h5_rel_path (str) – relative path from the current HDF5 path - h5_path - to the new group
- Returns
FileHDFio object pointing to the new group
- Return type
-
put
(key, value)[source]¶ Store data inside the HDF5 file
- Parameters
key (str) – key to store the data
value (pandas.DataFrame, pandas.Series, dict, list, float, int) – basically any kind of data is supported
-
remove_group
()[source]¶ Remove an HDF5 group - if it exists. If the group does not exist no error message is raised.
-
rewrite_hdf5
(job_name, info=False, exclude_groups=None, exclude_nodes=None)[source]¶ - Parameters
info (True/False) – whether to give the information on how much space has been saved
exclude_groups (list/None) – list of groups to delete from hdf
exclude_nodes (list/None) – list of nodes to delete from hdf
-
class
pyiron.base.generic.hdfio.
HDFStoreIO
(path, mode=None, complevel=None, complib=None, fletcher32=False, **kwargs)[source]¶ Bases:
pandas.io.pytables.HDFStore
dict-like IO interface for storing pandas objects in PyTables either Fixed or Table format. - copied from pandas.HDFStore
- Parameters
path (str) – File path to HDF5 file
mode (str) –
{‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’
'r'
Read-only; no data can be modified.
'w'
Write; a new file is created (an existing file with the same name would be deleted).
'a'
Append; an existing file is opened for reading and writing, and if the file does not exist it is created.
'r+'
It is similar to
'a'
, but the file must already exist.
complevel (int) – 1-9, default 0 If a complib is specified compression will be applied where possible
complib (str) – {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’, None}, default None If complevel is > 0 apply compression to objects written in the store wherever possible
fletcher32 (bool) – bool, default False If applying compression use the fletcher32 checksum
-
open
(**kwargs)[source]¶ Open the file in the specified mode - copied from pandas.HDFStore.open()
- Parameters
**kwargs – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
- Returns
self - in contrast to the original implementation in pandas.
- Return type
-
class
pyiron.base.generic.hdfio.
ProjectHDFio
(project, file_name, h5_path=None, mode=None)[source]¶ Bases:
pyiron.base.generic.hdfio.FileHDFio
The ProjectHDFio class connects the FileHDFio and the Project class, it is derived from the FileHDFio class but in addition the a project object instance is located at self.project enabling direct access to the database and other project related functionality, some of which are mapped to the ProjectHDFio class as well.
- Parameters
project (Project) – pyiron Project the current HDF5 project is located in
file_name (str) – name of the HDF5 file - in contrast to the FileHDFio object where file_name represents the absolute path of the HDF5 file.
h5_path (str) – absolute path inside the h5 path - starting from the root group
mode (str) – mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’ See HDFStore docstring or tables.open_file for info about modes
-
.. attribute:: project
Project instance the ProjectHDFio object is located in
-
.. attribute:: root_path
the pyiron user directory, defined in the .pyiron configuration
-
.. attribute:: project_path
the relative path of the current project / folder starting from the root path of the pyiron user directory
-
.. attribute:: path
the absolute path of the current project / folder plus the absolute path in the HDF5 file as one path
-
.. attribute:: file_name
absolute path to the HDF5 file
-
.. attribute:: h5_path
path inside the HDF5 file - also stored as absolute path
-
.. attribute:: history
previously opened groups / folders
-
.. attribute:: file_exists
boolean if the HDF5 was already written
-
.. attribute:: base_name
name of the HDF5 file but without any file extension
-
.. attribute:: file_path
directory where the HDF5 file is located
-
.. attribute:: is_root
boolean if the HDF5 object is located at the root level of the HDF5 file
-
.. attribute:: is_open
boolean if the HDF5 file is currently opened - if an active file handler exists
-
.. attribute:: is_empty
boolean if the HDF5 file is empty
-
.. attribute:: user
current unix/linux/windows user who is running pyiron
-
.. attribute:: sql_query
an SQL query to limit the jobs within the project to a subset which matches the SQL query.
-
.. attribute:: db
connection to the SQL database
-
.. attribute:: working_directory
working directory of the job is executed in - outside the HDF5 file
-
property
base_name
¶ The absolute path to of the current pyiron project - absolute path on the file system, not including the HDF5 path.
- Returns
current project path
- Return type
str
-
copy
()[source]¶ Copy the ProjectHDFio object - copying just the Python object but maintaining the same pyiron path
- Returns
copy of the ProjectHDFio object
- Return type
-
create_hdf
(path, job_name)[source]¶ Create an ProjectHDFio object to store project related information - for testing aggregated data
- Parameters
path (str) – absolute path
job_name (str) – name of the HDF5 container
- Returns
HDF5 object
- Return type
-
create_object
(class_name, **qwargs)[source]¶ Internal function to create a pyiron object
- Parameters
class_name (str) – name of a pyiron class
**qwargs – object parameters
- Returns
defined by the pyiron class in class_name with the input from **qwargs
- Return type
pyiron object
-
create_working_directory
()[source]¶ Create the working directory on the file system if it does not exist already.
-
property
db
¶ Get connection to the SQL database
- Returns
database conncetion
- Return type
-
get_job_id
(job_specifier)[source]¶ get the job_id for job named job_name in the local project path from database
- Parameters
job_specifier (str, int) – name of the job or job ID
- Returns
job ID of the job
- Return type
int
-
inspect
(job_specifier)[source]¶ Inspect an existing pyiron object - most commonly a job - from the database
- Parameters
job_specifier (str, int) – name of the job or job ID
- Returns
Access to the HDF5 object - not a GenericJob object - use load() instead.
- Return type
-
load
(job_specifier, convert_to_object=True)[source]¶ Load an existing pyiron object - most commonly a job - from the database
- Parameters
job_specifier (str, int) – name of the job or job ID
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns
Either the full GenericJob object or just a reduced JobCore object
- Return type
-
load_from_jobpath
(job_id=None, db_entry=None, convert_to_object=True)[source]¶ Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
- Parameters
job_id (int) – Job ID - optional, but either the job_id or the db_entry is required.
db_entry (dict) – database entry dictionary - optional, but either the job_id or the db_entry is required.
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns
Either the full GenericJob object or just a reduced JobCore object
- Return type
-
property
name
¶
-
property
path
¶ Absolute path of the HDF5 group starting from the system root - combination of the absolute system path plus the absolute path inside the HDF5 file starting from the root group.
- Returns
absolute path
- Return type
str
-
property
project
¶ Get the project instance the ProjectHDFio object is located in
- Returns
pyiron project
- Return type
-
property
project_path
¶ the relative path of the current project / folder starting from the root path of the pyiron user directory
- Returns
relative path of the current project / folder
- Return type
str
-
remove_job
(job_specifier, _unprotect=False)[source]¶ Remove a single job from the project based on its job_specifier - see also remove_jobs()
- Parameters
job_specifier (str, int) – name of the job or job ID
_unprotect (bool) – [True/False] delete the job without validating the dependencies to other jobs - default=False
-
property
root_path
¶ the pyiron user directory, defined in the .pyiron configuration
- Returns
pyiron user directory of the current project
- Return type
str
-
property
sql_query
¶ Get the SQL query for the project
- Returns
SQL query
- Return type
str
-
to_object
(object_type=None, **qwargs)[source]¶ Load the full pyiron object from an HDF5 file
- Parameters
object_type – if the ‘TYPE’ node is not available in the HDF5 file a manual object type can be set - optional
**qwargs – optional parameters [‘job_name’, ‘project’] - to specify the location of the HDF5 path
- Returns
pyiron object
- Return type
-
property
user
¶ Get current unix/linux/windows user who is running pyiron
- Returns
username
- Return type
str
-
property
working_directory
¶ Get the working directory of the current ProjectHDFio object. The working directory equals the path but it is represented by the filesystem:
/absolute/path/to/the/file.h5/path/inside/the/hdf5/file
- becomes:
/absolute/path/to/the/file_hdf5/path/inside/the/hdf5/file
- Returns
absolute path to the working directory
- Return type
str