pyiron.base.project.generic module¶
-
class
pyiron.base.project.generic.
Project
(path='', user=None, sql_query=None)[source]¶ Bases:
pyiron.base.project.path.ProjectPath
The project is the central class in pyiron, all other objects can be created from the project object.
- Parameters
path (GenericPath, str) – path of the project defined by GenericPath, absolute or relative (with respect to current working directory) path
user (str) – current pyiron user
sql_query (str) – SQL query to only select a subset of the existing jobs within the current project
-
.. attribute:: root_path
the pyiron user directory, defined in the .pyiron configuration
-
.. attribute:: project_path
the relative path of the current project / folder starting from the root path of the pyiron user directory
-
.. attribute:: path
the absolute path of the current project / folder
-
.. attribute:: base_name
the name of the current project / folder
-
.. attribute:: history
previously opened projects / folders
-
.. attribute:: parent_group
parent project - one level above the current project
-
.. attribute:: user
current unix/linux/windows user who is running pyiron
-
.. attribute:: sql_query
an SQL query to limit the jobs within the project to a subset which matches the SQL query.
-
.. attribute:: db
connection to the SQL database
-
.. attribute:: job_type
- Job Type object with all the available job types: [‘ExampleJob’, ‘SerialMaster’, ‘ParallelMaster’,
‘ScriptJob’, ‘ListMaster’]
-
.. attribute:: view_mode
If viewer_mode is enable pyiron has read only access to the database.
-
compress_jobs
(recursive=False)[source]¶ Compress all finished jobs in the current project and in all subprojects if recursive=True is selected.
- Parameters
recursive (bool) – [True/False] compress all jobs in all subprojects - default=False
-
copy
()[source]¶ Copy the project object - copying just the Python object but maintaining the same pyiron path
- Returns
copy of the project object
- Return type
-
copy_to
(destination)[source]¶ Copy the project object to a different pyiron path - including the content of the project (all jobs).
-
create_from_job
(job_old, new_job_name)[source]¶ Create a new job from an existing pyiron job
- Parameters
job_old (GenericJob) – Job to copy
new_job_name (str) – New job name
- Returns
New job with the new job name.
- Return type
-
create_group
(group)[source]¶ Create a new subproject/ group/ folder
- Parameters
group (str) – name of the new project
- Returns
New subproject
- Return type
-
static
create_hdf
(path, job_name)[source]¶ Create an ProjectHDFio object to store project related information - for example aggregated data
- Parameters
path (str) – absolute path
job_name (str) – name of the HDF5 container
- Returns
HDF5 object
- Return type
-
create_job
(job_type, job_name)[source]¶ Create one of the following jobs: - ‘ExampleJob’: example job just generating random number - ‘SerialMaster’: series of jobs run in serial - ‘ParallelMaster’: series of jobs run in parallel - ‘ScriptJob’: Python script or jupyter notebook job container - ‘ListMaster’: list of jobs
- Parameters
job_type (str) – job type can be [‘ExampleJob’, ‘SerialMaster’, ‘ParallelMaster’, ‘ScriptJob’, ‘ListMaster’]
job_name (str) – name of the job
- Returns
job object depending on the job_type selected
- Return type
-
delete_output_files_jobs
(recursive=False)[source]¶ Delete the output files of all finished jobs in the current project and in all subprojects if recursive=True is selected.
- Parameters
recursive (bool) – [True/False] delete the output files of all jobs in all subprojects - default=False
-
get_child_ids
(job_specifier, project=None)[source]¶ Get the childs for a specific job
- Parameters
job_specifier (str, int) – name of the job or job ID
project (Project) – Project the job is located in - optional
- Returns
list of child IDs
- Return type
list
-
get_db_columns
()[source]¶ Get column names
- Returns
- list of column names like:
[‘id’, ‘parentid’, ‘masterid’, ‘projectpath’, ‘project’, ‘job’, ‘subjob’, ‘chemicalformula’, ‘status’, ‘hamilton’, ‘hamversion’, ‘username’, ‘computer’, ‘timestart’, ‘timestop’, ‘totalcputime’]
- Return type
list
-
static
get_external_input
()[source]¶ Get external input either from the HDF5 file of the ScriptJob object which executes the Jupyter notebook or from an input.json file located in the same directory as the Jupyter notebook.
- Returns
Dictionary with external input
- Return type
dict
-
get_job_id
(job_specifier)[source]¶ get the job_id for job named job_name in the local project path from database
- Parameters
job_specifier (str, int) – name of the job or job ID
- Returns
job ID of the job
- Return type
int
-
get_job_ids
(recursive=True)[source]¶ Return the job IDs matching a specific query
- Parameters
recursive (bool) – search subprojects [True/False]
- Returns
a list of job IDs
- Return type
list
-
get_job_status
(job_specifier, project=None)[source]¶ Get the status of a particular job
- Parameters
job_specifier (str, int) – name of the job or job ID
project (Project) – Project the job is located in - optional
- Returns
- job status can be one of the following [‘initialized’, ‘appended’, ‘created’, ‘submitted’, ‘running’,
’aborted’, ‘collect’, ‘suspended’, ‘refresh’, ‘busy’, ‘finished’]
- Return type
str
-
get_job_working_directory
(job_specifier, project=None)[source]¶ Get the working directory of a particular job
- Parameters
job_specifier (str, int) – name of the job or job ID
project (Project) – Project the job is located in - optional
- Returns
working directory as absolute path
- Return type
str
-
get_jobs
(recursive=True, columns=None)[source]¶ Internal function to return the jobs as dictionary rather than a pandas.Dataframe
- Parameters
recursive (bool) – search subprojects [True/False]
columns (list) – by default only the columns [‘id’, ‘project’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
- Returns
columns are used as keys and point to a list of the corresponding values
- Return type
dict
-
get_jobs_status
(recursive=True, element_lst=None)[source]¶ Gives a overview of all jobs status.
- Parameters
recursive (bool) – search subprojects [True/False] - default=True
element_lst (list) – list of elements required in the chemical formular - by default None
- Returns
prints an overview of the job status.
- Return type
pandas.Series
-
get_project_size
()[source]¶ Get the size of the project in MegaByte.
- Returns
project size
- Return type
float
-
static
get_repository_status
()[source]¶ Finds the hashes for every pyiron module available.
- Returns
The name of each module and the hash for its current git head.
- Return type
pandas.DataFrame
-
groups
()[source]¶ Filter project by groups
- Returns
a project which is filtered by groups
- Return type
-
inspect
(job_specifier)[source]¶ Inspect an existing pyiron object - most commonly a job - from the database
- Parameters
job_specifier (str, int) – name of the job or job ID
- Returns
Access to the HDF5 object - not a GenericJob object - use load() instead.
- Return type
-
items
()[source]¶ All items in the current project - this includes jobs, sub projects/ groups/ folders and any kind of files
- Returns
items in the project
- Return type
list
-
iter_groups
()[source]¶ Iterate over the groups within the current project
- Returns
Yield of sub projects/ groups/ folders
- Return type
yield
-
iter_jobs
(path=None, recursive=True, convert_to_object=True, status=None)[source]¶ Iterate over the jobs within the current project and it is sub projects
- Parameters
path (str) – HDF5 path inside each job object
recursive (bool) – search subprojects [True/False] - True by default
convert_to_object (bool) – load the full GenericJob object (default) or just the HDF5 / JobCore object
status (str/None) – status of the jobs to filter for - [‘finished’, ‘aborted’, ‘submitted’, …]
- Returns
Yield of GenericJob or JobCore
- Return type
yield
-
iter_output
(recursive=True)[source]¶ Iterate over the output of jobs within the current project and it is sub projects
- Parameters
recursive (bool) – search subprojects [True/False] - True by default
- Returns
Yield of GenericJob or JobCore
- Return type
yield
-
job_table
(recursive=True, columns=None, all_columns=True, sort_by='id', full_table=False, element_lst=None, job_name_contains='')[source]¶ Access the job_table
- Parameters
recursive (bool) – search subprojects [True/False] - default=True
columns (list) – by default only the columns [‘job’, ‘project’, ‘chemicalformula’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
all_columns (bool) – Select all columns - this overwrites the columns option.
sort_by (str) – Sort by a specific column
full_table (bool) – Whether to show the entire pandas table
element_lst (list) – list of elements required in the chemical formular - by default None
job_name_contains (str) – a string which should be contained in every job_name
- Returns
Return the result as a pandas.Dataframe object
- Return type
pandas.Dataframe
-
keys
()[source]¶ List of file-, folder- and objectnames
- Returns
list of the names of project directories and project nodes
- Return type
list
-
list_all
()[source]¶ Combination of list_groups(), list_nodes() and list_files() all in one dictionary with the corresponding keys: - ‘groups’: Subprojects/ -folder/ -groups. - ‘nodes’: Jobs or pyiron objects - ‘files’: Files inside a project which do not belong to any pyiron object
- Returns
dictionary with all items in the project
- Return type
dict
-
list_dirs
(skip_hdf5=True)[source]¶ List directories inside the project
- Parameters
skip_hdf5 (bool) – Skip directories which belong to a pyiron object/ pyiron job - default=True
- Returns
list of directory names
- Return type
list
-
list_files
(extension=None)[source]¶ List files inside the project
- Parameters
extension (str) – filter by a specific extension
- Returns
list of file names
- Return type
list
-
list_groups
()[source]¶ List directories inside the project
- Returns
list of directory names
- Return type
list
-
list_nodes
(recursive=False)[source]¶ List nodes/ jobs/ pyiron objects inside the project
- Parameters
recursive (bool) – search subprojects [True/False] - default=False
- Returns
list of nodes/ jobs/ pyiron objects inside the project
- Return type
list
-
load
(job_specifier, convert_to_object=True)[source]¶ Load an existing pyiron object - most commonly a job - from the database
- Parameters
job_specifier (str, int) – name of the job or job ID
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns
Either the full GenericJob object or just a reduced JobCore object
- Return type
-
load_from_jobpath
(job_id=None, db_entry=None, convert_to_object=True)[source]¶ Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
- Parameters
job_id (int/ None) – Job ID - optional, but either the job_id or the db_entry is required.
db_entry (dict) – database entry dictionary - optional, but either the job_id or the db_entry is required.
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns
Either the full GenericJob object or just a reduced JobCore object
- Return type
-
static
load_from_jobpath_string
(job_path, convert_to_object=True)[source]¶ Internal function to load an existing job either based on the job ID or based on the database entry dictionary.
- Parameters
job_path (str) – string to reload the job from an HDF5 file - ‘/root_path/project_path/filename.h5/h5_path’
convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
- Returns
Either the full GenericJob object or just a reduced JobCore object
- Return type
-
move_to
(destination)[source]¶ Similar to the copy_to() function move the project object to a different pyiron path - including the content of the project (all jobs).
-
property
name
¶ The name of the current project folder
- Returns
name of the current project folder
- Return type
str
-
property
parent_group
¶ Get the parent group of the current project
- Returns
parent project
- Return type
-
static
queue_check_job_is_waiting_or_running
(item)[source]¶ Check if a job is still listed in the queue system as either waiting or running.
- Parameters
item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
- Returns
[True/False]
- Return type
bool
-
queue_delete_job
(item)[source]¶ Delete a job from the queuing system
- Parameters
item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
- Returns
Output from the queuing system as string - optimized for the Sun grid engine
- Return type
str
-
static
queue_enable_reservation
(item)[source]¶ Enable a reservation for a particular job within the queuing system
- Parameters
item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
- Returns
Output from the queuing system as string - optimized for the Sun grid engine
- Return type
str
-
static
queue_is_empty
()[source]¶ Check if the queue table is currently empty - no more jobs to wait for.
- Returns
True if the table is empty, else False - optimized for the Sun grid engine
- Return type
bool
-
queue_table
(project_only=True, recursive=True, full_table=False)[source]¶ Display the queuing system table as pandas.Dataframe
- Parameters
project_only (bool) – Query only for jobs within the current project - True by default
recursive (bool) – Include jobs from sub projects
full_table (bool) – Whether to show the entire pandas table
- Returns
Output from the queuing system - optimized for the Sun grid engine
- Return type
pandas.DataFrame
-
queue_table_global
(full_table=False)[source]¶ Display the queuing system table as pandas.Dataframe
- Parameters
full_table (bool) – Whether to show the entire pandas table
- Returns
Output from the queuing system - optimized for the Sun grid engine
- Return type
pandas.DataFrame
-
refresh_job_status_based_on_job_id
(job_id, que_mode=True)[source]¶ Internal function to check if a job is still listed ‘running’ in the job_table while it is no longer listed in the queuing system. In this case update the entry in the job_table to ‘aborted’.
- Parameters
job_id (int) – job ID
que_mode (bool) – [True/False] - default=True
-
refresh_job_status_based_on_queue_status
(job_specifier, status='running')[source]¶ Check if the job is still listed as running, while it is no longer listed in the queue.
- Parameters
job_specifier (str, int) – name of the job or job ID
status (str) – Currently only the jobstatus of ‘running’ jobs can be refreshed - default=’running’
-
remove
(enable=False, enforce=False)[source]¶ Delete all the whole project including all jobs in the project and its subprojects
- Parameters
enforce (bool) – [True/False] delete jobs even though they are used in other projects - default=False
enable (bool) – [True/False] enable this command.
-
remove_file
(file_name)[source]¶ Remove a file (same as unlink()) - copied from os.remove()
- If dir_fd is not None, it should be a file descriptor open to a directory,
and path should be relative; path will then be relative to that directory.
- dir_fd may not be implemented on your platform.
If it is unavailable, using it will raise a NotImplementedError.
- Parameters
file_name (str) – name of the file
-
remove_job
(job_specifier, _unprotect=False)[source]¶ Remove a single job from the project based on its job_specifier - see also remove_jobs()
- Parameters
job_specifier (str, int) – name of the job or job ID
_unprotect (bool) – [True/False] delete the job without validating the dependencies to other jobs - default=False
-
remove_jobs
(recursive=False)[source]¶ Remove all jobs in the current project and in all subprojects if recursive=True is selected - see also remove_job()
- Parameters
recursive (bool) – [True/False] delete all jobs in all subprojects - default=False
-
set_job_status
(job_specifier, status, project=None)[source]¶ Set the status of a particular job
- Parameters
job_specifier (str) – name of the job or job ID
status (str) – job status can be one of the following [‘initialized’, ‘appended’, ‘created’, ‘submitted’, ‘running’, ‘aborted’, ‘collect’, ‘suspended’, ‘refresh’, ‘busy’, ‘finished’]
project (str) – project path
-
static
set_logging_level
(level, channel=None)[source]¶ Set level for logger
- Parameters
level (str) – ‘DEBUG, INFO, WARN’
channel (int) – 0: file_log, 1: stream, None: both
-
switch_to_central_database
()[source]¶ Switch from local mode to central mode - if local_mode is enable pyiron is using a local database.
-
switch_to_local_database
(file_name='pyiron.db', cwd=None)[source]¶ Switch from central mode to local mode - if local_mode is enable pyiron is using a local database.
- Parameters
file_name (str) – file name or file path for the local database
cwd (str) – directory where the local database is located
-
switch_to_user_mode
()[source]¶ Switch from viewer mode to user mode - if viewer_mode is enable pyiron has read only access to the database.
-
switch_to_viewer_mode
()[source]¶ Switch from user mode to viewer mode - if viewer_mode is enable pyiron has read only access to the database.
-
values
()[source]¶ All items in the current project - this includes jobs, sub projects/ groups/ folders and any kind of files
- Returns
items in the project
- Return type
list
-
property
view_mode
¶ Get viewer_mode - if viewer_mode is enable pyiron has read only access to the database.
- Returns
returns TRUE when viewer_mode is enabled
- Return type
bool
-
static
wait_for_job
(job, interval_in_s=5, max_iterations=100)[source]¶ Sleep until the job is finished but maximum interval_in_s * max_iterations seconds.
- Parameters
job (GenericJob) – Job to wait for
interval_in_s (int) – interval when the job status is queried from the database - default 5 sec.
max_iterations (int) – maximum number of iterations - default 100