pyiron.base.project.generic module

class pyiron.base.project.generic.Project(path='', user=None, sql_query=None)[source]

Bases: pyiron.base.project.path.ProjectPath

The project is the central class in pyiron, all other objects can be created from the project object.

Parameters:
  • path (GenericPath, str) – path of the project defined by GenericPath, absolute or relative (with respect to current working directory) path
  • user (str) – current pyiron user
  • sql_query (str) – SQL query to only select a subset of the existing jobs within the current project
.. attribute:: root_path

the pyiron user directory, defined in the .pyiron configuration

.. attribute:: project_path

the relative path of the current project / folder starting from the root path of the pyiron user directory

.. attribute:: path

the absolute path of the current project / folder

.. attribute:: base_name

the name of the current project / folder

.. attribute:: history

previously opened projects / folders

.. attribute:: parent_group

parent project - one level above the current project

.. attribute:: user

current unix/linux/windows user who is running pyiron

.. attribute:: sql_query

an SQL query to limit the jobs within the project to a subset which matches the SQL query.

.. attribute:: db

connection to the SQL database

.. attribute:: job_type
Job Type object with all the available job types: [‘ExampleJob’, ‘SerialMaster’, ‘ParallelMaster’,
‘ScriptJob’, ‘ListMaster’]
.. attribute:: view_mode

If viewer_mode is enable pyiron has read only access to the database.

compress_jobs(recursive=False)[source]

Compress all finished jobs in the current project and in all subprojects if recursive=True is selected.

Parameters:recursive (bool) – [True/False] compress all jobs in all subprojects - default=False
copy()[source]

Copy the project object - copying just the Python object but maintaining the same pyiron path

Returns:copy of the project object
Return type:Project
copy_to(destination)[source]

Copy the project object to a different pyiron path - including the content of the project (all jobs).

Parameters:destination (Project) – project path to copy the project content to
Returns:pointing to the new project path
Return type:Project
create_from_job(job_old, new_job_name)[source]

Create a new job from an existing pyiron job

Parameters:
  • job_old (GenericJob) – Job to copy
  • new_job_name (str) – New job name
Returns:

New job with the new job name.

Return type:

GenericJob

create_group(group)[source]

Create a new subproject/ group/ folder

Parameters:group (str) – name of the new project
Returns:New subproject
Return type:Project
static create_hdf(path, job_name)[source]

Create an ProjectHDFio object to store project related information - for example aggregated data

Parameters:
  • path (str) – absolute path
  • job_name (str) – name of the HDF5 container
Returns:

HDF5 object

Return type:

ProjectHDFio

create_job(job_type, job_name)[source]

Create one of the following jobs: - ‘ExampleJob’: example job just generating random number - ‘SerialMaster’: series of jobs run in serial - ‘ParallelMaster’: series of jobs run in parallel - ‘ScriptJob’: Python script or jupyter notebook job container - ‘ListMaster’: list of jobs

Parameters:
  • job_type (str) – job type can be [‘ExampleJob’, ‘SerialMaster’, ‘ParallelMaster’, ‘ScriptJob’, ‘ListMaster’]
  • job_name (str) – name of the job
Returns:

job object depending on the job_type selected

Return type:

GenericJob

delete_output_files_jobs(recursive=False)[source]

Delete the output files of all finished jobs in the current project and in all subprojects if recursive=True is selected.

Parameters:recursive (bool) – [True/False] delete the output files of all jobs in all subprojects - default=False
get_child_ids(job_specifier, project=None)[source]

Get the childs for a specific job

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • project (Project) – Project the job is located in - optional
Returns:

list of child IDs

Return type:

list

get_db_columns()[source]

Get column names

Returns:
list of column names like:
[‘id’, ‘parentid’, ‘masterid’, ‘projectpath’, ‘project’, ‘job’, ‘subjob’, ‘chemicalformula’, ‘status’, ‘hamilton’, ‘hamversion’, ‘username’, ‘computer’, ‘timestart’, ‘timestop’, ‘totalcputime’]
Return type:list
get_job_id(job_specifier)[source]

get the job_id for job named job_name in the local project path from database

Parameters:job_specifier (str, int) – name of the job or job ID
Returns:job ID of the job
Return type:int
get_job_ids(recursive=True)[source]

Return the job IDs matching a specific query

Parameters:recursive (bool) – search subprojects [True/False]
Returns:a list of job IDs
Return type:list
get_job_status(job_specifier, project=None)[source]

Get the status of a particular job

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • project (Project) – Project the job is located in - optional
Returns:

job status can be one of the following [‘initialized’, ‘appended’, ‘created’, ‘submitted’, ‘running’,

’aborted’, ‘collect’, ‘suspended’, ‘refresh’, ‘busy’, ‘finished’]

Return type:

str

get_job_working_directory(job_specifier, project=None)[source]

Get the working directory of a particular job

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • project (Project) – Project the job is located in - optional
Returns:

working directory as absolute path

Return type:

str

get_jobs(recursive=True, columns=None)[source]

Internal function to return the jobs as dictionary rather than a pandas.Dataframe

Parameters:
  • recursive (bool) – search subprojects [True/False]
  • columns (list) – by default only the columns [‘id’, ‘project’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
Returns:

columns are used as keys and point to a list of the corresponding values

Return type:

dict

get_jobs_status(recursive=True, element_lst=None)[source]

Gives a overview of all jobs status.

Parameters:
  • recursive (bool) – search subprojects [True/False] - default=True
  • element_lst (list) – list of elements required in the chemical formular - by default None
Returns:

prints an overview of the job status.

Return type:

pandas.Series

get_project_size()[source]

Get the size of the project in MegaByte.

Returns:project size
Return type:float
static get_repository_status()[source]

Finds the hashes for every pyiron module available.

Returns:The name of each module and the hash for its current git head.
Return type:pandas.DataFrame
groups()[source]

Filter project by groups

Returns:a project which is filtered by groups
Return type:Project
inspect(job_specifier)[source]

Inspect an existing pyiron object - most commonly a job - from the database

Parameters:job_specifier (str, int) – name of the job or job ID
Returns:Access to the HDF5 object - not a GenericJob object - use load() instead.
Return type:JobCore
items()[source]

All items in the current project - this includes jobs, sub projects/ groups/ folders and any kind of files

Returns:items in the project
Return type:list
iter_groups()[source]

Iterate over the groups within the current project

Returns:Yield of sub projects/ groups/ folders
Return type:yield
iter_jobs(path=None, recursive=True, convert_to_object=True, status=None)[source]

Iterate over the jobs within the current project and it is sub projects

Parameters:
  • path (str) – HDF5 path inside each job object
  • recursive (bool) – search subprojects [True/False] - True by default
  • convert_to_object (bool) – load the full GenericJob object (default) or just the HDF5 / JobCore object
  • status (str/None) – status of the jobs to filter for - [‘finished’, ‘aborted’, ‘submitted’, …]
Returns:

Yield of GenericJob or JobCore

Return type:

yield

iter_output(recursive=True)[source]

Iterate over the output of jobs within the current project and it is sub projects

Parameters:recursive (bool) – search subprojects [True/False] - True by default
Returns:Yield of GenericJob or JobCore
Return type:yield
job_table(recursive=True, columns=None, all_columns=True, sort_by='id', full_table=False, element_lst=None, job_name_contains='')[source]

Access the job_table

Parameters:
  • recursive (bool) – search subprojects [True/False] - default=True
  • columns (list) – by default only the columns [‘job’, ‘project’, ‘chemicalformula’] are selected, but the user can select a subset of [‘id’, ‘status’, ‘chemicalformula’, ‘job’, ‘subjob’, ‘project’, ‘projectpath’, ‘timestart’, ‘timestop’, ‘totalcputime’, ‘computer’, ‘hamilton’, ‘hamversion’, ‘parentid’, ‘masterid’]
  • all_columns (bool) – Select all columns - this overwrites the columns option.
  • sort_by (str) – Sort by a specific column
  • full_table (bool) – Whether to show the entire pandas table
  • element_lst (list) – list of elements required in the chemical formular - by default None
  • job_name_contains (str) – a string which should be contained in every job_name
Returns:

Return the result as a pandas.Dataframe object

Return type:

pandas.Dataframe

keys()[source]

List of file-, folder- and objectnames

Returns:list of the names of project directories and project nodes
Return type:list
list_all()[source]

Combination of list_groups(), list_nodes() and list_files() all in one dictionary with the corresponding keys: - ‘groups’: Subprojects/ -folder/ -groups. - ‘nodes’: Jobs or pyiron objects - ‘files’: Files inside a project which do not belong to any pyiron object

Returns:dictionary with all items in the project
Return type:dict
list_dirs(skip_hdf5=True)[source]

List directories inside the project

Parameters:skip_hdf5 (bool) – Skip directories which belong to a pyiron object/ pyiron job - default=True
Returns:list of directory names
Return type:list
list_files(extension=None)[source]

List files inside the project

Parameters:extension (str) – filter by a specific extension
Returns:list of file names
Return type:list
list_groups()[source]

List directories inside the project

Returns:list of directory names
Return type:list
list_nodes(recursive=False)[source]

List nodes/ jobs/ pyiron objects inside the project

Parameters:recursive (bool) – search subprojects [True/False] - default=False
Returns:list of nodes/ jobs/ pyiron objects inside the project
Return type:list
load(job_specifier, convert_to_object=True)[source]

Load an existing pyiron object - most commonly a job - from the database

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

load_from_jobpath(job_id=None, db_entry=None, convert_to_object=True)[source]

Internal function to load an existing job either based on the job ID or based on the database entry dictionary.

Parameters:
  • job_id (int/ None) – Job ID - optional, but either the job_id or the db_entry is required.
  • db_entry (dict) – database entry dictionary - optional, but either the job_id or the db_entry is required.
  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

static load_from_jobpath_string(job_path, convert_to_object=True)[source]

Internal function to load an existing job either based on the job ID or based on the database entry dictionary.

Parameters:
  • job_path (str) – string to reload the job from an HDF5 file - ‘/root_path/project_path/filename.h5/h5_path’
  • convert_to_object (bool) – convert the object to an pyiron object or only access the HDF5 file - default=True accessing only the HDF5 file is about an order of magnitude faster, but only provides limited functionality. Compare the GenericJob object to JobCore object.
Returns:

Either the full GenericJob object or just a reduced JobCore object

Return type:

GenericJob, JobCore

move_to(destination)[source]

Similar to the copy_to() function move the project object to a different pyiron path - including the content of the project (all jobs).

Parameters:destination (Project) – project path to move the project content to
Returns:pointing to the new project path
Return type:Project
name

The name of the current project folder

Returns:name of the current project folder
Return type:str
nodes()[source]

Filter project by nodes

Returns:a project which is filtered by nodes
Return type:Project
parent_group

Get the parent group of the current project

Returns:parent project
Return type:Project
static queue_check_job_is_waiting_or_running(item)[source]

Check if a job is still listed in the queue system as either waiting or running.

Parameters:item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
Returns:[True/False]
Return type:bool
queue_delete_job(item)[source]

Delete a job from the queuing system

Parameters:item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
Returns:Output from the queuing system as string - optimized for the Sun grid engine
Return type:str
static queue_enable_reservation(item)[source]

Enable a reservation for a particular job within the queuing system

Parameters:item (int, GenericJob) – Provide either the job_ID or the full hamiltonian
Returns:Output from the queuing system as string - optimized for the Sun grid engine
Return type:str
static queue_is_empty()[source]

Check if the queue table is currently empty - no more jobs to wait for.

Returns:True if the table is empty, else False - optimized for the Sun grid engine
Return type:bool
queue_table(project_only=True, recursive=True, full_table=False)[source]

Display the queuing system table as pandas.Dataframe

Parameters:
  • project_only (bool) – Query only for jobs within the current project - True by default
  • recursive (bool) – Include jobs from sub projects
  • full_table (bool) – Whether to show the entire pandas table
Returns:

Output from the queuing system - optimized for the Sun grid engine

Return type:

pandas.DataFrame

queue_table_global(full_table=False)[source]

Display the queuing system table as pandas.Dataframe

Parameters:full_table (bool) – Whether to show the entire pandas table
Returns:Output from the queuing system - optimized for the Sun grid engine
Return type:pandas.DataFrame
refresh_job_status_based_on_job_id(job_id, que_mode=True)[source]

Internal function to check if a job is still listed ‘running’ in the job_table while it is no longer listed in the queuing system. In this case update the entry in the job_table to ‘aborted’.

Parameters:
  • job_id (int) – job ID
  • que_mode (bool) – [True/False] - default=True
refresh_job_status_based_on_queue_status(job_specifier, status='running')[source]

Check if the job is still listed as running, while it is no longer listed in the queue.

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • status (str) – Currently only the jobstatus of ‘running’ jobs can be refreshed - default=’running’
remove(enable=False, enforce=False)[source]

Delete all the whole project including all jobs in the project and its subprojects

Parameters:
  • enforce (bool) – [True/False] delete jobs even though they are used in other projects - default=False
  • enable (bool) – [True/False] enable this command.
remove_file(file_name)[source]

Remove a file (same as unlink()) - copied from os.remove()

If dir_fd is not None, it should be a file descriptor open to a directory,
and path should be relative; path will then be relative to that directory.
dir_fd may not be implemented on your platform.
If it is unavailable, using it will raise a NotImplementedError.
Parameters:file_name (str) – name of the file
remove_job(job_specifier, _unprotect=False)[source]

Remove a single job from the project based on its job_specifier - see also remove_jobs()

Parameters:
  • job_specifier (str, int) – name of the job or job ID
  • _unprotect (bool) – [True/False] delete the job without validating the dependencies to other jobs - default=False
remove_jobs(recursive=False)[source]

Remove all jobs in the current project and in all subprojects if recursive=True is selected - see also remove_job()

Parameters:recursive (bool) – [True/False] delete all jobs in all subprojects - default=False
set_job_status(job_specifier, status, project=None)[source]

Set the status of a particular job

Parameters:
  • job_specifier (str) – name of the job or job ID
  • status (str) – job status can be one of the following [‘initialized’, ‘appended’, ‘created’, ‘submitted’, ‘running’, ‘aborted’, ‘collect’, ‘suspended’, ‘refresh’, ‘busy’, ‘finished’]
  • project (str) – project path
static set_logging_level(level, channel=None)[source]

Set level for logger

Parameters:
  • level (str) – ‘DEBUG, INFO, WARN’
  • channel (int) – 0: file_log, 1: stream, None: both
switch_to_central_database()[source]

Switch from local mode to central mode - if local_mode is enable pyiron is using a local database.

switch_to_local_database(file_name='pyiron.db', cwd=None)[source]

Switch from central mode to local mode - if local_mode is enable pyiron is using a local database.

Parameters:
  • file_name (str) – file name or file path for the local database
  • cwd (str) – directory where the local database is located
switch_to_user_mode()[source]

Switch from viewer mode to user mode - if viewer_mode is enable pyiron has read only access to the database.

switch_to_viewer_mode()[source]

Switch from user mode to viewer mode - if viewer_mode is enable pyiron has read only access to the database.

values()[source]

All items in the current project - this includes jobs, sub projects/ groups/ folders and any kind of files

Returns:items in the project
Return type:list
view_mode

Get viewer_mode - if viewer_mode is enable pyiron has read only access to the database.

Returns:returns TRUE when viewer_mode is enabled
Return type:bool
static wait_for_job(job, interval_in_s=5, max_iterations=100)[source]

Sleep until the job is finished but maximum interval_in_s * max_iterations seconds.

Parameters:
  • job (GenericJob) – Job to wait for
  • interval_in_s (int) – interval when the job status is queried from the database - default 5 sec.
  • max_iterations (int) – maximum number of iterations - default 100