API
PySEMTools is a collection of modules that together provide a comprehensive toolkit for working with spectral element method (SEM) data.
The presently included submodules are:
Command line interface functionalities available in pySEMTools. |
|
Communication utilities for parallel processing |
|
Data types to hold and operate with SEM data. |
|
High-order interpolation routines for SEM data. |
|
Classes to read or write data from Nek style codes |
|
IO operations using ADIOS2. |
|
Classes to read or write data in the hdf format |
|
Classes to interface with paraview - catalyst |
|
Wrappers to ease IO |
|
Monitoring tools for pySEMTools |
|
Set of tools to perform reduced order modeling tasks. |
The compression module inside the package is still considered to be experimental and is prone to change, therefore its documentation is not yet included here.
The post-processing module is in a similar state. This module intends to use the tools to perform more specific post processing tasks. It is still under development and its API is prone to change, therefore the documentation is reserved for a future release.
The full API documentation for all the classes and functions is provided below.
cli
Command line interface functionalities available in pySEMTools.
- pysemtools.cli.extract_subdomain()
Extract a subdomain from an existing file.
This script extracts a box-shaped subdomain from a given field file and saves it to a new file.
- Parameters:
- - -input_file: Path to the input field file.
- - -output_file: Path to the output field file where the extracted subdomain will be saved
- - -bounds: Comma-separated list of 6 floats defining the subdomain bounds: xmin,xmax,ymin,ymax,zmin,zmax
- - -fields: (Optional) Comma-separated list of field names to extract. If not provided, all fields will be extracted.
Examples
To use this script, run it with MPI using the following command:
>>> mpirun -n <num_processes> python pysemtools_extract_subdomain --input_file <input.fld> --output_file <output.fld> --bounds=<xmin,xmax,ymin,ymax,zmin,zmax> [--fields=<field1,field2,...>]
Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.
- pysemtools.cli.index_files()
Create a json file that contains and index of files in a folder
This script creates an index of files in a specified folder and saves it as a JSON file. The index can include file contents and time intervals based on user preferences.
- Parameters:
- - -folder_path: Path to the folder containing files to index.
- - -output_folder: Path to the output folder.
- - -file_type: (Optional) List of file types to index.
- - -include_file_contents: (Optional) Boolean flag to include file contents in the index
- - -include_time_interval: (Optional) Boolean flag to include time interval in the index.
- - -run_start_time: (Optional) Float indicating the start time of the run.
- - -stat_start_time: (Optional) Float indicating the start time of the statistics.
Examples
To use this script, run the following command:
>>> pysemtools_index_files --folder_path <path_to_folder> --output_folder <output_folder> [--file_type <file_type1,file_type2,...>] [--include_file_contents] [--include_time_interval] [--run_start_time <float>] [--stat_start_time <float>]
Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.
- pysemtools.cli.visnek()
Create a nek5000 file template from existing files. Used to visualize nek5000 files in VisIt.
This script generates a nek5000 file template based on existing files in the current directory. The generated file contains information about the file naming pattern, first timestep, and number of timesteps.
- Parameters:
- filename_pattern: Pattern to match the nek5000 files.
Examples
To use this script, run it with the following command:
>>> pysemtools_visnek <filename_pattern>
Replacing the placeholder with the actual filename pattern. Observe that you have to remove the angle brackets.
comm
Communication utilities for parallel processing
- class pysemtools.comm.Router(comm)
This class can be used to handle communication between ranks in a MPI communicator.
With this one can send and recieve data to any rank that is specified in the destination list.
- Parameters:
- commMPI communicator
The MPI communicator that is used for the communication.
- Attributes:
- commMPI communicator
The MPI communicator that is used for the communication.
- destination_countndarray
Specifies a buffer to see how many points I send to each rank
- source_countndarray
Specifies a buffer to see how many points I recieve from each rank
Methods
all_gather([data, dtype])Gathers data from all processes to all processes.
all_to_all([destination, data, dtype])Sends data to specified destinations and recieves data from whoever sent.
gather_in_root([data, root, dtype])Gathers data from all processes to the root process.
scatter_from_root([data, sendcounts, root, ...])Scatters data from the root process to all other processes.
send_recv([destination, data, dtype, tag])Sends data to specified destinations and recieves data from whoever sent.
transfer_data(comm_pattern, **kwargs)Moves data between ranks in the specified patthern.
Notes
The data is always flattened before sending and recieved data is always flattened. The user must reshape the data after recieving it.
Examples
To initialize simply use the communicator
>>> from mpi4py import MPI >>> from pysemtools.comm.router import Router >>> comm = MPI.COMM_WORLD >>> rt = Router(comm)
- all_gather(data=None, dtype=None)
Gathers data from all processes to all processes.
This is a wrapper to the MPI Allgatherv function.
- Parameters:
- datandarray
Data that is gathered in all processes.
- dtypedtype
The data type of the data that is gathered.
- Returns:
- recvbufndarray
The gathered data in the root process. The data is always recieved flattened. User must reshape it.
- sendcountsndarray
The number of data that was sent from each rank.
Examples
To gather data from all ranks to the root rank, do the following:
>>> rt = Router(comm) >>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank >>> recvbf, sendcounts = rt.all_gather(data = local_data, dtype = np.double)
- all_to_all(destination=None, data=None, dtype=None, **kwargs)
Sends data to specified destinations and recieves data from whoever sent.
In this instance we use all to all collective.
- Parameters:
- destinationlist
A list with the rank ids that the data should be sent to.
- datalist or ndarray
The data that will be sent. If it is a list, the data will be sent to the corresponding index in the destination list. if the data is an ndarray, the same data will be sent to all destinations.
- dtypedtype
The data type of the data that is sent.
- Returns:
- sourceslist
A list with the rank ids that the data was recieved from.
- recvbufflist
A list with the recieved data. The data is stored in the same order as the sources.
Notes
Extra keyword arguments are ignored. This is to keep the same interface as the send_recv method.
Examples
To send and recieve data between ranks, do the following:
local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)
>>> rt = Router(comm) >>> destination = [rank + 1, rank + 2] >>> for i, dest in enumerate(destination): >>> if dest >= size: >>> destination[i] = dest - size >>> sources, recvbf = rt.all_to_all(destination = destination, >>> data = local_data, dtype=np.double) >>> for i in range(0, len(recvbf)): >>> recvbf[i] = recvbf[i].reshape((-1, 3))
- gather_in_root(data=None, root=0, dtype=None)
Gathers data from all processes to the root process.
This is a wrapper to the MPI Gatherv function.
- Parameters:
- datandarray
Data that is gathered in the root process.
- rootint
The rank that will gather the data.
- dtypedtype
The data type of the data that is gathered.
- Returns:
- recvbufndarray
The gathered data in the root process. The data is always recieved flattened. User must reshape it.
- sendcountsndarray
The number of data that was sent from each rank.
Examples
To gather data from all ranks to the root rank, do the following:
>>> rt = Router(comm) >>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank >>> recvbf, sendcounts = rt.gather_in_root(data = local_data, >>> root = 0, dtype = np.double)
- scatter_from_root(data=None, sendcounts=None, root=0, dtype=None)
Scatters data from the root process to all other processes.
This is a wrapper to the MPI Scatterv function.
- Parameters:
- datandarray
The data that is scattered to all processes.
- sendcountsndarray, optional
The number of data that is sent to each process. If not specified, the data is divided equally among all processes.
- rootint
The rank that will scatter
- dtypedtype
The data type of the data that is scattered.
- Returns:
- recvbufndarray
The scattered data in the current process. The data is always recieved flattened. User must reshape it.
Examples
To scatter data from the root rank, do the following:
>>> rt = Router(comm) >>> recvbf = rt.scatter_from_root(data = recvbf, sendcounts=sendcounts, root = 0, dtype = np.double) >>> recvbf = recvbf.reshape((-1, 3))
Note tha the sendcounts are just a ndarray of size comm.Get_size() with the number of data that is sent to each rank.
- send_recv(destination=None, data=None, dtype=None, tag=None)
Sends data to specified destinations and recieves data from whoever sent.
Typically, when a rank needs to send some data, it also needs to recieve some. In this method this is done by using non blocking communication. We note, however, that when the method returns, the data is already recieved.
- Parameters:
- destinationlist
A list with the rank ids that the data should be sent to.
- datalist or ndarray
The data that will be sent. If it is a list, the data will be sent to the corresponding destination. if the data is an ndarray, the same data will be sent to all destinations.
- dtypedtype
The data type of the data that is sent.
- tagint
Tag used to identify the messages.
- Returns:
- sourceslist
A list with the rank ids that the data was recieved from.
- recvbufflist
A list with the recieved data. The data is stored in the same order as the sources.
Examples
To send and recieve data between ranks, do the following:
local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)
>>> rt = Router(comm) >>> destination = [rank + 1, rank + 2] >>> for i, dest in enumerate(destination): >>> if dest >= size: >>> destination[i] = dest - size >>> sources, recvbf = rt.send_recv(destination = destination, >>> data = local_data, dtype=np.double, tag = 0) >>> for i in range(0, len(recvbf)): >>> recvbf[i] = recvbf[i].reshape((-1, 3))
- transfer_data(comm_pattern, **kwargs)
Moves data between ranks in the specified patthern.
This method wraps others in this class.
- Parameters:
- keywordstr
The keyword that specifies the pattern of the data movement. current options are:
distribute_p2p: sends data to specified destinations. and recieves data from whoever sent. point to point comm
distribute_a2a: sends data to specified destinations. and recieves data from whoever sent. all to all comm
gather: gathers data from all processes to the root process.
scatter: scatters data from the root process to all other processes.
- kwargsdict
The arguments that are passed to the specified pattern. One needs to check the pattern documentation for the required arguments. For each scenario
- Returns:
- tuple
The output of the specified pattern.
datatypes
Data types to hold and operate with SEM data.
- class pysemtools.datatypes.Coef(msh, comm, get_area=False, apply_1d_operators=True, bckend='numpy')
Class that contains arrays like mass matrix, jacobian, jacobian inverse, etc.
This class can be used when mathematical operations such as derivation and integration is needed on the sem mesh.
- Parameters:
- mshMesh
Mesh object.
- commComm
MPI comminicator object.
- get_areabool, optional
If True, the area integration weight and normal vectors will be calculated. (Default value = True).
- apply_1d_operatorsbool, optional
If True, the 1D operators will be applied instead of building 3D operators. (Default value = True).
- Attributes:
- drdxndarray
component [0,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- drdyndarray
component [0,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- drdzndarray
component [0,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dsdxndarray
component [1,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dsdyndarray
component [1,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dsdzndarray
component [1,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dtdxndarray
component [2,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dtdyndarray
component [2,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- dtdzndarray
component [2,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
- Bndarray
Mass matrix for each point. shape is (nelv, lz, ly, lx).
- areandarray
Area integration weight for each point in the facets. shape is (nelv, 6, ly, lx).
- nxndarray
x component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).
- nyndarray
y component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).
- nzndarray
z component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).
Methods
apply_spatial_filter(field)Apply the stored spatial filters
build_spatial_filter([r_tf, s_tf, t_tf])Build a spatial filter based on the given transfer functions.
dssum(field, msh)Peform average of given field over shared points in each rank.
dudrst(field[, direction])Perform derivative with respect to reference coordinate r/s/t.
dudrst_1d_operator(field, dr, ds[, dt])Perform derivative applying the 1d operators provided as inputs.
dudrst_1d_operator_torch(field, dr, ds[, dt])Perform derivative applying the 1D operators provided as inputs.
dudrst_3d_operator(field, dr)Perform derivative with respect to reference coordinate r.
dudrst_transposed(field[, direction])Perform derivative with respect to reference coordinate r/s/t.
dudxyz(field, drdx, dsdx[, dtdx])Perform derivative with respect to physical coordinate x,y,z.
glsum(a, comm[, dtype])Peform global summatin of given qunaitity a using MPI.
- Returns:
Examples
Assuming you have a mesh object and MPI communicator object, you can initialize the Coef object as follows:
>>> from pysemtools import Coef >>> coef = Coef(msh, comm)
- apply_spatial_filter(field)
Apply the stored spatial filters
- Parameters:
- fieldnp.array
Field to apply the spatial filter to. Shape should be (nelv, lz, ly, lx).
- Returns:
- np.array
Filtered field. Shape is the same as the input field.
Notes
The spatial filters must be created before calling this function, otherwise, an error will be raised.
- build_spatial_filter(r_tf: array | None = None, s_tf: array | None = None, t_tf: array | None = None)
Build a spatial filter based on the given transfer functions.
- Parameters:
- r_tfnp.array
Transfer function for the r dimension. Shape should be (lx, lx).
- s_tfnp.array
Transfer function for the s dimension. Shape should be (ly, ly).
- t_tfnp.array
Transfer function for the t dimension. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.
- Returns:
- None
The spatial filters are stored in the r_filter, s_filter, and t_filter attributes.
- dssum(field, msh)
Peform average of given field over shared points in each rank.
This method averages the field over shared points in the same rank. It uses the connectivity data in the mesh object. dssum might be a missleading name.
- Parameters:
- fieldndarray
Field to average over shared points.
- mshMesh
pySEMTools Mesh object.
- Returns:
- ndarray
Input field with shared points averaged with shared points in the SAME rank.
Examples
Assuming you have a Coef object and are working on a 3d field:
>>> dudx = coef.dssum(dudx, msh)
- dudrst(field, direction='r')
Perform derivative with respect to reference coordinate r/s/t.
Used to perform the derivative in the reference coordinates
- Parameters:
- fieldndarray
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- directionstr
Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).
- Returns:
- ndarray
Derivative of the field with respect to r/s/t. Shape is the same as the input field.
- dudrst_1d_operator(field, dr, ds, dt=None)
Perform derivative applying the 1d operators provided as inputs.
This method uses the 1D operators to apply the derivative. To apply derivative in r. you mush provide the 1d differenciation matrix in that direction and identity in the others.
- Parameters:
- fieldndarray
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- drndarray
Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).
- dsndarray
Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).
- dtndarray
Derivative matrix in the t direction to apply to each element. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.
- Returns:
- ndarray
Derivative of the field with respect to r/s/t. Shape is the same as the input field.
Examples
Assuming you have a Coef object
>>> dxdr = coef.dudrst(x, coef.dr, np.eye(ly, dtype=coef.dtype), np.eye(lz, dtype=coef.dtype))
- dudrst_1d_operator_torch(field, dr, ds, dt=None)
Perform derivative applying the 1D operators provided as inputs.
- Parameters:
- fieldtorch.Tensor
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- drtorch.Tensor
Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).
- dstorch.Tensor
Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).
- dttorch.Tensor (optional)
Derivative matrix in the t direction. Shape should be (lz, lz).
- Returns:
- torch.Tensor
Derivative of the field with respect to r/s/t. Shape is the same as the input field.
- dudrst_3d_operator(field, dr)
Perform derivative with respect to reference coordinate r.
This method uses derivation matrices from the lagrange polynomials at the GLL points.
- Parameters:
- fieldndarray
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- drndarray
Derivative matrix in the r/s/t direction to apply to each element. Shape should be (lx*ly*lz, lx*ly*lz).
- Returns:
- ndarray
Derivative of the field with respect to r/s/t. Shape is the same as the input field.
Examples
Assuming you have a Coef object
>>> dxdr = coef.dudrst(x, coef.dr)
- dudrst_transposed(field, direction='r')
Perform derivative with respect to reference coordinate r/s/t.
Used to perform the derivative in the reference coordinates
- Parameters:
- fieldndarray
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- directionstr
Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).
- Returns:
- ndarray
Derivative of the field with respect to r/s/t. Shape is the same as the input field.
- dudxyz(field, drdx, dsdx, dtdx=None)
Perform derivative with respect to physical coordinate x,y,z.
This method uses the chain rule, first evaluating derivatives with respect to rst, then multiplying by the inverse of the jacobian to map to xyz.
- Parameters:
- fieldndarray
Field to take derivative of. Shape should be (nelv, lz, ly, lx).
- drdxndarray
Derivative of the reference coordinates with respect to x, i.e., first entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.
- dsdxndarray
Derivative of the reference coordinates with respect to y, i.e., second entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.
- dtdxndarray
Derivative of the reference coordinates with respect to z, i.e., third entry in the appropiate row of the jacobian inverse. Shape should be the same as the field. (Default value = None) Only valid for 3D fields.
- Returns:
- ndarray
Derivative of the field with respect to x,y,z. Shape is the same as the input field.
Examples
Assuming you have a Coef object and are working on a 3d field:
>>> dudx = coef.dudxyz(u, coef.drdx, coef.dsdx, coef.dtdx)
- glsum(a, comm, dtype=<class 'numpy.float64'>)
Peform global summatin of given qunaitity a using MPI.
This method uses MPI to sum over all MPI ranks. It works with any numpy array shape and returns one value.
- Parameters:
- andarray
Quantity to sum over all mpiranks.
- commComm
MPI communicator object.
- datypenumpy.dtype
(Default value = np.double).
- Returns:
- float
Sum of the quantity a over all MPI ranks.
Examples
Assuming you have a Coef object and are working on a 3d field:
>>> volume = coef.glsum(coef.B, comm)
- class pysemtools.datatypes.Field(comm, data=None)
Class that contains fields.
This is the main class used to contain data that can be used for post processing. The data does not generarly need to be present in this class, as it is typically enough to have the data as ndarrays of shape (nelv, lz, ly, lx) for each field. However this class provides a easy interface to collect data tha is somehow associated.
It also allows to easily write data to disk. As all the data in this class will be stored in the same file.
- Parameters:
- commComm
MPI comminicator object.
- dataHexaData, optional
HexaData object that contains the coordinates of the domain.
- Attributes:
- fieldsdict
Dictionary that contains the fields. The keys are the field names and the values are lists of ndarrays. The keys for these dictionaries are the same as for Hexadata objects, i.e. vel, pres, temp, scal.
- vel_fieldsint
Number of velocity fields.
- pres_fieldsint
Number of pressure fields.
- temp_fieldsint
Number of temperature fields.
- scal_fieldsint
Number of scalar fields.
- tfloat
Time of the data.
Methods
Update number of fields.
clear
- Returns:
Examples
If a hexadata object: data is read from disk, the field object can be created directly from it.
>>> from pysemtools.datatypes.field import Field >>> fld = Mesh(comm, data = data)
If one wishes to use the data in the fields. It is possible to reference it with a ndarray of shape (nelv, lz, ly, lx) as follows:
>>> u = fld.fields["vel"][0] >>> v = fld.fields["vel"][1] >>> w = fld.fields["vel"][2] >>> vel_magnitude = np.sqrt(u**2 + v**2 + w**2)
A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:
>>> from pysemtools.datatypes.field import Field >>> fld = Field(comm) >>> fld.fields["vel"].append(u) >>> fld.update_vars()
This fld object can then be used to write fld files from field u created in the code.
- update_vars()
Update number of fields.
Update the number of fields in the class in the event that it has been modified. This is needed for writing data properly if more arrays are added to the class.
Examples
A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:
>>> from pysemtools.datatypes.field import Field >>> fld = Field(comm) >>> fld.fields["vel"].append(u) >>> fld.update_vars()
This fld object can then be used to write fld files from field u created in the code.
- class pysemtools.datatypes.FieldRegistry(comm, data=None, bckend='numpy')
Class that contains fields.
This class extends the main field class, as it contains a registry that allows to easily reference the fields.
- Parameters:
- commComm
MPI comminicator object.
- dataHexaData, optional
HexaData object that contains the coordinates of the domain.
Methods
add_field(comm[, field_name, field, ...])Add fields to the registry.
clear()Clear the registry and the fields.
rename_registry_key([old_key, new_key])Rename a key in the registry.
to([comm, bckend])Move all fields to cpu in numpy to write out
Update the registry with the fields that are present in the fields dictionary.
- add_field(comm, field_name='', field=None, file_type=None, file_name=None, file_key=None, dtype=<class 'numpy.float64'>)
Add fields to the registry. They will be stored in the fields dictionary to easily write them.
- Parameters:
- commComm
MPI comminicator object.
- field_namestr
Name of the field to be added. where the field is added thepends on the name
- fieldndarray
Field to be added to the registry. If this is provided, it is assumed to be a ndarray. It will then be added to the registry.
- file_typestr
Type of the file to be added. If this is provided, it is assumed to be a file. Currently, only “fld” supported.
- file_namestr
File name of the field to be added. If this is provided, it is assumed to be a file. It will then be added to the registry.
- file_keystr
File key. This will be search in the file. For nek file, the key have the following format: “vel_0”, “vel_1”, “pres”, “temp”, “scal_0”, “scal_1”, etc. Only for “vel” we read the 2/3 components at the same time
- dtypenp.dtype
Data type of the field. Default is np.double.
- clear()
Clear the registry and the fields.
- rename_registry_key(old_key='', new_key='')
Rename a key in the registry.
- Parameters:
- old_keystr
Old key to be renamed.
- new_keystr
New key to be used.
Notes
If you update the registry, some keys might be overwritten or multiple keys might reference the same data.
- to(comm=None, bckend='numpy')
Move all fields to cpu in numpy to write out
- update_vars()
Update the registry with the fields that are present in the fields dictionary.
- class pysemtools.datatypes.Mesh(comm, data=None, x=None, y=None, z=None, elmap=None, create_connectivity=False, bckend='numpy', log_level=None)
Class that contains coordinate and partitioning data of the domain.
This class needs to be used generaly as it contains the coordinates of the domain and some information about the partitioning of the domain.
- Parameters:
- commComm
MPI comminicator object.
- dataHexaData, optional
HexaData object that contains the coordinates of the domain.
- xndarray, optional
X coordinates of the domain. shape is (nelv, lz, ly, lx).
- yndarray, optional
Y coordinates of the domain. shape is (nelv, lz, ly, lx).
- zndarray, optional
Z coordinates of the domain. shape is (nelv, lz, ly, lx).
- elmapndarray, optional
1D ndarray of global element ids. shape is (nelv,).
- create_connectivitybool, optional
If True, the connectivity of the domain will be created. (Memory intensive).
- bckendstr, optional
Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.
- Attributes:
- xndarray
X coordinates of the domain. shape is (nelv, lz, ly, lx).
- yndarray
Y coordinates of the domain. shape is (nelv, lz, ly, lx).
- zndarray
Z coordinates of the domain. shape is (nelv, lz, ly, lx).
- lxint
Polynomial degree in x direction.
- lyint
Polynomial degree in y direction.
- lzint
Polynomial degree in z direction.
- nelvint
Number of elements in the domain in current rank.
- glb_nelvint
Total number of elements in the domain.
- gdimint
Dimension of the domain.
- non_linear_shared_pointslist, optional
List that show the index where the points in the domain are shared, used by coef in dssum.
Methods
Create connectivity with the information from one processor
Get the edge centers of the domain.
Get the centroid of each facet
Get the vertices of the domain.
init_common(comm)Initialize common attributes.
init_from_coords(comm, x, y, z[, elmap])Initialize from coordinates.
init_from_data(comm, data)Initialize form data.
to([comm, bckend])Transfer the Mesh object to the desired backend.
- Returns:
Examples
If a hexadata object: data is read from disk, the mesh object can be created directly from it.
>>> from pysemtools.datatypes.msh import Mesh >>> msh = Mesh(comm, data = data)
If the coordinates are already available, the mesh object can be created from them.
>>> from pysemtools.datatypes.msh import Mesh >>> msh = Mesh(comm, x = x, y = y, z = z)
This is useful in situations where the coordinates are generated in the code or streamed into python from another source.
- create_connectivity()
Create connectivity with the information from one processor
Notes
This function creates a map that contains the connectivity of the domain. This is not the recomended way to perform connectivity in large domains. For this, it is better to use the dedicated connecitivty object that performs the operations in parallel.
- get_edge_centers()
Get the edge centers of the domain.
Get all the edge centers of the domain in 2D or 3D.
Notes
We need 4 edges for 2D and 12 edges for 3D. For all cases we store 3 coordinates for each edge.
- get_facet_centers()
Get the centroid of each facet
Find the “centroid of each facet. This is used to find the shared facets between elements.
Notes
This is not really the centroid, as we also find a coordinate in the dimension perpendicular to the facet. This means that these values can be outside or inside the element. However the same behaviour should be seen in the matching elements.
- get_vertices()
Get the vertices of the domain.
Get all the vertices of the domain in 2D or 3D.
Notes
We need 4 vertices for 2D and 8 vertices for 3D. For all cases, we store 3 coordinates for each vertex.
- init_common(comm)
Initialize common attributes.
This function is used to initialize the common attributes of the mesh object.
- Parameters:
- commComm
MPI communicator object.
- Returns:
- None
Nothing is returned, the attributes are set in the object.
- init_from_coords(comm, x, y, z, elmap=None)
Initialize from coordinates.
This function is used to initialize the mesh object from x, y, z ndarrays.
- Parameters:
- commComm
MPI communicator object.
- xndarray
X coordinates of the domain. shape is (nelv, lz, ly, lx).
- yndarray
Y coordinates of the domain. shape is (nelv, lz, ly, lx).
- zndarray
Z coordinates of the domain. shape is (nelv, lz, ly, lx).
- elmapndarray, optional
1D ndarray of global element ids. shape is (nelv,). If not provided, it will be set to None.
- Returns:
- None
Nothing is returned, the attributes are set in the object.
- init_from_data(comm, data)
Initialize form data.
This function is used to initialize the mesh object from a hexadata object.
- Parameters:
- commComm
MPI communicator object.
- dataHexaData
HexaData object that contains the coordinates of the domain.
- Returns:
- None
Nothing is returned, the attributes are set in the object.
- to(comm=None, bckend='numpy')
Transfer the Mesh object to the desired backend.
- Parameters:
- commComm
MPI communicator object.
- bckendstr
Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.
- Returns:
- msh_cpuMesh
Mesh object in the desired backend.
- class pysemtools.datatypes.MeshConnectivity(comm, msh: Mesh | None = None, rel_tol=1e-05, use_hashtable=False, max_simultaneous_sends=1, max_elem_per_vertex: int | None = None, max_elem_per_edge: int | None = None, max_elem_per_face: int | None = None, coef=None)
Class to compute the connectivity of the mesh
Uses facets and vertices to determine which elements are connected to each other
- Parameters:
- commMPI communicator
The MPI communicator
- mshMesh
The mesh object
- rel_tolfloat
The relative tolerance to use when comparing the coordinates of the facets/edges
- use_hashtablebool
Whether to use a hashtable to define connectivity. This is faster but uses more memory
- max_simultaneous_sendsint
The maximum number of simultaneous sends to use when sending data to other ranks. A lower number saves memory for buffers but is slower.
- max_elem_per_vertexint
The maximum number of elements that share a vertex. The default values are 4 for 2D and 8 for 3D (Works for a structured mesh) The default value is selected if this input is left as None
- max_elem_per_edgeint
The maximum number of elements that share an edge. The default values are 2 for 2D and 4 for 3D (Works for a structured mesh) The default value is selected if this input is left as None
- max_elem_per_faceint
The maximum number of elements that share a face. The default values are 2 for 3D (Works for a structured mesh) The default value is selected if this input is left as None
Methods
dssum([field, msh, average])Computes the dssum of the field
dssum_global([local_dssum_field, field, msh])Computes the global dssum of the field
dssum_local([field, msh])Computes the local dssum of the field
get_boundary_node_indices_2d(msh[, ...])Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.
get_multiplicity(msh)Computes the multiplicity of the elements in the mesh
global_connectivity(msh)Computes the global connectivity of the mesh
local_connectivity(msh)Computes the local connectivity of the mesh
- dssum(field: ndarray | None = None, msh: Mesh | None = None, average: str = 'multiplicity')
Computes the dssum of the field
- Parameters:
- fieldnp.ndarray
The field to compute the dssum
- mshMesh
The mesh object
- averagestr
The averaging weights to use. Can be “multiplicity”
- Returns:
- np.ndarray
The dssum of the field
- dssum_global(local_dssum_field: ndarray | None = None, field: ndarray | None = None, msh: Mesh | None = None)
Computes the global dssum of the field
- Parameters:
- local_dssum_fieldnp.ndarray
The local dssum of the field, computed with dssum_local
- fieldnp.ndarray
The field to compute the dssum
- mshMesh
The mesh object
- Returns:
- np.ndarray
The global dssum of the field
- dssum_local(field: ndarray | None = None, msh: Mesh | None = None)
Computes the local dssum of the field
- Parameters:
- fieldnp.ndarray
The field to compute the dssum
- mshMesh
The mesh object
- Returns:
- np.ndarray
The local dssum of the field
- get_boundary_node_indices_2d(msh, masking_function=None)
Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.
- Parameters:
- mshMesh
The mesh associated with this connectivity object.
- masking_functioncallable or None
Optional function with signature (msh, e, k, j, i) → bool. If provided, only nodes for which this returns True are included.
- Returns:
- boundary_node_indiceslist of tuple
Indices in the form (element, z=0, j, i) for all GLL nodes on boundary edges.
- get_multiplicity(msh: Mesh)
Computes the multiplicity of the elements in the mesh
- Parameters:
- mshMesh
Notes
The multiplicity is the number of times a point in a element is shared with its own element or others. The minimum multiplicity is 1, since the point is always shared with itself.
- global_connectivity(msh: Mesh)
Computes the global connectivity of the mesh
Currently this function sends data from all to all.
- Parameters:
- mshMesh
The mesh object
Notes
In 3D. this function sends the facet centers of the unique_efp_elem and unique_efp_facet to all other ranks. as well as the element ID and facet ID to be assigned.
We compare the unique facet centers of our rank to those of others and determine which one matches. When we find that one matches, we populate the directories: global_shared_efp_to_rank_map[(e, f)] = rank global_shared_efp_to_elem_map[(e, f)] = elem global_shared_efp_to_facet_map[(e, f)] = facet
So for each element facet pair we will know which rank has it, and which is their ID in that rank.
BE MINDFUL: Later when redistributing, send the points, but also send the element and facet ID to the other ranks so the reciever can know which is the facet that corresponds.
- local_connectivity(msh: Mesh)
Computes the local connectivity of the mesh
This function checks elements within a rank
- Parameters:
- mshMesh
The mesh object
Notes
In 3D, the centers of the facets are compared. efp means element facet pair. One obtains a local_shared_efp_to_elem_map and a local_shared_efp_to_facet_map dictionary.
local_shared_efp_to_elem_map[(e, f)] = [e1, e2, …] gives a list with the elements e1, e2 … that share the same facet f of element e.
local_shared_efp_to_facet_map[(e, f)] = [f1, f2, …] gives a list with the facets f1, f2 … of the elements e1, e2 … that share the same facet f of element e.
In each case, the index of the element list, corresponds to the index of the facet list. Therefore, the element list might have repeated element entries.
Additionally, we create a list of unique_efp_elem and unique_efp_facet, which are the elements and facets that are not shared with any other element. These are either boundary elements or elements that are connected to other ranks. the unique pairs are the ones that are checked in global connecitivy,
- class pysemtools.datatypes.MeshPartitioner(comm, msh: Mesh | None = None, conditions: list[ndarray] | None = None)
A class that repartitons SEM mesh data using a given partitioning algorithm.
The idea is to be able to chose subdomains of the mesh and split the elements such that the load is balanced among the ranks.
One could use this to repartition the data if the condition array is full of True values, but the idea is to be able to use any condition array.
- Parameters:
- commMPI communicator
MPI communicator
- mshMesh
Mesh object to partition
- conditionslist[np.ndarray]
List of conditions to apply to the mesh elements. The conditions should be in the form of a list of numpy arrays. Each numpy array should have the same length as the number of elements in the mesh. The conditions should be boolean arrays.
Methods
create_partitioned_field([fld, ...])Create a partitioned field object
create_partitioned_mesh([msh, ...])Create a partitioned mesh object
redistribute_field_elements([field, ...])Redistribute the elements of the mesh object to different ranks
- create_partitioned_field(fld: Field | FieldRegistry | None = None, partitioning_algorithm: str = 'load_balanced_linear') FieldRegistry
Create a partitioned field object
- Parameters:
- fldField or FieldRegistry
Field object to partition
- partitioning_algorithmstr
Algorithm to use for partitioning the mesh elements
- Returns:
- partitioned_fieldFieldRegistry
Partitioned field object
- create_partitioned_mesh(msh: Mesh | None = None, partitioning_algorithm: str = 'load_balanced_linear', create_conectivity: bool = False) Mesh
Create a partitioned mesh object
- Parameters:
- mshMesh
Mesh object to partition
- partitioning_algorithmstr
Algorithm to use for partitioning the mesh elements
- Returns:
- partitioned_meshMesh
Partitioned mesh object
- redistribute_field_elements(field: ndarray | None = None, partitioning_algorithm: str = 'load_balanced_linear') None
Redistribute the elements of the mesh object to different ranks
- Parameters:
- fieldnp.ndarray
Field to redistribute based on the conditions at initialization
- partitioning_algorithmstr
Algorithm to use for partitioning the mesh elements
- class pysemtools.datatypes.VTKMesh(comm: Comm, x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex', global_connectivity: bool = True, distributed_axis: int = 0)
Class that contains the mesh data in a vtk friendly format
Helper to build connectivity and offsets etc.
- Parameters:
- commMPI.Comm
The MPI communicator
- xnp.ndarray
The x coordinates of the mesh points
- ynp.ndarray
The y coordinates of the mesh points
- znp.ndarray
The z coordinates of the mesh points
- cell_typestr, optional
The type of cells in the mesh, by default “hex”. Only “hex” is currently supported.
- global_connectivitybool, optional
Whether to use global connectivity or local connectivity in parallel. By default True, but if only one rank is used, it is set to False since the connectivity is already global in that case.
- distributed_axisint, optional
The axis along which the mesh is distributed, by default 0 (Only zero allowed now)
Notes
Offsets is n_cells + 1 to work on VTKHDF. This might need to be reduced to n_cells for other use cases like Catalyst.
interpolation
High-order interpolation routines for SEM data.
- class pysemtools.interpolation.Probes(comm, output_fname: str = './interpolated_fields.csv', probes: ndarray | str | None = None, msh=typing.Union[pysemtools.datatypes.msh.Mesh, str, list], write_coords: bool = True, progress_bar: bool = False, point_interpolator_type: str = 'single_point_legendre', max_pts: int = 128, find_points_iterative: list = [False, 5000], find_points_comm_pattern: str = 'point_to_point', elem_percent_expansion: float = 0.01, global_tree_type: str = 'rank_bbox', global_tree_nbins: int = 1024, use_autograd: bool = False, find_points_tol: float = np.float64(2.220446049250313e-15), find_points_max_iter: int = 50, local_data_structure: str = 'kdtree', use_oriented_bbox: bool = False, clean_search_traces: bool = False)
Class to interpolate fields at probes from a SEM mesh.
Main interpolation class. This works in parallel.
If the points to interpolate are available only at rank 0, make sure to only pass them at that rank and set the others to None. In that case, the points will be scattered to all ranks.
If the points are passed in all ranks, then they will be considered as different points to be interpolated and the work to interpolate will be multiplied. If you are doing this, make sure that the points that each rank pass are different. Otherwise simply pass None in all ranks but 0.
See example below to observe how to avoid unnecessary replication of data in all ranks when passing probes as argument if the points are the same in all ranks.
If reading probes from file, they will be read on rank 0 and scattered unless parallel hdf5 is used, in which case the probes will be read in all ranks. (In development)
- Parameters:
- commMPI communicator
MPI communicator.
- output_fnamestr
Output file name. Default is “./interpolated_fields.csv”. Note that you can change the file extension to .hdf5 to write in hdf5 format. by using the name “interpolated_fields.hdf5”.
- probesUnion[np.ndarray, str]
Probes coordinates. If a string, it is assumed to be a file name.
- mshUnion[Mesh, str, list]
Mesh data. If a string, it is assumed to be a file name. If it is a list the first entry is the file name and the second is the dtype of the data. if it is a Mesh object, the x, y, z coordinates are taken from the object.
- write_coordsbool
If True, the coordinates of the probes are written to a file. Default is True.
- progress_barbool
If True, a progress bar is shown. Default is False.
- point_interpolator_typestr
Type of point interpolator. Default is single_point_legendre. options are: single_point_legendre, single_point_lagrange, multiple_point_legendre_numpy, multiple_point_legendre_torch.
- max_ptsint, optional
Maximum number of points to interpolate. Default is 128. Used if multiple point interpolator is selected.
- find_points_iterativelist
List with two elements. First element is a boolean that indicates if the search is iterative. Second element is the maximum number of candidate ranks to send the data. This affects memory. Default is [False, 5000].
- find_points_comm_patternstr
Communication pattern for finding points. Default is point_to_point. options are: point_to_point, collective, rma
- elem_percent_expansionfloat
Percentage expansion of the element bounding box. Default is 0.01.
- global_tree_typestr
How is the global tree constructed to determine rank candidates for the probes. Only really used if using tree structures to determine candidates. Default is rank_bbox. options are: rank_bbox, domain_binning.
- global_tree_nbinsint
Number of bins in the global tree. Only used if the global tree is domain_binning. Default is 1024.
- use_autogradbool
If True, autograd is used. Default is False.
- find_points_tolfloat
The tolerance to use when finding points. Default is np.finfo(np.double).eps * 10.
- find_points_max_iterint
The maximum number of iterations to use when finding points. Default is 50.
- local_data_structurestr
The local data structure to use when finding points. Default is kdtree. options are: kdtree, obb_tree.
- use_oriented_bboxbool
If True, oriented bounding boxes are used when finding points. Default is False.
- clean_search_tracesbool
If True, cleans all the data used for searching points after initialization. This saves memory if only interpolation is needed afterwards. Default is False.
- Attributes:
- probesndarray
2D array of probe coordinates. shape = (n_probes, 3).
- interpolated_fieldsndarray
2D array of interpolated fields at probes. shape = (n_probes, n_fields + 1). The first column is always time, the rest are the interpolated fields.
Methods
interpolate_from_field_list(t, field_list, comm)Interpolate the probes from a list of fields.
clean_search_traces
Notes
A sample input file can be found in the examples folder of the main repository, However, the file is not used in said example.
Examples
Initialize from file:
>>> from mpi4py import MPI >>> from pysemtools.interpolation.probes import Probes >>> comm = MPI.COMM_WORLD >>> probes = Probes(comm, filename="path/to/params.json")
2. Initialize from code, passing everything as arguments. Assume msh is created. One must then create the probe data in rank 0. A dummy probe_data must be created in all other ranks
>>> from mpi4py import MPI >>> from pysemtools.interpolation.probes import Probes >>> comm = MPI.COMM_WORLD >>> if comm.Get_rank() == 0: >>> probes_data = np.array([[0.0, 0.0, 0.0], [1.0, 1.0, 1.0], [2.0, 2.0, 2.0]]) >>> else: >>> probes_data = None >>> probes = Probes(comm, probes=probes_data, msh=msh)
Note that probes is initialized in all ranks, but the probe_data containing the coordinates are only relevant in rank 0. They are scattered internally.
- interpolate_from_field_list(t, field_list, comm, write_data=True, field_names: list[str] | None = None)
Interpolate the probes from a list of fields.
This method interpolates from a list of fields (ndarrays of shape (nelv, lz, ly, lx)).
- Parameters:
- tfloat
Time of the field data.
- field_listlist
List of fields to interpolate. Each field is an ndarray of shape (nelv, lz, ly, lx).
- commComm
MPI communicator.
- write_databool
If True, the interpolated data is written to a file. Default is True.
- field_nameslist
List of names of the interpolated fields. Useful when writing to file. Default is None.
Examples
This method can be used to interpolate fields from a list of fields. If you have previosly obtained a set of fields u,v,w as ndarrays of shape (nelv, lz, ly, lx), you can:
>>> probes.interpolate_from_field_list(t, [u,v,w], comm)
The results are stored in probes.interpolated_fields attribute. Remember: the first column of this attribute is always the time t given.
io
There are multiple interfaces to perform IO operations. The main one relates to NEk5000/Neko files. These are read by using the ppymech submodule. It is also possible to use ADIOS2 to, for example, use the in-situ processing capabilities, like data streaming or compression. Support for HDF5 files in parallel is also provided, including VTKHDF files that can be shown in Paraview. Finally, a set of wrappers to read and write data in a more user-friendly way is also included.
ppymech
Classes to read or write data from Nek style codes
- pysemtools.io.ppymech.preadnek(filename, comm, data_dtype=<class 'numpy.float64'>)
Read and fld file and return a pymech hexadata object (Parallel).
Main function for readinf nek type fld filed.
- Parameters:
- filenamestr
The filename of the fld file.
- commComm
MPI communicator.
- data_dtypestr
The data type of the data in the file. (Default value = “float64”).
- Returns:
- HexaData
The data read from the file in a pymech hexadata object.
Examples
>>> from mpi4py import MPI >>> from pysemtools.io.ppymech.neksuite import preadnek >>> comm = MPI.COMM_WORLD >>> data = preadnek('field00001.fld', comm)
- pysemtools.io.ppymech.pwritenek(filename, data, comm)
Write and fld file and from a pymech hexadata object (Parallel).
Main function to write fld files.
- Parameters:
- filenamestr
The filename of the fld file.
- dataHexaData
The data to write to the file.
- commComm
MPI communicator.
Examples
Assuming you have a hexadata object already:
>>> from pysemtools.io.ppymech.neksuite import pwritenek >>> pwritenek('field00001.fld', data, comm)
- pysemtools.io.ppymech.pynekread(filename, comm, data_dtype=<class 'numpy.float64'>, msh=None, fld=None, overwrite_fld=False)
Read nek file and returs a pynekobject (Parallel).
Main function for readinf nek type fld filed.
- Parameters:
- filenamestr
The filename of the fld file.
- commComm
MPI communicator.
- data_dtypestr
The data type of the data in the file. (Default value = “float64”).
- mshMesh
The mesh object to put the data in. (Default value = None).
- fldField
The field object to put the data in. (Default value = None).
- overwrite_fldbool
Wether or not to overwrite the contents of fld. (Default value = False).
- Returns:
- None
Nothing is returned, the attributes are set in the object.
Examples
>>> from mpi4py import MPI >>> from pysemtools.io.ppymech.neksuite import pynekread >>> comm = MPI.COMM_WORLD >>> msh = msh_c(comm) >>> fld = field_c(comm) >>> pynekread(fname, comm, msh = msh, fld=fld)
- pysemtools.io.ppymech.pynekread_field(filename, comm, data_dtype=<class 'numpy.float64'>, key='')
Read nek file and returs a pynekobject (Parallel).
Main function for readinf nek type fld filed.
- Parameters:
- filenamestr
The filename of the fld file.
- commComm
MPI communicator.
- data_dtypestr
The data type of the data in the file. (Default value = “float64”).
- keystr
The key of the field to read. Typically “vel”, “pres”, “temp” or “scal_1”, “scal_2”, etc.
- Returns:
- list
The data read from the file in a list.
- pysemtools.io.ppymech.pynekwrite(filename, comm, msh=None, fld=None, wdsz=4, istep=0, write_mesh=True)
Write and fld file and from pynekdatatypes (Parallel).
Main function to write fld files.
- Parameters:
- filenamestr
The filename of the fld file.
- commComm
MPI communicator.
- mshMesh
The mesh object to write to the file. (Default value = None).
- fldField
The field object to write to the file. (Default value = None).
- wdszint
The word size of the data in the file. (Default value = 4).
- istepint
The time step of the data. (Default value = 0).
- write_meshbool
If True, write the mesh data. (Default value = True).
Examples
Assuming a mesh object and field object are already present in the namespace:
>>> from pysemtools.io.ppymech.neksuite import pwritenek >>> pynekwrite('field00001.fld', comm, msh = msh, fld=fld)
ADIOS2
IO operations using ADIOS2.
- class pysemtools.io.adios2.DataCompressor(comm, mesh_info=None, wrd_size=4)
Class used to write compressed data to disk.
This Assumes that the input data has a msh object available.
- Parameters:
- commComm
MPI communicator.
- mesh_infodict
Dictionary with mesh information.
- wrd_sizeint
Word size to write data. (Default value = 4). Single precsiion is 4, double is 8.
Methods
read(comm[, fname, variable_names])Read data from disk using adios2.
write(comm[, fname, variable_names, data])Write data to disk using adios2.
- Returns:
Examples
This class is used to write data to disk. The data is compressed using bzip2.
>>> mesh_info = {"glb_nelv": msh.glb_nelv, "lxyz": msh.lxyz, "gdim": msh.gdim} >>> dc = DataCompressor(comm, mesh_info = mesh_info, wrd_size = 4)
- read(comm, fname='compress.bp', variable_names=None)
Read data from disk using adios2.
Read compressed data and internally decompress it.
- Parameters:
- commComm
MPI communicator.
- fnamestr
File name to read. (Default value = “compress.bp”).
- variable_nameslist
List of string with the names of the variables to read. These names NEED to match the names adios2 used to write the data.
- Returns:
- list
List of numpy arrays with the data read. the ndarrays in the list are 1d.
Examples
This function is used to read data from disk. The data is compressed using bzip2.
>>> variable_names = ["x", "y", "z"] >>> data = dc.read(comm, fname = "compress.bp", variable_names = variable_names)
- write(comm, fname='compress.bp', variable_names=None, data=None)
Write data to disk using adios2.
Lossless compression with bzip2.
- Parameters:
- commComm
MPI communicator.
- fnamestr
File name to write. (Default value = “compress.bp”).
- variable_nameslist
List of string with the names of the variables to write. This is very important, as adios2 will use these names to process the data.
- datalist
List of numpy arrays with the data to write. Corresponding in index to variable_names. The arrays must be 1d.
Examples
This function is used to write data to disk. The data is compressed using bzip2.
>>> variable_names = ["x", "y", "z"] >>> data = [msh.x, msh.y, msh.z] >>> dc.write(comm, fname = "compress.bp", variable_names = variable_names, data = data)
- class pysemtools.io.adios2.DataStreamer(comm, from_nek=True)
Class used to communicate data between codes using adios2.
Use this to send and recieve data.
The data is always transported as a 1d array, so it is necessary to reshape.
- Parameters:
- commComm
MPI communicator.
- from_nekbool
Define if the data that is being stream comes from a nek-like code. (Default value = True).
- Attributes:
- glb_nelvint
Total number of elements in the global domain.
- lxyzint
Number of points per element.
- gdimint
Problem dimension.
- nelvint
Number of elements that this rank has.
Methods
finalize()Finalize the execution of the module.
recieve([fld, variable])Recieve data from another code using adios2.
stream(fld)Send data to another code using adios2.
Examples
This type must be paired with another streaming in the other executable/code. The codes will not start if the streams are not paired.
A full of example of recieving data is shown below. It is possible to pair with other classes to, for example, write data to disk.
>>> ds = data_streamer_c(comm) >>> x = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape x >>> y = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape y >>> z = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape z >>> ds.finalize() >>> msh = msh_c(comm, x = x, y = y, z = z) >>> write_fld_file_from_list("field0.f00001", comm, msh, [x, y, z])
To send data to the other code, use the stream method.
>>> ds.stream(x.reshape(x.size))
- finalize()
Finalize the execution of the module.
Used to close reader and writer. The stream will end and the code will not be coupled.
- recieve(fld=None, variable='f2py_field')
Recieve data from another code using adios2.
The data is always transported as a 1d array, so it is necessary to reshape to deinred shape.
- Parameters:
- fldndarray
Buffer to contain the data. If None, it will be created. (Default value = None).
- variablestr
Name of the adios2 variable to be read. Neko used default name. Change as needed. This name can remain the same during the execution even if different quantities are being transported. (Default value = “f2py_field”).
- Returns:
- ndarray
Returns the field that was recieved.
- stream(fld)
Send data to another code using adios2.
Couple 2 code or executables.
- Parameters:
- fldndarray
Field to be sent. Must be a 1d array.
HDF
Classes to read or write data in the hdf format
- class pysemtools.io.hdf.HDF5File(comm: Comm, fname: str, mode: str, parallel: bool)
Class to write and read hdf5 files in parallel using h5py.
Open an hdf5 file based on inputs.
- Parameters:
- commMPI.Comm
MPI communicator.
- fnamestr
Name of the hdf5 file to read or write.
- modestr
Mode to open the file. Should be “r” for reading or “w” for writing.
- parallelbool
Whether to use parallel I/O or not.
Methods
close([clean])Close the hdf5 file object
open(fname, mode, parallel)Open an hdf5 file based on inputs.
read_dataset(dataset_name[, dtype, ...])Read a dataset from the hdf5 file object
read_slices(dataset_name[, dtype])Read the slices hyperslabs from the file
set_active_group(group_name)Set the active group to read or write data from.
set_read_slices_external(global_shape, slices)Set the slices that should be read from the file based on external input.
set_read_slices_linear_lb(global_shape, ...)Set the slices that should be read from the file.
set_write_slices(local_shape, distributed_axis)Set the slices that should be written to the file.
write_dataset(dataset_name, data[, ...])Write a dataset to the hdf5 file object
write_slices(dataset_name, data[, shape_in_file])Write the hyperslab to the file.
- close(clean: bool = True)
Close the hdf5 file object
- Parameters:
- cleanbool
Whether to clean the attributes that are assigned when opening a file. This is useful if the file object will be reused to open another file after closing the current one. Default is False.
- open(fname: str, mode: str, parallel: bool)
Open an hdf5 file based on inputs.
This can be used to open a new file after closing the previous one.
- Parameters:
- fnamestr
Name of the hdf5 file to read or write.
- modestr
Mode to open the file. Should be “r” for reading or “w” for writing.
- parallelbool
Whether to use parallel I/O or not. If True, the file will be opened using the MPI-IO driver. If False, the file will be opened using the default driver.
- read_dataset(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int | None = None, slices: list | None = None, as_array_list_in_file: bool = False, ignore_metadata: bool = False)
Read a dataset from the hdf5 file object
- Parameters:
- dataset_namestr
Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.
- dtypenp.dtype
Data type to read the dataset in. Default is np.double.
- distributed_axisint
Axis along which the data is distributed in parallel. This is required for parallel reading. Default is None.
- sliceslist
Optional. List of slices to read from the dataset. In case it is known
- as_array_list_in_filebool
Optional. default is False. Whether the data is stored as an array list in the file. This is useful if originally the data had a different shape but was flattened to 1d before writing. This will use the shape attribute stored in the file to do the partioning but will keep in mind that the data is stored as a 1d array to read properly.
- ignore_metadatabool
Optional. default is False. Force to read the data ingnoring any shape metadata. This will just read the arrays as stored and will not try to assume an original shape
- Returns:
- local_datanp.ndarray
Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.
- read_slices(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>)
Read the slices hyperslabs from the file
- Parameters:
- dataset_namestr
Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.
- dtypenp.dtype
Data type to read the dataset in. Default is np.double.
- Returns:
- local_datanp.ndarray
Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.
- set_active_group(group_name: str)
Set the active group to read or write data from.
This is useful to avoid having to specify the group every time a dataset is read or written.
- Parameters:
- group_namestr
Name of the group to set as active. Can include the group path, e.g. “/group1/group2”. If the group does not exist, it will be created if the file is opened in write mode, otherwise an error will be raised.
- set_read_slices_external(global_shape: tuple, slices: list)
Set the slices that should be read from the file based on external input.
slices need to be precomputed in this case
- Parameters:
- global_shapetuple
Shape of the global array to be read.
- sliceslist
List of slices to read from the data set.
- set_read_slices_linear_lb(global_shape: tuple, distributed_axis: int, explicit_strides: bool = False, shape_in_file: list | None = None)
Set the slices that should be read from the file.
Data is distributed in a linear load balanced way.
- Parameters:
- global_shapetuple
Shape of the global array to be read. This is required to determine the local shape and the slices to read from the file.
- distributed_axisint
Axis along which the data is distributed in parallel. This is required to determine the local shape and the slices to read from the file.
- explicit_stridesbool
Whether to use explicit strides to read the data. This is useful if the data is stored as 1D in the file but originally had a different shape.
- set_write_slices(local_shape: tuple, distributed_axis: int, extra_global_entries: list[int] | None = None)
Set the slices that should be written to the file.
Obtain global shape from the local one
- Parameters:
- local_shapetuple
Shape of the local array to be written. This is required to determine the global shape and the slices to write to the file.
- distributed_axisint
Axis along which the data is distributed in parallel.
- extra_global_entrieslist[int], optional
List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write. Default is None.
- write_dataset(dataset_name: str, data: ndarray, distributed_axis: int | None = None, extra_global_entries: list[int] | None = None, shape_in_ram: tuple | None = None)
Write a dataset to the hdf5 file object
- Parameters:
- dataset_namestr
Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.
- datanp.ndarray
Data to write to the file.
- distributed_axisint
Axis along which the data is distributed in parallel. This is required for parallel writing. Default is None.
- extra_global_entrieslist[int]
Optional. List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write.
- shape_in_ramtuple
Optional. Shape of the data in RAM. This is useful if the data is stored in a different shape that it originally had, for example, if it is stored in a 1d array but originally it had a different shape. this will be the shape that is stored in the file in the attribute “shape” and can be used to reshape the data when reading it.
- write_slices(dataset_name: str, data: ndarray, shape_in_file: tuple | None = None)
Write the hyperslab to the file.
Perform the write operations
- Parameters:
- dataset_namestr
Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.
- datanp.ndarray
Data to write to the file. This should have the same shape as the local shape determined by the set_write_slices method.
- shape_in_filetuple, optional
Shape of the data to be stored in the file. This is useful if the data is stored in a different shape in the file than it is in RAM.
- class pysemtools.io.hdf.VTKHDFFile(comm: Comm, fname: str, mode: str, parallel: bool)
Class to write and read vtkhdf files in parallel using h5py. Open an hdf5 file based on inputs.
- Parameters:
- commMPI.Comm
MPI communicator.
- fnamestr
Name of the hdf5 file to read or write.
- modestr
Mode to open the file. Should be “r” for reading or “w” for writing.
- parallelbool
Whether to use parallel I/O or not.
Methods
close([clean])Close the hdf5 file object
link_to_existing_mesh(mesh_name)Link to an existing mesh
open(fname, mode, parallel)Open an hdf5 file based on inputs.
read_dataset(dataset_name[, dtype, ...])Read a dataset from the hdf5 file object
read_mesh_data([dtype, distributed_axis])Read the mesh data from the hdf5 file
read_point_data(dataset_name[, dtype, ...])Read point data from the hdf5 file
read_slices(dataset_name[, dtype])Read the slices hyperslabs from the file
set_active_group(group_name)Set the active group to read or write data from.
set_read_slices_external(global_shape, slices)Set the slices that should be read from the file based on external input.
set_read_slices_linear_lb(global_shape, ...)Set the slices that should be read from the file.
set_write_slices(local_shape, distributed_axis)Set the slices that should be written to the file.
write_dataset(dataset_name, data[, ...])Write a dataset to the hdf5 file object
write_mesh_data(x, y, z[, distributed_axis])Write the mesh data to the hdf5 file
write_point_data(dataset_name, data[, ...])Write point data to the hdf5 file
write_slices(dataset_name, data[, shape_in_file])Write the hyperslab to the file.
- close(clean: bool = True)
Close the hdf5 file object
- Parameters:
- cleanbool
Whether to clean up the file after closing. This will delete the file from disk. Should only be used for testing.
- link_to_existing_mesh(mesh_name: str)
Link to an existing mesh
Avoid rewriting mesh data if not necessary. It can be quite costly in storage.
- Parameters:
- mesh_namestr
Name of the hdf5 file to link to.
- read_mesh_data(dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)
Read the mesh data from the hdf5 file
- Parameters:
- dtypenp.dtype
Data type to read the mesh data as. Should be a floating point type.
- distributed_axisint
Axis along which the data is distributed in parallel. Should be 0 for now.
- Returns:
- xnp.ndarray
The x coordinates of the mesh points.
- ynp.ndarray
The y coordinates of the mesh points.
- znp.ndarray
The z coordinates of the mesh points.
- read_point_data(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)
Read point data from the hdf5 file
- Parameters:
- dataset_namestr
Name of the dataset to read. This should be the name of the dataset in the hdf5 file.
- dtypenp.dtype
Data type to read the dataset as. Should be a floating point type.
- distributed_axisint
Axis along which the data is distributed in parallel. Should be 0 for now.
- Returns:
- np.ndarray
The point data read from the hdf5 file. Will have the same shape as the mesh points.
- write_mesh_data(x: ndarray, y: ndarray, z: ndarray, distributed_axis: int = 0)
Write the mesh data to the hdf5 file
- Parameters:
- xnp.ndarray
The x coordinates of the mesh points.
- ynp.ndarray
The y coordinates of the mesh points.
- znp.ndarray
The z coordinates of the mesh points.
- distributed_axisint
Axis along which the data is distributed in parallel. Should be 0 for now.
- write_point_data(dataset_name: str, data: ndarray, distributed_axis: int = 0)
Write point data to the hdf5 file
- Parameters:
- dataset_namestr
Name of the dataset to write. This will be used as the name of the dataset in the hdf5 file.
- datanp.ndarray
Data to write. Should have the same number of points as the mesh.
- distributed_axisint
Axis along which the data is distributed in parallel. Should be 0 for now.
Catalyst2
Classes to interface with paraview - catalyst
- class pysemtools.io.catalyst.CatalystSession(comm: Comm, pipeline: str, channel: str, implementation_name=None, implementation_path: str | None = None)
Class to interface with catalyst
Interface with catalyst to set up a session and load the pipeline.
- Parameters:
- commMPI.Comm
- pipelinestr
Path to the saved ParaView Catalyst state file (e.g. “pipeline.py”).
- channelstr
Name of the catalyst channel to use (must match the registrationName in your saved state script, e.g. “rbc00001.vtkhdf”).
- implementation_namestr
Optional name of the catalyst implementation to use. Also set with env variable CATALYST_IMPLEMENTATION_NAME e.g. export CATALYST_IMPLEMENTATION_NAME=paraview
- implementation_pathstr
Optional path to the catalyst implementation. Also set with env variable CATALYST_IMPLEMENTATION_PATH e.g. export CATALYST_IMPLEMENTATION_PATH=/path/to/paraview/lib/catalyst
Methods
execute([timestep, time_value])Execute the catalyst pipeline for the current mesh and fields
finalize()Finalize the catalyst session
set_field(fields)Set the fields for the catalyst session / conduit node
set_mesh(x, y, z[, cell_type])Set the mesh for the catalyst session / conduit node
- execute(timestep=0, time_value=0.0)
Execute the catalyst pipeline for the current mesh and fields
- Parameters:
- timestep: int
Current timestep to set in the catalyst state.
- time_value: float
Current time value to set in the catalyst state.
- finalize()
Finalize the catalyst session
- set_field(fields: dict[str, ndarray])
Set the fields for the catalyst session / conduit node
- Parameters:
- fields: dict[str, np.ndarray]
Dictionary of field name to field values.
- set_mesh(x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex')
Set the mesh for the catalyst session / conduit node
- Parameters:
- x: np.ndarray
array with local x coordinates on this rank.
- y: np.ndarray
array with local y coordinates on this rank.
- z: np.ndarray
array with local z coordinates on this rank.
- cell_type: str
cell type, currently only “hex” is supported.
wrappers
Wrappers to ease IO
- pysemtools.io.wrappers.partition_read_data(comm, fname: str | None = None, distributed_axis: int = 0)
Generate partition information for hdf5 files. Useful if needing to read or write multiple files with the same partitioning, such that the read/write functions does not need to do the same every time.
- Parameters:
- commMPI.Comm
The MPI communicator
- fnamestr
The name of the file to read
- distributed_axisint, optional
The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.
- Returns:
- list
A list of slices corresponding to the local data to be read by each process
- pysemtools.io.wrappers.read_data(comm, fname: str, keys: list[str], parallel_io: bool = False, dtype=<class 'numpy.float32'>, distributed_axis: int = 0, slices: list | None = None)
Read data from a file and return a dictionary with the names of the files and keys
- Parameters:
- comm, MPI.Comm
The MPI communicator
- fnamestr
The name of the file to read
- keyslist[str]
The keys to read from the file
- parallel_iobool, optional
If True, read the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True
- dtypenp.dtype, optional
The data type of the data to read, by default np.single
- distributed_axisint, optional
The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.
- sliceslist, optional
A list of slices to read from the file. If None, the local data will be read based on the distributed_axis and the communicator. If provided, it should match the number of dimensions in the data. Note that if you are reading in parallel, the slices should be provided in such a way that they correspond to the local data on each process, otherwise the data will be replicated. If in doubt, do not provide slices, and the local data will be determined automatically.
- Returns:
- dict
A dictionary with the keys and the data read from the file
- pysemtools.io.wrappers.write_data(comm, fname: str, data_dict: dict[str, ~numpy.ndarray], parallel_io: bool = False, dtype=<class 'numpy.float32'>, msh: ~pysemtools.datatypes.msh.Mesh | list[~numpy.ndarray] | None = None, write_mesh: bool = False, distributed_axis: int = 0, uniform_shape: bool = False)
Write data to a file
- Parameters:
- comm, MPI.Comm
The MPI communicator
- fnamestr
The name of the file to write
- data_dictdict
The data to write to the file
- parallel_iobool, optional
If True, write the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True
- dtypenp.dtype, optional
- mshMesh, optional
The mesh object to write to a fld file, by default None
- write_meshbool, optional
Only valid for writing fld files
- distributed_axisint, optional
The axis along which the data is distributed, by default 0
- uniform_shapebool, optional
If True, the global shape of the data is assumed to be uniform, by default False
monitoring
Monitoring tools for pySEMTools
- class pysemtools.monitoring.Logger(level=None, comm=None, module_name=None)
Class that takes charge of logging messages
This class takes care of logging only in one rank and setting differnent logging levels.
Generally, the levels are set using the environment variables PYSEMTOOLS_DEBUG and PYSEMTOOLS_HIDE_LOG.
- Parameters:
- levelint, optional
Logging level. The default is None, which sets it to logging.INFO.
- commMPI.Comm
MPI communicator.
- module_namestr, optional
Name of the module that is using the logger. The default is None.
- Attributes:
- loglogging.Logger
Logger object that handles the logging
- commMPI.Comm
MPI communicator
- sync_timedict
Dictionary to store the times for sync_tic/sync_toc methods
- timefloat
Variable to store the time for tic/toc methods
Methods
write(level, message)
Method that writes messages in the log
tic()
Store the current time.
toc()
Write elapsed time since the last call to tic.
sync_tic(id=0)
Store the current time.
sync_toc(id=0, message=None, time_message=”Elapsed time: “)
Write elapsed time since the last call to tic.
- sync_tic(id=0)
Store the current time in the attibute sync_time with key id.
- Returns:
- None.
- sync_toc(id=0, message=None, time_message='Elapsed time: ', level='info')
Write elapsed time since the last call to tic for the given id in the sync_time attribute.
- tic()
Store the current time.
- Returns:
- None.
- toc(message=None, time_message='Elapsed time: ', level='info')
Write elapsed time since the last call to tic.
- write(level, message)
Writes messages in the log
- Parameters:
- levelstr
Level of the message. Possible values are: “debug_all”, “debug”, “info “info_all”, “warning”, “error”, “critical”.
- messagestr
Message to be logged.
rom
Set of tools to perform reduced order modeling tasks. Particularly, proper orthogonal decomposition (POD)
- class pysemtools.rom.IoHelp(comm, number_of_fields=1, batch_size=1, field_size=1, field_data_type=<class 'numpy.float64'>, mass_matrix_data_type=<class 'numpy.float64'>, module_name='io_helper')
Class used to help with IO buffering
This class contains buffers to be used in the carrying out, for example, POD
- Parameters:
- commMPI communicator
MPI communicator object.
- number_of_fieldsint
Number of fields to be stored in the buffer.
- batch_sizeint
Size of the buffer.
- field_sizeint
Size of each field.
- field_data_typedata-type, optional
Data type of the fields. The default is np.double.
- mass_matrix_data_typedata-type, optional
Data type of the mass matrix. The default is np.double.
- module_namestr, optional
Name of the module for logging purposes. The default is “io_helper”.
Methods
copy_fieldlist_to_xi([field_list])Copy a field list into the buffer xi position 0
load_buffer([scale_snapshot])Function to load snapshot into the allocated buffer.
split_narray_to_1dfields(array)Split a snapshot into a set of fields.
- copy_fieldlist_to_xi(field_list=None)
Copy a field list into the buffer xi position 0
This is used to make multi dimensional data into one big column vector that works as a snapshot for the POD
- Parameters:
- field_listlist of np.ndarray
List of fields to be copied into xi.
- Returns:
- None
- load_buffer(scale_snapshot=True)
Function to load snapshot into the allocated buffer.
This transfer the data from xi into the buffer at the current buffer index.
If the buffer is full, it sets the flag to update from buffer to True.
This is the buffer that the POD class uses to compute.
- Parameters:
- scale_snapshotbool, optional
If True, the snapshot is scaled with the mass matrix before being loaded into the buffer. The default is True.
- Returns:
- None
- split_narray_to_1dfields(array)
Split a snapshot into a set of fields.
This is somewhat an inverse of copy_fieldlist_to_xi, where from a big column vector we split it into a list of fields.
- Parameters:
- arraynp.ndarray
Array to be split into fields.
- Returns:
- field_list1dlist of np.ndarray
List of fields obtained from splitting the array.
- class pysemtools.rom.POD(comm, number_of_modes_to_update=1, global_updates=True, auto_expand=False, auto_expand_from_these_modes=1, bckend='numpy')
Class that wraps the SVD to facilitate the use of the POD
This class performs the POD in parallel or locally, and allows for incremental updates of the modes.
This wraps the SVD class to provide a more user-friendly interface for performing POD.
- Parameters:
- commMPI communicator
MPI communicator object.
- number_of_modes_to_updateint
Number of modes to update during each update step. The default is 1.
- global_updatesbool, optional
If True, perform global updates. If False, perform local updates. The default is True
- auto_expandbool, optional
If True, automatically expand the number of modes when the orthogonality criterion is not met The default is False.
- auto_expand_from_these_modesint, optional
Number of modes from which to start the automatic expansion. The default is 1.
- bckendstr, optional
Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.
Methods
check_snapshot_orthogonality(comm[, xi])Check the level of orthogonality of the new snapshot with the current basis
Do a rotation of the current modes into a global basis
scale_modes(comm[, bm1sqrt, op])Scale the current modes with the given mass matrix and the provided opeartion (div or mult)
update(comm[, buff])Update POD modes from a batch of snapshots in buff
- Returns:
- None
- check_snapshot_orthogonality(comm, xi=None)
Check the level of orthogonality of the new snapshot with the current basis
The data is stored in the running_ra attribute. This is valid only if the auto update is activated. In this case, if the orthogonality ratio is above the minimun_orthogonality_ratio and the current number of modes is below the setk, then the number of modes k is increased by one. The minimun orthogonality ratio is set to 0.99 by default.
- Parameters:
- commMPI communicator
MPI communicator object.
- xinp.ndarray
New snapshot to be checked.
- Returns:
- None
- rotate_local_modes_to_global(comm)
Do a rotation of the current modes into a global basis
This is needed if the POD is initially performed locally. The rotation is done only once required.
- Parameters:
- commMPI communicator
MPI communicator object.
- Returns:
- None
- scale_modes(comm, bm1sqrt=None, op='div')
Scale the current modes with the given mass matrix and the provided opeartion (div or mult)
- Parameters:
- commMPI communicator
MPI communicator object.
- bm1sqrtnp.ndarray
Mass matrix to scale the modes.
- opstr, optional
Operation to be performed. “div” for division, “mult” for multiplication. The default is “div”.
- Returns:
- None
- update(comm, buff=None)
Update POD modes from a batch of snapshots in buff
- Parameters:
- commMPI communicator
MPI communicator object.
- buffnp.ndarray
Buffer containing the new snapshots to update the modes.
- Returns:
- None
- class pysemtools.rom.SVD(logger, bckend='numpy')
Class used to obtain parallel and streaming SVD results
This is the main class of the ROM module and POD wraps arounf it.
- Parameters:
- loggerLogger
Logger object to be used for logging.
- bckendstr, optional
Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.
Methods
gbl_svd(xi, comm)perform a global svd that contains all necesary rotations for global modes
gbl_svd_numpy(xi, comm)Perform a global SVD using numpy, including necessary rotations for global modes using numpy
gbl_svd_torch(xi, comm)Perform a global SVD using PyTorch, including necessary rotations for global modes.
gbl_update(u_1t, d_1t, vt_1t, xi, k, comm)Method to update the global svds from a batch of data
gbl_update_numpy(u_1t, d_1t, vt_1t, xi, k, comm)Method to update the global svds from a batch of data using numpy
gbl_update_torch(u_1t, d_1t, vt_1t, xi, k, comm)Method to update the global svds from a batch of data using PyTorch
lcl_to_gbl_svd(uii, dii, vtii, k_set, comm)Perform rotations to obtain global modes from local modes.
lcl_update(u_1t, d_1t, vt_1t, xi, k)Method to update the local svds from a batch of data
- Returns:
- None
- gbl_svd(xi, comm)
perform a global svd that contains all necesary rotations for global modes
- Parameters:
- xinp.ndarray
Data to perform the svd on.
- commMPI communicator
MPI communicator object.
- Returns:
- u_localnp.ndarray
Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
- dynp.ndarray
Global singular values.
- vtynp.ndarray
Global right singular vectors.
- gbl_svd_numpy(xi, comm)
Perform a global SVD using numpy, including necessary rotations for global modes using numpy
- Parameters:
- xinp.ndarray
Data to perform the svd on.
- commMPI communicator
MPI communicator object.
- Returns:
- u_localnp.ndarray
Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
- dynp.ndarray
Global singular values.
- vtynp.ndarray
Global right singular vectors.
- gbl_svd_torch(xi, comm)
Perform a global SVD using PyTorch, including necessary rotations for global modes.
- Parameters:
- xinp.ndarray or torch.Tensor
Data to perform the svd on.
- commMPI communicator
MPI communicator object.
- Returns:
- u_localtorch.Tensor
Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
- dytorch.Tensor
Global singular values.
- vtytorch.Tensor
Global right singular vectors.
- gbl_update(u_1t, d_1t, vt_1t, xi, k, comm)
Method to update the global svds from a batch of data
- Parameters:
- u_1tnp.ndarray
Left singular vectors from previous update.
- d_1tnp.ndarray
Singular values from previous update.
- vt_1tnp.ndarray
Right singular vectors from previous update.
- xinp.ndarray
New data to be used for the update.
- kint
Number of modes to be kept.
- commMPI communicator
MPI communicator object.
- Returns:
- u_1tnp.ndarray
Updated left singular vectors.
- d_1tnp.ndarray
Updated singular values.
- vt_1tnp.ndarray
Updated right singular vectors.
Notes
No need to delete xi, as this comes from a buffer that is pre allocated
- gbl_update_numpy(u_1t, d_1t, vt_1t, xi, k, comm)
Method to update the global svds from a batch of data using numpy
- Parameters:
- u_1tnp.ndarray
Left singular vectors from previous update.
- d_1tnp.ndarray
Singular values from previous update.
- vt_1tnp.ndarray
Right singular vectors.
- xinp.ndarray
New data to be used for the update.
- kint
Number of modes to be kept.
- commMPI communicator
MPI communicator object.
- Returns:
- u_1tnp.ndarray
Updated left singular vectors.
- d_1tnp.ndarray
Updated singular values.
- vt_1tnp.ndarray
Updated right singular vectors.
Notes
No need to delete xi, as this comes from a buffer that is pre allocated
- gbl_update_torch(u_1t, d_1t, vt_1t, xi, k, comm)
Method to update the global svds from a batch of data using PyTorch
- Parameters:
- u_1ttorch.Tensor
Left singular vectors from previous update.
- d_1ttorch.Tensor
Singular values from previous update.
- vt_1ttorch.Tensor
Right singular vectors.
- xinp.ndarray or torch.Tensor
New data to be used for the update.
- kint
Number of modes to be kept.
- commMPI communicator
MPI communicator object.
- Returns:
- u_1ttorch.Tensor
Updated left singular vectors.
- d_1ttorch.Tensor
Updated singular values.
- vt_1ttorch.Tensor
Updated right singular vectors.
Notes
No need to delete xi, as this comes from a buffer that is pre allocated
- lcl_to_gbl_svd(uii, dii, vtii, k_set, comm)
Perform rotations to obtain global modes from local modes.
- Parameters:
- uiinp.ndarray
Local left singular vectors.
- diinp.ndarray
Local singular values.
- vtiinp.ndarray
Local right singular vectors.
- k_setint
Number of modes that want to be kept globally. This is different than k because in local updates, each rank can have different k that will provide with ALL global modes.
- commMPI communicator
MPI communicator object.
- Returns:
- ui_globalnp.ndarray
Global left singular vectors.
- dynp.ndarray
Global singular values.
- vtynp.ndarray
Global right singular vectors.
- lcl_update(u_1t, d_1t, vt_1t, xi, k)
Method to update the local svds from a batch of data
This method does not require communication.
- Parameters:
- u_1tnp.ndarray
Left singular vectors from previous update.
- d_1tnp.ndarray
Singular values from previous update.
- vt_1tnp.ndarray
Right singular vectors from previous update.
- xinp.ndarray
New data to be used for the update.
- kint
Number of modes to be kept.
- Returns:
- u_1tnp.ndarray
Updated left singular vectors.
- d_1tnp.ndarray
Updated singular values.
- vt_1tnp.ndarray
Updated right singular vectors.