API

PySEMTools is a collection of modules that together provide a comprehensive toolkit for working with spectral element method (SEM) data.

The presently included submodules are:

`pysemtools.cli`	Command line interface functionalities available in pySEMTools.
`pysemtools.comm`	Communication utilities for parallel processing
`pysemtools.datatypes`	Data types to hold and operate with SEM data.
`pysemtools.interpolation`	High-order interpolation routines for SEM data.
`pysemtools.io.ppymech`	Classes to read or write data from Nek style codes
`pysemtools.io.adios2`	IO operations using ADIOS2.
`pysemtools.io.hdf`	Classes to read or write data in the hdf format
`pysemtools.io.catalyst`	Classes to interface with paraview - catalyst
`pysemtools.io.wrappers`	Wrappers to ease IO
`pysemtools.monitoring`	Monitoring tools for pySEMTools
`pysemtools.rom`	Set of tools to perform reduced order modeling tasks.

The compression module inside the package is still considered to be experimental and is prone to change, therefore its documentation is not yet included here.

The post-processing module is in a similar state. This module intends to use the tools to perform more specific post processing tasks. It is still under development and its API is prone to change, therefore the documentation is reserved for a future release.

The full API documentation for all the classes and functions is provided below.

cli

Command line interface functionalities available in pySEMTools.

pysemtools.cli.extract_subdomain()

Extract a subdomain from an existing file.

This script extracts a box-shaped subdomain from a given field file and saves it to a new file.

Parameters:

- -input_file: Path to the input field file.
- -output_file: Path to the output field file where the extracted subdomain will be saved
- -bounds: Comma-separated list of 6 floats defining the subdomain bounds: xmin,xmax,ymin,ymax,zmin,zmax
- -fields: (Optional) Comma-separated list of field names to extract. If not provided, all fields will be extracted.

Examples

To use this script, run it with MPI using the following command:

>>> mpirun -n <num_processes> python pysemtools_extract_subdomain --input_file <input.fld> --output_file <output.fld> --bounds=<xmin,xmax,ymin,ymax,zmin,zmax> [--fields=<field1,field2,...>]

Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.

pysemtools.cli.index_files()

Create a json file that contains and index of files in a folder

This script creates an index of files in a specified folder and saves it as a JSON file. The index can include file contents and time intervals based on user preferences.

Parameters:

- -folder_path: Path to the folder containing files to index.
- -output_folder: Path to the output folder.
- -file_type: (Optional) List of file types to index.
- -include_file_contents: (Optional) Boolean flag to include file contents in the index
- -include_time_interval: (Optional) Boolean flag to include time interval in the index.
- -run_start_time: (Optional) Float indicating the start time of the run.
- -stat_start_time: (Optional) Float indicating the start time of the statistics.

Examples

To use this script, run the following command:

>>> pysemtools_index_files --folder_path <path_to_folder> --output_folder <output_folder> [--file_type <file_type1,file_type2,...>] [--include_file_contents] [--include_time_interval] [--run_start_time <float>] [--stat_start_time <float>]

Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.

pysemtools.cli.visnek()

Create a nek5000 file template from existing files. Used to visualize nek5000 files in VisIt.

This script generates a nek5000 file template based on existing files in the current directory. The generated file contains information about the file naming pattern, first timestep, and number of timesteps.

Parameters:

filename_pattern: Pattern to match the nek5000 files.

Examples

To use this script, run it with the following command:

>>> pysemtools_visnek <filename_pattern>

Replacing the placeholder with the actual filename pattern. Observe that you have to remove the angle brackets.

comm

Communication utilities for parallel processing

class pysemtools.comm.Router(comm)

This class can be used to handle communication between ranks in a MPI communicator.

With this one can send and recieve data to any rank that is specified in the destination list.

Parameters:

commMPI communicator: The MPI communicator that is used for the communication.

Attributes:

commMPI communicator: The MPI communicator that is used for the communication.
destination_countndarray: Specifies a buffer to see how many points I send to each rank
source_countndarray: Specifies a buffer to see how many points I recieve from each rank

Methods

`all_gather`([data, dtype])	Gathers data from all processes to all processes.
`all_to_all`([destination, data, dtype])	Sends data to specified destinations and recieves data from whoever sent.
`gather_in_root`([data, root, dtype])	Gathers data from all processes to the root process.
`scatter_from_root`([data, sendcounts, root, ...])	Scatters data from the root process to all other processes.
`send_recv`([destination, data, dtype, tag])	Sends data to specified destinations and recieves data from whoever sent.
`transfer_data`(comm_pattern, **kwargs)	Moves data between ranks in the specified patthern.

Notes

The data is always flattened before sending and recieved data is always flattened. The user must reshape the data after recieving it.

Examples

To initialize simply use the communicator

>>> from mpi4py import MPI
>>> from pysemtools.comm.router import Router
>>> comm = MPI.COMM_WORLD
>>> rt = Router(comm)

all_gather(data=None, dtype=None)

Gathers data from all processes to all processes.

This is a wrapper to the MPI Allgatherv function.

Parameters:

datandarray: Data that is gathered in all processes.
dtypedtype: The data type of the data that is gathered.

Returns:

recvbufndarray: The gathered data in the root process. The data is always recieved flattened. User must reshape it.
sendcountsndarray: The number of data that was sent from each rank.

Examples

To gather data from all ranks to the root rank, do the following:

>>> rt = Router(comm)
>>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank
>>> recvbf, sendcounts = rt.all_gather(data = local_data, dtype = np.double)

all_to_all(destination=None, data=None, dtype=None, **kwargs)

Sends data to specified destinations and recieves data from whoever sent.

In this instance we use all to all collective.

Parameters:

destinationlist: A list with the rank ids that the data should be sent to.
datalist or ndarray: The data that will be sent. If it is a list, the data will be sent to the corresponding index in the destination list. if the data is an ndarray, the same data will be sent to all destinations.
dtypedtype: The data type of the data that is sent.

Returns:

sourceslist: A list with the rank ids that the data was recieved from.
recvbufflist: A list with the recieved data. The data is stored in the same order as the sources.

Notes

Extra keyword arguments are ignored. This is to keep the same interface as the send_recv method.

Examples

To send and recieve data between ranks, do the following:

local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)

>>> rt = Router(comm)
>>> destination = [rank + 1, rank + 2]
>>> for i, dest in enumerate(destination):
>>>     if dest >= size:
>>>         destination[i] = dest - size
>>> sources, recvbf = rt.all_to_all(destination = destination,
>>>                   data = local_data, dtype=np.double)
>>> for i in range(0, len(recvbf)):
>>>     recvbf[i] = recvbf[i].reshape((-1, 3))

gather_in_root(data=None, root=0, dtype=None)

Gathers data from all processes to the root process.

This is a wrapper to the MPI Gatherv function.

Parameters:

datandarray: Data that is gathered in the root process.
rootint: The rank that will gather the data.
dtypedtype: The data type of the data that is gathered.

Returns:

recvbufndarray: The gathered data in the root process. The data is always recieved flattened. User must reshape it.
sendcountsndarray: The number of data that was sent from each rank.

Examples

To gather data from all ranks to the root rank, do the following:

>>> rt = Router(comm)
>>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank
>>> recvbf, sendcounts = rt.gather_in_root(data = local_data,
>>>                      root = 0, dtype = np.double)

scatter_from_root(data=None, sendcounts=None, root=0, dtype=None)

Scatters data from the root process to all other processes.

This is a wrapper to the MPI Scatterv function.

Parameters:

datandarray: The data that is scattered to all processes.
sendcountsndarray, optional: The number of data that is sent to each process. If not specified, the data is divided equally among all processes.
rootint: The rank that will scatter
dtypedtype: The data type of the data that is scattered.

Returns:

recvbufndarray: The scattered data in the current process. The data is always recieved flattened. User must reshape it.

Examples

To scatter data from the root rank, do the following:

>>> rt = Router(comm)
>>> recvbf = rt.scatter_from_root(data = recvbf,
                sendcounts=sendcounts, root = 0, dtype = np.double)
>>> recvbf = recvbf.reshape((-1, 3))

Note tha the sendcounts are just a ndarray of size comm.Get_size() with the number of data that is sent to each rank.

send_recv(destination=None, data=None, dtype=None, tag=None)

Sends data to specified destinations and recieves data from whoever sent.

Typically, when a rank needs to send some data, it also needs to recieve some. In this method this is done by using non blocking communication. We note, however, that when the method returns, the data is already recieved.

Parameters:

destinationlist: A list with the rank ids that the data should be sent to.
datalist or ndarray: The data that will be sent. If it is a list, the data will be sent to the corresponding destination. if the data is an ndarray, the same data will be sent to all destinations.
dtypedtype: The data type of the data that is sent.
tagint: Tag used to identify the messages.

Returns:

sourceslist: A list with the rank ids that the data was recieved from.
recvbufflist: A list with the recieved data. The data is stored in the same order as the sources.

Examples

To send and recieve data between ranks, do the following:

local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)

>>> rt = Router(comm)
>>> destination = [rank + 1, rank + 2]
>>> for i, dest in enumerate(destination):
>>>     if dest >= size:
>>>         destination[i] = dest - size
>>> sources, recvbf = rt.send_recv(destination = destination,
>>>                   data = local_data, dtype=np.double, tag = 0)
>>> for i in range(0, len(recvbf)):
>>>     recvbf[i] = recvbf[i].reshape((-1, 3))

transfer_data(comm_pattern, **kwargs)

Moves data between ranks in the specified patthern.

This method wraps others in this class.

Parameters:

keywordstr

The keyword that specifies the pattern of the data movement. current options are:

distribute_p2p: sends data to specified destinations. and recieves data from whoever sent. point to point comm

distribute_a2a: sends data to specified destinations. and recieves data from whoever sent. all to all comm

gather: gathers data from all processes to the root process.

scatter: scatters data from the root process to all other processes.

kwargsdict

The arguments that are passed to the specified pattern. One needs to check the pattern documentation for the required arguments. For each scenario

Returns:

tuple: The output of the specified pattern.

datatypes

Data types to hold and operate with SEM data.

class pysemtools.datatypes.Coef(msh, comm, get_area=False, apply_1d_operators=True, bckend='numpy')

Class that contains arrays like mass matrix, jacobian, jacobian inverse, etc.

This class can be used when mathematical operations such as derivation and integration is needed on the sem mesh.

Parameters:

mshMesh: Mesh object.
commComm: MPI comminicator object.
get_areabool, optional: If True, the area integration weight and normal vectors will be calculated. (Default value = True).
apply_1d_operatorsbool, optional: If True, the 1D operators will be applied instead of building 3D operators. (Default value = True).

Attributes:

drdxndarray: component [0,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
drdyndarray: component [0,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
drdzndarray: component [0,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dsdxndarray: component [1,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dsdyndarray: component [1,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dsdzndarray: component [1,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dtdxndarray: component [2,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dtdyndarray: component [2,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
dtdzndarray: component [2,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).
Bndarray: Mass matrix for each point. shape is (nelv, lz, ly, lx).
areandarray: Area integration weight for each point in the facets. shape is (nelv, 6, ly, lx).
nxndarray: x component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).
nyndarray: y component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).
nzndarray: z component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).

Methods

`apply_spatial_filter`(field)	Apply the stored spatial filters
`build_spatial_filter`([r_tf, s_tf, t_tf])	Build a spatial filter based on the given transfer functions.
`dssum`(field, msh)	Peform average of given field over shared points in each rank.
`dudrst`(field[, direction])	Perform derivative with respect to reference coordinate r/s/t.
`dudrst_1d_operator`(field, dr, ds[, dt])	Perform derivative applying the 1d operators provided as inputs.
`dudrst_1d_operator_torch`(field, dr, ds[, dt])	Perform derivative applying the 1D operators provided as inputs.
`dudrst_3d_operator`(field, dr)	Perform derivative with respect to reference coordinate r.
`dudrst_transposed`(field[, direction])	Perform derivative with respect to reference coordinate r/s/t.
`dudxyz`(field, drdx, dsdx[, dtdx])	Perform derivative with respect to physical coordinate x,y,z.
`glsum`(a, comm[, dtype])	Peform global summatin of given qunaitity a using MPI.

Returns:

Examples

Assuming you have a mesh object and MPI communicator object, you can initialize the Coef object as follows:

>>> from pysemtools import Coef
>>> coef = Coef(msh, comm)

apply_spatial_filter(field)

Apply the stored spatial filters

Parameters:

fieldnp.array: Field to apply the spatial filter to. Shape should be (nelv, lz, ly, lx).

Returns:

np.array: Filtered field. Shape is the same as the input field.

Notes

The spatial filters must be created before calling this function, otherwise, an error will be raised.

build_spatial_filter(r_tf: array | None = None, s_tf: array | None = None, t_tf: array | None = None)

Build a spatial filter based on the given transfer functions.

Parameters:

r_tfnp.array: Transfer function for the r dimension. Shape should be (lx, lx).
s_tfnp.array: Transfer function for the s dimension. Shape should be (ly, ly).
t_tfnp.array: Transfer function for the t dimension. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.

Returns:

None: The spatial filters are stored in the r_filter, s_filter, and t_filter attributes.

dssum(field, msh)

Peform average of given field over shared points in each rank.

This method averages the field over shared points in the same rank. It uses the connectivity data in the mesh object. dssum might be a missleading name.

Parameters:

fieldndarray: Field to average over shared points.
mshMesh: pySEMTools Mesh object.

Returns:

ndarray: Input field with shared points averaged with shared points in the SAME rank.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> dudx = coef.dssum(dudx, msh)

dudrst(field, direction='r')

Perform derivative with respect to reference coordinate r/s/t.

Used to perform the derivative in the reference coordinates

Parameters:

fieldndarray: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
directionstr: Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).

Returns:

ndarray: Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudrst_1d_operator(field, dr, ds, dt=None)

Perform derivative applying the 1d operators provided as inputs.

This method uses the 1D operators to apply the derivative. To apply derivative in r. you mush provide the 1d differenciation matrix in that direction and identity in the others.

Parameters:

fieldndarray: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
drndarray: Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).
dsndarray: Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).
dtndarray: Derivative matrix in the t direction to apply to each element. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.

Returns:

ndarray: Derivative of the field with respect to r/s/t. Shape is the same as the input field.

Examples

Assuming you have a Coef object

>>> dxdr = coef.dudrst(x, coef.dr, np.eye(ly, dtype=coef.dtype), np.eye(lz, dtype=coef.dtype))

dudrst_1d_operator_torch(field, dr, ds, dt=None)

Perform derivative applying the 1D operators provided as inputs.

Parameters:

fieldtorch.Tensor: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
drtorch.Tensor: Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).
dstorch.Tensor: Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).
dttorch.Tensor (optional): Derivative matrix in the t direction. Shape should be (lz, lz).

Returns:

torch.Tensor: Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudrst_3d_operator(field, dr)

Perform derivative with respect to reference coordinate r.

This method uses derivation matrices from the lagrange polynomials at the GLL points.

Parameters:

fieldndarray: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
drndarray: Derivative matrix in the r/s/t direction to apply to each element. Shape should be (lx*ly*lz, lx*ly*lz).

Returns:

ndarray: Derivative of the field with respect to r/s/t. Shape is the same as the input field.

Examples

Assuming you have a Coef object

>>> dxdr = coef.dudrst(x, coef.dr)

dudrst_transposed(field, direction='r')

Perform derivative with respect to reference coordinate r/s/t.

Used to perform the derivative in the reference coordinates

Parameters:

fieldndarray: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
directionstr: Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).

Returns:

ndarray: Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudxyz(field, drdx, dsdx, dtdx=None)

Perform derivative with respect to physical coordinate x,y,z.

This method uses the chain rule, first evaluating derivatives with respect to rst, then multiplying by the inverse of the jacobian to map to xyz.

Parameters:

fieldndarray: Field to take derivative of. Shape should be (nelv, lz, ly, lx).
drdxndarray: Derivative of the reference coordinates with respect to x, i.e., first entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.
dsdxndarray: Derivative of the reference coordinates with respect to y, i.e., second entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.
dtdxndarray: Derivative of the reference coordinates with respect to z, i.e., third entry in the appropiate row of the jacobian inverse. Shape should be the same as the field. (Default value = None) Only valid for 3D fields.

Returns:

ndarray: Derivative of the field with respect to x,y,z. Shape is the same as the input field.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> dudx = coef.dudxyz(u, coef.drdx, coef.dsdx, coef.dtdx)

glsum(a, comm, dtype=<class 'numpy.float64'>)

Peform global summatin of given qunaitity a using MPI.

This method uses MPI to sum over all MPI ranks. It works with any numpy array shape and returns one value.

Parameters:

andarray: Quantity to sum over all mpiranks.
commComm: MPI communicator object.
datypenumpy.dtype: (Default value = np.double).

Returns:

float: Sum of the quantity a over all MPI ranks.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> volume = coef.glsum(coef.B, comm)

class pysemtools.datatypes.Field(comm, data=None)

Class that contains fields.

This is the main class used to contain data that can be used for post processing. The data does not generarly need to be present in this class, as it is typically enough to have the data as ndarrays of shape (nelv, lz, ly, lx) for each field. However this class provides a easy interface to collect data tha is somehow associated.

It also allows to easily write data to disk. As all the data in this class will be stored in the same file.

Parameters:

commComm: MPI comminicator object.
dataHexaData, optional: HexaData object that contains the coordinates of the domain.

Attributes:

fieldsdict: Dictionary that contains the fields. The keys are the field names and the values are lists of ndarrays. The keys for these dictionaries are the same as for Hexadata objects, i.e. vel, pres, temp, scal.
vel_fieldsint: Number of velocity fields.
pres_fieldsint: Number of pressure fields.
temp_fieldsint: Number of temperature fields.
scal_fieldsint: Number of scalar fields.
tfloat: Time of the data.

Methods

update_vars()

Update number of fields.

clear

Returns:

Examples

If a hexadata object: data is read from disk, the field object can be created directly from it.

>>> from pysemtools.datatypes.field import Field
>>> fld = Mesh(comm, data = data)

If one wishes to use the data in the fields. It is possible to reference it with a ndarray of shape (nelv, lz, ly, lx) as follows:

>>> u = fld.fields["vel"][0]
>>> v = fld.fields["vel"][1]
>>> w = fld.fields["vel"][2]
>>> vel_magnitude = np.sqrt(u**2 + v**2 + w**2)

A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:

>>> from pysemtools.datatypes.field import Field
>>> fld = Field(comm)
>>> fld.fields["vel"].append(u)
>>> fld.update_vars()

This fld object can then be used to write fld files from field u created in the code.

update_vars()

Update number of fields.

Update the number of fields in the class in the event that it has been modified. This is needed for writing data properly if more arrays are added to the class.

Examples

A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:

>>> from pysemtools.datatypes.field import Field
>>> fld = Field(comm)
>>> fld.fields["vel"].append(u)
>>> fld.update_vars()

This fld object can then be used to write fld files from field u created in the code.

class pysemtools.datatypes.FieldRegistry(comm, data=None, bckend='numpy')

Class that contains fields.

This class extends the main field class, as it contains a registry that allows to easily reference the fields.

Parameters:

commComm: MPI comminicator object.
dataHexaData, optional: HexaData object that contains the coordinates of the domain.

Methods

`add_field`(comm[, field_name, field, ...])	Add fields to the registry.
`clear`()	Clear the registry and the fields.
`rename_registry_key`([old_key, new_key])	Rename a key in the registry.
`to`([comm, bckend])	Move all fields to cpu in numpy to write out
`update_vars`()	Update the registry with the fields that are present in the fields dictionary.

add_field(comm, field_name='', field=None, file_type=None, file_name=None, file_key=None, dtype=<class 'numpy.float64'>)

Add fields to the registry. They will be stored in the fields dictionary to easily write them.

Parameters:

commComm: MPI comminicator object.
field_namestr: Name of the field to be added. where the field is added thepends on the name
fieldndarray: Field to be added to the registry. If this is provided, it is assumed to be a ndarray. It will then be added to the registry.
file_typestr: Type of the file to be added. If this is provided, it is assumed to be a file. Currently, only “fld” supported.
file_namestr: File name of the field to be added. If this is provided, it is assumed to be a file. It will then be added to the registry.
file_keystr: File key. This will be search in the file. For nek file, the key have the following format: “vel_0”, “vel_1”, “pres”, “temp”, “scal_0”, “scal_1”, etc. Only for “vel” we read the 2/3 components at the same time
dtypenp.dtype: Data type of the field. Default is np.double.

clear(): Clear the registry and the fields.

rename_registry_key(old_key='', new_key='')

Rename a key in the registry.

Parameters:

old_keystr: Old key to be renamed.
new_keystr: New key to be used.

Notes

If you update the registry, some keys might be overwritten or multiple keys might reference the same data.

to(comm=None, bckend='numpy'): Move all fields to cpu in numpy to write out

update_vars(): Update the registry with the fields that are present in the fields dictionary.

class pysemtools.datatypes.Mesh(comm, data=None, x=None, y=None, z=None, elmap=None, create_connectivity=False, bckend='numpy', log_level=None)

Class that contains coordinate and partitioning data of the domain.

This class needs to be used generaly as it contains the coordinates of the domain and some information about the partitioning of the domain.

Parameters:

commComm: MPI comminicator object.
dataHexaData, optional: HexaData object that contains the coordinates of the domain.
xndarray, optional: X coordinates of the domain. shape is (nelv, lz, ly, lx).
yndarray, optional: Y coordinates of the domain. shape is (nelv, lz, ly, lx).
zndarray, optional: Z coordinates of the domain. shape is (nelv, lz, ly, lx).
elmapndarray, optional: 1D ndarray of global element ids. shape is (nelv,).
create_connectivitybool, optional: If True, the connectivity of the domain will be created. (Memory intensive).
bckendstr, optional: Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.

Attributes:

xndarray: X coordinates of the domain. shape is (nelv, lz, ly, lx).
yndarray: Y coordinates of the domain. shape is (nelv, lz, ly, lx).
zndarray: Z coordinates of the domain. shape is (nelv, lz, ly, lx).
lxint: Polynomial degree in x direction.
lyint: Polynomial degree in y direction.
lzint: Polynomial degree in z direction.
nelvint: Number of elements in the domain in current rank.
glb_nelvint: Total number of elements in the domain.
gdimint: Dimension of the domain.
non_linear_shared_pointslist, optional: List that show the index where the points in the domain are shared, used by coef in dssum.

Methods

`create_connectivity`()	Create connectivity with the information from one processor
`get_edge_centers`()	Get the edge centers of the domain.
`get_facet_centers`()	Get the centroid of each facet
`get_vertices`()	Get the vertices of the domain.
`init_common`(comm)	Initialize common attributes.
`init_from_coords`(comm, x, y, z[, elmap])	Initialize from coordinates.
`init_from_data`(comm, data)	Initialize form data.
`to`([comm, bckend])	Transfer the Mesh object to the desired backend.

Returns:

Examples

If a hexadata object: data is read from disk, the mesh object can be created directly from it.

>>> from pysemtools.datatypes.msh import Mesh
>>> msh = Mesh(comm, data = data)

If the coordinates are already available, the mesh object can be created from them.

>>> from pysemtools.datatypes.msh import Mesh
>>> msh = Mesh(comm, x = x, y = y, z = z)

This is useful in situations where the coordinates are generated in the code or streamed into python from another source.

create_connectivity()

Create connectivity with the information from one processor

Notes

This function creates a map that contains the connectivity of the domain. This is not the recomended way to perform connectivity in large domains. For this, it is better to use the dedicated connecitivty object that performs the operations in parallel.

get_edge_centers()

Get the edge centers of the domain.

Get all the edge centers of the domain in 2D or 3D.

Notes

We need 4 edges for 2D and 12 edges for 3D. For all cases we store 3 coordinates for each edge.

get_facet_centers()

Get the centroid of each facet

Find the “centroid of each facet. This is used to find the shared facets between elements.

Notes

This is not really the centroid, as we also find a coordinate in the dimension perpendicular to the facet. This means that these values can be outside or inside the element. However the same behaviour should be seen in the matching elements.

get_vertices()

Get the vertices of the domain.

Get all the vertices of the domain in 2D or 3D.

Notes

We need 4 vertices for 2D and 8 vertices for 3D. For all cases, we store 3 coordinates for each vertex.

init_common(comm)

Initialize common attributes.

This function is used to initialize the common attributes of the mesh object.

Parameters:

commComm: MPI communicator object.

Returns:

None: Nothing is returned, the attributes are set in the object.

init_from_coords(comm, x, y, z, elmap=None)

Initialize from coordinates.

This function is used to initialize the mesh object from x, y, z ndarrays.

Parameters:

commComm: MPI communicator object.
xndarray: X coordinates of the domain. shape is (nelv, lz, ly, lx).
yndarray: Y coordinates of the domain. shape is (nelv, lz, ly, lx).
zndarray: Z coordinates of the domain. shape is (nelv, lz, ly, lx).
elmapndarray, optional: 1D ndarray of global element ids. shape is (nelv,). If not provided, it will be set to None.

Returns:

None: Nothing is returned, the attributes are set in the object.

init_from_data(comm, data)

Initialize form data.

This function is used to initialize the mesh object from a hexadata object.

Parameters:

commComm: MPI communicator object.
dataHexaData: HexaData object that contains the coordinates of the domain.

Returns:

None: Nothing is returned, the attributes are set in the object.

to(comm=None, bckend='numpy')

Transfer the Mesh object to the desired backend.

Parameters:

commComm: MPI communicator object.
bckendstr: Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.

Returns:

msh_cpuMesh: Mesh object in the desired backend.

class pysemtools.datatypes.MeshConnectivity(comm, msh: Mesh | None = None, rel_tol=1e-05, use_hashtable=False, max_simultaneous_sends=1, max_elem_per_vertex: int | None = None, max_elem_per_edge: int | None = None, max_elem_per_face: int | None = None, coef=None)

Class to compute the connectivity of the mesh

Uses facets and vertices to determine which elements are connected to each other

Parameters:

commMPI communicator: The MPI communicator
mshMesh: The mesh object
rel_tolfloat: The relative tolerance to use when comparing the coordinates of the facets/edges
use_hashtablebool: Whether to use a hashtable to define connectivity. This is faster but uses more memory
max_simultaneous_sendsint: The maximum number of simultaneous sends to use when sending data to other ranks. A lower number saves memory for buffers but is slower.
max_elem_per_vertexint: The maximum number of elements that share a vertex. The default values are 4 for 2D and 8 for 3D (Works for a structured mesh) The default value is selected if this input is left as None
max_elem_per_edgeint: The maximum number of elements that share an edge. The default values are 2 for 2D and 4 for 3D (Works for a structured mesh) The default value is selected if this input is left as None
max_elem_per_faceint: The maximum number of elements that share a face. The default values are 2 for 3D (Works for a structured mesh) The default value is selected if this input is left as None

Methods

`dssum`([field, msh, average])	Computes the dssum of the field
`dssum_global`([local_dssum_field, field, msh])	Computes the global dssum of the field
`dssum_local`([field, msh])	Computes the local dssum of the field
`get_boundary_node_indices_2d`(msh[, ...])	Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.
`get_multiplicity`(msh)	Computes the multiplicity of the elements in the mesh
`global_connectivity`(msh)	Computes the global connectivity of the mesh
`local_connectivity`(msh)	Computes the local connectivity of the mesh

dssum(field: ndarray | None = None, msh: Mesh | None = None, average: str = 'multiplicity')

Computes the dssum of the field

Parameters:

fieldnp.ndarray: The field to compute the dssum
mshMesh: The mesh object
averagestr: The averaging weights to use. Can be “multiplicity”

Returns:

np.ndarray: The dssum of the field

dssum_global(local_dssum_field: ndarray | None = None, field: ndarray | None = None, msh: Mesh | None = None)

Computes the global dssum of the field

Parameters:

local_dssum_fieldnp.ndarray: The local dssum of the field, computed with dssum_local
fieldnp.ndarray: The field to compute the dssum
mshMesh: The mesh object

Returns:

np.ndarray: The global dssum of the field

dssum_local(field: ndarray | None = None, msh: Mesh | None = None)

Computes the local dssum of the field

Parameters:

fieldnp.ndarray: The field to compute the dssum
mshMesh: The mesh object

Returns:

np.ndarray: The local dssum of the field

get_boundary_node_indices_2d(msh, masking_function=None)

Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.

Parameters:

mshMesh: The mesh associated with this connectivity object.
masking_functioncallable or None: Optional function with signature (msh, e, k, j, i) → bool. If provided, only nodes for which this returns True are included.

Returns:

boundary_node_indiceslist of tuple: Indices in the form (element, z=0, j, i) for all GLL nodes on boundary edges.

get_multiplicity(msh: Mesh)

Computes the multiplicity of the elements in the mesh

Parameters:

mshMesh

Notes

The multiplicity is the number of times a point in a element is shared with its own element or others. The minimum multiplicity is 1, since the point is always shared with itself.

global_connectivity(msh: Mesh)

Computes the global connectivity of the mesh

Currently this function sends data from all to all.

Parameters:

mshMesh: The mesh object

Notes

In 3D. this function sends the facet centers of the unique_efp_elem and unique_efp_facet to all other ranks. as well as the element ID and facet ID to be assigned.

We compare the unique facet centers of our rank to those of others and determine which one matches. When we find that one matches, we populate the directories: global_shared_efp_to_rank_map[(e, f)] = rank global_shared_efp_to_elem_map[(e, f)] = elem global_shared_efp_to_facet_map[(e, f)] = facet

So for each element facet pair we will know which rank has it, and which is their ID in that rank.

BE MINDFUL: Later when redistributing, send the points, but also send the element and facet ID to the other ranks so the reciever can know which is the facet that corresponds.

local_connectivity(msh: Mesh)

Computes the local connectivity of the mesh

This function checks elements within a rank

Parameters:

mshMesh: The mesh object

Notes

In 3D, the centers of the facets are compared. efp means element facet pair. One obtains a local_shared_efp_to_elem_map and a local_shared_efp_to_facet_map dictionary.

local_shared_efp_to_elem_map[(e, f)] = [e1, e2, …] gives a list with the elements e1, e2 … that share the same facet f of element e.

local_shared_efp_to_facet_map[(e, f)] = [f1, f2, …] gives a list with the facets f1, f2 … of the elements e1, e2 … that share the same facet f of element e.

In each case, the index of the element list, corresponds to the index of the facet list. Therefore, the element list might have repeated element entries.

Additionally, we create a list of unique_efp_elem and unique_efp_facet, which are the elements and facets that are not shared with any other element. These are either boundary elements or elements that are connected to other ranks. the unique pairs are the ones that are checked in global connecitivy,

class pysemtools.datatypes.MeshPartitioner(comm, msh: Mesh | None = None, conditions: list[ndarray] | None = None)

A class that repartitons SEM mesh data using a given partitioning algorithm.

The idea is to be able to chose subdomains of the mesh and split the elements such that the load is balanced among the ranks.

One could use this to repartition the data if the condition array is full of True values, but the idea is to be able to use any condition array.

Parameters:

commMPI communicator: MPI communicator
mshMesh: Mesh object to partition
conditionslist[np.ndarray]: List of conditions to apply to the mesh elements. The conditions should be in the form of a list of numpy arrays. Each numpy array should have the same length as the number of elements in the mesh. The conditions should be boolean arrays.

Methods

`create_partitioned_field`([fld, ...])	Create a partitioned field object
`create_partitioned_mesh`([msh, ...])	Create a partitioned mesh object
`redistribute_field_elements`([field, ...])	Redistribute the elements of the mesh object to different ranks

create_partitioned_field(fld: Field | FieldRegistry | None = None, partitioning_algorithm: str = 'load_balanced_linear') → FieldRegistry

Create a partitioned field object

Parameters:

fldField or FieldRegistry: Field object to partition
partitioning_algorithmstr: Algorithm to use for partitioning the mesh elements

Returns:

partitioned_fieldFieldRegistry: Partitioned field object

create_partitioned_mesh(msh: Mesh | None = None, partitioning_algorithm: str = 'load_balanced_linear', create_conectivity: bool = False) → Mesh

Create a partitioned mesh object

Parameters:

mshMesh: Mesh object to partition
partitioning_algorithmstr: Algorithm to use for partitioning the mesh elements

Returns:

partitioned_meshMesh: Partitioned mesh object

redistribute_field_elements(field: ndarray | None = None, partitioning_algorithm: str = 'load_balanced_linear') → None

Redistribute the elements of the mesh object to different ranks

Parameters:

fieldnp.ndarray: Field to redistribute based on the conditions at initialization
partitioning_algorithmstr: Algorithm to use for partitioning the mesh elements

class pysemtools.datatypes.VTKMesh(comm: Comm, x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex', global_connectivity: bool = True, distributed_axis: int = 0)

Class that contains the mesh data in a vtk friendly format

Helper to build connectivity and offsets etc.

Parameters:

commMPI.Comm: The MPI communicator
xnp.ndarray: The x coordinates of the mesh points
ynp.ndarray: The y coordinates of the mesh points
znp.ndarray: The z coordinates of the mesh points
cell_typestr, optional: The type of cells in the mesh, by default “hex”. Only “hex” is currently supported.
global_connectivitybool, optional: Whether to use global connectivity or local connectivity in parallel. By default True, but if only one rank is used, it is set to False since the connectivity is already global in that case.
distributed_axisint, optional: The axis along which the mesh is distributed, by default 0 (Only zero allowed now)

Notes

Offsets is n_cells + 1 to work on VTKHDF. This might need to be reduced to n_cells for other use cases like Catalyst.

interpolation

High-order interpolation routines for SEM data.

class pysemtools.interpolation.Probes(comm, output_fname: str = './interpolated_fields.csv', probes: ndarray | str | None = None, msh=typing.Union[pysemtools.datatypes.msh.Mesh, str, list], write_coords: bool = True, progress_bar: bool = False, point_interpolator_type: str = 'single_point_legendre', max_pts: int = 128, find_points_iterative: list = [False, 5000], find_points_comm_pattern: str = 'point_to_point', elem_percent_expansion: float = 0.01, global_tree_type: str = 'rank_bbox', global_tree_nbins: int = 1024, use_autograd: bool = False, find_points_tol: float = np.float64(2.220446049250313e-15), find_points_max_iter: int = 50, local_data_structure: str = 'kdtree', use_oriented_bbox: bool = False, clean_search_traces: bool = False)

Class to interpolate fields at probes from a SEM mesh.

Main interpolation class. This works in parallel.

If the points to interpolate are available only at rank 0, make sure to only pass them at that rank and set the others to None. In that case, the points will be scattered to all ranks.

If the points are passed in all ranks, then they will be considered as different points to be interpolated and the work to interpolate will be multiplied. If you are doing this, make sure that the points that each rank pass are different. Otherwise simply pass None in all ranks but 0.

See example below to observe how to avoid unnecessary replication of data in all ranks when passing probes as argument if the points are the same in all ranks.

If reading probes from file, they will be read on rank 0 and scattered unless parallel hdf5 is used, in which case the probes will be read in all ranks. (In development)

Parameters:

commMPI communicator: MPI communicator.
output_fnamestr: Output file name. Default is “./interpolated_fields.csv”. Note that you can change the file extension to .hdf5 to write in hdf5 format. by using the name “interpolated_fields.hdf5”.
probesUnion[np.ndarray, str]: Probes coordinates. If a string, it is assumed to be a file name.
mshUnion[Mesh, str, list]: Mesh data. If a string, it is assumed to be a file name. If it is a list the first entry is the file name and the second is the dtype of the data. if it is a Mesh object, the x, y, z coordinates are taken from the object.
write_coordsbool: If True, the coordinates of the probes are written to a file. Default is True.
progress_barbool: If True, a progress bar is shown. Default is False.
point_interpolator_typestr: Type of point interpolator. Default is single_point_legendre. options are: single_point_legendre, single_point_lagrange, multiple_point_legendre_numpy, multiple_point_legendre_torch.
max_ptsint, optional: Maximum number of points to interpolate. Default is 128. Used if multiple point interpolator is selected.
find_points_iterativelist: List with two elements. First element is a boolean that indicates if the search is iterative. Second element is the maximum number of candidate ranks to send the data. This affects memory. Default is [False, 5000].
find_points_comm_patternstr: Communication pattern for finding points. Default is point_to_point. options are: point_to_point, collective, rma
elem_percent_expansionfloat: Percentage expansion of the element bounding box. Default is 0.01.
global_tree_typestr: How is the global tree constructed to determine rank candidates for the probes. Only really used if using tree structures to determine candidates. Default is rank_bbox. options are: rank_bbox, domain_binning.
global_tree_nbinsint: Number of bins in the global tree. Only used if the global tree is domain_binning. Default is 1024.
use_autogradbool: If True, autograd is used. Default is False.
find_points_tolfloat: The tolerance to use when finding points. Default is np.finfo(np.double).eps * 10.
find_points_max_iterint: The maximum number of iterations to use when finding points. Default is 50.
local_data_structurestr: The local data structure to use when finding points. Default is kdtree. options are: kdtree, obb_tree.
use_oriented_bboxbool: If True, oriented bounding boxes are used when finding points. Default is False.
clean_search_tracesbool: If True, cleans all the data used for searching points after initialization. This saves memory if only interpolation is needed afterwards. Default is False.

Attributes:

probesndarray: 2D array of probe coordinates. shape = (n_probes, 3).
interpolated_fieldsndarray: 2D array of interpolated fields at probes. shape = (n_probes, n_fields + 1). The first column is always time, the rest are the interpolated fields.

Methods

interpolate_from_field_list(t, field_list, comm)

Interpolate the probes from a list of fields.

clean_search_traces

Notes

A sample input file can be found in the examples folder of the main repository, However, the file is not used in said example.

Examples

Initialize from file:

>>> from mpi4py import MPI
>>> from pysemtools.interpolation.probes import Probes
>>> comm = MPI.COMM_WORLD
>>> probes = Probes(comm, filename="path/to/params.json")

2. Initialize from code, passing everything as arguments. Assume msh is created. One must then create the probe data in rank 0. A dummy probe_data must be created in all other ranks

>>> from mpi4py import MPI
>>> from pysemtools.interpolation.probes import Probes
>>> comm = MPI.COMM_WORLD
>>> if comm.Get_rank() == 0:
>>>     probes_data = np.array([[0.0, 0.0, 0.0], [1.0, 1.0, 1.0], [2.0, 2.0, 2.0]])
>>> else:
>>>    probes_data = None
>>> probes = Probes(comm, probes=probes_data, msh=msh)

Note that probes is initialized in all ranks, but the probe_data containing the coordinates are only relevant in rank 0. They are scattered internally.

interpolate_from_field_list(t, field_list, comm, write_data=True, field_names: list[str] | None = None)

Interpolate the probes from a list of fields.

This method interpolates from a list of fields (ndarrays of shape (nelv, lz, ly, lx)).

Parameters:

tfloat: Time of the field data.
field_listlist: List of fields to interpolate. Each field is an ndarray of shape (nelv, lz, ly, lx).
commComm: MPI communicator.
write_databool: If True, the interpolated data is written to a file. Default is True.
field_nameslist: List of names of the interpolated fields. Useful when writing to file. Default is None.

Examples

This method can be used to interpolate fields from a list of fields. If you have previosly obtained a set of fields u,v,w as ndarrays of shape (nelv, lz, ly, lx), you can:

>>> probes.interpolate_from_field_list(t, [u,v,w], comm)

The results are stored in probes.interpolated_fields attribute. Remember: the first column of this attribute is always the time t given.

io

There are multiple interfaces to perform IO operations. The main one relates to NEk5000/Neko files. These are read by using the ppymech submodule. It is also possible to use ADIOS2 to, for example, use the in-situ processing capabilities, like data streaming or compression. Support for HDF5 files in parallel is also provided, including VTKHDF files that can be shown in Paraview. Finally, a set of wrappers to read and write data in a more user-friendly way is also included.

ppymech

Classes to read or write data from Nek style codes

pysemtools.io.ppymech.preadnek(filename, comm, data_dtype=<class 'numpy.float64'>)

Read and fld file and return a pymech hexadata object (Parallel).

Main function for readinf nek type fld filed.

Parameters:

filenamestr: The filename of the fld file.
commComm: MPI communicator.
data_dtypestr: The data type of the data in the file. (Default value = “float64”).

Returns:

HexaData: The data read from the file in a pymech hexadata object.

Examples

>>> from mpi4py import MPI
>>> from pysemtools.io.ppymech.neksuite import preadnek
>>> comm = MPI.COMM_WORLD
>>> data = preadnek('field00001.fld', comm)

pysemtools.io.ppymech.pwritenek(filename, data, comm)

Write and fld file and from a pymech hexadata object (Parallel).

Main function to write fld files.

Parameters:

filenamestr: The filename of the fld file.
dataHexaData: The data to write to the file.
commComm: MPI communicator.

Examples

Assuming you have a hexadata object already:

>>> from pysemtools.io.ppymech.neksuite import pwritenek
>>> pwritenek('field00001.fld', data, comm)

pysemtools.io.ppymech.pynekread(filename, comm, data_dtype=<class 'numpy.float64'>, msh=None, fld=None, overwrite_fld=False)

Read nek file and returs a pynekobject (Parallel).

Main function for readinf nek type fld filed.

Parameters:

filenamestr: The filename of the fld file.
commComm: MPI communicator.
data_dtypestr: The data type of the data in the file. (Default value = “float64”).
mshMesh: The mesh object to put the data in. (Default value = None).
fldField: The field object to put the data in. (Default value = None).
overwrite_fldbool: Wether or not to overwrite the contents of fld. (Default value = False).

Returns:

None: Nothing is returned, the attributes are set in the object.

Examples

>>> from mpi4py import MPI
>>> from pysemtools.io.ppymech.neksuite import pynekread
>>> comm = MPI.COMM_WORLD
>>> msh = msh_c(comm)
>>> fld = field_c(comm)
>>> pynekread(fname, comm, msh = msh, fld=fld)

pysemtools.io.ppymech.pynekread_field(filename, comm, data_dtype=<class 'numpy.float64'>, key='')

Read nek file and returs a pynekobject (Parallel).

Main function for readinf nek type fld filed.

Parameters:

filenamestr: The filename of the fld file.
commComm: MPI communicator.
data_dtypestr: The data type of the data in the file. (Default value = “float64”).
keystr: The key of the field to read. Typically “vel”, “pres”, “temp” or “scal_1”, “scal_2”, etc.

Returns:

list: The data read from the file in a list.

pysemtools.io.ppymech.pynekwrite(filename, comm, msh=None, fld=None, wdsz=4, istep=0, write_mesh=True)

Write and fld file and from pynekdatatypes (Parallel).

Main function to write fld files.

Parameters:

filenamestr: The filename of the fld file.
commComm: MPI communicator.
mshMesh: The mesh object to write to the file. (Default value = None).
fldField: The field object to write to the file. (Default value = None).
wdszint: The word size of the data in the file. (Default value = 4).
istepint: The time step of the data. (Default value = 0).
write_meshbool: If True, write the mesh data. (Default value = True).

Examples

Assuming a mesh object and field object are already present in the namespace:

>>> from pysemtools.io.ppymech.neksuite import pwritenek
>>> pynekwrite('field00001.fld', comm, msh = msh, fld=fld)

ADIOS2

IO operations using ADIOS2.

class pysemtools.io.adios2.DataCompressor(comm, mesh_info=None, wrd_size=4)

Class used to write compressed data to disk.

This Assumes that the input data has a msh object available.

Parameters:

commComm: MPI communicator.
mesh_infodict: Dictionary with mesh information.
wrd_sizeint: Word size to write data. (Default value = 4). Single precsiion is 4, double is 8.

Methods

`read`(comm[, fname, variable_names])	Read data from disk using adios2.
`write`(comm[, fname, variable_names, data])	Write data to disk using adios2.

Returns:

Examples

This class is used to write data to disk. The data is compressed using bzip2.

>>> mesh_info = {"glb_nelv": msh.glb_nelv, "lxyz": msh.lxyz, "gdim": msh.gdim}
>>> dc = DataCompressor(comm, mesh_info = mesh_info, wrd_size = 4)

read(comm, fname='compress.bp', variable_names=None)

Read data from disk using adios2.

Read compressed data and internally decompress it.

Parameters:

commComm: MPI communicator.
fnamestr: File name to read. (Default value = “compress.bp”).
variable_nameslist: List of string with the names of the variables to read. These names NEED to match the names adios2 used to write the data.

Returns:

list: List of numpy arrays with the data read. the ndarrays in the list are 1d.

Examples

This function is used to read data from disk. The data is compressed using bzip2.

>>> variable_names = ["x", "y", "z"]
>>> data = dc.read(comm, fname = "compress.bp", variable_names = variable_names)

write(comm, fname='compress.bp', variable_names=None, data=None)

Write data to disk using adios2.

Lossless compression with bzip2.

Parameters:

commComm: MPI communicator.
fnamestr: File name to write. (Default value = “compress.bp”).
variable_nameslist: List of string with the names of the variables to write. This is very important, as adios2 will use these names to process the data.
datalist: List of numpy arrays with the data to write. Corresponding in index to variable_names. The arrays must be 1d.

Examples

This function is used to write data to disk. The data is compressed using bzip2.

>>> variable_names = ["x", "y", "z"]
>>> data = [msh.x, msh.y, msh.z]
>>> dc.write(comm, fname = "compress.bp", variable_names = variable_names, data = data)

class pysemtools.io.adios2.DataStreamer(comm, from_nek=True)

Class used to communicate data between codes using adios2.

Use this to send and recieve data.

The data is always transported as a 1d array, so it is necessary to reshape.

Parameters:

commComm: MPI communicator.
from_nekbool: Define if the data that is being stream comes from a nek-like code. (Default value = True).

Attributes:

glb_nelvint: Total number of elements in the global domain.
lxyzint: Number of points per element.
gdimint: Problem dimension.
nelvint: Number of elements that this rank has.

Methods

`finalize`()	Finalize the execution of the module.
`recieve`([fld, variable])	Recieve data from another code using adios2.
`stream`(fld)	Send data to another code using adios2.

Examples

This type must be paired with another streaming in the other executable/code. The codes will not start if the streams are not paired.

A full of example of recieving data is shown below. It is possible to pair with other classes to, for example, write data to disk.

>>> ds = data_streamer_c(comm)
>>> x = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape x
>>> y = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape y
>>> z = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape z
>>> ds.finalize()
>>> msh = msh_c(comm, x = x, y = y, z = z)
>>> write_fld_file_from_list("field0.f00001", comm, msh, [x, y, z])

To send data to the other code, use the stream method.

>>> ds.stream(x.reshape(x.size))

finalize()

Finalize the execution of the module.

Used to close reader and writer. The stream will end and the code will not be coupled.

recieve(fld=None, variable='f2py_field')

Recieve data from another code using adios2.

The data is always transported as a 1d array, so it is necessary to reshape to deinred shape.

Parameters:

fldndarray: Buffer to contain the data. If None, it will be created. (Default value = None).
variablestr: Name of the adios2 variable to be read. Neko used default name. Change as needed. This name can remain the same during the execution even if different quantities are being transported. (Default value = “f2py_field”).

Returns:

ndarray: Returns the field that was recieved.

stream(fld)

Send data to another code using adios2.

Couple 2 code or executables.

Parameters:

fldndarray: Field to be sent. Must be a 1d array.

HDF

Classes to read or write data in the hdf format

class pysemtools.io.hdf.HDF5File(comm: Comm, fname: str, mode: str, parallel: bool)

Class to write and read hdf5 files in parallel using h5py.

Open an hdf5 file based on inputs.

Parameters:

commMPI.Comm: MPI communicator.
fnamestr: Name of the hdf5 file to read or write.
modestr: Mode to open the file. Should be “r” for reading or “w” for writing.
parallelbool: Whether to use parallel I/O or not.

Methods

`close`([clean])	Close the hdf5 file object
`open`(fname, mode, parallel)	Open an hdf5 file based on inputs.
`read_dataset`(dataset_name[, dtype, ...])	Read a dataset from the hdf5 file object
`read_slices`(dataset_name[, dtype])	Read the slices hyperslabs from the file
`set_active_group`(group_name)	Set the active group to read or write data from.
`set_read_slices_external`(global_shape, slices)	Set the slices that should be read from the file based on external input.
`set_read_slices_linear_lb`(global_shape, ...)	Set the slices that should be read from the file.
`set_write_slices`(local_shape, distributed_axis)	Set the slices that should be written to the file.
`write_dataset`(dataset_name, data[, ...])	Write a dataset to the hdf5 file object
`write_slices`(dataset_name, data[, shape_in_file])	Write the hyperslab to the file.

close(clean: bool = True)

Close the hdf5 file object

Parameters:

cleanbool: Whether to clean the attributes that are assigned when opening a file. This is useful if the file object will be reused to open another file after closing the current one. Default is False.

open(fname: str, mode: str, parallel: bool)

Open an hdf5 file based on inputs.

This can be used to open a new file after closing the previous one.

Parameters:

fnamestr: Name of the hdf5 file to read or write.
modestr: Mode to open the file. Should be “r” for reading or “w” for writing.
parallelbool: Whether to use parallel I/O or not. If True, the file will be opened using the MPI-IO driver. If False, the file will be opened using the default driver.

read_dataset(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int | None = None, slices: list | None = None, as_array_list_in_file: bool = False, ignore_metadata: bool = False)

Read a dataset from the hdf5 file object

Parameters:

dataset_namestr: Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.
dtypenp.dtype: Data type to read the dataset in. Default is np.double.
distributed_axisint: Axis along which the data is distributed in parallel. This is required for parallel reading. Default is None.
sliceslist: Optional. List of slices to read from the dataset. In case it is known
as_array_list_in_filebool: Optional. default is False. Whether the data is stored as an array list in the file. This is useful if originally the data had a different shape but was flattened to 1d before writing. This will use the shape attribute stored in the file to do the partioning but will keep in mind that the data is stored as a 1d array to read properly.
ignore_metadatabool: Optional. default is False. Force to read the data ingnoring any shape metadata. This will just read the arrays as stored and will not try to assume an original shape

Returns:

local_datanp.ndarray: Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.

read_slices(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>)

Read the slices hyperslabs from the file

Parameters:

dataset_namestr: Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.
dtypenp.dtype: Data type to read the dataset in. Default is np.double.

Returns:

local_datanp.ndarray: Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.

set_active_group(group_name: str)

Set the active group to read or write data from.

This is useful to avoid having to specify the group every time a dataset is read or written.

Parameters:

group_namestr: Name of the group to set as active. Can include the group path, e.g. “/group1/group2”. If the group does not exist, it will be created if the file is opened in write mode, otherwise an error will be raised.

set_read_slices_external(global_shape: tuple, slices: list)

Set the slices that should be read from the file based on external input.

slices need to be precomputed in this case

Parameters:

global_shapetuple: Shape of the global array to be read.
sliceslist: List of slices to read from the data set.

set_read_slices_linear_lb(global_shape: tuple, distributed_axis: int, explicit_strides: bool = False, shape_in_file: list | None = None)

Set the slices that should be read from the file.

Data is distributed in a linear load balanced way.

Parameters:

global_shapetuple: Shape of the global array to be read. This is required to determine the local shape and the slices to read from the file.
distributed_axisint: Axis along which the data is distributed in parallel. This is required to determine the local shape and the slices to read from the file.
explicit_stridesbool: Whether to use explicit strides to read the data. This is useful if the data is stored as 1D in the file but originally had a different shape.

set_write_slices(local_shape: tuple, distributed_axis: int, extra_global_entries: list[int] | None = None)

Set the slices that should be written to the file.

Obtain global shape from the local one

Parameters:

local_shapetuple: Shape of the local array to be written. This is required to determine the global shape and the slices to write to the file.
distributed_axisint: Axis along which the data is distributed in parallel.
extra_global_entrieslist[int], optional: List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write. Default is None.

write_dataset(dataset_name: str, data: ndarray, distributed_axis: int | None = None, extra_global_entries: list[int] | None = None, shape_in_ram: tuple | None = None)

Write a dataset to the hdf5 file object

Parameters:

dataset_namestr: Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.
datanp.ndarray: Data to write to the file.
distributed_axisint: Axis along which the data is distributed in parallel. This is required for parallel writing. Default is None.
extra_global_entrieslist[int]: Optional. List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write.
shape_in_ramtuple: Optional. Shape of the data in RAM. This is useful if the data is stored in a different shape that it originally had, for example, if it is stored in a 1d array but originally it had a different shape. this will be the shape that is stored in the file in the attribute “shape” and can be used to reshape the data when reading it.

write_slices(dataset_name: str, data: ndarray, shape_in_file: tuple | None = None)

Write the hyperslab to the file.

Perform the write operations

Parameters:

dataset_namestr: Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.
datanp.ndarray: Data to write to the file. This should have the same shape as the local shape determined by the set_write_slices method.
shape_in_filetuple, optional: Shape of the data to be stored in the file. This is useful if the data is stored in a different shape in the file than it is in RAM.

class pysemtools.io.hdf.VTKHDFFile(comm: Comm, fname: str, mode: str, parallel: bool)

Class to write and read vtkhdf files in parallel using h5py. Open an hdf5 file based on inputs.

Parameters:

commMPI.Comm: MPI communicator.
fnamestr: Name of the hdf5 file to read or write.
modestr: Mode to open the file. Should be “r” for reading or “w” for writing.
parallelbool: Whether to use parallel I/O or not.

Methods

`close`([clean])	Close the hdf5 file object
`link_to_existing_mesh`(mesh_name)	Link to an existing mesh
`open`(fname, mode, parallel)	Open an hdf5 file based on inputs.
`read_dataset`(dataset_name[, dtype, ...])	Read a dataset from the hdf5 file object
`read_mesh_data`([dtype, distributed_axis])	Read the mesh data from the hdf5 file
`read_point_data`(dataset_name[, dtype, ...])	Read point data from the hdf5 file
`read_slices`(dataset_name[, dtype])	Read the slices hyperslabs from the file
`set_active_group`(group_name)	Set the active group to read or write data from.
`set_read_slices_external`(global_shape, slices)	Set the slices that should be read from the file based on external input.
`set_read_slices_linear_lb`(global_shape, ...)	Set the slices that should be read from the file.
`set_write_slices`(local_shape, distributed_axis)	Set the slices that should be written to the file.
`write_dataset`(dataset_name, data[, ...])	Write a dataset to the hdf5 file object
`write_mesh_data`(x, y, z[, distributed_axis])	Write the mesh data to the hdf5 file
`write_point_data`(dataset_name, data[, ...])	Write point data to the hdf5 file
`write_slices`(dataset_name, data[, shape_in_file])	Write the hyperslab to the file.

close(clean: bool = True)

Close the hdf5 file object

Parameters:

cleanbool: Whether to clean up the file after closing. This will delete the file from disk. Should only be used for testing.

link_to_existing_mesh(mesh_name: str)

Link to an existing mesh

Avoid rewriting mesh data if not necessary. It can be quite costly in storage.

Parameters:

mesh_namestr: Name of the hdf5 file to link to.

read_mesh_data(dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)

Read the mesh data from the hdf5 file

Parameters:

dtypenp.dtype: Data type to read the mesh data as. Should be a floating point type.
distributed_axisint: Axis along which the data is distributed in parallel. Should be 0 for now.

Returns:

xnp.ndarray: The x coordinates of the mesh points.
ynp.ndarray: The y coordinates of the mesh points.
znp.ndarray: The z coordinates of the mesh points.

read_point_data(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)

Read point data from the hdf5 file

Parameters:

dataset_namestr: Name of the dataset to read. This should be the name of the dataset in the hdf5 file.
dtypenp.dtype: Data type to read the dataset as. Should be a floating point type.
distributed_axisint: Axis along which the data is distributed in parallel. Should be 0 for now.

Returns:

np.ndarray: The point data read from the hdf5 file. Will have the same shape as the mesh points.

write_mesh_data(x: ndarray, y: ndarray, z: ndarray, distributed_axis: int = 0)

Write the mesh data to the hdf5 file

Parameters:

xnp.ndarray: The x coordinates of the mesh points.
ynp.ndarray: The y coordinates of the mesh points.
znp.ndarray: The z coordinates of the mesh points.
distributed_axisint: Axis along which the data is distributed in parallel. Should be 0 for now.

write_point_data(dataset_name: str, data: ndarray, distributed_axis: int = 0)

Write point data to the hdf5 file

Parameters:

dataset_namestr: Name of the dataset to write. This will be used as the name of the dataset in the hdf5 file.
datanp.ndarray: Data to write. Should have the same number of points as the mesh.
distributed_axisint: Axis along which the data is distributed in parallel. Should be 0 for now.

Catalyst2

Classes to interface with paraview - catalyst

class pysemtools.io.catalyst.CatalystSession(comm: Comm, pipeline: str, channel: str, implementation_name=None, implementation_path: str | None = None)

Class to interface with catalyst

Interface with catalyst to set up a session and load the pipeline.

Parameters:

commMPI.Comm
pipelinestr: Path to the saved ParaView Catalyst state file (e.g. “pipeline.py”).
channelstr: Name of the catalyst channel to use (must match the registrationName in your saved state script, e.g. “rbc00001.vtkhdf”).
implementation_namestr: Optional name of the catalyst implementation to use. Also set with env variable CATALYST_IMPLEMENTATION_NAME e.g. export CATALYST_IMPLEMENTATION_NAME=paraview
implementation_pathstr: Optional path to the catalyst implementation. Also set with env variable CATALYST_IMPLEMENTATION_PATH e.g. export CATALYST_IMPLEMENTATION_PATH=/path/to/paraview/lib/catalyst

Methods

`execute`([timestep, time_value])	Execute the catalyst pipeline for the current mesh and fields
`finalize`()	Finalize the catalyst session
`set_field`(fields)	Set the fields for the catalyst session / conduit node
`set_mesh`(x, y, z[, cell_type])	Set the mesh for the catalyst session / conduit node

execute(timestep=0, time_value=0.0)

Execute the catalyst pipeline for the current mesh and fields

Parameters:

timestep: int: Current timestep to set in the catalyst state.
time_value: float: Current time value to set in the catalyst state.

finalize(): Finalize the catalyst session

set_field(fields: dict[str, ndarray])

Set the fields for the catalyst session / conduit node

Parameters:

fields: dict[str, np.ndarray]: Dictionary of field name to field values.

set_mesh(x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex')

Set the mesh for the catalyst session / conduit node

Parameters:

x: np.ndarray: array with local x coordinates on this rank.
y: np.ndarray: array with local y coordinates on this rank.
z: np.ndarray: array with local z coordinates on this rank.
cell_type: str: cell type, currently only “hex” is supported.

wrappers

Wrappers to ease IO

pysemtools.io.wrappers.partition_read_data(comm, fname: str | None = None, distributed_axis: int = 0)

Generate partition information for hdf5 files. Useful if needing to read or write multiple files with the same partitioning, such that the read/write functions does not need to do the same every time.

Parameters:

commMPI.Comm: The MPI communicator
fnamestr: The name of the file to read
distributed_axisint, optional: The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.

Returns:

list: A list of slices corresponding to the local data to be read by each process

pysemtools.io.wrappers.read_data(comm, fname: str, keys: list[str], parallel_io: bool = False, dtype=<class 'numpy.float32'>, distributed_axis: int = 0, slices: list | None = None)

Read data from a file and return a dictionary with the names of the files and keys

Parameters:

comm, MPI.Comm: The MPI communicator
fnamestr: The name of the file to read
keyslist[str]: The keys to read from the file
parallel_iobool, optional: If True, read the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True
dtypenp.dtype, optional: The data type of the data to read, by default np.single
distributed_axisint, optional: The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.
sliceslist, optional: A list of slices to read from the file. If None, the local data will be read based on the distributed_axis and the communicator. If provided, it should match the number of dimensions in the data. Note that if you are reading in parallel, the slices should be provided in such a way that they correspond to the local data on each process, otherwise the data will be replicated. If in doubt, do not provide slices, and the local data will be determined automatically.

Returns:

dict: A dictionary with the keys and the data read from the file

pysemtools.io.wrappers.write_data(comm, fname: str, data_dict: dict[str, ~numpy.ndarray], parallel_io: bool = False, dtype=<class 'numpy.float32'>, msh: ~pysemtools.datatypes.msh.Mesh | list[~numpy.ndarray] | None = None, write_mesh: bool = False, distributed_axis: int = 0, uniform_shape: bool = False)

Write data to a file

Parameters:

comm, MPI.Comm: The MPI communicator
fnamestr: The name of the file to write
data_dictdict: The data to write to the file
parallel_iobool, optional: If True, write the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True
dtypenp.dtype, optional
mshMesh, optional: The mesh object to write to a fld file, by default None
write_meshbool, optional: Only valid for writing fld files
distributed_axisint, optional: The axis along which the data is distributed, by default 0
uniform_shapebool, optional: If True, the global shape of the data is assumed to be uniform, by default False

monitoring

Monitoring tools for pySEMTools

class pysemtools.monitoring.Logger(level=None, comm=None, module_name=None)

Class that takes charge of logging messages

This class takes care of logging only in one rank and setting differnent logging levels.

Generally, the levels are set using the environment variables PYSEMTOOLS_DEBUG and PYSEMTOOLS_HIDE_LOG.

Parameters:

levelint, optional: Logging level. The default is None, which sets it to logging.INFO.
commMPI.Comm: MPI communicator.
module_namestr, optional: Name of the module that is using the logger. The default is None.

Attributes:

loglogging.Logger: Logger object that handles the logging
commMPI.Comm: MPI communicator
sync_timedict: Dictionary to store the times for sync_tic/sync_toc methods
timefloat: Variable to store the time for tic/toc methods

Methods

write(level, message)	Method that writes messages in the log
tic()	Store the current time.
toc()	Write elapsed time since the last call to tic.
sync_tic(id=0)	Store the current time.
sync_toc(id=0, message=None, time_message=”Elapsed time: “)	Write elapsed time since the last call to tic.

sync_tic(id=0)

Store the current time in the attibute sync_time with key id.

Returns:

None.

sync_toc(id=0, message=None, time_message='Elapsed time: ', level='info'): Write elapsed time since the last call to tic for the given id in the sync_time attribute.

tic()

Store the current time.

Returns:

None.

toc(message=None, time_message='Elapsed time: ', level='info'): Write elapsed time since the last call to tic.

write(level, message)

Writes messages in the log

Parameters:

levelstr: Level of the message. Possible values are: “debug_all”, “debug”, “info “info_all”, “warning”, “error”, “critical”.
messagestr: Message to be logged.

rom

Set of tools to perform reduced order modeling tasks. Particularly, proper orthogonal decomposition (POD)

class pysemtools.rom.IoHelp(comm, number_of_fields=1, batch_size=1, field_size=1, field_data_type=<class 'numpy.float64'>, mass_matrix_data_type=<class 'numpy.float64'>, module_name='io_helper')

Class used to help with IO buffering

This class contains buffers to be used in the carrying out, for example, POD

Parameters:

commMPI communicator: MPI communicator object.
number_of_fieldsint: Number of fields to be stored in the buffer.
batch_sizeint: Size of the buffer.
field_sizeint: Size of each field.
field_data_typedata-type, optional: Data type of the fields. The default is np.double.
mass_matrix_data_typedata-type, optional: Data type of the mass matrix. The default is np.double.
module_namestr, optional: Name of the module for logging purposes. The default is “io_helper”.

Methods

`copy_fieldlist_to_xi`([field_list])	Copy a field list into the buffer xi position 0
`load_buffer`([scale_snapshot])	Function to load snapshot into the allocated buffer.
`split_narray_to_1dfields`(array)	Split a snapshot into a set of fields.

copy_fieldlist_to_xi(field_list=None)

Copy a field list into the buffer xi position 0

This is used to make multi dimensional data into one big column vector that works as a snapshot for the POD

Parameters:

field_listlist of np.ndarray: List of fields to be copied into xi.

Returns:

None

load_buffer(scale_snapshot=True)

Function to load snapshot into the allocated buffer.

This transfer the data from xi into the buffer at the current buffer index.

If the buffer is full, it sets the flag to update from buffer to True.

This is the buffer that the POD class uses to compute.

Parameters:

scale_snapshotbool, optional: If True, the snapshot is scaled with the mass matrix before being loaded into the buffer. The default is True.

Returns:

None

split_narray_to_1dfields(array)

Split a snapshot into a set of fields.

This is somewhat an inverse of copy_fieldlist_to_xi, where from a big column vector we split it into a list of fields.

Parameters:

arraynp.ndarray: Array to be split into fields.

Returns:

field_list1dlist of np.ndarray: List of fields obtained from splitting the array.

class pysemtools.rom.POD(comm, number_of_modes_to_update=1, global_updates=True, auto_expand=False, auto_expand_from_these_modes=1, bckend='numpy')

Class that wraps the SVD to facilitate the use of the POD

This class performs the POD in parallel or locally, and allows for incremental updates of the modes.

This wraps the SVD class to provide a more user-friendly interface for performing POD.

Parameters:

commMPI communicator: MPI communicator object.
number_of_modes_to_updateint: Number of modes to update during each update step. The default is 1.
global_updatesbool, optional: If True, perform global updates. If False, perform local updates. The default is True
auto_expandbool, optional: If True, automatically expand the number of modes when the orthogonality criterion is not met The default is False.
auto_expand_from_these_modesint, optional: Number of modes from which to start the automatic expansion. The default is 1.
bckendstr, optional: Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.

Methods

`check_snapshot_orthogonality`(comm[, xi])	Check the level of orthogonality of the new snapshot with the current basis
`rotate_local_modes_to_global`(comm)	Do a rotation of the current modes into a global basis
`scale_modes`(comm[, bm1sqrt, op])	Scale the current modes with the given mass matrix and the provided opeartion (div or mult)
`update`(comm[, buff])	Update POD modes from a batch of snapshots in buff

Returns:

None

check_snapshot_orthogonality(comm, xi=None)

Check the level of orthogonality of the new snapshot with the current basis

The data is stored in the running_ra attribute. This is valid only if the auto update is activated. In this case, if the orthogonality ratio is above the minimun_orthogonality_ratio and the current number of modes is below the setk, then the number of modes k is increased by one. The minimun orthogonality ratio is set to 0.99 by default.

Parameters:

commMPI communicator: MPI communicator object.
xinp.ndarray: New snapshot to be checked.

Returns:

None

rotate_local_modes_to_global(comm)

Do a rotation of the current modes into a global basis

This is needed if the POD is initially performed locally. The rotation is done only once required.

Parameters:

commMPI communicator: MPI communicator object.

Returns:

None

scale_modes(comm, bm1sqrt=None, op='div')

Scale the current modes with the given mass matrix and the provided opeartion (div or mult)

Parameters:

commMPI communicator: MPI communicator object.
bm1sqrtnp.ndarray: Mass matrix to scale the modes.
opstr, optional: Operation to be performed. “div” for division, “mult” for multiplication. The default is “div”.

Returns:

None

update(comm, buff=None)

Update POD modes from a batch of snapshots in buff

Parameters:

commMPI communicator: MPI communicator object.
buffnp.ndarray: Buffer containing the new snapshots to update the modes.

Returns:

None

class pysemtools.rom.SVD(logger, bckend='numpy')

Class used to obtain parallel and streaming SVD results

This is the main class of the ROM module and POD wraps arounf it.

Parameters:

loggerLogger: Logger object to be used for logging.
bckendstr, optional: Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.

Methods

`gbl_svd`(xi, comm)	perform a global svd that contains all necesary rotations for global modes
`gbl_svd_numpy`(xi, comm)	Perform a global SVD using numpy, including necessary rotations for global modes using numpy
`gbl_svd_torch`(xi, comm)	Perform a global SVD using PyTorch, including necessary rotations for global modes.
`gbl_update`(u_1t, d_1t, vt_1t, xi, k, comm)	Method to update the global svds from a batch of data
`gbl_update_numpy`(u_1t, d_1t, vt_1t, xi, k, comm)	Method to update the global svds from a batch of data using numpy
`gbl_update_torch`(u_1t, d_1t, vt_1t, xi, k, comm)	Method to update the global svds from a batch of data using PyTorch
`lcl_to_gbl_svd`(uii, dii, vtii, k_set, comm)	Perform rotations to obtain global modes from local modes.
`lcl_update`(u_1t, d_1t, vt_1t, xi, k)	Method to update the local svds from a batch of data

Returns:

None

gbl_svd(xi, comm)

perform a global svd that contains all necesary rotations for global modes

Parameters:

xinp.ndarray: Data to perform the svd on.
commMPI communicator: MPI communicator object.

Returns:

u_localnp.ndarray: Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
dynp.ndarray: Global singular values.
vtynp.ndarray: Global right singular vectors.

gbl_svd_numpy(xi, comm)

Perform a global SVD using numpy, including necessary rotations for global modes using numpy

Parameters:

xinp.ndarray: Data to perform the svd on.
commMPI communicator: MPI communicator object.

Returns:

u_localnp.ndarray: Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
dynp.ndarray: Global singular values.
vtynp.ndarray: Global right singular vectors.

gbl_svd_torch(xi, comm)

Perform a global SVD using PyTorch, including necessary rotations for global modes.

Parameters:

xinp.ndarray or torch.Tensor: Data to perform the svd on.
commMPI communicator: MPI communicator object.

Returns:

u_localtorch.Tensor: Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.
dytorch.Tensor: Global singular values.
vtytorch.Tensor: Global right singular vectors.

gbl_update(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data

Parameters:

u_1tnp.ndarray: Left singular vectors from previous update.
d_1tnp.ndarray: Singular values from previous update.
vt_1tnp.ndarray: Right singular vectors from previous update.
xinp.ndarray: New data to be used for the update.
kint: Number of modes to be kept.
commMPI communicator: MPI communicator object.

Returns:

u_1tnp.ndarray: Updated left singular vectors.
d_1tnp.ndarray: Updated singular values.
vt_1tnp.ndarray: Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

gbl_update_numpy(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using numpy

Parameters:

u_1tnp.ndarray: Left singular vectors from previous update.
d_1tnp.ndarray: Singular values from previous update.
vt_1tnp.ndarray: Right singular vectors.
xinp.ndarray: New data to be used for the update.
kint: Number of modes to be kept.
commMPI communicator: MPI communicator object.

Returns:

u_1tnp.ndarray: Updated left singular vectors.
d_1tnp.ndarray: Updated singular values.
vt_1tnp.ndarray: Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

gbl_update_torch(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using PyTorch

Parameters:

u_1ttorch.Tensor: Left singular vectors from previous update.
d_1ttorch.Tensor: Singular values from previous update.
vt_1ttorch.Tensor: Right singular vectors.
xinp.ndarray or torch.Tensor: New data to be used for the update.
kint: Number of modes to be kept.
commMPI communicator: MPI communicator object.

Returns:

u_1ttorch.Tensor: Updated left singular vectors.
d_1ttorch.Tensor: Updated singular values.
vt_1ttorch.Tensor: Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

lcl_to_gbl_svd(uii, dii, vtii, k_set, comm)

Perform rotations to obtain global modes from local modes.

Parameters:

uiinp.ndarray: Local left singular vectors.
diinp.ndarray: Local singular values.
vtiinp.ndarray: Local right singular vectors.
k_setint: Number of modes that want to be kept globally. This is different than k because in local updates, each rank can have different k that will provide with ALL global modes.
commMPI communicator: MPI communicator object.

Returns:

ui_globalnp.ndarray: Global left singular vectors.
dynp.ndarray: Global singular values.
vtynp.ndarray: Global right singular vectors.

lcl_update(u_1t, d_1t, vt_1t, xi, k)

Method to update the local svds from a batch of data

This method does not require communication.

Parameters:

u_1tnp.ndarray: Left singular vectors from previous update.
d_1tnp.ndarray: Singular values from previous update.
vt_1tnp.ndarray: Right singular vectors from previous update.
xinp.ndarray: New data to be used for the update.
kint: Number of modes to be kept.

Returns:

u_1tnp.ndarray: Updated left singular vectors.
d_1tnp.ndarray: Updated singular values.
vt_1tnp.ndarray: Updated right singular vectors.