API

PySEMTools is a collection of modules that together provide a comprehensive toolkit for working with spectral element method (SEM) data.

The presently included submodules are:

pysemtools.cli

Command line interface functionalities available in pySEMTools.

pysemtools.comm

Communication utilities for parallel processing

pysemtools.datatypes

Data types to hold and operate with SEM data.

pysemtools.interpolation

High-order interpolation routines for SEM data.

pysemtools.io.ppymech

Classes to read or write data from Nek style codes

pysemtools.io.adios2

IO operations using ADIOS2.

pysemtools.io.hdf

Classes to read or write data in the hdf format

pysemtools.io.catalyst

Classes to interface with paraview - catalyst

pysemtools.io.wrappers

Wrappers to ease IO

pysemtools.monitoring

Monitoring tools for pySEMTools

pysemtools.rom

Set of tools to perform reduced order modeling tasks.

The compression module inside the package is still considered to be experimental and is prone to change, therefore its documentation is not yet included here.

The post-processing module is in a similar state. This module intends to use the tools to perform more specific post processing tasks. It is still under development and its API is prone to change, therefore the documentation is reserved for a future release.

The full API documentation for all the classes and functions is provided below.


cli

Command line interface functionalities available in pySEMTools.

pysemtools.cli.extract_subdomain()

Extract a subdomain from an existing file.

This script extracts a box-shaped subdomain from a given field file and saves it to a new file.

Parameters:
- -input_file: Path to the input field file.
- -output_file: Path to the output field file where the extracted subdomain will be saved
- -bounds: Comma-separated list of 6 floats defining the subdomain bounds: xmin,xmax,ymin,ymax,zmin,zmax
- -fields: (Optional) Comma-separated list of field names to extract. If not provided, all fields will be extracted.

Examples

To use this script, run it with MPI using the following command:

>>> mpirun -n <num_processes> python pysemtools_extract_subdomain --input_file <input.fld> --output_file <output.fld> --bounds=<xmin,xmax,ymin,ymax,zmin,zmax> [--fields=<field1,field2,...>]

Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.

pysemtools.cli.index_files()

Create a json file that contains and index of files in a folder

This script creates an index of files in a specified folder and saves it as a JSON file. The index can include file contents and time intervals based on user preferences.

Parameters:
- -folder_path: Path to the folder containing files to index.
- -output_folder: Path to the output folder.
- -file_type: (Optional) List of file types to index.
- -include_file_contents: (Optional) Boolean flag to include file contents in the index
- -include_time_interval: (Optional) Boolean flag to include time interval in the index.
- -run_start_time: (Optional) Float indicating the start time of the run.
- -stat_start_time: (Optional) Float indicating the start time of the statistics.

Examples

To use this script, run the following command:

>>> pysemtools_index_files --folder_path <path_to_folder> --output_folder <output_folder> [--file_type <file_type1,file_type2,...>] [--include_file_contents] [--include_time_interval] [--run_start_time <float>] [--stat_start_time <float>]

Replacing the placeholders with actual values. Observe that you have to remove the angle brackets.

pysemtools.cli.visnek()

Create a nek5000 file template from existing files. Used to visualize nek5000 files in VisIt.

This script generates a nek5000 file template based on existing files in the current directory. The generated file contains information about the file naming pattern, first timestep, and number of timesteps.

Parameters:
filename_pattern: Pattern to match the nek5000 files.

Examples

To use this script, run it with the following command:

>>> pysemtools_visnek <filename_pattern>

Replacing the placeholder with the actual filename pattern. Observe that you have to remove the angle brackets.

comm

Communication utilities for parallel processing

class pysemtools.comm.Router(comm)

This class can be used to handle communication between ranks in a MPI communicator.

With this one can send and recieve data to any rank that is specified in the destination list.

Parameters:
commMPI communicator

The MPI communicator that is used for the communication.

Attributes:
commMPI communicator

The MPI communicator that is used for the communication.

destination_countndarray

Specifies a buffer to see how many points I send to each rank

source_countndarray

Specifies a buffer to see how many points I recieve from each rank

Methods

all_gather([data, dtype])

Gathers data from all processes to all processes.

all_to_all([destination, data, dtype])

Sends data to specified destinations and recieves data from whoever sent.

gather_in_root([data, root, dtype])

Gathers data from all processes to the root process.

scatter_from_root([data, sendcounts, root, ...])

Scatters data from the root process to all other processes.

send_recv([destination, data, dtype, tag])

Sends data to specified destinations and recieves data from whoever sent.

transfer_data(comm_pattern, **kwargs)

Moves data between ranks in the specified patthern.

Notes

The data is always flattened before sending and recieved data is always flattened. The user must reshape the data after recieving it.

Examples

To initialize simply use the communicator

>>> from mpi4py import MPI
>>> from pysemtools.comm.router import Router
>>> comm = MPI.COMM_WORLD
>>> rt = Router(comm)
all_gather(data=None, dtype=None)

Gathers data from all processes to all processes.

This is a wrapper to the MPI Allgatherv function.

Parameters:
datandarray

Data that is gathered in all processes.

dtypedtype

The data type of the data that is gathered.

Returns:
recvbufndarray

The gathered data in the root process. The data is always recieved flattened. User must reshape it.

sendcountsndarray

The number of data that was sent from each rank.

Examples

To gather data from all ranks to the root rank, do the following:

>>> rt = Router(comm)
>>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank
>>> recvbf, sendcounts = rt.all_gather(data = local_data, dtype = np.double)
all_to_all(destination=None, data=None, dtype=None, **kwargs)

Sends data to specified destinations and recieves data from whoever sent.

In this instance we use all to all collective.

Parameters:
destinationlist

A list with the rank ids that the data should be sent to.

datalist or ndarray

The data that will be sent. If it is a list, the data will be sent to the corresponding index in the destination list. if the data is an ndarray, the same data will be sent to all destinations.

dtypedtype

The data type of the data that is sent.

Returns:
sourceslist

A list with the rank ids that the data was recieved from.

recvbufflist

A list with the recieved data. The data is stored in the same order as the sources.

Notes

Extra keyword arguments are ignored. This is to keep the same interface as the send_recv method.

Examples

To send and recieve data between ranks, do the following:

local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)

>>> rt = Router(comm)
>>> destination = [rank + 1, rank + 2]
>>> for i, dest in enumerate(destination):
>>>     if dest >= size:
>>>         destination[i] = dest - size
>>> sources, recvbf = rt.all_to_all(destination = destination,
>>>                   data = local_data, dtype=np.double)
>>> for i in range(0, len(recvbf)):
>>>     recvbf[i] = recvbf[i].reshape((-1, 3))
gather_in_root(data=None, root=0, dtype=None)

Gathers data from all processes to the root process.

This is a wrapper to the MPI Gatherv function.

Parameters:
datandarray

Data that is gathered in the root process.

rootint

The rank that will gather the data.

dtypedtype

The data type of the data that is gathered.

Returns:
recvbufndarray

The gathered data in the root process. The data is always recieved flattened. User must reshape it.

sendcountsndarray

The number of data that was sent from each rank.

Examples

To gather data from all ranks to the root rank, do the following:

>>> rt = Router(comm)
>>> local_data = np.ones(((rank+1)*10, 3), dtype=np.double)*rank
>>> recvbf, sendcounts = rt.gather_in_root(data = local_data,
>>>                      root = 0, dtype = np.double)
scatter_from_root(data=None, sendcounts=None, root=0, dtype=None)

Scatters data from the root process to all other processes.

This is a wrapper to the MPI Scatterv function.

Parameters:
datandarray

The data that is scattered to all processes.

sendcountsndarray, optional

The number of data that is sent to each process. If not specified, the data is divided equally among all processes.

rootint

The rank that will scatter

dtypedtype

The data type of the data that is scattered.

Returns:
recvbufndarray

The scattered data in the current process. The data is always recieved flattened. User must reshape it.

Examples

To scatter data from the root rank, do the following:

>>> rt = Router(comm)
>>> recvbf = rt.scatter_from_root(data = recvbf,
                sendcounts=sendcounts, root = 0, dtype = np.double)
>>> recvbf = recvbf.reshape((-1, 3))

Note tha the sendcounts are just a ndarray of size comm.Get_size() with the number of data that is sent to each rank.

send_recv(destination=None, data=None, dtype=None, tag=None)

Sends data to specified destinations and recieves data from whoever sent.

Typically, when a rank needs to send some data, it also needs to recieve some. In this method this is done by using non blocking communication. We note, however, that when the method returns, the data is already recieved.

Parameters:
destinationlist

A list with the rank ids that the data should be sent to.

datalist or ndarray

The data that will be sent. If it is a list, the data will be sent to the corresponding destination. if the data is an ndarray, the same data will be sent to all destinations.

dtypedtype

The data type of the data that is sent.

tagint

Tag used to identify the messages.

Returns:
sourceslist

A list with the rank ids that the data was recieved from.

recvbufflist

A list with the recieved data. The data is stored in the same order as the sources.

Examples

To send and recieve data between ranks, do the following:

local_data = np.zeros(((rank+1)*10, 3), dtype=np.double)

>>> rt = Router(comm)
>>> destination = [rank + 1, rank + 2]
>>> for i, dest in enumerate(destination):
>>>     if dest >= size:
>>>         destination[i] = dest - size
>>> sources, recvbf = rt.send_recv(destination = destination,
>>>                   data = local_data, dtype=np.double, tag = 0)
>>> for i in range(0, len(recvbf)):
>>>     recvbf[i] = recvbf[i].reshape((-1, 3))
transfer_data(comm_pattern, **kwargs)

Moves data between ranks in the specified patthern.

This method wraps others in this class.

Parameters:
keywordstr

The keyword that specifies the pattern of the data movement. current options are:

  • distribute_p2p: sends data to specified destinations. and recieves data from whoever sent. point to point comm

  • distribute_a2a: sends data to specified destinations. and recieves data from whoever sent. all to all comm

  • gather: gathers data from all processes to the root process.

  • scatter: scatters data from the root process to all other processes.

kwargsdict

The arguments that are passed to the specified pattern. One needs to check the pattern documentation for the required arguments. For each scenario

Returns:
tuple

The output of the specified pattern.

datatypes

Data types to hold and operate with SEM data.

class pysemtools.datatypes.Coef(msh, comm, get_area=False, apply_1d_operators=True, bckend='numpy')

Class that contains arrays like mass matrix, jacobian, jacobian inverse, etc.

This class can be used when mathematical operations such as derivation and integration is needed on the sem mesh.

Parameters:
mshMesh

Mesh object.

commComm

MPI comminicator object.

get_areabool, optional

If True, the area integration weight and normal vectors will be calculated. (Default value = True).

apply_1d_operatorsbool, optional

If True, the 1D operators will be applied instead of building 3D operators. (Default value = True).

Attributes:
drdxndarray

component [0,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

drdyndarray

component [0,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

drdzndarray

component [0,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dsdxndarray

component [1,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dsdyndarray

component [1,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dsdzndarray

component [1,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dtdxndarray

component [2,0] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dtdyndarray

component [2,1] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

dtdzndarray

component [2,2] of the jacobian inverse tensor for each point. shape is (nelv, lz, ly, lx).

Bndarray

Mass matrix for each point. shape is (nelv, lz, ly, lx).

areandarray

Area integration weight for each point in the facets. shape is (nelv, 6, ly, lx).

nxndarray

x component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).

nyndarray

y component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).

nzndarray

z component of the normal vector for each point in the facets. shape is (nelv, 6, ly, lx).

Methods

apply_spatial_filter(field)

Apply the stored spatial filters

build_spatial_filter([r_tf, s_tf, t_tf])

Build a spatial filter based on the given transfer functions.

dssum(field, msh)

Peform average of given field over shared points in each rank.

dudrst(field[, direction])

Perform derivative with respect to reference coordinate r/s/t.

dudrst_1d_operator(field, dr, ds[, dt])

Perform derivative applying the 1d operators provided as inputs.

dudrst_1d_operator_torch(field, dr, ds[, dt])

Perform derivative applying the 1D operators provided as inputs.

dudrst_3d_operator(field, dr)

Perform derivative with respect to reference coordinate r.

dudrst_transposed(field[, direction])

Perform derivative with respect to reference coordinate r/s/t.

dudxyz(field, drdx, dsdx[, dtdx])

Perform derivative with respect to physical coordinate x,y,z.

glsum(a, comm[, dtype])

Peform global summatin of given qunaitity a using MPI.

Returns:

Examples

Assuming you have a mesh object and MPI communicator object, you can initialize the Coef object as follows:

>>> from pysemtools import Coef
>>> coef = Coef(msh, comm)
apply_spatial_filter(field)

Apply the stored spatial filters

Parameters:
fieldnp.array

Field to apply the spatial filter to. Shape should be (nelv, lz, ly, lx).

Returns:
np.array

Filtered field. Shape is the same as the input field.

Notes

The spatial filters must be created before calling this function, otherwise, an error will be raised.

build_spatial_filter(r_tf: array | None = None, s_tf: array | None = None, t_tf: array | None = None)

Build a spatial filter based on the given transfer functions.

Parameters:
r_tfnp.array

Transfer function for the r dimension. Shape should be (lx, lx).

s_tfnp.array

Transfer function for the s dimension. Shape should be (ly, ly).

t_tfnp.array

Transfer function for the t dimension. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.

Returns:
None

The spatial filters are stored in the r_filter, s_filter, and t_filter attributes.

dssum(field, msh)

Peform average of given field over shared points in each rank.

This method averages the field over shared points in the same rank. It uses the connectivity data in the mesh object. dssum might be a missleading name.

Parameters:
fieldndarray

Field to average over shared points.

mshMesh

pySEMTools Mesh object.

Returns:
ndarray

Input field with shared points averaged with shared points in the SAME rank.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> dudx = coef.dssum(dudx, msh)
dudrst(field, direction='r')

Perform derivative with respect to reference coordinate r/s/t.

Used to perform the derivative in the reference coordinates

Parameters:
fieldndarray

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

directionstr

Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).

Returns:
ndarray

Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudrst_1d_operator(field, dr, ds, dt=None)

Perform derivative applying the 1d operators provided as inputs.

This method uses the 1D operators to apply the derivative. To apply derivative in r. you mush provide the 1d differenciation matrix in that direction and identity in the others.

Parameters:
fieldndarray

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

drndarray

Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).

dsndarray

Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).

dtndarray

Derivative matrix in the t direction to apply to each element. Shape should be (lz, lz). This is optional. If none is passed, it is assumed that the field is 2D.

Returns:
ndarray

Derivative of the field with respect to r/s/t. Shape is the same as the input field.

Examples

Assuming you have a Coef object

>>> dxdr = coef.dudrst(x, coef.dr, np.eye(ly, dtype=coef.dtype), np.eye(lz, dtype=coef.dtype))
dudrst_1d_operator_torch(field, dr, ds, dt=None)

Perform derivative applying the 1D operators provided as inputs.

Parameters:
fieldtorch.Tensor

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

drtorch.Tensor

Derivative matrix in the r direction to apply to each element. Shape should be (lx, lx).

dstorch.Tensor

Derivative matrix in the s direction to apply to each element. Shape should be (ly, ly).

dttorch.Tensor (optional)

Derivative matrix in the t direction. Shape should be (lz, lz).

Returns:
torch.Tensor

Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudrst_3d_operator(field, dr)

Perform derivative with respect to reference coordinate r.

This method uses derivation matrices from the lagrange polynomials at the GLL points.

Parameters:
fieldndarray

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

drndarray

Derivative matrix in the r/s/t direction to apply to each element. Shape should be (lx*ly*lz, lx*ly*lz).

Returns:
ndarray

Derivative of the field with respect to r/s/t. Shape is the same as the input field.

Examples

Assuming you have a Coef object

>>> dxdr = coef.dudrst(x, coef.dr)
dudrst_transposed(field, direction='r')

Perform derivative with respect to reference coordinate r/s/t.

Used to perform the derivative in the reference coordinates

Parameters:
fieldndarray

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

directionstr

Direction to take the derivative. Can be ‘r’, ‘s’, or ‘t’. (Default value = ‘r’).

Returns:
ndarray

Derivative of the field with respect to r/s/t. Shape is the same as the input field.

dudxyz(field, drdx, dsdx, dtdx=None)

Perform derivative with respect to physical coordinate x,y,z.

This method uses the chain rule, first evaluating derivatives with respect to rst, then multiplying by the inverse of the jacobian to map to xyz.

Parameters:
fieldndarray

Field to take derivative of. Shape should be (nelv, lz, ly, lx).

drdxndarray

Derivative of the reference coordinates with respect to x, i.e., first entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.

dsdxndarray

Derivative of the reference coordinates with respect to y, i.e., second entry in the appropiate row of the jacobian inverse. Shape should be the same as the field.

dtdxndarray

Derivative of the reference coordinates with respect to z, i.e., third entry in the appropiate row of the jacobian inverse. Shape should be the same as the field. (Default value = None) Only valid for 3D fields.

Returns:
ndarray

Derivative of the field with respect to x,y,z. Shape is the same as the input field.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> dudx = coef.dudxyz(u, coef.drdx, coef.dsdx, coef.dtdx)
glsum(a, comm, dtype=<class 'numpy.float64'>)

Peform global summatin of given qunaitity a using MPI.

This method uses MPI to sum over all MPI ranks. It works with any numpy array shape and returns one value.

Parameters:
andarray

Quantity to sum over all mpiranks.

commComm

MPI communicator object.

datypenumpy.dtype

(Default value = np.double).

Returns:
float

Sum of the quantity a over all MPI ranks.

Examples

Assuming you have a Coef object and are working on a 3d field:

>>> volume = coef.glsum(coef.B, comm)
class pysemtools.datatypes.Field(comm, data=None)

Class that contains fields.

This is the main class used to contain data that can be used for post processing. The data does not generarly need to be present in this class, as it is typically enough to have the data as ndarrays of shape (nelv, lz, ly, lx) for each field. However this class provides a easy interface to collect data tha is somehow associated.

It also allows to easily write data to disk. As all the data in this class will be stored in the same file.

Parameters:
commComm

MPI comminicator object.

dataHexaData, optional

HexaData object that contains the coordinates of the domain.

Attributes:
fieldsdict

Dictionary that contains the fields. The keys are the field names and the values are lists of ndarrays. The keys for these dictionaries are the same as for Hexadata objects, i.e. vel, pres, temp, scal.

vel_fieldsint

Number of velocity fields.

pres_fieldsint

Number of pressure fields.

temp_fieldsint

Number of temperature fields.

scal_fieldsint

Number of scalar fields.

tfloat

Time of the data.

Methods

update_vars()

Update number of fields.

clear

Returns:

Examples

If a hexadata object: data is read from disk, the field object can be created directly from it.

>>> from pysemtools.datatypes.field import Field
>>> fld = Mesh(comm, data = data)

If one wishes to use the data in the fields. It is possible to reference it with a ndarray of shape (nelv, lz, ly, lx) as follows:

>>> u = fld.fields["vel"][0]
>>> v = fld.fields["vel"][1]
>>> w = fld.fields["vel"][2]
>>> vel_magnitude = np.sqrt(u**2 + v**2 + w**2)

A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:

>>> from pysemtools.datatypes.field import Field
>>> fld = Field(comm)
>>> fld.fields["vel"].append(u)
>>> fld.update_vars()

This fld object can then be used to write fld files from field u created in the code.

update_vars()

Update number of fields.

Update the number of fields in the class in the event that it has been modified. This is needed for writing data properly if more arrays are added to the class.

Examples

A field object can be created empty and then fields can be added to it. Useful to write data to disk. if a ndarray u is created with shape (nelv, lz, ly, lx) it can be added to the field object as follows:

>>> from pysemtools.datatypes.field import Field
>>> fld = Field(comm)
>>> fld.fields["vel"].append(u)
>>> fld.update_vars()

This fld object can then be used to write fld files from field u created in the code.

class pysemtools.datatypes.FieldRegistry(comm, data=None, bckend='numpy')

Class that contains fields.

This class extends the main field class, as it contains a registry that allows to easily reference the fields.

Parameters:
commComm

MPI comminicator object.

dataHexaData, optional

HexaData object that contains the coordinates of the domain.

Methods

add_field(comm[, field_name, field, ...])

Add fields to the registry.

clear()

Clear the registry and the fields.

rename_registry_key([old_key, new_key])

Rename a key in the registry.

to([comm, bckend])

Move all fields to cpu in numpy to write out

update_vars()

Update the registry with the fields that are present in the fields dictionary.

add_field(comm, field_name='', field=None, file_type=None, file_name=None, file_key=None, dtype=<class 'numpy.float64'>)

Add fields to the registry. They will be stored in the fields dictionary to easily write them.

Parameters:
commComm

MPI comminicator object.

field_namestr

Name of the field to be added. where the field is added thepends on the name

fieldndarray

Field to be added to the registry. If this is provided, it is assumed to be a ndarray. It will then be added to the registry.

file_typestr

Type of the file to be added. If this is provided, it is assumed to be a file. Currently, only “fld” supported.

file_namestr

File name of the field to be added. If this is provided, it is assumed to be a file. It will then be added to the registry.

file_keystr

File key. This will be search in the file. For nek file, the key have the following format: “vel_0”, “vel_1”, “pres”, “temp”, “scal_0”, “scal_1”, etc. Only for “vel” we read the 2/3 components at the same time

dtypenp.dtype

Data type of the field. Default is np.double.

clear()

Clear the registry and the fields.

rename_registry_key(old_key='', new_key='')

Rename a key in the registry.

Parameters:
old_keystr

Old key to be renamed.

new_keystr

New key to be used.

Notes

If you update the registry, some keys might be overwritten or multiple keys might reference the same data.

to(comm=None, bckend='numpy')

Move all fields to cpu in numpy to write out

update_vars()

Update the registry with the fields that are present in the fields dictionary.

class pysemtools.datatypes.Mesh(comm, data=None, x=None, y=None, z=None, elmap=None, create_connectivity=False, bckend='numpy', log_level=None)

Class that contains coordinate and partitioning data of the domain.

This class needs to be used generaly as it contains the coordinates of the domain and some information about the partitioning of the domain.

Parameters:
commComm

MPI comminicator object.

dataHexaData, optional

HexaData object that contains the coordinates of the domain.

xndarray, optional

X coordinates of the domain. shape is (nelv, lz, ly, lx).

yndarray, optional

Y coordinates of the domain. shape is (nelv, lz, ly, lx).

zndarray, optional

Z coordinates of the domain. shape is (nelv, lz, ly, lx).

elmapndarray, optional

1D ndarray of global element ids. shape is (nelv,).

create_connectivitybool, optional

If True, the connectivity of the domain will be created. (Memory intensive).

bckendstr, optional

Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.

Attributes:
xndarray

X coordinates of the domain. shape is (nelv, lz, ly, lx).

yndarray

Y coordinates of the domain. shape is (nelv, lz, ly, lx).

zndarray

Z coordinates of the domain. shape is (nelv, lz, ly, lx).

lxint

Polynomial degree in x direction.

lyint

Polynomial degree in y direction.

lzint

Polynomial degree in z direction.

nelvint

Number of elements in the domain in current rank.

glb_nelvint

Total number of elements in the domain.

gdimint

Dimension of the domain.

non_linear_shared_pointslist, optional

List that show the index where the points in the domain are shared, used by coef in dssum.

Methods

create_connectivity()

Create connectivity with the information from one processor

get_edge_centers()

Get the edge centers of the domain.

get_facet_centers()

Get the centroid of each facet

get_vertices()

Get the vertices of the domain.

init_common(comm)

Initialize common attributes.

init_from_coords(comm, x, y, z[, elmap])

Initialize from coordinates.

init_from_data(comm, data)

Initialize form data.

to([comm, bckend])

Transfer the Mesh object to the desired backend.

Returns:

Examples

If a hexadata object: data is read from disk, the mesh object can be created directly from it.

>>> from pysemtools.datatypes.msh import Mesh
>>> msh = Mesh(comm, data = data)

If the coordinates are already available, the mesh object can be created from them.

>>> from pysemtools.datatypes.msh import Mesh
>>> msh = Mesh(comm, x = x, y = y, z = z)

This is useful in situations where the coordinates are generated in the code or streamed into python from another source.

create_connectivity()

Create connectivity with the information from one processor

Notes

This function creates a map that contains the connectivity of the domain. This is not the recomended way to perform connectivity in large domains. For this, it is better to use the dedicated connecitivty object that performs the operations in parallel.

get_edge_centers()

Get the edge centers of the domain.

Get all the edge centers of the domain in 2D or 3D.

Notes

We need 4 edges for 2D and 12 edges for 3D. For all cases we store 3 coordinates for each edge.

get_facet_centers()

Get the centroid of each facet

Find the “centroid of each facet. This is used to find the shared facets between elements.

Notes

This is not really the centroid, as we also find a coordinate in the dimension perpendicular to the facet. This means that these values can be outside or inside the element. However the same behaviour should be seen in the matching elements.

get_vertices()

Get the vertices of the domain.

Get all the vertices of the domain in 2D or 3D.

Notes

We need 4 vertices for 2D and 8 vertices for 3D. For all cases, we store 3 coordinates for each vertex.

init_common(comm)

Initialize common attributes.

This function is used to initialize the common attributes of the mesh object.

Parameters:
commComm

MPI communicator object.

Returns:
None

Nothing is returned, the attributes are set in the object.

init_from_coords(comm, x, y, z, elmap=None)

Initialize from coordinates.

This function is used to initialize the mesh object from x, y, z ndarrays.

Parameters:
commComm

MPI communicator object.

xndarray

X coordinates of the domain. shape is (nelv, lz, ly, lx).

yndarray

Y coordinates of the domain. shape is (nelv, lz, ly, lx).

zndarray

Z coordinates of the domain. shape is (nelv, lz, ly, lx).

elmapndarray, optional

1D ndarray of global element ids. shape is (nelv,). If not provided, it will be set to None.

Returns:
None

Nothing is returned, the attributes are set in the object.

init_from_data(comm, data)

Initialize form data.

This function is used to initialize the mesh object from a hexadata object.

Parameters:
commComm

MPI communicator object.

dataHexaData

HexaData object that contains the coordinates of the domain.

Returns:
None

Nothing is returned, the attributes are set in the object.

to(comm=None, bckend='numpy')

Transfer the Mesh object to the desired backend.

Parameters:
commComm

MPI communicator object.

bckendstr

Backend to use for the data. Options are ‘numpy’ and ‘torch’. Default is ‘numpy’.

Returns:
msh_cpuMesh

Mesh object in the desired backend.

class pysemtools.datatypes.MeshConnectivity(comm, msh: Mesh | None = None, rel_tol=1e-05, use_hashtable=False, max_simultaneous_sends=1, max_elem_per_vertex: int | None = None, max_elem_per_edge: int | None = None, max_elem_per_face: int | None = None, coef=None)

Class to compute the connectivity of the mesh

Uses facets and vertices to determine which elements are connected to each other

Parameters:
commMPI communicator

The MPI communicator

mshMesh

The mesh object

rel_tolfloat

The relative tolerance to use when comparing the coordinates of the facets/edges

use_hashtablebool

Whether to use a hashtable to define connectivity. This is faster but uses more memory

max_simultaneous_sendsint

The maximum number of simultaneous sends to use when sending data to other ranks. A lower number saves memory for buffers but is slower.

max_elem_per_vertexint

The maximum number of elements that share a vertex. The default values are 4 for 2D and 8 for 3D (Works for a structured mesh) The default value is selected if this input is left as None

max_elem_per_edgeint

The maximum number of elements that share an edge. The default values are 2 for 2D and 4 for 3D (Works for a structured mesh) The default value is selected if this input is left as None

max_elem_per_faceint

The maximum number of elements that share a face. The default values are 2 for 3D (Works for a structured mesh) The default value is selected if this input is left as None

Methods

dssum([field, msh, average])

Computes the dssum of the field

dssum_global([local_dssum_field, field, msh])

Computes the global dssum of the field

dssum_local([field, msh])

Computes the local dssum of the field

get_boundary_node_indices_2d(msh[, ...])

Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.

get_multiplicity(msh)

Computes the multiplicity of the elements in the mesh

global_connectivity(msh)

Computes the global connectivity of the mesh

local_connectivity(msh)

Computes the local connectivity of the mesh

dssum(field: ndarray | None = None, msh: Mesh | None = None, average: str = 'multiplicity')

Computes the dssum of the field

Parameters:
fieldnp.ndarray

The field to compute the dssum

mshMesh

The mesh object

averagestr

The averaging weights to use. Can be “multiplicity”

Returns:
np.ndarray

The dssum of the field

dssum_global(local_dssum_field: ndarray | None = None, field: ndarray | None = None, msh: Mesh | None = None)

Computes the global dssum of the field

Parameters:
local_dssum_fieldnp.ndarray

The local dssum of the field, computed with dssum_local

fieldnp.ndarray

The field to compute the dssum

mshMesh

The mesh object

Returns:
np.ndarray

The global dssum of the field

dssum_local(field: ndarray | None = None, msh: Mesh | None = None)

Computes the local dssum of the field

Parameters:
fieldnp.ndarray

The field to compute the dssum

mshMesh

The mesh object

Returns:
np.ndarray

The local dssum of the field

get_boundary_node_indices_2d(msh, masking_function=None)

Return list of (e, k, j, i) indices of GLL nodes lying on boundary edges in a 2D mesh.

Parameters:
mshMesh

The mesh associated with this connectivity object.

masking_functioncallable or None

Optional function with signature (msh, e, k, j, i) → bool. If provided, only nodes for which this returns True are included.

Returns:
boundary_node_indiceslist of tuple

Indices in the form (element, z=0, j, i) for all GLL nodes on boundary edges.

get_multiplicity(msh: Mesh)

Computes the multiplicity of the elements in the mesh

Parameters:
mshMesh

Notes

The multiplicity is the number of times a point in a element is shared with its own element or others. The minimum multiplicity is 1, since the point is always shared with itself.

global_connectivity(msh: Mesh)

Computes the global connectivity of the mesh

Currently this function sends data from all to all.

Parameters:
mshMesh

The mesh object

Notes

In 3D. this function sends the facet centers of the unique_efp_elem and unique_efp_facet to all other ranks. as well as the element ID and facet ID to be assigned.

We compare the unique facet centers of our rank to those of others and determine which one matches. When we find that one matches, we populate the directories: global_shared_efp_to_rank_map[(e, f)] = rank global_shared_efp_to_elem_map[(e, f)] = elem global_shared_efp_to_facet_map[(e, f)] = facet

So for each element facet pair we will know which rank has it, and which is their ID in that rank.

BE MINDFUL: Later when redistributing, send the points, but also send the element and facet ID to the other ranks so the reciever can know which is the facet that corresponds.

local_connectivity(msh: Mesh)

Computes the local connectivity of the mesh

This function checks elements within a rank

Parameters:
mshMesh

The mesh object

Notes

In 3D, the centers of the facets are compared. efp means element facet pair. One obtains a local_shared_efp_to_elem_map and a local_shared_efp_to_facet_map dictionary.

local_shared_efp_to_elem_map[(e, f)] = [e1, e2, …] gives a list with the elements e1, e2 … that share the same facet f of element e.

local_shared_efp_to_facet_map[(e, f)] = [f1, f2, …] gives a list with the facets f1, f2 … of the elements e1, e2 … that share the same facet f of element e.

In each case, the index of the element list, corresponds to the index of the facet list. Therefore, the element list might have repeated element entries.

Additionally, we create a list of unique_efp_elem and unique_efp_facet, which are the elements and facets that are not shared with any other element. These are either boundary elements or elements that are connected to other ranks. the unique pairs are the ones that are checked in global connecitivy,

class pysemtools.datatypes.MeshPartitioner(comm, msh: Mesh | None = None, conditions: list[ndarray] | None = None)

A class that repartitons SEM mesh data using a given partitioning algorithm.

The idea is to be able to chose subdomains of the mesh and split the elements such that the load is balanced among the ranks.

One could use this to repartition the data if the condition array is full of True values, but the idea is to be able to use any condition array.

Parameters:
commMPI communicator

MPI communicator

mshMesh

Mesh object to partition

conditionslist[np.ndarray]

List of conditions to apply to the mesh elements. The conditions should be in the form of a list of numpy arrays. Each numpy array should have the same length as the number of elements in the mesh. The conditions should be boolean arrays.

Methods

create_partitioned_field([fld, ...])

Create a partitioned field object

create_partitioned_mesh([msh, ...])

Create a partitioned mesh object

redistribute_field_elements([field, ...])

Redistribute the elements of the mesh object to different ranks

create_partitioned_field(fld: Field | FieldRegistry | None = None, partitioning_algorithm: str = 'load_balanced_linear') FieldRegistry

Create a partitioned field object

Parameters:
fldField or FieldRegistry

Field object to partition

partitioning_algorithmstr

Algorithm to use for partitioning the mesh elements

Returns:
partitioned_fieldFieldRegistry

Partitioned field object

create_partitioned_mesh(msh: Mesh | None = None, partitioning_algorithm: str = 'load_balanced_linear', create_conectivity: bool = False) Mesh

Create a partitioned mesh object

Parameters:
mshMesh

Mesh object to partition

partitioning_algorithmstr

Algorithm to use for partitioning the mesh elements

Returns:
partitioned_meshMesh

Partitioned mesh object

redistribute_field_elements(field: ndarray | None = None, partitioning_algorithm: str = 'load_balanced_linear') None

Redistribute the elements of the mesh object to different ranks

Parameters:
fieldnp.ndarray

Field to redistribute based on the conditions at initialization

partitioning_algorithmstr

Algorithm to use for partitioning the mesh elements

class pysemtools.datatypes.VTKMesh(comm: Comm, x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex', global_connectivity: bool = True, distributed_axis: int = 0)

Class that contains the mesh data in a vtk friendly format

Helper to build connectivity and offsets etc.

Parameters:
commMPI.Comm

The MPI communicator

xnp.ndarray

The x coordinates of the mesh points

ynp.ndarray

The y coordinates of the mesh points

znp.ndarray

The z coordinates of the mesh points

cell_typestr, optional

The type of cells in the mesh, by default “hex”. Only “hex” is currently supported.

global_connectivitybool, optional

Whether to use global connectivity or local connectivity in parallel. By default True, but if only one rank is used, it is set to False since the connectivity is already global in that case.

distributed_axisint, optional

The axis along which the mesh is distributed, by default 0 (Only zero allowed now)

Notes

Offsets is n_cells + 1 to work on VTKHDF. This might need to be reduced to n_cells for other use cases like Catalyst.

interpolation

High-order interpolation routines for SEM data.

class pysemtools.interpolation.Probes(comm, output_fname: str = './interpolated_fields.csv', probes: ndarray | str | None = None, msh=typing.Union[pysemtools.datatypes.msh.Mesh, str, list], write_coords: bool = True, progress_bar: bool = False, point_interpolator_type: str = 'single_point_legendre', max_pts: int = 128, find_points_iterative: list = [False, 5000], find_points_comm_pattern: str = 'point_to_point', elem_percent_expansion: float = 0.01, global_tree_type: str = 'rank_bbox', global_tree_nbins: int = 1024, use_autograd: bool = False, find_points_tol: float = np.float64(2.220446049250313e-15), find_points_max_iter: int = 50, local_data_structure: str = 'kdtree', use_oriented_bbox: bool = False, clean_search_traces: bool = False)

Class to interpolate fields at probes from a SEM mesh.

Main interpolation class. This works in parallel.

If the points to interpolate are available only at rank 0, make sure to only pass them at that rank and set the others to None. In that case, the points will be scattered to all ranks.

If the points are passed in all ranks, then they will be considered as different points to be interpolated and the work to interpolate will be multiplied. If you are doing this, make sure that the points that each rank pass are different. Otherwise simply pass None in all ranks but 0.

See example below to observe how to avoid unnecessary replication of data in all ranks when passing probes as argument if the points are the same in all ranks.

If reading probes from file, they will be read on rank 0 and scattered unless parallel hdf5 is used, in which case the probes will be read in all ranks. (In development)

Parameters:
commMPI communicator

MPI communicator.

output_fnamestr

Output file name. Default is “./interpolated_fields.csv”. Note that you can change the file extension to .hdf5 to write in hdf5 format. by using the name “interpolated_fields.hdf5”.

probesUnion[np.ndarray, str]

Probes coordinates. If a string, it is assumed to be a file name.

mshUnion[Mesh, str, list]

Mesh data. If a string, it is assumed to be a file name. If it is a list the first entry is the file name and the second is the dtype of the data. if it is a Mesh object, the x, y, z coordinates are taken from the object.

write_coordsbool

If True, the coordinates of the probes are written to a file. Default is True.

progress_barbool

If True, a progress bar is shown. Default is False.

point_interpolator_typestr

Type of point interpolator. Default is single_point_legendre. options are: single_point_legendre, single_point_lagrange, multiple_point_legendre_numpy, multiple_point_legendre_torch.

max_ptsint, optional

Maximum number of points to interpolate. Default is 128. Used if multiple point interpolator is selected.

find_points_iterativelist

List with two elements. First element is a boolean that indicates if the search is iterative. Second element is the maximum number of candidate ranks to send the data. This affects memory. Default is [False, 5000].

find_points_comm_patternstr

Communication pattern for finding points. Default is point_to_point. options are: point_to_point, collective, rma

elem_percent_expansionfloat

Percentage expansion of the element bounding box. Default is 0.01.

global_tree_typestr

How is the global tree constructed to determine rank candidates for the probes. Only really used if using tree structures to determine candidates. Default is rank_bbox. options are: rank_bbox, domain_binning.

global_tree_nbinsint

Number of bins in the global tree. Only used if the global tree is domain_binning. Default is 1024.

use_autogradbool

If True, autograd is used. Default is False.

find_points_tolfloat

The tolerance to use when finding points. Default is np.finfo(np.double).eps * 10.

find_points_max_iterint

The maximum number of iterations to use when finding points. Default is 50.

local_data_structurestr

The local data structure to use when finding points. Default is kdtree. options are: kdtree, obb_tree.

use_oriented_bboxbool

If True, oriented bounding boxes are used when finding points. Default is False.

clean_search_tracesbool

If True, cleans all the data used for searching points after initialization. This saves memory if only interpolation is needed afterwards. Default is False.

Attributes:
probesndarray

2D array of probe coordinates. shape = (n_probes, 3).

interpolated_fieldsndarray

2D array of interpolated fields at probes. shape = (n_probes, n_fields + 1). The first column is always time, the rest are the interpolated fields.

Methods

interpolate_from_field_list(t, field_list, comm)

Interpolate the probes from a list of fields.

clean_search_traces

Notes

A sample input file can be found in the examples folder of the main repository, However, the file is not used in said example.

Examples

  1. Initialize from file:

>>> from mpi4py import MPI
>>> from pysemtools.interpolation.probes import Probes
>>> comm = MPI.COMM_WORLD
>>> probes = Probes(comm, filename="path/to/params.json")

2. Initialize from code, passing everything as arguments. Assume msh is created. One must then create the probe data in rank 0. A dummy probe_data must be created in all other ranks

>>> from mpi4py import MPI
>>> from pysemtools.interpolation.probes import Probes
>>> comm = MPI.COMM_WORLD
>>> if comm.Get_rank() == 0:
>>>     probes_data = np.array([[0.0, 0.0, 0.0], [1.0, 1.0, 1.0], [2.0, 2.0, 2.0]])
>>> else:
>>>    probes_data = None
>>> probes = Probes(comm, probes=probes_data, msh=msh)

Note that probes is initialized in all ranks, but the probe_data containing the coordinates are only relevant in rank 0. They are scattered internally.

interpolate_from_field_list(t, field_list, comm, write_data=True, field_names: list[str] | None = None)

Interpolate the probes from a list of fields.

This method interpolates from a list of fields (ndarrays of shape (nelv, lz, ly, lx)).

Parameters:
tfloat

Time of the field data.

field_listlist

List of fields to interpolate. Each field is an ndarray of shape (nelv, lz, ly, lx).

commComm

MPI communicator.

write_databool

If True, the interpolated data is written to a file. Default is True.

field_nameslist

List of names of the interpolated fields. Useful when writing to file. Default is None.

Examples

This method can be used to interpolate fields from a list of fields. If you have previosly obtained a set of fields u,v,w as ndarrays of shape (nelv, lz, ly, lx), you can:

>>> probes.interpolate_from_field_list(t, [u,v,w], comm)

The results are stored in probes.interpolated_fields attribute. Remember: the first column of this attribute is always the time t given.

io

There are multiple interfaces to perform IO operations. The main one relates to NEk5000/Neko files. These are read by using the ppymech submodule. It is also possible to use ADIOS2 to, for example, use the in-situ processing capabilities, like data streaming or compression. Support for HDF5 files in parallel is also provided, including VTKHDF files that can be shown in Paraview. Finally, a set of wrappers to read and write data in a more user-friendly way is also included.

ppymech

Classes to read or write data from Nek style codes

pysemtools.io.ppymech.preadnek(filename, comm, data_dtype=<class 'numpy.float64'>)

Read and fld file and return a pymech hexadata object (Parallel).

Main function for readinf nek type fld filed.

Parameters:
filenamestr

The filename of the fld file.

commComm

MPI communicator.

data_dtypestr

The data type of the data in the file. (Default value = “float64”).

Returns:
HexaData

The data read from the file in a pymech hexadata object.

Examples

>>> from mpi4py import MPI
>>> from pysemtools.io.ppymech.neksuite import preadnek
>>> comm = MPI.COMM_WORLD
>>> data = preadnek('field00001.fld', comm)
pysemtools.io.ppymech.pwritenek(filename, data, comm)

Write and fld file and from a pymech hexadata object (Parallel).

Main function to write fld files.

Parameters:
filenamestr

The filename of the fld file.

dataHexaData

The data to write to the file.

commComm

MPI communicator.

Examples

Assuming you have a hexadata object already:

>>> from pysemtools.io.ppymech.neksuite import pwritenek
>>> pwritenek('field00001.fld', data, comm)
pysemtools.io.ppymech.pynekread(filename, comm, data_dtype=<class 'numpy.float64'>, msh=None, fld=None, overwrite_fld=False)

Read nek file and returs a pynekobject (Parallel).

Main function for readinf nek type fld filed.

Parameters:
filenamestr

The filename of the fld file.

commComm

MPI communicator.

data_dtypestr

The data type of the data in the file. (Default value = “float64”).

mshMesh

The mesh object to put the data in. (Default value = None).

fldField

The field object to put the data in. (Default value = None).

overwrite_fldbool

Wether or not to overwrite the contents of fld. (Default value = False).

Returns:
None

Nothing is returned, the attributes are set in the object.

Examples

>>> from mpi4py import MPI
>>> from pysemtools.io.ppymech.neksuite import pynekread
>>> comm = MPI.COMM_WORLD
>>> msh = msh_c(comm)
>>> fld = field_c(comm)
>>> pynekread(fname, comm, msh = msh, fld=fld)
pysemtools.io.ppymech.pynekread_field(filename, comm, data_dtype=<class 'numpy.float64'>, key='')

Read nek file and returs a pynekobject (Parallel).

Main function for readinf nek type fld filed.

Parameters:
filenamestr

The filename of the fld file.

commComm

MPI communicator.

data_dtypestr

The data type of the data in the file. (Default value = “float64”).

keystr

The key of the field to read. Typically “vel”, “pres”, “temp” or “scal_1”, “scal_2”, etc.

Returns:
list

The data read from the file in a list.

pysemtools.io.ppymech.pynekwrite(filename, comm, msh=None, fld=None, wdsz=4, istep=0, write_mesh=True)

Write and fld file and from pynekdatatypes (Parallel).

Main function to write fld files.

Parameters:
filenamestr

The filename of the fld file.

commComm

MPI communicator.

mshMesh

The mesh object to write to the file. (Default value = None).

fldField

The field object to write to the file. (Default value = None).

wdszint

The word size of the data in the file. (Default value = 4).

istepint

The time step of the data. (Default value = 0).

write_meshbool

If True, write the mesh data. (Default value = True).

Examples

Assuming a mesh object and field object are already present in the namespace:

>>> from pysemtools.io.ppymech.neksuite import pwritenek
>>> pynekwrite('field00001.fld', comm, msh = msh, fld=fld)

ADIOS2

IO operations using ADIOS2.

class pysemtools.io.adios2.DataCompressor(comm, mesh_info=None, wrd_size=4)

Class used to write compressed data to disk.

This Assumes that the input data has a msh object available.

Parameters:
commComm

MPI communicator.

mesh_infodict

Dictionary with mesh information.

wrd_sizeint

Word size to write data. (Default value = 4). Single precsiion is 4, double is 8.

Methods

read(comm[, fname, variable_names])

Read data from disk using adios2.

write(comm[, fname, variable_names, data])

Write data to disk using adios2.

Returns:

Examples

This class is used to write data to disk. The data is compressed using bzip2.

>>> mesh_info = {"glb_nelv": msh.glb_nelv, "lxyz": msh.lxyz, "gdim": msh.gdim}
>>> dc = DataCompressor(comm, mesh_info = mesh_info, wrd_size = 4)
read(comm, fname='compress.bp', variable_names=None)

Read data from disk using adios2.

Read compressed data and internally decompress it.

Parameters:
commComm

MPI communicator.

fnamestr

File name to read. (Default value = “compress.bp”).

variable_nameslist

List of string with the names of the variables to read. These names NEED to match the names adios2 used to write the data.

Returns:
list

List of numpy arrays with the data read. the ndarrays in the list are 1d.

Examples

This function is used to read data from disk. The data is compressed using bzip2.

>>> variable_names = ["x", "y", "z"]
>>> data = dc.read(comm, fname = "compress.bp", variable_names = variable_names)
write(comm, fname='compress.bp', variable_names=None, data=None)

Write data to disk using adios2.

Lossless compression with bzip2.

Parameters:
commComm

MPI communicator.

fnamestr

File name to write. (Default value = “compress.bp”).

variable_nameslist

List of string with the names of the variables to write. This is very important, as adios2 will use these names to process the data.

datalist

List of numpy arrays with the data to write. Corresponding in index to variable_names. The arrays must be 1d.

Examples

This function is used to write data to disk. The data is compressed using bzip2.

>>> variable_names = ["x", "y", "z"]
>>> data = [msh.x, msh.y, msh.z]
>>> dc.write(comm, fname = "compress.bp", variable_names = variable_names, data = data)
class pysemtools.io.adios2.DataStreamer(comm, from_nek=True)

Class used to communicate data between codes using adios2.

Use this to send and recieve data.

The data is always transported as a 1d array, so it is necessary to reshape.

Parameters:
commComm

MPI communicator.

from_nekbool

Define if the data that is being stream comes from a nek-like code. (Default value = True).

Attributes:
glb_nelvint

Total number of elements in the global domain.

lxyzint

Number of points per element.

gdimint

Problem dimension.

nelvint

Number of elements that this rank has.

Methods

finalize()

Finalize the execution of the module.

recieve([fld, variable])

Recieve data from another code using adios2.

stream(fld)

Send data to another code using adios2.

Examples

This type must be paired with another streaming in the other executable/code. The codes will not start if the streams are not paired.

A full of example of recieving data is shown below. It is possible to pair with other classes to, for example, write data to disk.

>>> ds = data_streamer_c(comm)
>>> x = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape x
>>> y = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape y
>>> z = get_fld_from_ndarray(ds.recieve(), ds.lx, ds.ly, ds.lz, ds.nelv) # Recieve and reshape z
>>> ds.finalize()
>>> msh = msh_c(comm, x = x, y = y, z = z)
>>> write_fld_file_from_list("field0.f00001", comm, msh, [x, y, z])

To send data to the other code, use the stream method.

>>> ds.stream(x.reshape(x.size))
finalize()

Finalize the execution of the module.

Used to close reader and writer. The stream will end and the code will not be coupled.

recieve(fld=None, variable='f2py_field')

Recieve data from another code using adios2.

The data is always transported as a 1d array, so it is necessary to reshape to deinred shape.

Parameters:
fldndarray

Buffer to contain the data. If None, it will be created. (Default value = None).

variablestr

Name of the adios2 variable to be read. Neko used default name. Change as needed. This name can remain the same during the execution even if different quantities are being transported. (Default value = “f2py_field”).

Returns:
ndarray

Returns the field that was recieved.

stream(fld)

Send data to another code using adios2.

Couple 2 code or executables.

Parameters:
fldndarray

Field to be sent. Must be a 1d array.

HDF

Classes to read or write data in the hdf format

class pysemtools.io.hdf.HDF5File(comm: Comm, fname: str, mode: str, parallel: bool)

Class to write and read hdf5 files in parallel using h5py.

Open an hdf5 file based on inputs.

Parameters:
commMPI.Comm

MPI communicator.

fnamestr

Name of the hdf5 file to read or write.

modestr

Mode to open the file. Should be “r” for reading or “w” for writing.

parallelbool

Whether to use parallel I/O or not.

Methods

close([clean])

Close the hdf5 file object

open(fname, mode, parallel)

Open an hdf5 file based on inputs.

read_dataset(dataset_name[, dtype, ...])

Read a dataset from the hdf5 file object

read_slices(dataset_name[, dtype])

Read the slices hyperslabs from the file

set_active_group(group_name)

Set the active group to read or write data from.

set_read_slices_external(global_shape, slices)

Set the slices that should be read from the file based on external input.

set_read_slices_linear_lb(global_shape, ...)

Set the slices that should be read from the file.

set_write_slices(local_shape, distributed_axis)

Set the slices that should be written to the file.

write_dataset(dataset_name, data[, ...])

Write a dataset to the hdf5 file object

write_slices(dataset_name, data[, shape_in_file])

Write the hyperslab to the file.

close(clean: bool = True)

Close the hdf5 file object

Parameters:
cleanbool

Whether to clean the attributes that are assigned when opening a file. This is useful if the file object will be reused to open another file after closing the current one. Default is False.

open(fname: str, mode: str, parallel: bool)

Open an hdf5 file based on inputs.

This can be used to open a new file after closing the previous one.

Parameters:
fnamestr

Name of the hdf5 file to read or write.

modestr

Mode to open the file. Should be “r” for reading or “w” for writing.

parallelbool

Whether to use parallel I/O or not. If True, the file will be opened using the MPI-IO driver. If False, the file will be opened using the default driver.

read_dataset(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int | None = None, slices: list | None = None, as_array_list_in_file: bool = False, ignore_metadata: bool = False)

Read a dataset from the hdf5 file object

Parameters:
dataset_namestr

Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.

dtypenp.dtype

Data type to read the dataset in. Default is np.double.

distributed_axisint

Axis along which the data is distributed in parallel. This is required for parallel reading. Default is None.

sliceslist

Optional. List of slices to read from the dataset. In case it is known

as_array_list_in_filebool

Optional. default is False. Whether the data is stored as an array list in the file. This is useful if originally the data had a different shape but was flattened to 1d before writing. This will use the shape attribute stored in the file to do the partioning but will keep in mind that the data is stored as a 1d array to read properly.

ignore_metadatabool

Optional. default is False. Force to read the data ingnoring any shape metadata. This will just read the arrays as stored and will not try to assume an original shape

Returns:
local_datanp.ndarray

Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.

read_slices(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>)

Read the slices hyperslabs from the file

Parameters:
dataset_namestr

Name of the dataset to read. Can include the group path, e.g. “/group1/group2/dataset”.

dtypenp.dtype

Data type to read the dataset in. Default is np.double.

Returns:
local_datanp.ndarray

Data read from the file. This will be a local array with the shape determined by the global shape of the dataset and the parallel distribution. If slices are provided, the shape will be determined by the slices.

set_active_group(group_name: str)

Set the active group to read or write data from.

This is useful to avoid having to specify the group every time a dataset is read or written.

Parameters:
group_namestr

Name of the group to set as active. Can include the group path, e.g. “/group1/group2”. If the group does not exist, it will be created if the file is opened in write mode, otherwise an error will be raised.

set_read_slices_external(global_shape: tuple, slices: list)

Set the slices that should be read from the file based on external input.

slices need to be precomputed in this case

Parameters:
global_shapetuple

Shape of the global array to be read.

sliceslist

List of slices to read from the data set.

set_read_slices_linear_lb(global_shape: tuple, distributed_axis: int, explicit_strides: bool = False, shape_in_file: list | None = None)

Set the slices that should be read from the file.

Data is distributed in a linear load balanced way.

Parameters:
global_shapetuple

Shape of the global array to be read. This is required to determine the local shape and the slices to read from the file.

distributed_axisint

Axis along which the data is distributed in parallel. This is required to determine the local shape and the slices to read from the file.

explicit_stridesbool

Whether to use explicit strides to read the data. This is useful if the data is stored as 1D in the file but originally had a different shape.

set_write_slices(local_shape: tuple, distributed_axis: int, extra_global_entries: list[int] | None = None)

Set the slices that should be written to the file.

Obtain global shape from the local one

Parameters:
local_shapetuple

Shape of the local array to be written. This is required to determine the global shape and the slices to write to the file.

distributed_axisint

Axis along which the data is distributed in parallel.

extra_global_entrieslist[int], optional

List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write. Default is None.

write_dataset(dataset_name: str, data: ndarray, distributed_axis: int | None = None, extra_global_entries: list[int] | None = None, shape_in_ram: tuple | None = None)

Write a dataset to the hdf5 file object

Parameters:
dataset_namestr

Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.

datanp.ndarray

Data to write to the file.

distributed_axisint

Axis along which the data is distributed in parallel. This is required for parallel writing. Default is None.

extra_global_entrieslist[int]

Optional. List of extra entries to add to the global shape of the dataset. This is useful if the ranks are writing a certain amount of data but the global array should be bigger than what they collectively write.

shape_in_ramtuple

Optional. Shape of the data in RAM. This is useful if the data is stored in a different shape that it originally had, for example, if it is stored in a 1d array but originally it had a different shape. this will be the shape that is stored in the file in the attribute “shape” and can be used to reshape the data when reading it.

write_slices(dataset_name: str, data: ndarray, shape_in_file: tuple | None = None)

Write the hyperslab to the file.

Perform the write operations

Parameters:
dataset_namestr

Name of the dataset to write. Can include the group path, e.g. “/group1/group2/dataset”.

datanp.ndarray

Data to write to the file. This should have the same shape as the local shape determined by the set_write_slices method.

shape_in_filetuple, optional

Shape of the data to be stored in the file. This is useful if the data is stored in a different shape in the file than it is in RAM.

class pysemtools.io.hdf.VTKHDFFile(comm: Comm, fname: str, mode: str, parallel: bool)

Class to write and read vtkhdf files in parallel using h5py. Open an hdf5 file based on inputs.

Parameters:
commMPI.Comm

MPI communicator.

fnamestr

Name of the hdf5 file to read or write.

modestr

Mode to open the file. Should be “r” for reading or “w” for writing.

parallelbool

Whether to use parallel I/O or not.

Methods

close([clean])

Close the hdf5 file object

link_to_existing_mesh(mesh_name)

Link to an existing mesh

open(fname, mode, parallel)

Open an hdf5 file based on inputs.

read_dataset(dataset_name[, dtype, ...])

Read a dataset from the hdf5 file object

read_mesh_data([dtype, distributed_axis])

Read the mesh data from the hdf5 file

read_point_data(dataset_name[, dtype, ...])

Read point data from the hdf5 file

read_slices(dataset_name[, dtype])

Read the slices hyperslabs from the file

set_active_group(group_name)

Set the active group to read or write data from.

set_read_slices_external(global_shape, slices)

Set the slices that should be read from the file based on external input.

set_read_slices_linear_lb(global_shape, ...)

Set the slices that should be read from the file.

set_write_slices(local_shape, distributed_axis)

Set the slices that should be written to the file.

write_dataset(dataset_name, data[, ...])

Write a dataset to the hdf5 file object

write_mesh_data(x, y, z[, distributed_axis])

Write the mesh data to the hdf5 file

write_point_data(dataset_name, data[, ...])

Write point data to the hdf5 file

write_slices(dataset_name, data[, shape_in_file])

Write the hyperslab to the file.

close(clean: bool = True)

Close the hdf5 file object

Parameters:
cleanbool

Whether to clean up the file after closing. This will delete the file from disk. Should only be used for testing.

Link to an existing mesh

Avoid rewriting mesh data if not necessary. It can be quite costly in storage.

Parameters:
mesh_namestr

Name of the hdf5 file to link to.

read_mesh_data(dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)

Read the mesh data from the hdf5 file

Parameters:
dtypenp.dtype

Data type to read the mesh data as. Should be a floating point type.

distributed_axisint

Axis along which the data is distributed in parallel. Should be 0 for now.

Returns:
xnp.ndarray

The x coordinates of the mesh points.

ynp.ndarray

The y coordinates of the mesh points.

znp.ndarray

The z coordinates of the mesh points.

read_point_data(dataset_name: str, dtype: ~numpy.dtype = <class 'numpy.float64'>, distributed_axis: int = 0)

Read point data from the hdf5 file

Parameters:
dataset_namestr

Name of the dataset to read. This should be the name of the dataset in the hdf5 file.

dtypenp.dtype

Data type to read the dataset as. Should be a floating point type.

distributed_axisint

Axis along which the data is distributed in parallel. Should be 0 for now.

Returns:
np.ndarray

The point data read from the hdf5 file. Will have the same shape as the mesh points.

write_mesh_data(x: ndarray, y: ndarray, z: ndarray, distributed_axis: int = 0)

Write the mesh data to the hdf5 file

Parameters:
xnp.ndarray

The x coordinates of the mesh points.

ynp.ndarray

The y coordinates of the mesh points.

znp.ndarray

The z coordinates of the mesh points.

distributed_axisint

Axis along which the data is distributed in parallel. Should be 0 for now.

write_point_data(dataset_name: str, data: ndarray, distributed_axis: int = 0)

Write point data to the hdf5 file

Parameters:
dataset_namestr

Name of the dataset to write. This will be used as the name of the dataset in the hdf5 file.

datanp.ndarray

Data to write. Should have the same number of points as the mesh.

distributed_axisint

Axis along which the data is distributed in parallel. Should be 0 for now.

Catalyst2

Classes to interface with paraview - catalyst

class pysemtools.io.catalyst.CatalystSession(comm: Comm, pipeline: str, channel: str, implementation_name=None, implementation_path: str | None = None)

Class to interface with catalyst

Interface with catalyst to set up a session and load the pipeline.

Parameters:
commMPI.Comm
pipelinestr

Path to the saved ParaView Catalyst state file (e.g. “pipeline.py”).

channelstr

Name of the catalyst channel to use (must match the registrationName in your saved state script, e.g. “rbc00001.vtkhdf”).

implementation_namestr

Optional name of the catalyst implementation to use. Also set with env variable CATALYST_IMPLEMENTATION_NAME e.g. export CATALYST_IMPLEMENTATION_NAME=paraview

implementation_pathstr

Optional path to the catalyst implementation. Also set with env variable CATALYST_IMPLEMENTATION_PATH e.g. export CATALYST_IMPLEMENTATION_PATH=/path/to/paraview/lib/catalyst

Methods

execute([timestep, time_value])

Execute the catalyst pipeline for the current mesh and fields

finalize()

Finalize the catalyst session

set_field(fields)

Set the fields for the catalyst session / conduit node

set_mesh(x, y, z[, cell_type])

Set the mesh for the catalyst session / conduit node

execute(timestep=0, time_value=0.0)

Execute the catalyst pipeline for the current mesh and fields

Parameters:
timestep: int

Current timestep to set in the catalyst state.

time_value: float

Current time value to set in the catalyst state.

finalize()

Finalize the catalyst session

set_field(fields: dict[str, ndarray])

Set the fields for the catalyst session / conduit node

Parameters:
fields: dict[str, np.ndarray]

Dictionary of field name to field values.

set_mesh(x: ndarray, y: ndarray, z: ndarray, cell_type: str = 'hex')

Set the mesh for the catalyst session / conduit node

Parameters:
x: np.ndarray

array with local x coordinates on this rank.

y: np.ndarray

array with local y coordinates on this rank.

z: np.ndarray

array with local z coordinates on this rank.

cell_type: str

cell type, currently only “hex” is supported.

wrappers

Wrappers to ease IO

pysemtools.io.wrappers.partition_read_data(comm, fname: str | None = None, distributed_axis: int = 0)

Generate partition information for hdf5 files. Useful if needing to read or write multiple files with the same partitioning, such that the read/write functions does not need to do the same every time.

Parameters:
commMPI.Comm

The MPI communicator

fnamestr

The name of the file to read

distributed_axisint, optional

The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.

Returns:
list

A list of slices corresponding to the local data to be read by each process

pysemtools.io.wrappers.read_data(comm, fname: str, keys: list[str], parallel_io: bool = False, dtype=<class 'numpy.float32'>, distributed_axis: int = 0, slices: list | None = None)

Read data from a file and return a dictionary with the names of the files and keys

Parameters:
comm, MPI.Comm

The MPI communicator

fnamestr

The name of the file to read

keyslist[str]

The keys to read from the file

parallel_iobool, optional

If True, read the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True

dtypenp.dtype, optional

The data type of the data to read, by default np.single

distributed_axisint, optional

The axis along which the data is distributed, by default 0. This is used to determine how many elements to read from the file in parallel.

sliceslist, optional

A list of slices to read from the file. If None, the local data will be read based on the distributed_axis and the communicator. If provided, it should match the number of dimensions in the data. Note that if you are reading in parallel, the slices should be provided in such a way that they correspond to the local data on each process, otherwise the data will be replicated. If in doubt, do not provide slices, and the local data will be determined automatically.

Returns:
dict

A dictionary with the keys and the data read from the file

pysemtools.io.wrappers.write_data(comm, fname: str, data_dict: dict[str, ~numpy.ndarray], parallel_io: bool = False, dtype=<class 'numpy.float32'>, msh: ~pysemtools.datatypes.msh.Mesh | list[~numpy.ndarray] | None = None, write_mesh: bool = False, distributed_axis: int = 0, uniform_shape: bool = False)

Write data to a file

Parameters:
comm, MPI.Comm

The MPI communicator

fnamestr

The name of the file to write

data_dictdict

The data to write to the file

parallel_iobool, optional

If True, write the file in parallel, by default False. This is aimed for hdf5 files, and currently it does not work if True

dtypenp.dtype, optional
mshMesh, optional

The mesh object to write to a fld file, by default None

write_meshbool, optional

Only valid for writing fld files

distributed_axisint, optional

The axis along which the data is distributed, by default 0

uniform_shapebool, optional

If True, the global shape of the data is assumed to be uniform, by default False

monitoring

Monitoring tools for pySEMTools

class pysemtools.monitoring.Logger(level=None, comm=None, module_name=None)

Class that takes charge of logging messages

This class takes care of logging only in one rank and setting differnent logging levels.

Generally, the levels are set using the environment variables PYSEMTOOLS_DEBUG and PYSEMTOOLS_HIDE_LOG.

Parameters:
levelint, optional

Logging level. The default is None, which sets it to logging.INFO.

commMPI.Comm

MPI communicator.

module_namestr, optional

Name of the module that is using the logger. The default is None.

Attributes:
loglogging.Logger

Logger object that handles the logging

commMPI.Comm

MPI communicator

sync_timedict

Dictionary to store the times for sync_tic/sync_toc methods

timefloat

Variable to store the time for tic/toc methods

Methods

write(level, message)

Method that writes messages in the log

tic()

Store the current time.

toc()

Write elapsed time since the last call to tic.

sync_tic(id=0)

Store the current time.

sync_toc(id=0, message=None, time_message=”Elapsed time: “)

Write elapsed time since the last call to tic.

sync_tic(id=0)

Store the current time in the attibute sync_time with key id.

Returns:
None.
sync_toc(id=0, message=None, time_message='Elapsed time: ', level='info')

Write elapsed time since the last call to tic for the given id in the sync_time attribute.

tic()

Store the current time.

Returns:
None.
toc(message=None, time_message='Elapsed time: ', level='info')

Write elapsed time since the last call to tic.

write(level, message)

Writes messages in the log

Parameters:
levelstr

Level of the message. Possible values are: “debug_all”, “debug”, “info “info_all”, “warning”, “error”, “critical”.

messagestr

Message to be logged.

rom

Set of tools to perform reduced order modeling tasks. Particularly, proper orthogonal decomposition (POD)

class pysemtools.rom.IoHelp(comm, number_of_fields=1, batch_size=1, field_size=1, field_data_type=<class 'numpy.float64'>, mass_matrix_data_type=<class 'numpy.float64'>, module_name='io_helper')

Class used to help with IO buffering

This class contains buffers to be used in the carrying out, for example, POD

Parameters:
commMPI communicator

MPI communicator object.

number_of_fieldsint

Number of fields to be stored in the buffer.

batch_sizeint

Size of the buffer.

field_sizeint

Size of each field.

field_data_typedata-type, optional

Data type of the fields. The default is np.double.

mass_matrix_data_typedata-type, optional

Data type of the mass matrix. The default is np.double.

module_namestr, optional

Name of the module for logging purposes. The default is “io_helper”.

Methods

copy_fieldlist_to_xi([field_list])

Copy a field list into the buffer xi position 0

load_buffer([scale_snapshot])

Function to load snapshot into the allocated buffer.

split_narray_to_1dfields(array)

Split a snapshot into a set of fields.

copy_fieldlist_to_xi(field_list=None)

Copy a field list into the buffer xi position 0

This is used to make multi dimensional data into one big column vector that works as a snapshot for the POD

Parameters:
field_listlist of np.ndarray

List of fields to be copied into xi.

Returns:
None
load_buffer(scale_snapshot=True)

Function to load snapshot into the allocated buffer.

This transfer the data from xi into the buffer at the current buffer index.

If the buffer is full, it sets the flag to update from buffer to True.

This is the buffer that the POD class uses to compute.

Parameters:
scale_snapshotbool, optional

If True, the snapshot is scaled with the mass matrix before being loaded into the buffer. The default is True.

Returns:
None
split_narray_to_1dfields(array)

Split a snapshot into a set of fields.

This is somewhat an inverse of copy_fieldlist_to_xi, where from a big column vector we split it into a list of fields.

Parameters:
arraynp.ndarray

Array to be split into fields.

Returns:
field_list1dlist of np.ndarray

List of fields obtained from splitting the array.

class pysemtools.rom.POD(comm, number_of_modes_to_update=1, global_updates=True, auto_expand=False, auto_expand_from_these_modes=1, bckend='numpy')

Class that wraps the SVD to facilitate the use of the POD

This class performs the POD in parallel or locally, and allows for incremental updates of the modes.

This wraps the SVD class to provide a more user-friendly interface for performing POD.

Parameters:
commMPI communicator

MPI communicator object.

number_of_modes_to_updateint

Number of modes to update during each update step. The default is 1.

global_updatesbool, optional

If True, perform global updates. If False, perform local updates. The default is True

auto_expandbool, optional

If True, automatically expand the number of modes when the orthogonality criterion is not met The default is False.

auto_expand_from_these_modesint, optional

Number of modes from which to start the automatic expansion. The default is 1.

bckendstr, optional

Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.

Methods

check_snapshot_orthogonality(comm[, xi])

Check the level of orthogonality of the new snapshot with the current basis

rotate_local_modes_to_global(comm)

Do a rotation of the current modes into a global basis

scale_modes(comm[, bm1sqrt, op])

Scale the current modes with the given mass matrix and the provided opeartion (div or mult)

update(comm[, buff])

Update POD modes from a batch of snapshots in buff

Returns:
None
check_snapshot_orthogonality(comm, xi=None)

Check the level of orthogonality of the new snapshot with the current basis

The data is stored in the running_ra attribute. This is valid only if the auto update is activated. In this case, if the orthogonality ratio is above the minimun_orthogonality_ratio and the current number of modes is below the setk, then the number of modes k is increased by one. The minimun orthogonality ratio is set to 0.99 by default.

Parameters:
commMPI communicator

MPI communicator object.

xinp.ndarray

New snapshot to be checked.

Returns:
None
rotate_local_modes_to_global(comm)

Do a rotation of the current modes into a global basis

This is needed if the POD is initially performed locally. The rotation is done only once required.

Parameters:
commMPI communicator

MPI communicator object.

Returns:
None
scale_modes(comm, bm1sqrt=None, op='div')

Scale the current modes with the given mass matrix and the provided opeartion (div or mult)

Parameters:
commMPI communicator

MPI communicator object.

bm1sqrtnp.ndarray

Mass matrix to scale the modes.

opstr, optional

Operation to be performed. “div” for division, “mult” for multiplication. The default is “div”.

Returns:
None
update(comm, buff=None)

Update POD modes from a batch of snapshots in buff

Parameters:
commMPI communicator

MPI communicator object.

buffnp.ndarray

Buffer containing the new snapshots to update the modes.

Returns:
None
class pysemtools.rom.SVD(logger, bckend='numpy')

Class used to obtain parallel and streaming SVD results

This is the main class of the ROM module and POD wraps arounf it.

Parameters:
loggerLogger

Logger object to be used for logging.

bckendstr, optional

Backend to be used for the math operations. The default is “numpy”. “torch” can be used if PyTorch is installed.

Methods

gbl_svd(xi, comm)

perform a global svd that contains all necesary rotations for global modes

gbl_svd_numpy(xi, comm)

Perform a global SVD using numpy, including necessary rotations for global modes using numpy

gbl_svd_torch(xi, comm)

Perform a global SVD using PyTorch, including necessary rotations for global modes.

gbl_update(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data

gbl_update_numpy(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using numpy

gbl_update_torch(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using PyTorch

lcl_to_gbl_svd(uii, dii, vtii, k_set, comm)

Perform rotations to obtain global modes from local modes.

lcl_update(u_1t, d_1t, vt_1t, xi, k)

Method to update the local svds from a batch of data

Returns:
None
gbl_svd(xi, comm)

perform a global svd that contains all necesary rotations for global modes

Parameters:
xinp.ndarray

Data to perform the svd on.

commMPI communicator

MPI communicator object.

Returns:
u_localnp.ndarray

Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.

dynp.ndarray

Global singular values.

vtynp.ndarray

Global right singular vectors.

gbl_svd_numpy(xi, comm)

Perform a global SVD using numpy, including necessary rotations for global modes using numpy

Parameters:
xinp.ndarray

Data to perform the svd on.

commMPI communicator

MPI communicator object.

Returns:
u_localnp.ndarray

Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.

dynp.ndarray

Global singular values.

vtynp.ndarray

Global right singular vectors.

gbl_svd_torch(xi, comm)

Perform a global SVD using PyTorch, including necessary rotations for global modes.

Parameters:
xinp.ndarray or torch.Tensor

Data to perform the svd on.

commMPI communicator

MPI communicator object.

Returns:
u_localtorch.Tensor

Local left singular vectors after global rotation. In this case local references that the modes are distributed across ranks. The modes are global.

dytorch.Tensor

Global singular values.

vtytorch.Tensor

Global right singular vectors.

gbl_update(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data

Parameters:
u_1tnp.ndarray

Left singular vectors from previous update.

d_1tnp.ndarray

Singular values from previous update.

vt_1tnp.ndarray

Right singular vectors from previous update.

xinp.ndarray

New data to be used for the update.

kint

Number of modes to be kept.

commMPI communicator

MPI communicator object.

Returns:
u_1tnp.ndarray

Updated left singular vectors.

d_1tnp.ndarray

Updated singular values.

vt_1tnp.ndarray

Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

gbl_update_numpy(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using numpy

Parameters:
u_1tnp.ndarray

Left singular vectors from previous update.

d_1tnp.ndarray

Singular values from previous update.

vt_1tnp.ndarray

Right singular vectors.

xinp.ndarray

New data to be used for the update.

kint

Number of modes to be kept.

commMPI communicator

MPI communicator object.

Returns:
u_1tnp.ndarray

Updated left singular vectors.

d_1tnp.ndarray

Updated singular values.

vt_1tnp.ndarray

Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

gbl_update_torch(u_1t, d_1t, vt_1t, xi, k, comm)

Method to update the global svds from a batch of data using PyTorch

Parameters:
u_1ttorch.Tensor

Left singular vectors from previous update.

d_1ttorch.Tensor

Singular values from previous update.

vt_1ttorch.Tensor

Right singular vectors.

xinp.ndarray or torch.Tensor

New data to be used for the update.

kint

Number of modes to be kept.

commMPI communicator

MPI communicator object.

Returns:
u_1ttorch.Tensor

Updated left singular vectors.

d_1ttorch.Tensor

Updated singular values.

vt_1ttorch.Tensor

Updated right singular vectors.

Notes

No need to delete xi, as this comes from a buffer that is pre allocated

lcl_to_gbl_svd(uii, dii, vtii, k_set, comm)

Perform rotations to obtain global modes from local modes.

Parameters:
uiinp.ndarray

Local left singular vectors.

diinp.ndarray

Local singular values.

vtiinp.ndarray

Local right singular vectors.

k_setint

Number of modes that want to be kept globally. This is different than k because in local updates, each rank can have different k that will provide with ALL global modes.

commMPI communicator

MPI communicator object.

Returns:
ui_globalnp.ndarray

Global left singular vectors.

dynp.ndarray

Global singular values.

vtynp.ndarray

Global right singular vectors.

lcl_update(u_1t, d_1t, vt_1t, xi, k)

Method to update the local svds from a batch of data

This method does not require communication.

Parameters:
u_1tnp.ndarray

Left singular vectors from previous update.

d_1tnp.ndarray

Singular values from previous update.

vt_1tnp.ndarray

Right singular vectors from previous update.

xinp.ndarray

New data to be used for the update.

kint

Number of modes to be kept.

Returns:
u_1tnp.ndarray

Updated left singular vectors.

d_1tnp.ndarray

Updated singular values.

vt_1tnp.ndarray

Updated right singular vectors.