CSE Quick Reference Guide

The Computational Science Environment project, or CSE, is a stack of tools and combines all Baseline Configuration Team (BCT) mandated packages and are built in a similar way on every HPC system in the program. This allows users of the HPC systems to have a common environment on every machine in the program to develop software or have their jobs run in a similar fashion, independent of the machine.

Packages and Libraries Included

Anaconda ARPACK-NG Boost C++ C-Blosc
CMake Dakota Doxygen FFTW
Flex Git Gnuplot Graphviz
GSL HDF5 LAPACK libpcap
Mesa METIS NCL NetCDF
Octave OpenMPI PAPI PETSc
Python Python3 QT R
ScaLAPACK SCALASCA SQLite SuperLU
TAU Valgrind VTK XDMF
XV ZeroMQ

How to Use the CSE Software and Libraries

Initial module setup module load cseinit

View what software is available module avail cse

Load the desired module (GSL as an example) module load cse/gsl/latest

After loading the module, the software package or library will be added to your path.

Module name format for CSE applications is cse/application/version. The latest tag for each module will always link to the latest installed version of that application. For example, to load the latest CSE version of GSL: module load cseinit module load cse/gsl/latest

When loading modules, the version tag can be left off. This will load the newest version of the software, if more than one version exists. For example, to load the CSE version of GSL without using version: module load cseinit module load cse/gsl

More information about the module command can be found in any of the DSRC modules guides. For example:
https://centers.hpc.mil/users/docs/afrl/modulesGuide.html
Additional information can be found at http://modules.sourceforge.net/

Compiler Support

The CSE application stack consists largely of open-source software, most of which was written with the GNU Compiler Collection (GCC) in mind. Because of this, GCC is the primary supported compiler for the CSE application stack.

The Intel compilers contains a GCC compatibility layer that allows for an Intel version of CSE. To load the Intel version of the CSE software stack: module load cseinit-intel

The module names for CSE software are the same whether using the GCC-built cseinit or the Intel-built cseinit-intel environment modules.

Software Explanation

Anaconda

Description

Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS.

NOTE: The number of modules included within the cse/anaconda2 and cse/anaconda3 installs are too numerous to list in the scope of this document. A full list of installed modules can be found by loading either the cse/anaconda2 or cse/anaconda3 module and performing a conda list command.

Usage

module load cse/anaconda2/latest - or -
module load cse/anaconda3/latest

Vendor Links

https://www.anaconda.com/

ARPACK-NG

Description

ARPACK, the ARnoldi PACKage, is a numerical software library written in FORTRAN 77 for solving large scale eigenvalue problems in the matrix-free fashion.

The package is designed to compute a few eigenvalues and corresponding eigenvectors of large sparse or structured matrices, using the Implicitly Restarted Arnoldi Method (IRAM) or, in the case of symmetric matrices, the corresponding variant of the Lanczos algorithm. It is used by many popular numerical computing environments such as SciPy, Mathematica, GNU Octave and MATLAB to provide this functionality.

Usage
module load cse/arpack_ng/latest
Vendor Links

https://github.com/opencollab/arpack-ng

Boost C++

Description

Boost is a set of libraries for the C++ programming language that provide support for tasks and structures such as linear algebra, pseudorandom number generation, multithreading, image processing, regular expressions, and unit testing. It contains over eighty individual libraries.

Usage
module load cse/boost/latest
Vendor Links

http://www.boost.org/

C-Blosc

Description

A blocking, shuffling and loss-less compression library that can be faster than 'memcpy()'.

Usage
module load cse/cblosc/latest
Vendor Links

https://blosc.org/
https://github.com/Blosc/c-blosc

CMake

Description

CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files and generate native makefiles and workspaces that can be used in the compiler environment of your choice. The suite of CMake tools were created by Kitware in response to the need for a powerful, cross-platform build environment for open-source projects such as ITK and VTK.

Usage
module load cse/cmake/latest
Vendor Links

https://cmake.org

Dakota

Description

From: https://dakota.sandia.gov/content/about

The Dakota project delivers both state-of-the-art research and robust, usable software for optimization and UQ. Broadly, the Dakota software's advanced parametric analyses enable design exploration, model calibration, risk analysis, and quantification of margins and uncertainty with computational models. The Dakota toolkit provides a flexible, extensible interface between such simulation codes and its iterative systems analysis methods, which include:

  • optimization with gradient and non-gradient-based methods;
  • uncertainty quantification with sampling, reliability, stochastic expansion, and epistemic methods;
  • parameter estimation using nonlinear least squares (deterministic) or Bayesian inference (stochastic); and
  • sensitivity/variance analysis with design of experiments and parameter study methods.

These capabilities may be used on their own or as components within advanced strategies such as hybrid optimization, surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty.

Usage
module load cse/dakota/latest
Vendor Links

https://dakota.sandia.gov/

Doxygen

Description

Doxygen is the de facto standard tool for generating documentation from annotated C++ sources, but it also supports other popular programming languages such as C, Objective-C, C#, PHP, Java, Python, IDL (Corba, Microsoft, and UNO/OpenOffice flavors), Fortran, VHDL, and to some extent D.

Usage
module load cse/doxygen/latest
Vendor Links

http://www.doxygen.nl/

FFTW

Description

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e., the discrete cosine/sine transforms or DCT/DST).

Usage
module load cse/fftw/latest
Vendor Links

http://www.fftw.org

Flex

Description

Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex. It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers").

Usage
module load cse/flex/latest
Vendor Links

https://github.com/westes/flex

Git

Description

Git is a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Usage
module load cse/git/latest
Vendor Links

https://git-scm.com/

Gnuplot

Description

Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed (i.e., you don't have to pay for it). It was originally created to allow scientists and students to visualize mathematical functions and data interactively but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986.

Usage
module load cse/gnuplot/latest
Vendor Links

http://www.gnuplot.info/

Graphviz

Description

Graphviz is open-source graph visualization software. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. It has important applications in networking, bioinformatics, software engineering, database and web design, machine learning, and in visual interfaces for other technical domains.

Usage
module load cse/graphviz/latest
Vendor Links

https://graphviz.org/

GSL

Description

The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License.

The library provides a wide range of mathematical routines such as random number generators, special functions, and least-squares fitting. There are over 1000 functions in total with an extensive test suite.

The complete range of subject areas covered by the library includes:

Subject Areas Covered
Complex Numbers Roots of Polynomials Special Functions
Vectors and Matrices Permutations Sorting
BLAS Support Linear Algebra Eigensystems
Fast Fourier Transforms Quadrature Random Numbers
Quasi-Random Sequences Random Distributions Statistics
Histograms N-Tuples Monte Carlo Integration
Simulated Annealing Differential Equations Interpolation
Numerical Differentiation Chebyshev Approximation Series Acceleration
Discrete Hankel Transforms Root-Finding Minimization
Least-Squares Fitting Physical Constants IEEE Floating-Point
Discrete Wavelet Transforms Basis splines Running Statistics
Sparse Matrices and Linear
Usage
module load cse/gsl/latest
Vendor Links

http://www.gnu.org/software/gsl

HDF5

Description

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.

Usage

module load cse/hdf5/latest - or -
module load cse/hdf5_no_mpi/latest

Vendor Links

http://www.hdfgroup.org/HDF5/

LAPACK

Description

LAPACK is written in Fortran 90 and provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. The associated matrix factorizations (LU, Cholesky, QR, SVD, Schur, and generalized Schur) are also provided, as are related computations such as reordering of the Schur factorizations and estimating condition numbers. Dense and banded matrices are handled, but not general sparse matrices. In all areas, similar functionality is provided for real and complex matrices, in both single and double precision.

Basic Linear Algebra Subroutine (BLAS) is a de facto application programming interface standard for publishing libraries to perform basic linear algebra operations such as vector and matrix multiplication.

Usage
module load cse/lapack/latest
Vendor Links

http://www.netlib.org/lapack

libpcap

Description

libpcap is an application programming interface (API) for capturing network traffic.

Usage
module load cse/libpcap/latest
Vendor Links

https://www.tcpdump.org/

Mesa

Description

Mesa, also called Mesa3D and The Mesa 3D Graphics Library, is an open-source software implementation of OpenGL, Vulkan, and other graphics API specifications. Mesa translates these specifications to vendor-specific graphics hardware drivers.

Usage
module load cse/mesa/latest
Vendor Links

https://mesa3d.org/

METIS

Description

METIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices. The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes developed at the Karypis Lab at the University of Minnesota.

Usage
module load cse/metis/latest
Vendor Links

http://glaros.dtc.umn.edu/gkhome/metis/metis/overview

NCL & NCAR Graphics

Description

NCL/NCAR Graphics is comprised of:

  • A library containing over two dozen Fortran/C utilities for drawing contours, maps, vectors, streamlines, weather maps, surfaces, histograms, X/Y plots, annotations, and more
  • An ANSI/ISO standard version of GKS, with both C and FORTRAN callable entries
  • A math library containing a collection of C and Fortran interpolators and approximators for one-dimensional, two-dimensional, and three-dimensional data
  • Applications for displaying, editing, and manipulating graphical output
  • Map databases
  • Hundreds of FORTRAN and C examples
  • Demo programs
  • Compilation scripts
Usage
module load cse/ncl/latest
Vendor Links

http://www.ncl.ucar.edu/overview.shtml

NetCDF

Description

NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely distributed collection of data access libraries for C, C++, and FORTRAN. The NetCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.

NetCDF is available in C, C++, and FORTRAN library sets. As each library set is developed and versioned differently by the developer, they are available as different modules.

Usage

module load cse/netcdf/latest - or - module load cse/netcdf_cxx/latest - or - module load cse/netcdf_fortran/latest

Vendor Links

http://www.unidata.ucar.edu/software/netcdf/

Octave

Description

GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to MATLAB so that most programs are easily portable.

Usage
module load cse/octave/latest
Vendor Links

http://www.gnu.org/software/octave

OpenMPI

Description

The Open MPI Project is an open-source Message Passing Interface implementation that is developed and maintained by a consortium of academic, research, and industry partners. Open MPI is therefore able to combine the expertise, technologies, and resources from across the High-Performance Computing community to build the best MPI library available. Open MPI offers advantages for system and software vendors, application developers and computer science researchers.

Usage
module load cse/openmpi/latest
Vendor Links

https://www.open-mpi.org/

PAPI

Description

PAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events.

Usage
module load cse/papi/latest
Vendor Links

http://icl.cs.utk.edu/papi/

PETSc

Description

PETSc is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, shared memory pthreads, and NVIDIA GPUs, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism.

Usage

module load cse/petsc_complex/latest - or - module load cse/petsc_real/latest

Vendor Links

http://www.mcs.anl.gov/petsc

Python

Description

Python is an interpreted, interactive, object-oriented programming language. It is often compared to Tcl, Perl, Scheme or Java.

Python combines remarkable power with very clear syntax. It has modules, classes, exceptions, very high-level dynamic data types, and dynamic typing. There are interfaces to many system calls and libraries, as well as to various windowing systems (X11, Motif, Tk, Mac, MFC). New built-in modules are easily written in C or C++. Python is also usable as an extension language for applications that need a programmable interface.

CSE's Python module, cse/python/latest, provides the following modules:

cse/cython cse/dask cse/dateutil ccse/distributed
cse/docutils cse/dpkt cse/h5py cse/h5py_no_mpi
cse/ipython cse/lineprofiler cse/logilab cse/lzo
cse/matplotlib cse/mercurial cse/mgen cse/mock
cse/mpi4py cse/networkx cse/numexpr cse/numpy
cse/pandas cse/petsc4py_complex cse/petsc4py_real cse/pybindgen
cse/pyblosc cse/pygal cse/pylint cse/pyparsing
cse/pypcap cse/pyqt cse/pyro cse/pytables
cse/pytecplot cse/pytz cse/pyzmq cse/scipy
cse/scons cse/sip cse/skimage cse/sklearn
cse/snappy cse/statprof cse/theano cse/winpdb
cse/wxpython
Usage
module load cse/python/latest
Vendor Links

http://www.python.org/

Python3

Description

Python is an interpreted, interactive, object-oriented programming language. It is often compared to Tcl, Perl, Scheme or Java.

Python combines remarkable power with very clear syntax. It has modules, classes, exceptions, very high-level dynamic data types, and dynamic typing. There are interfaces to many system calls and libraries, as well as to various windowing systems (X11, Motif, Tk, Mac, MFC). New built-in modules are easily written in C or C++. Python is also usable as an extension language for applications that need a programmable interface.

NOTE: CSE's cse/python3 module references the same install as cse/anaconda3. As such, it provides no Python3 modules outside of what is provided with Anaconda3.

Usage
module load cse/python3/latest
Vendor Links

http://www.python.org/

QT

Description

Qt is a free and open-source widget toolkit for creating graphical user interfaces as well as cross-platform applications that run on various software and hardware platforms such as Linux, Windows, macOS, Android or embedded systems with little or no change in the underlying codebase while still being a native application with native capabilities and speed.

Usage
module load cse/qt/latest
Vendor Links

https://www.qt.io/

R

Description

R is a free software environment for statistical computing and graphics.

Usage
module load cse/R/latest
Vendor Links

https://www.r-project.org/

ScaLAPACK

Description

ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. ScaLAPACK solves dense and banded linear systems, least squares problems, eigenvalue problems, and singular value problems.

Usage
module load cse/scalapack/latest
Vendor Links

http://www.netlib.org/scalapack

SCALASCA

Description

Scalasca is a software tool that supports the performance optimization of parallel programs by measuring and analyzing their runtime behavior. The analysis identifies potential performance bottlenecks - particularly those concerning communication and synchronization - and offers guidance in exploring their causes.

Usage
module load cse/scalasca/latest
Vendor Links

http://www.scalasca.org/

SQLite

Description

SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.

Usage
module load cse/sqlite/latest
Vendor Links

https://sqlite.org/

SuperLU

Description

SuperLU is a general-purpose library for the direct solution of large, sparse, non-symmetric systems of linear equations on high performance machines. The library is written in C and is callable from either C or Fortran. The library routines will perform an LU decomposition with partial pivoting and triangular system solves through forward and back substitution. The LU factorization routines can handle non-square matrices but the triangular solves are performed only for square matrices. The matrix columns may be preordered (before factorization) either through library or user supplied routines. This preordering for sparsity is completely separate from the factorization. Working precision iterative refinement subroutines are provided for improved backward stability. Routines are also provided to equilibrate the system, estimate the condition number, calculate the relative backward error, and estimate error bounds for the refined solutions.

Usage
module load cse/superlu/latest
Vendor Links

http://crd-legacy.lbl.gov/~xiaoye/SuperLU

TAU

Description

TAU (Tuning and Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements. All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool based on the Program Database Toolkit (PDT), dynamically using DyninstAPI, at runtime in the Java Virtual Machine, or manually using the instrumentation API.

TAU's profile visualization tool, paraprof, provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools.

Usage

module load cse/tau/latest - or - module load cse/tau_mpi/latest

Vendor Links

http://www.cs.uoregon.edu/research/tau/home.php

Valgrind

Description

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.

Usage
module load cse/valgrind/latest
Vendor Links

http://valgrind.org/

VTK

Description

The Visualization Toolkit (VTK) is open-source software for manipulating and displaying scientific data. It comes with state-of-the-art tools for 3D rendering, a suite of widgets for 3D interaction, and extensive 2D plotting capability.

Usage
module load cse/vtk/latest
Vendor Links

https://vtk.org/

XDMF

Description

XDMF (eXtensible Data Model and Format) is a library providing a standard way to access data produced by HPC codes. Data format refers to the raw data to be manipulated, the description of the data is separate from the values themselves. It distinguishes the metadata (Light data) and the values themselves (Heavy data). Light data is stored using XML, Heavy data is typically stored using HDF5, so some information is stored redundantly in both XML and HDF5.

Usage

module load cse/xdmf/latest - or - module load cse/xdmf2/latest - or - module load cse/xdmf3/latest

Vendor Links

http://www.xdmf.org/index.php/Main_Page

XV

Description

xv is a shareware program written by John Bradley to display and modify digital images under the X Window System.

Usage
module load cse/xv/latest
Vendor Links

http://www.trilon.com/xv/

ZeroMQ

Description

ZeroMQ (also spelled ØMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library, aimed at use in distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ZeroMQ system can run without a dedicated message broker. The library's API is designed to resemble Berkeley sockets.

Usage
module load cse/zeromq/latest
Vendor Links

https://zeromq.org/



All descriptions found on vendors' websites