BLAS-like Library Instantiation Software Framework
-
Updated
Aug 20, 2025 - C
BLAS-like Library Instantiation Software Framework
Acceleration package for neural networks on multi-core CPUs
(WIP) A small but powerful, homemade PyTorch from scratch.
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
CEED Library: Code for Efficient Extensible Discretizations
HermitCore: A C-based, lightweight unikernel
PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural fe…
CUDA bindings for Ruby
High-accuracy SIMD sin/cos/sincos library in C with AVX2, AVX-512, and NEON support. Full-range reduction. Fast at scale. Portable by design.
Massive-Parallel Trajectory Calculations (MPTRAC) is a Lagrangian particle dispersion model for the analysis of atmospheric transport processes in the free troposphere and stratosphere.
A playground to build C/C++/Go/Fortran applications on top of RustyHermit
A Flexible Storage Framework for HPC
best CPU/GPU sparse solver for large sparse matrices
DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System
Solution of the telegram ML competition 2023
The Juelich Rapid Spectral Simulation Code (JURASSIC) is a fast infrared radiative transfer model for the analysis of atmospheric remote sensing measurements.
This repository contains an MPI program written in C that calculates the Riemann zeta function and evaluates its performance using MPI collective communication functions. The program approximates the value of ζ(3) with a given value of `s` (3 - Apéry's constant). It calculates the runtime, speedup, and efficiency for different numbers of processes.
Add a description, image, and links to the high-performance-computing topic page so that developers can more easily learn about it.
To associate your repository with the high-performance-computing topic, visit your repo's landing page and select "manage topics."