Icicle C++ Usage Guide
Overview​
This guide covers the usage of ICICLE's C++ API, including device management, memory operations, data transfer, synchronization, and compute APIs.
Device Management​
See all ICICLE runtime APIs in runtime.h
Loading a Backend​
The backend can be loaded from a specific path or from an environment variable. This is essential for setting up the computing environment.
#include "icicle/runtime.h"
eIcicleError result = icicle_load_backend_from_env_or_default();
// or load from custom install dir
eIcicleError result = icicle_load_backend("/path/to/backend/installdir", true);
Setting and Getting Active Device​
You can set the active device for the current thread and retrieve it when needed:
icicle::Device device = {"CUDA", 0}; // or other
eIcicleError result = icicle_set_device(device);
// or query current (thread) device
eIcicleError result = icicle_get_active_device(device);
Setting and Getting the Default Device​
You can set the default device for all threads:
icicle::Device device = {"CUDA", 0}; // or other
eIcicleError result = icicle_set_default_device(device);
Setting a default device should be done once from the main thread of the application. If another device or backend is needed for a specific thread icicle_set_device should be used instead.
Querying Device Information​
Retrieve the number of available devices and check if a pointer is allocated on the host or on the active device:
int device_count;
eIcicleError result = icicle_get_device_count(device_count);
bool is_host_memory;
eIcicleError result = icicle_is_host_memory(ptr);
bool is_device_memory;
eIcicleError result = icicle_is_active_device_memory(ptr);
Memory Management​
Allocating and Freeing Memory​
Memory can be allocated and freed on the active device:
void* ptr;
eIcicleError result = icicle_malloc(&ptr, 1024); // Allocate 1024 bytes
eIcicleError result = icicle_free(ptr); // Free the allocated memory
Asynchronous Memory Operations​
You can perform memory allocation and deallocation asynchronously using streams:
icicleStreamHandle stream;
eIcicleError err = icicle_create_stream(&stream);
void* ptr;
err = icicle_malloc_async(&ptr, 1024, stream);
err = icicle_free_async(ptr, stream);
Querying Available Memory​
Retrieve the total and available memory on the active device:
size_t total_memory, available_memory;
eIcicleError err = icicle_get_available_memory(total_memory, available_memory);
Setting Memory Values​
Set memory to a specific value on the active device, synchronously or asynchronously:
eIcicleError err = icicle_memset(ptr, 0, 1024); // Set 1024 bytes to 0
eIcicleError err = icicle_memset_async(ptr, 0, 1024, stream);
Data Transfer​
Copying Data​
Data can be copied between host and device, or between devices. The location of the memory is inferred from the pointers:
eIcicleError result = icicle_copy(dst, src, size);
eIcicleError result = icicle_copy_async(dst, src, size, stream);
Explicit Data Transfers​
To avoid device-inference overhead, use explicit copy functions:
eIcicleError result = icicle_copy_to_host(host_dst, device_src, size);
eIcicleError result = icicle_copy_to_host_async(host_dst, device_src, size, stream);
eIcicleError result = icicle_copy_to_device(device_dst, host_src, size);
eIcicleError result = icicle_copy_to_device_async(device_dst, host_src, size, stream);
Stream Management​
Creating and Destroying Streams​
Streams are used to manage asynchronous operations:
icicleStreamHandle stream;
eIcicleError result = icicle_create_stream(&stream);
eIcicleError result = icicle_destroy_stream(stream);
Synchronization​
Synchronizing Streams and Devices​
Ensure all previous operations on a stream or device are completed before proceeding:
eIcicleError result = icicle_stream_synchronize(stream);
eIcicleError result = icicle_device_synchronize();
Device Properties​
Checking Device Availability​
Check if a device is available and retrieve a list of registered devices:
icicle::Device dev;
eIcicleError result = icicle_is_device_available(dev);
Querying Device Properties​
Retrieve properties of the active device:
DeviceProperties properties;
eIcicleError result = icicle_get_device_properties(properties);
/******************/
// where DeviceProperties is
struct DeviceProperties {
bool using_host_memory; // Indicates if the device uses host memory
int num_memory_regions; // Number of memory regions available on the device
bool supports_pinned_memory; // Indicates if the device supports pinned memory
// Add more properties as needed
};
Compute APIs​
Multi-Scalar Multiplication (MSM) Example​
Icicle provides high-performance compute APIs such as the Multi-Scalar Multiplication (MSM) for cryptographic operations. Here's a simple example of how to use the MSM API.
#include <iostream>
#include "icicle/runtime.h"
#include "icicle/api/bn254.h"
using namespace bn254;
int main()
{
// Load installed backends
icicle_load_backend_from_env_or_default();
// trying to choose CUDA if available, or fallback to CPU otherwise (default device)
const bool is_cuda_device_available = (eIcicleError::SUCCESS == icicle_is_device_available("CUDA"));
if (is_cuda_device_available) {
Device device = {"CUDA", 0}; // GPU-0
ICICLE_CHECK(icicle_set_device(device)); // ICICLE_CHECK asserts that the api call returns eIcicleError::SUCCESS
} // else we stay on CPU backend
// Setup inputs
int msm_size = 1024;
auto scalars = std::make_unique<scalar_t[]>(msm_size);
auto points = std::make_unique<affine_t[]>(msm_size);
projective_t result;
// Generate random inputs
scalar_t::rand_host_many(scalars.get(), msm_size);
projective_t::rand_host_many(points.get(), msm_size);
// (optional) copy scalars to device memory explicitly
scalar_t* scalars_d = nullptr;
auto err = icicle_malloc((void**)&scalars_d, sizeof(scalar_t) * msm_size);
// Note: need to test err and make sure no errors occurred
err = icicle_copy(scalars_d, scalars.get(), sizeof(scalar_t) * msm_size);
// MSM configuration
MSMConfig config = default_msm_config();
// tell icicle that the scalars are on device. Note that EC points and result are on host memory in this example.
config.are_scalars_on_device = true;
// Execute the MSM kernel (on the current device)
eIcicleError result_code = msm(scalars_d, points.get(), msm_size, config, &result);
// OR call bn254_msm(scalars_d, points.get(), msm_size, config, &result);
// Free the device memory
icicle_free(scalars_d);
// Check for errors
if (result_code == eIcicleError::SUCCESS) {
std::cout << "MSM result: " << projective_t::to_affine(result) << std::endl;
} else {
std::cerr << "MSM computation failed with error: " << get_error_string(result_code) << std::endl;
}
return 0;
}
Polynomial Operations Example​
Here's another example demonstrating polynomial operations using Icicle:
#include <iostream>
#include "icicle/runtime.h"
#include "icicle/polynomials/polynomials.h"
#include "icicle/api/bn254.h"
using namespace bn254;
// define bn254Poly to be a polynomial over the scalar field of bn254
using bn254Poly = Polynomial<scalar_t>;
static bn254Poly randomize_polynomial(uint32_t size)
{
auto coeff = std::make_unique<scalar_t[]>(size);
for (int i = 0; i < size; i++)
coeff[i] = scalar_t::rand_host();
return bn254Poly::from_rou_evaluations(coeff.get(), size);
}
int main()
{
// Load backend and set device
icicle_load_backend_from_env_or_default();
// trying to choose CUDA if available, or fallback to CPU otherwise (default device)
const bool is_cuda_device_available = (eIcicleError::SUCCESS == icicle_is_device_available("CUDA"));
if (is_cuda_device_available) {
Device device = {"CUDA", 0}; // GPU-0
ICICLE_CHECK(icicle_set_device(device)); // ICICLE_CHECK asserts that the API call returns eIcicleError::SUCCESS
} // else we stay on CPU backend
int poly_size = 1024;
// build domain for ntt is required for some polynomial ops that rely on ntt
ntt_init_domain(scalar_t::omega(12), default_ntt_init_domain_config());
// randomize polynomials f(x),g(x) over the scalar field of bn254
bn254Poly f = randomize_polynomial(poly_size);
bn254Poly g = randomize_polynomial(poly_size);
// Perform polynomial multiplication
auto result = f * g; // Executes on the current device
ICICLE_LOG_INFO << "Done";
return 0;
}
In this example, the polynomial multiplication is used to perform polynomial multiplication on CUDA or CPU, showcasing the flexibility and power of Icicle's compute APIs.
Error Handling​
Checking for Errors​
Icicle APIs return an eIcicleError
enumeration value. Always check the returned value to ensure that operations were successful.
if (result != eIcicleError::SUCCESS) {
// Handle error
}
This guide provides an overview of the essential APIs available in Icicle for C++. The provided examples should help you get started with integrating Icicle into your high-performance computing projects.