Skip to main content
Version: 3.4.0

Icicle C++ Usage Guide

Overview

This guide covers the usage of ICICLE's C++ API, including device management, memory operations, data transfer, synchronization, and compute APIs.

Device Management

note

See all ICICLE runtime APIs in runtime.h

Loading a Backend

The backend can be loaded from a specific path or from an environment variable. This is essential for setting up the computing environment.

#include "icicle/runtime.h"
eIcicleError result = icicle_load_backend_from_env_or_default();
// or load from custom install dir
eIcicleError result = icicle_load_backend("/path/to/backend/installdir", true);

Setting and Getting Active Device

You can set the active device for the current thread and retrieve it when needed:

icicle::Device device = {"CUDA", 0}; // or other
eIcicleError result = icicle_set_device(device);
// or query current (thread) device
eIcicleError result = icicle_get_active_device(device);

Setting and Getting the Default Device

You can set the default device for all threads:

icicle::Device device = {"CUDA", 0}; // or other
eIcicleError result = icicle_set_default_device(device);
caution

Setting a default device should be done once from the main thread of the application. If another device or backend is needed for a specific thread icicle_set_device should be used instead.

Querying Device Information

Retrieve the number of available devices and check if a pointer is allocated on the host or on the active device:

int device_count;
eIcicleError result = icicle_get_device_count(device_count);

bool is_host_memory;
eIcicleError result = icicle_is_host_memory(ptr);

bool is_device_memory;
eIcicleError result = icicle_is_active_device_memory(ptr);

Memory Management

Allocating and Freeing Memory

Memory can be allocated and freed on the active device:

void* ptr;
eIcicleError result = icicle_malloc(&ptr, 1024); // Allocate 1024 bytes
eIcicleError result = icicle_free(ptr); // Free the allocated memory

Asynchronous Memory Operations

You can perform memory allocation and deallocation asynchronously using streams:

icicleStreamHandle stream;
eIcicleError err = icicle_create_stream(&stream);

void* ptr;
err = icicle_malloc_async(&ptr, 1024, stream);
err = icicle_free_async(ptr, stream);

Querying Available Memory

Retrieve the total and available memory on the active device:

size_t total_memory, available_memory;
eIcicleError err = icicle_get_available_memory(total_memory, available_memory);

Setting Memory Values

Set memory to a specific value on the active device, synchronously or asynchronously:

eIcicleError err = icicle_memset(ptr, 0, 1024); // Set 1024 bytes to 0
eIcicleError err = icicle_memset_async(ptr, 0, 1024, stream);

Data Transfer

Copying Data

Data can be copied between host and device, or between devices. The location of the memory is inferred from the pointers:

eIcicleError result = icicle_copy(dst, src, size);
eIcicleError result = icicle_copy_async(dst, src, size, stream);

Explicit Data Transfers

To avoid device-inference overhead, use explicit copy functions:

eIcicleError result = icicle_copy_to_host(host_dst, device_src, size);
eIcicleError result = icicle_copy_to_host_async(host_dst, device_src, size, stream);

eIcicleError result = icicle_copy_to_device(device_dst, host_src, size);
eIcicleError result = icicle_copy_to_device_async(device_dst, host_src, size, stream);

Stream Management

Creating and Destroying Streams

Streams are used to manage asynchronous operations:

icicleStreamHandle stream;
eIcicleError result = icicle_create_stream(&stream);
eIcicleError result = icicle_destroy_stream(stream);

Synchronization

Synchronizing Streams and Devices

Ensure all previous operations on a stream or device are completed before proceeding:

eIcicleError result = icicle_stream_synchronize(stream);
eIcicleError result = icicle_device_synchronize();

Device Properties

Checking Device Availability

Check if a device is available and retrieve a list of registered devices:

icicle::Device dev;
eIcicleError result = icicle_is_device_available(dev);

Querying Device Properties

Retrieve properties of the active device:

DeviceProperties properties;
eIcicleError result = icicle_get_device_properties(properties);

/******************/
// where DeviceProperties is
struct DeviceProperties {
bool using_host_memory; // Indicates if the device uses host memory
int num_memory_regions; // Number of memory regions available on the device
bool supports_pinned_memory; // Indicates if the device supports pinned memory
// Add more properties as needed
};

Compute APIs

Multi-Scalar Multiplication (MSM) Example

Icicle provides high-performance compute APIs such as the Multi-Scalar Multiplication (MSM) for cryptographic operations. Here's a simple example of how to use the MSM API.

#include <iostream>
#include "icicle/runtime.h"
#include "icicle/api/bn254.h"

using namespace bn254;

int main()
{
// Load installed backends
icicle_load_backend_from_env_or_default();

// trying to choose CUDA if available, or fallback to CPU otherwise (default device)
const bool is_cuda_device_available = (eIcicleError::SUCCESS == icicle_is_device_available("CUDA"));
if (is_cuda_device_available) {
Device device = {"CUDA", 0}; // GPU-0
ICICLE_CHECK(icicle_set_device(device)); // ICICLE_CHECK asserts that the api call returns eIcicleError::SUCCESS
} // else we stay on CPU backend

// Setup inputs
int msm_size = 1024;
auto scalars = std::make_unique<scalar_t[]>(msm_size);
auto points = std::make_unique<affine_t[]>(msm_size);
projective_t result;

// Generate random inputs
scalar_t::rand_host_many(scalars.get(), msm_size);
projective_t::rand_host_many(points.get(), msm_size);

// (optional) copy scalars to device memory explicitly
scalar_t* scalars_d = nullptr;
auto err = icicle_malloc((void**)&scalars_d, sizeof(scalar_t) * msm_size);
// Note: need to test err and make sure no errors occurred
err = icicle_copy(scalars_d, scalars.get(), sizeof(scalar_t) * msm_size);

// MSM configuration
MSMConfig config = default_msm_config();
// tell icicle that the scalars are on device. Note that EC points and result are on host memory in this example.
config.are_scalars_on_device = true;

// Execute the MSM kernel (on the current device)
eIcicleError result_code = msm(scalars_d, points.get(), msm_size, config, &result);
// OR call bn254_msm(scalars_d, points.get(), msm_size, config, &result);

// Free the device memory
icicle_free(scalars_d);

// Check for errors
if (result_code == eIcicleError::SUCCESS) {
std::cout << "MSM result: " << projective_t::to_affine(result) << std::endl;
} else {
std::cerr << "MSM computation failed with error: " << get_error_string(result_code) << std::endl;
}

return 0;
}

Polynomial Operations Example

Here's another example demonstrating polynomial operations using Icicle:

#include <iostream>
#include "icicle/runtime.h"
#include "icicle/polynomials/polynomials.h"
#include "icicle/api/bn254.h"

using namespace bn254;

// define bn254Poly to be a polynomial over the scalar field of bn254
using bn254Poly = Polynomial<scalar_t>;

static bn254Poly randomize_polynomial(uint32_t size)
{
auto coeff = std::make_unique<scalar_t[]>(size);
for (int i = 0; i < size; i++)
coeff[i] = scalar_t::rand_host();
return bn254Poly::from_rou_evaluations(coeff.get(), size);
}

int main()
{
// Load backend and set device
icicle_load_backend_from_env_or_default();

// trying to choose CUDA if available, or fallback to CPU otherwise (default device)
const bool is_cuda_device_available = (eIcicleError::SUCCESS == icicle_is_device_available("CUDA"));
if (is_cuda_device_available) {
Device device = {"CUDA", 0}; // GPU-0
ICICLE_CHECK(icicle_set_device(device)); // ICICLE_CHECK asserts that the API call returns eIcicleError::SUCCESS
} // else we stay on CPU backend

int poly_size = 1024;

// build domain for ntt is required for some polynomial ops that rely on ntt
ntt_init_domain(scalar_t::omega(12), default_ntt_init_domain_config());

// randomize polynomials f(x),g(x) over the scalar field of bn254
bn254Poly f = randomize_polynomial(poly_size);
bn254Poly g = randomize_polynomial(poly_size);

// Perform polynomial multiplication
auto result = f * g; // Executes on the current device

ICICLE_LOG_INFO << "Done";

return 0;
}

In this example, the polynomial multiplication is used to perform polynomial multiplication on CUDA or CPU, showcasing the flexibility and power of Icicle's compute APIs.

Error Handling

Checking for Errors

Icicle APIs return an eIcicleError enumeration value. Always check the returned value to ensure that operations were successful.

if (result != eIcicleError::SUCCESS) {
// Handle error
}

This guide provides an overview of the essential APIs available in Icicle for C++. The provided examples should help you get started with integrating Icicle into your high-performance computing projects.