Modern processors (both CPUs and GPUs) have traded-in clock-speed for parallelism. Carefully crafted parallel code can run orders-of-magnitude faster than simple C/C++ implementations. However getting this huge speedup requires extensive knowledge of a variety of hardware and software technologies. Royal Caliber can lend its expertise in HPC to help you maximize performance while you continue to focus on algorithmic innovation. Our core services include:
‣development for GPUs using CUDA or OpenCL
‣algorithm parallelization for multi-core, many-core and multi-node scaling
‣algorithmic analysis and identification of software bottlenecks
‣code optimization using improved memory access, SIMD instruction sets, etc.
‣making hardware recommendations for specific use cases
We have experience working with both server class hardware (x86 CPUs + discrete GPUs) as well as high-end mobile hardware (ARM CPUs + Adreno, Mali, etc.). We have developed highly optimized solutions in a wide array of application domains including:
‣image and video processing
‣computational chemistry and medicine
‣electronic design automation
In addition to HPC code development and code optimization, we also offer customized training sessions to help you maintain or develop high performance code on your own. Get your team over the learning curve with hands-on workshops designed to make you productive as quickly as possible.