Published in PC Hardware

MediaTek CorePilot 2.0 speaks CPU, GPU too

by on05 May 2015


Maximizes device performance and power efficiency

MediaTek became definitely a breath of fresh air in the mobile SoC arena a few years back. 

They made a lot of noise with world's first true octa-core processor and the company's Core Pilot 1.0 technology was responsible for balancing performance and battery life. Core Pilot 1.0 was announced in July 2013, along with MediaTek's first mobile System on a Chip (SOC) with Heterogeneous Multi-Processing. We know that other SoC manufacturers, including Qualcomm, support Heterogeneous Multi-Processing too. Guess what, AMD is big on that too.

MediaTek's technology for Heterogeneous Multi-Processing is called CorePilot 1.0. It is designed to maximize device performance and power saving through interactive power management, adaptive thermal management and advanced scheduler algorithms. CorePilot 1.0 was used only to control the CPU cores, but CorePilot 2.0 can combine the computational power of both CPU and GPU.

CorePilot 2.0 adds new advancements in heterogeneous computing including and works with CPUs and GPUs, and MediaTek calls this CPU + GPU heterogeneous computing "Device Fusion". This is an “intelligent” technology which can efficiently execute OpenCL programs by fusing GPU and CPU computing capabilities. Long story short, the Device Fusion and CPU + GPU heterogeneous should make your phone a bit cooler, and in some cases bit faster, while extending your battery life.

The company promises up to 146% performance improvement when compared to using CPU or GPU only architectures. Using CorePilot 2.0 can lower energy consumption of up to 18% when compared to using CPU or GPU only architectures. It also frees programmers from predicting what computing device is best suited to which task. Programmers can focus on the algorithm design and the SoC who will run the code faster, CPU or GPU if not both.  

CorePilot 2.0 with Device Fusion is designed to overcome the limitations caused by OpenCL, the language of choice for heterogeneous computing. OpenCL is designed to serve as the common high-level language for optimizing multiple computing devices but with help of "intelligent technology", MediaTek CorePilot 2.0 can efficiently execute OpenCL programs by fusing GPU and CPU computing capabilities.

Device Fusion is able to flexibly dispatch each kernel (functional part) of an OpenCL program to the most suitable computing device. In case that the program is  GPU favoured OpenCL program, Device Fusion will send it to GPU. If the program is CPU favoured OpenCL program, Device Fusion will send it to CPU. If the code is a throughput oriented program, Device Fusion will send it to both CPU and GPU for parallel processing. We will not get deeper than that, as there is a whitepaper about CorePIlot 2.0 posted at the company's website that can tell you a bit more.

MediaTek has some case study example how CorePilot 2.0 can speed things up. Superresolution is an image-processing algorithm which can enhance image resolution and extract significant detail from the original image but it required significant computation. The algorithm divides the task in several stages: find_neighbor, col_upsample, row_upsample, sub and blur2 and each stage is implemented as an OpenCL kernel. The case study, done by the company, uses three different image resolutions: 1MPixel, 2MPixel and 4MPixel were enlarged by 2.25x using GPU only, CPU only, and Device Fusion (in CorePilot 2.0).

The case study compared this algorithm executed on MediaTek MT 6795 GPU, MT 6795 CPU, unnamed Device Fusion (Probably Helio X20) and Adreno 420 from Qualcomm Snapdragon 805. The case study conveniently failed to use the Adreno 430 on the Snapdragon 810.  

corepilot2

It turns out that MediaTek MT 6795, also known as Helio X10 runs, this algorithm faster than Adreno 420 but it gets smoked by the unnamed Device Fusion device. The MediaTek MT 6795 CPU processes a 4 Megapixel image in about 800 milliseconds, and MT 6795 GPU is slightly faster getting closer to 700 ms. Adreno 420 runs the same test in some 1600 milliseconds, while Device Fusion can do the same task in about 400 ms. This is a significant performance jump no doubt about it.

With Device Fusion and CorePilot 2.0, the SoC can increase the performance of processed megapixels per second by 22 percent on the CPU and a whopping 146 percent on the GPU. At the same time energy consumption drops by 18 percent. It all looks good on paper and we hope to see more about the CorePilot 2.0 and the next generation SoC very soon.

Last modified on 05 May 2015
Rate this item
(4 votes)

Read more about: