Date of Award

9-30-2014

Document Type

Thesis

Degree Name

Computer Science, MS

First Advisor

Hai Jiang

Committee Members

Hung-Chi Su; Jeff Jenness; Xiuzhen Huang

Call Number

LD 251 .A566t 2014 G74

Abstract

To achieve high performance parallel computing, the graphic processing unit (GPU) plays a critical role. NVIDIA invented CUDA as a parallel processing platform and programming model in the late 1990s. With CUDA, we can directly use GPU with C, C++, Fortran, Java or Python code by NVCC compiler. We introduced checkpoint/restart scheme and computation states migration strategy for fault tolerance. Checkpoint/Restart scheme is used to save all the computation state in run-time for later restoration if necessary. Migrating computation state is the process of moving computation states from one heavily loaded host to a lightly loaded host for load balancing and load sharing. This thesis focuses on the implementations of constructing computation states including local variables, execution counter and application-level stack structures in GPU, achieving GPU and CPU communication and migrating computation state from one machine to another through the support of a run-time module.

Rights Management

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Guo, Xinyuan, "GPU Computation Checkpoint/Restart Scheme with Application-Level Stacks" (2014). Student Theses and Dissertations. 784.
https://arch.astate.edu/all-etd/784

Download

Included in

Computer Sciences Commons

COinS

Student Theses and Dissertations

GPU Computation Checkpoint/Restart Scheme with Application-Level Stacks

Date of Award

Document Type

Degree Name

First Advisor

Committee Members

Call Number

Abstract

Rights Management

Recommended Citation

Included in

Search

Discover

Contribute

Resources

Student Theses and Dissertations

GPU Computation Checkpoint/Restart Scheme with Application-Level Stacks

Author

Date of Award

Document Type

Degree Name

First Advisor

Committee Members

Call Number

Abstract

Rights Management

Recommended Citation

Included in

Share

Search

Discover

Contribute

Resources