Transferring Data
Overview
The bastion host provides secure access to the HPC supercomputers, has limited storage capacity and is not intended for file transfers.
To make transfers to/from HPC, you will need to have logged into your account at least once. If you have not, you may encounter "directory does not exist" errors. This is because your home directory is not created until you log in for the first time. See here about System Access
Files are transferred to shared data storage and not to the bastion node, login nodes, or compute nodes. Because the storage is shared, your files are accessible on all clusters; Puma, Ocelote and Elgato. When you look at the diagram above, you can intuitively see the efficiency of transferring data without additional hops. Keeping mind that the data pipes are wider also, allowing for faster data transmission.
Data Transfers by Size
Small Transfers: For small data transfers the web portal offers the most intuitive method.
Transfers <100GB: we recommend sftp, scp or rsync using filexfer.hpc.arizona.edu.
Transfers (>100GB), transfers outside the university, and large transfers within HPC: we recommend using Globus (GridFTP).
Transfer Software Summary
Software | CLI Interface? | GUI Interface? | Cloud Services | Notes | |||
|---|---|---|---|---|---|---|---|
Google Drive | Amazon Web Services | Box | Dropbox |
| |||
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
|
|
|
|
|
| On Windows, WinSCP is available as a GUI interface. | |
|
|
|
|
|
| Grsync is a GUI interface for rsync for multiple platforms. | |
|
|
|
|
|
| rclone has recently announced they have an experimental GUI. | |
|
|
|
|
|
|
| |
|
|
|
|
|
|
| |
File transfers and SSH Keys
Several of the file transfer methods listed below use authentication based on the SSH protocol, including scp, sftp and rsync. Therefore, adding your SSH Key to the filexfer.hpc.arizona.edu node can allow one to avoid entering passwords when using those methods. See the documentation for adding SSH Keys.
Transfer Applications and Protocol
GridFTP / Globus
NEW: for comprehensive information on using Globus, see: Globus
Overview
GridFTP is an extension of the standard File transfer Protocol (FTP) for high-speed, reliable, and secure data transfer. Because GridFTP provides a more reliable and high performance file transfer (compared to protocols such as SCP or rsync), it enables the transmission of very large files. GridFTP also addresses the problem of incompatibility between storage and access systems. You can read more about the advantages of GridFTP here.
To use GridFTP, we recommend you use Globus. Globus uses endpoints to make transfers.
Endpoints