/
Transferring Data

Transferring Data


THIS SITE IS DEPRECATED

We have transitioned to another service and are no longer actively updating this site.

Refer to our new documentation at: hpcdocs.hpc.arizona.edu

Overview

  • The bastion host provides secure access to the HPC supercomputers, has limited storage capacity and is not intended for file transfers.
  • To make transfers to/from HPC, you will need to have logged into your account at least once. If you have not, you may encounter "directory does not exist" errors. This is because your home directory is not created until you log in for the first time. See here about System Access

Files are transferred to shared data storage and not to the bastion node, login nodes, or compute nodes. Because the storage is shared, your files are accessible on all clusters; Puma, Ocelote and Elgato. When you look at the diagram above, you can intuitively see the efficiency of transferring data without additional hops.  Keeping mind that the data pipes are wider also, allowing for faster data transmission.

Data Transfers by Size

  1. Small Transfers: For small data transfers the web portal offers the most intuitive method.
  2. Transfers <100GB: we recommend sftp, scp or rsync using filexfer.hpc.arizona.edu.  
  3. Transfers (>100GB), transfers outside the university, and large transfers within HPC: we recommend using Globus (GridFTP).
Contents


Transfer Software Summary

SoftwareCLI Interface?GUI Interface?Cloud ServicesNotes
Google DriveAmazon Web ServicesBoxDropbox
Globus(tick)(tick)(tick)(tick)(error)(error)
SFTP(tick)(tick)(error)(error)(error)(error)
SCP(tick)(tick)(error)(error)(error)(error)On Windows, WinSCP is available as a GUI interface.
rsync(tick)(tick)(error)(error)(error)(error)Grsync is a GUI interface for rsync for multiple platforms.
rclone(tick)(question)(tick)(tick)(tick)(tick)rclone has recently announced they have an experimental GUI.
Cyberduck(tick)(tick)(tick)(tick)(tick)(tick)
iRODS(tick)(tick)(error)(error)(error)(error)


File transfers and SSH Keys

Several of the file transfer methods listed below use authentication based on the SSH protocol, including scp, sftp and rsync. Therefore, adding your SSH Key to the filexfer.hpc.arizona.edu node can allow one to avoid entering passwords when using those methods. See the documentation for adding SSH Keys.


Transfer Applications and Protocol

GridFTP / Globus

NEW: for comprehensive information on using Globus, see: Globus

Overview

GridFTP is an extension of the standard File transfer Protocol (FTP) for high-speed, reliable, and secure data transfer. Because GridFTP provides a more reliable and high performance file transfer (compared to protocols such as SCP or rsync), it enables the transmission of very large files. GridFTP also addresses the problem of incompatibility between storage and access systems. You can read more about the advantages of GridFTP here.

To use GridFTP, we recommend you use Globus. Globus uses endpoints to make transfers. 

Endpoints

 Personal Endpoint

Personal Endpoint

If you're trying to use Globus to move files to a local external drive, you may run into a permissions issue. Globus has information on resolving this in their FAQs.

  1. Go to https://www.globus.org/ and click Log In in the top right corner.
  2. In the Use your existing organizational login box, type in or find The University of Arizona and hit Continue.
  3. This will take you to Webauth. Log in as normal.
  4. You will end up at the Globus File Manager web interface.
  5. Choose Collections on the left and download Globus Connect Personal for your operating system.
  6. Type a descriptive name for your local computer under Provide label for future reference and click Allow.
  7. Under Collection Details, use your UArizona email under Owner Identity (this should be the default) and enter a descriptive Collection Name.
  8. You should now be able to find your collection on Globus under Endpoints → Administered by You.


 HPC Endpoint

HPC Endpoint

The endpoint for HPC can be found by searching UA HPC Filesystems under the Collections tab. If you do not see the UA HPC Filesystems collections, uncheck any checked filter in Quick Filters. Select the result UA HPC Filesystems with the subheading Managed Mapped Collection.

 Storage Rental Endpoint

Storage Rental Endpoint

The endpoint for rental storage (found on the filexfer nodes under /rental) can be found by searching UA Rental Storage Filesystem under the Collections tab.

 AWS (Amazon Web Services) Endpoint

AWS S3 Endpoint (UITS Subsidized Tier 2 Storage)

  1. Under the Collections tab, enter UA AWS S3 in the search bar. In the results, you should see the name UA AWS S3 show up with the description Managed Mapped Collection. Click the endpoint's name to proceed.                     
  2. Next, select the Credentials tab. If you are prompted for Authentication/Consent, click Continue
  3. If requested, authenticate by selecting your Arizona email address, then Allow
  4. You will then be returned to the Credentials tab. From there, link to your AWS S3 Bucket by entering your public and private keys in the provided fields


  5. Once you've added your keys, navigate back to the UA AWS S3 collection, go to the Collections tab, and click Add a Guest Collection on the right
  6. Under Create New Guest Collection, click Browse next to the Directory field to find your group's AWS bucket. You will find it under /ua-rt-t2-faculty_netid/ where faculty_netid is the NetID of the faculty member who requested the bucket. Under Display Name, enter a descriptive name that you can use to identify your bucket. Once you've completed the process, click Create Collection.

    If you encounter Authentication/Consent Required after clicking Browse, click Continue, select your university credentials, and click Allow. That should bring you back to the Browse window.

  7. To find and use your new collection, navigate to the Collections tab, go to Shareable By You, and select the name. That will open your collection in the File Manager window allowing you to view the contents and initiate transfers.


SFTP

The intent is that filexfer.hpc.arizona.edu is to be used for most file transfers. SFTP encrypts data before it is sent across the network. Additional capabilities include resuming interrupted transfers, directory listings, and remote file removal. To transfer files with SFTP, you will need to open an SSH v2 compliant terminal and navigate to a desired working directory on your local machine. To access HPC

$ sftp NetID@filexfer.hpc.arizona.edu

You will then be able to move files between your machine and HPC using get and put commands. For example:

sftp> get /path/to/remote/file /path/to/local/directory ### Retrieves file from HPC. Omitting paths will default to working directories.
sftp> put /path/to/local/file /path/to/remote/directory ### Uploads a file from your local computer to HPC. Omitting paths will default to working directories.
sftp> help ### prints detailed sftp usage

FTP/LFTP

Due to security risks, it is not possible to FTP to the file transfer node from a remote machine, however, you may FTP from the file transfer node to a remote machine.

HPC uses the FTP client LFTP to transfer files between the file transfer node and remote machines. This can be done using get and put commands. To use lftp, you must first connect to our file transfer node using an SSH v2 compliant terminal:

$ ssh NetID@filexfer.hpc.arizona.edu

Once connected, you may connect to the external host using the command lftp. For example:

$ lftp ftp.hostname.gov

You will then be able to move files between HPC and the remote host using get and put commands. For example:

> get /path/to/remote/file /path/to/local/directory ### retrieves file from remote host
> put /path/to/local/file /path/to/remote/directory ### Uploads file from HPC to remote host

For more information on LFTP, see their official documentation.


SCP

SCP uses Secure Shell (SSH) for data transfer and utilizes the same mechanisms for authentication, thereby ensuring the authenticity and confidentiality of the data in transit.

 Mac/Linux

Mac/Linux

You will need to use an SSH v2 compliant termin