Storage


New HPC Documentation Website!

New documentation is coming that will replace our current Confluence website (the one you're viewing right now). We will be sending an announcement on when the site will go live. Interested in taking a peek? Check out this page for the beta version. Note: the URL is likely to change.


New GPUs on Ocelote!

We have recently added 22 new P100 GPUs to Ocelote. Need to request multiple GPUs on a node and you're finding Puma queue times too slow? You can now request two GPUs per node on Ocelote using --gres=gpu:2.



Where Should I Store My Data?

  1. Data undergoing active analyses should be stored in HPC's local High Performance Storage.

  2. Large amounts of data not requiring immediate access from our HPC compute nodes can be stored at reasonable rates on our Rental Storage.

  3. RDAS is a research data service which supports the mounting of SMB shares. The supported operating systems are MacOS, Linux, and Windows. It provides 5TB of free storage. 

  4. Research data not requiring immediate access should be stored in General Research Data Storage (Tier 2)
    For example:
    1. Large datasets where only subsets are actively being analyzed
    2. Results no longer requiring immediate access
    3. Backups (highly encouraged!)

  5. Data that require HIPAA-compliance can be stored on Soteria (currently in the pilot phase).
Detailed Storage Documentation

  • HPC High Performance StorageThe University’s Research Data Center provides data storage for active analysis on the high-performance computers (HPCs). Your storage is mounted as a filesystem and all the clusters have access to the same filesystems.
  • Rental StorageWe offer a rental storage solution on a storage array housed in the Research Data Center. This array is mounted on our data transfer nodes (filexfer.hpc.arizona.edu). Rental storage is not mounted on HPC compute or login nodes.
  • Tier 2 StorageResearch Technologies in partnership with UITS has implemented an AWS rental storage solution. Buckets can be requested through the user portal and are intended for archival storage and data not requiring immediate access from the user.
  • Research Desktop Attached StorageR-DAS provides up to 5 TB of no-cost storage capacity for each PI group. Our requirement was to enable our users to easily share data with other research group members. You can treat the allocation as a drive mounted on your local computer. R-DAS is intended for storing open research data, but not controlled or regulated data.
  • Google DriveThis offering is deprecated.

Storage Option Summary


PurposeCapacityCostRestricted data?AccessDurationBackup
Primary StorageResearch data. Supports compute. Directly attached to HPC/home 50GB /groups 500GB /xdisk 20TBFreeNot for restricted dataDirectly mounted to HPC. Also uses Globus and DTNsLong term. Aligns with HPC purchase cycleNo
R-DASResearch Desktop Attached Storage - SMB shares5 TBFreeNot for restricted dataMounted to workstations as sharesLong termNo
Rental StorageResearch data. Large datasets. Typically for staging to HPCRented per Terabyte per yearRental rate: $47.35 per TB per yearNot for restricted dataUses Globus and DTNs. Copy data to PrimaryLong term. Aligns with HPC purchase cycleNo
Tier 2Typically research data. Unused data is archived15GB to TB'sTier-based system. First 1TB of active data and archival data are free. Active data > 1TB is paid.Not for restricted dataUses Globus and AWS command line interfaceTypically long term since use of Glacier is free and slowArchival
ReDataResearch data. Managed by UA LibrariesQuota systemFreeNot for restricted dataLogin and fill out fields, then uploadLonger than 10 yearsNo

Soteria
HIPAA

Secure data enclaveIndividual requestsFree upon qualificationRestricted data; HIPAA, ePHIHIPAA training required, followed by request processLong termNo
BoxGeneral data50GBFreeNot for restricted dataBrowserLong termNo
Google DriveGeneral data15GBFree. Google rates for amounts > 15GBNot for restricted dataBrowserUnlimited usage expires March 1, 2023No

NIH Data Management and Sharing Policy

The NIH has issued a new data management and sharing policy, effective January 25, 2023. The university libraries now offers a comprehensive guide for how to navigate these policies and what they mean for you.

What's new about the 2023 NIH Data Management and Sharing Policy?

Previously, the NIH only required grants with $500,000 per year or more in direct costs to provide a brief explanation of how and when data resulting from the grant would be shared.

The 2023 policy is entirely new. Beginning in 2023, ALL grant applications or renewals that generate Scientific Data must now include a robust and detailed plan for how you will manage and share data during the entire funded period. This includes information on data storage, access policies/procedures, preservation, metadata standards, distribution approaches, and more. You must provide this information in a data management and sharing plan (DMSP). The DMSP is similar to what other funders call a data management plan (DMP).

The DMSP will be assessed by NIH Program Staff (though peer reviewers will be able to comment on the proposed data management budget). The Institute, Center, or Office (ICO)-approved plan becomes a Term and Condition of the Notice of Award.