.. _indra-data-collection: .. _introduction_usaxs: Data Collection =============== .. index:: Indra; USAXS data collection HDF5 data files Nexus files The USAXS-SAXS-WAXS instrument uses three independent devices: Bonse-Hart USAXS, pinhole SAXS, and WAXS. Data collection proceeds on one device at a time, requiring each device to move in and out of the beam position. Two typical strategies are used: * **Static samples:** All samples are measured with USAXS first, then the SAXS device moves in for all samples, then WAXS. This is the most time-efficient sequence. * **Time-resolved / temperature-resolved experiments:** USAXS, SAXS, and WAXS are collected sequentially for each sample before repeating the cycle. The overhead for device movements is larger in this case. Each USAXS and SAXS segment includes a short transmission measurement performed before each scan. Transmission data are included automatically in the metadata. Folder structure ---------------- Data are saved in a specific folder structure inside the base user folder configured by staff. For example:: 12_5_UserName/ DataSetName/ (default name: "data") DataSetName_usaxs/ ... many HDF5 files (.h5) DataSetName_saxs/ ... many HDF5 files (.hdf) DataSetName_waxs/ ... many HDF5 files (.hdf) Users can create as many ``DataSetName`` folders as needed. Each folder can contain many datasets (even thousands). Filenames --------- Sample names are used as HDF5 file names and must meet the following requirements: start with a letter, contain only letters, numbers, and ``_``, and be no more than approximately 40 characters long. The system will automatically remove any unacceptable characters from user input. For time-, pressure-, or temperature-resolved experiments, a control variable value is typically appended to the name — for example, ``_246degC_526min``. An order number guaranteeing uniqueness (e.g., ``_0001``) is appended to each filename automatically. Metadata -------- Each Nexus file contains a large amount of metadata in fields with human-readable names. You can browse this metadata using free tools such as HDFView, Igor Pro, or other HDF5 browsers. The *Metadata Browser* tool in Irena can also quickly extract specific metadata fields. .. index:: Indra; USAXS/SAXS/WAXS HDF5 files HDF5 files HDF5 Nexus data files --------------------- Each data collection segment saves data in a separate HDF5 file. SAXS and WAXS files follow the NXsas definition (https://manual.nexusformat.org/classes/applications/NXsas.html). There is no formal definition for Bonse-Hart USAXS instruments; the USAXS files follow a loose NXsas convention with modified data formats. The Matilda automatic data reduction system (see below) appends reduced data following the NXcanSAS definition (https://manual.nexusformat.org/classes/applications/NXcanSAS.html#nxcansas) into each file. .. index:: Indra; Matilda automatic data reduction .. _automatic_data_reduction: Automatic Data Reduction ("Matilda") ------------------------------------- Since September 2025, a Python script called *Matilda* (https://github.com/APS-USAXS/Matilda) has been running to reduce data automatically for users. Matilda reduces each dataset and appends the results to the raw data file in NXcanSAS format. .. note:: Automatic data reduction may fail due to user errors, code issues, or weak data. **Primary data reduction remains the Indra and Nika packages in Igor Pro.** When in doubt, verify results using Indra and Nika and compare to the automatic output. The results should be close, though minor differences may arise from parameter choices. If non-default reduction parameters are needed, use Igor Pro. Examples of parameters that can only be changed in Igor: calibration method (Matilda uses sample thickness for absolute intensity in cm\ :sup:`2`/cm\ :sup:`3`; Igor supports per-gram calibration using density/weight, or thickness from transmission); number of Q bins (Matilda defaults to 500 for USAXS and 200 for SAXS); and other advanced options. Important — Background data collection strategy ----------------------------------------------- *These requirements must be met for automatic data reduction to work correctly.* 1. An appropriate background measurement ("Blank") must be available in the data folder when sample data are collected. 2. Data reduction proceeds correctly when: a. The **most recent Blank** before each sample measurement is used as its background. b. When a new Blank is collected, it is applied to all subsequent samples. Samples measured before the new Blank was collected use the Blank that preceded them. c. Any measurement with "blank" anywhere in its name (case-insensitive) is treated as a Blank — for example: "AirBlank", "Air Blank", "blank", "Capillary blank". If "blank" does not appear in the name, the measurement is treated as a sample. d. If no measurement has "blank" in its name, automatic data reduction is not performed. e. The Blank must be somewhere in the *user folder*; it does not have to be in the same sample subfolder. f. Samples without a matching Blank are reduced only to QR data (no background subtraction, no calibration) and must be re-reduced in Igor. 3. Sample calibration uses the **thickness** value stored in the file. There is no way to patch this after data collection — thickness must be provided before collecting data. 4. Calibration assumes thickness is the correct method, giving intensity in units of cm\ :sup:`2`/cm\ :sup:`3`. Calibration per weight is not available in Matilda. 5. Data reduction runs automatically within 15 seconds after each dataset finishes collecting. *Example of a proper data collection sequence:* :: Sample set 1 1. newSample("Set1") 1. measure Blank1 2. measure samples belonging to Blank1 (*) 3. measure Blank2 4. measure samples belonging to Blank2 (*) Sample set 2 1. (Optional) RE(newSample("Set2")) 1. Recommended: measure Blank3 2. measure samples belonging to the last Blank measured (*) Sample set 3 ... (*) Collect a new blank every 10–20 samples or every hour for standard resolution (Si 220), or every 5–10 samples or every 30 minutes for high resolution (Si 440). If this is not possible, collect a new blank as soon as convenient. What data reduction Matilda does --------------------------------- 1. Detects a new USAXS/SAXS/WAXS data file on the server after measurement completes. 2. Reads the file name and order number (``_XYZ`` suffix before the extension). 3. Identifies the most recent Blank in the same *userName* folder based on the ``_XYZ`` number. The blank with the closest lower order number is selected. 4. Follows the same reduction path as Igor: a. Reduces both Blank and Sample to Q–Intensity–uncertainty data. b. Calculates transmission and calibration factors: * For USAXS: standard-less calculation from first principles. * For SAXS and WAXS: uses a Glassy Carbon SRM 3600 measurement to determine the calibration constant. c. Subtracts the Blank from the Sample and applies calibration constants, producing USAXS slit-smeared (``SMR``) and SAXS/WAXS (``QRS``) data. d. Rebins data: USAXS flyscans (~8k points) to 500 log-Q bins; SAXS (~800 points) to 200 log-Q bins; WAXS is kept at maximum resolution. e. For USAXS, desmears the data (``DSM``) to produce pinhole-equivalent data suitable for any analysis tool. 5. Appends the reduced data to the raw Nexus file in this order: a. Sample QRS data b. Blank QRS data c. Calibrated data (NXcanSAS format) d. For USAXS: slit-smeared data in NXcanSAS slit-smeared format. Data files — next steps ------------------------ After data collection you have HDF5 files containing both raw (NXsas) and reduced/calibrated (NXcanSAS) data. If Matilda ran correctly and no custom parameters are needed, these files are all you need. If anything needs to be changed or re-reduced, use Igor Pro (Indra for USAXS, Nika for SAXS and WAXS). Igor allows overriding many default parameters, including selecting a different Blank (even measurements without "blank" in their name). Options for using these HDF5 data: 1. Import into Igor using the new USAXS GUI, the HDF5 importer, or native Igor Pro HDF5 handling. See :ref:`Import data `. 2. Use the HDF5 files directly in applications that read NXcanSAS — for example, SasView 6 (https://www.sasview.org/) can open these files and automatically locate the relevant data. 3. Open the files in HDFView to inspect the data structure and metadata. Any application that reads HDF5 (e.g., MATLAB) can import the data. 4. Use Python to read the data for analysis. The NXcanSAS definition describes how to locate data using attributes. The Matilda repository (GitHub) contains example code for reading these files.