Automated slice‐specific z‐shimming for functional magnetic resonance imaging of the human spinal cord

Abstract Functional magnetic resonance imaging (fMRI) of the human spinal cord faces many challenges, such as signal loss due to local magnetic field inhomogeneities. This issue can be addressed with slice‐specific z‐shimming, which compensates for the dephasing effect of the inhomogeneities using a slice‐specific gradient pulse. Here, we aim to address outstanding issues regarding this technique by evaluating its effects on several aspects that are directly relevant for spinal fMRI and by developing two automated procedures in order to improve upon the time‐consuming and subjective nature of manual selection of z‐shims: one procedure finds the z‐shim that maximizes signal intensity in each slice of an EPI reference‐scan and the other finds the through‐slice field inhomogeneity for each EPI‐slice in field map data and calculates the required compensation gradient moment. We demonstrate that the beneficial effects of z‐shimming are apparent across different echo times, hold true for both the dorsal and ventral horn, and are also apparent in the temporal signal‐to‐noise ratio (tSNR) of EPI time‐series data. Both of our automated approaches were faster than the manual approach, lead to significant improvements in gray matter tSNR compared to no z‐shimming and resulted in beneficial effects that were stable across time. While the field‐map‐based approach performed slightly worse than the manual approach, the EPI‐based approach performed as well as the manual one and was furthermore validated on an external corticospinal data‐set (N > 100). Together, automated z‐shimming may improve the data quality of future spinal fMRI studies and lead to increased reproducibility in longitudinal studies.


| INTRODUCTION
The spinal cord is one of the key structures linking the brain with the peripheral nervous system and participates in numerous sensory, motor and autonomic functions (Hochman, 2007). Noninvasive approaches to investigate the human spinal cord are therefore of great interest not only from a basic neuroscientific perspective, but also with regards to their possible clinical utility in order to understand pathological mechanisms in motor and sensory disorders such as multiple sclerosis and chronic pain . Currently, the main approach to investigate spinal cord function is based on blood-oxygen-level-dependent functional magnetic resonance imaging (BOLD fMRI; for reviews see Giove et al., 2004;Stroman et al., 2014;Summers et al., 2014;Cohen-Adad, 2017). Using conventional BOLD fMRI techniques such as gradient-echo echo-planar imaging (GE EPI) is however challenging in the spinal cord due to (i) its small cross-sectional diameter, (ii) prominent physiological noise from cardiac and respiratory sources, and (iii) magnetic field inhomogeneities.
In the cervical spinal cord (i.e., the part that is easiest to access with currently available receive coils at 3 T), inhomogeneities in the magnetic field occur at both large and small spatial scales (Cohen-Adad, 2017). While large-scale variations are for example due to the proximity of the lungs (and can thus vary dynamically; e.g., Verma & Cohen-Adad, 2014), small-scale variations are due to the interfaces between vertebrae and connective tissue, which have different magnetic susceptibilities (Cooke et al., 2004;Finsterbusch et al., 2012).
These small-scale field inhomogeneities are reproduced spatially along the superiorinferior axis of the spinal cord and significantly affect image quality, leading to consistent patterns of signal loss Maieron et al., 2007). While it would thus be imperative for reliable and reproducible fMRI of the spinal cord to mitigate these effects, standard shimming techniques implemented on common whole-body MR systems are not able to compensate these spatially repeating inhomogeneities to an adequate degree (Finsterbusch, 2014).
One method that is commonly employed to overcome throughslice dephasing is slice-specific "z-shimming" (Constable, 1995;Frahm et al., 1988;Glover, 1999) where an additional gradient pulse is applied in the slice-selection direction in order to compensate the effect of susceptibility-induced gradients and resulting signal loss. In the brain, z-shimming has been applied in GE EPI studies focused on susceptibility-prone regions, that is, those that are close to air/bone interfaces such as the orbitofrontal, the medial temporal, and the inferior temporal cortex (Deichmann et al., 2003;Posse et al., 2003;Weiskopf et al., 2006;Yang et al., 1997). Finsterbusch et al. (2012) investigated whether one could use this approach to also compensate for the periodically occurring signal drop-outs (along the superiorinferior axis) on T2*-weighted GE EPI images of the spinal cord. By applying single, slice-specific compensation momentswhich were manually determined based on a reference-scan acquired prior to the experimental EPI acquisitionthey were able to demonstrate an improvement in spinal cord image quality: reducing the spatially repeating signal drop-outs via slice-specific z-shimming resulted in an increase of mean signal-intensity by $20% and a reduction of signalintensity variability along the cord by $80%.
While the slice-specific z-shimming protocol developed by Finsterbusch and colleagues has already been used in numerous spinal (e.g., Sprenger et al., 2012;Geuter & Buchel, 2013;Kong et al., 2014;van de Sand et al., 2015;Eippert et al., 2017;Sprenger et al., 2018) and cortico-spinal fMRI studies (e.g., Oliva et al., 2022;Sprenger et al., 2015;Tinnermann et al., 2017;Vahdat et al., 2020), the impact of slice-specific z-shimming on EPI time-series data has not been investigated systematically, as Finsterbusch and colleagues only evaluated its effects on single volumes of GE EPI data, but not on timeseries metrics such as tSNR (Welvaert & Rosseel, 2013). Even more importantand already argued for by Finsterbusch and colleagueswould be an automated way to determine the slice-specific z-shims, as these are currently determined manually by the scanner operator: either visually by going through each slice and z-shim value obtained in a reference-scan or by manually placing a region of interest on each slice of this reference scan and evaluating the extracted signal intensity. This procedure is time-consuming, requires expertise in judging the quality of spinal EPI data, and contains a subjective component, thus also limiting its potential in terms of reproducibility.
In this study, we aim to develop an automated and user-friendly procedure for determining slice-specific z-shims in order to improve the quality of spinal fMRI. In a first step, we aim to replicate the results of Finsterbusch et al. using twice the original sample size (N = 48). Next, we aim to extend their findings by probing the relevance of slice-specific z-shimming for fMRI through investigating its effects (a) across different echo times, (b) in distinct anatomical regions, and (c) on a time-series metric (tSNR). Most importantly, we propose two different automated methods for determining slicespecific z-shims (each based on a sample size of N = 24). The first method is based on a z-shim reference-scan acquisition and determines z-shim values by analyzing EPI signal intensity within the spinal cord for each combination of slice and z-shim value. The second method is based on a field map (FM) acquisition and determines zshim values by estimating the strength of the gradient field needed to compensate for the local through-slice inhomogeneity for each slice.
In a final step, we use an independently-acquired external data-set (N > 100; Oliva et al., 2022) in order to validate our candidate approach for automating the selection of slice-specific z-shims.

| Participants
48 healthy participants (22 females, mean age: 27.17 years, range 20-37 years) participated in this study. All participants provided written informed consent and the study was approved by the ethics committee at the Medical Faculty of the University of Leipzig. The sample size was determined based on a study by Finsterbusch et al. (2012): as we wanted to replicate and extend their findings (which were based on a sample of N = 24), we chose the same sample size for each of our two subgroups, resulting in an overall sample size of N = 48.

| Study design
All participants underwent the following scans in the order described below (for details of scans, see section "2.3 Data acquisition").
After an initial localizer scan, the EPI slice stack and the adjust volume were prescribed and a single EPI volume was acquired in order to initialize the scanner's "Advanced shim mode"this shim was then employed in all the following EPI acquisitions by using the same adjust volume. The adjust volume had the same angle as the EPI acquisitions, was centered on the spinal cord in leftright and anteriorposterior directions and had a volume of 60 Â 60 Â 130 mm (thus extending 5 mm in superior and inferior direction beyond the EPI slice-stack). An EPI z-shim reference scan was performed next in order to allow for the manual as well as EPI-based automated selection of the optimal zshim moment for each slice. Two sagittal FMs (vendor-based and inhouse versions, respectively) were then acquired to obtain the B 0 static magnetic field distribution, of which the vendor-based one was used for the FM based automated z-shim selection due to it being widely available. This was followed by the acquisition of a highresolution T2-weighted image in order to allow for spinal cord segmentation as needed for the FM based automated z-shim selection.
In order to compare the signal characteristics under different zshimming conditions, EPI data were acquired with three different EPI protocols for each participant: without z-shim gradient compensation (condition "no z-shim"), with z-shim gradient compensation based on manual z-shim selection (condition "manual z-shim"), and with z-shim gradient compensation based on automated z-shim selection (condition "automated z-shim"). For one-half of the participants (24 participants), the automated selection was based on the EPI reference scan, whereas for the other half, the automated selection was based on the vendor-based FM. Both single EPI volumes , as well as 250 EPI volumes (in order to assess effects on time-series data), were acquired for each condition; the order of the EPI scans under different conditions was pseudo-randomized across participants.
We also wanted to assess the benefits of slice-specific zshimming at different echo times (TE), and therefore acquired 25 EPI volumes under three different TEs (30, 40, and 50 ms, each with a repetition time [TR] of 2552 ms) for each of the three conditions (please note that the z-shim indices chosen reflect gradient fields to be compensatedrather than moments of the compensation gradient pulseand thus scale the pulsed gradient moment with the TE such that a determined index is valid for all TEs). The order of the EPI scans acquired with different TEs were also pseudo-randomized across participants.
The EPI reference scan and the in-house FM acquisitions were repeated at the end of the scanning session in order to assess the stability of z-shimming across time.

| Data acquisition
All measurements were performed on a 3 T whole-body Siemens Prisma MRI System (Siemens, Erlangen, Germany) equipped with a whole-body radio-frequency (RF) transmit coil and 64-channel RF head-and-neck coil and a 32-channel RF spine-array, using the head coil element groups 5-7, the neck coil element groups 1 and 2, and spine coil element group 1 (all receive-only).
EPI acquisitions were based on the z-shim protocol developed by Finsterbusch et al. (2012) that employed a single, slice-specific gradient pulse for compensating through-slice signal dephasing. EPI volumes covered the spinal cord from the second cervical vertebra to the first thoracic vertebra and were acquired with the following parameters: slice orientation: transverse oblique; number of slices: 24; slice thickness: 5 mm; field of view: 128 Â 128 mm 2 , in-plane resolution: 1 Â 1 mm 2 ; TR: 2312 ms; TE: 40 ms; flip angle: 84 (chosen based on our repetition time of 2312 ms and the cervical cord gray matter T1 estimate of $1000 ms at 3 T; Smith et al., 2008); GRAPPA acceleration factor: 2; partial Fourier factor: 7/8, phase-encoding direction: anterior-to-posterior (AP), echo spacing: 0.93 ms, bandwidth per pixel: 1220 Hz/Pixel; additionally, fat saturation was employed. The EPI reference scan (TE: 40 ms, total acquisition time: 55 s) was acquired with 21 equidistant z-shim moments compensating field inhomogeneities between +0.21 and À 0.21 mT/m (in steps of 0.021 mT/m).
The vendor-based FM (total acquisition time: 4.31 min) was obtained using the 2D GRE sequence provided by Siemens with two echoes per shot (TE 1: 4.00 ms; TE 2: 6.46 ms; slice orientation: sagittal (parallel to the normal vector of the axial EPI slices); slice number: 32; slice thickness: 2.2 mm; field-of-view: 180 Â 180 mm 2 ; in-plane resolution: 1 Â 1 mm 2 ; TR: 500 ms; flip angle: 50 , bandwidth per pixel of 1030 Hz/pixel). Additionally, an in-house FM based on a 3D multi-echo FLASH sequence with multiple gradient echoes acquired at short inter-TEs was acquired, which yielded a superior signal-to-noise ratio at a reduced overall scan time. This contained 12 bipolar gradient echoes (which allowed for shorter inter-echo spacings; note that potential image shifts were avoided by a multi-echo navigator scan without phase encoding right at the start of image acquisition; a phase correction between the odd and even echoes was performed by the vendor's Ice reconstruction pipeline), a TE increment/difference of 1.3 ms, fat suppression RF pulses with corresponding spoiler gradients before each slabselective excitation, a repetition time of 32 ms, a flip angle of 15 , bandwidth per pixel of 1030 Hz/pixel, and sagittal slice orientation (parallel to the normal vector of the axial EPI slices). The in-plane and partition resolutions of this in-house FM were 1 Â 1 mm 2 and 2.2 mm, respectively, with corresponding fields-of-view of 180 Â 180 Â 70.4 mm 3 . A total scan time of less than 2 min was achieved by the application of GRAPPA (an acceleration factor of 2 was used in PE dimension). The frequency offset Δν 0 in each voxel was extracted from a linear fit to the unwrapped phases of all echoes (unwrapping of phase jumps exceeding +/À Pi was performed using a simple algorithm; due to the employed short echo and inter-echo times, this unwrapping could be applied because problems of noisy phase jumps or an undersampling of the phase evolution were largely absent).
2.4 | Selection of slice-specific z-shim moments

| Manual selection
The researcher carrying out the data acquisition (MK) determined the z-shim moment with the highest signal intensity in the spinal cord for each slice by visual inspection (i.e., for each of the 24 slices, the researcher looked at all 21 volumeseach volume reflecting an acquisition with one z-shim momentin order to determine the "optimal" z-shim moment for each slice). This selection process took $10 min per participant and was carried out for all 48 participants, that is, in both subgroups of 24 participants.  Jenkinson et al., 2012; https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) were employed to determine the optimal z-shim moment for each slice.
These values were then sent back to the scanner console in a text file that is read by the z-shim sequence. An overview of the automated methods is given in Figure 1 (please note that the z-shim selection process is automated and does not require any input from the user).

EPI-based selection
In a subsample of 24 participants, the EPI z-shim reference-scan was used to determine the optimum z-shim moments. The EPI z-shim reference-scanconsisting of 21 volumes (each volume (a) (b) F I G U R E 1 Schematic depiction of automated z-shim methods. After the acquisition of the necessary scans for each method (z-shim reference EPI for EPI-based approach, T2-weighted image and field map for field map based approach), DICOM images were exported to an online calculation computer, and converted to NIfTI format before further processing. (a) EPI-based selection. The z-shim reference scan was then averaged across its 21 volumes (one volume per z-shim moment; three volumes are depicted here as mid-sagittal sections to illustrate the varying signal loss) and the resulting mean image was segmented (the segmentation is shown here as a transparent red overlay for display purposes). The mean signal intensities for each slice and z-shim moment were extracted from the segmented cord, resulting in a 24 Â 21 signal intensity matrix (slicesÂvolumes). For each slice, the z-shim value (i.e., the corresponding index in the reference scan) resulting in the maximum intensity was selected. (b) Field map based selection. A high-resolution T2-weighted image was segmented and used to determine the field map voxels to be included in the fitting procedure (the segmentation is shown as a transparent red overlay for display purposes). The gray and the black boxes depict the EPI coverage on the T2-weighted image and field map, respectively. Voxels within a 9 mm thick slab (i.e., nine transversal field map slices, corresponding to a 5 mm EPI slice +2 mm on each side) were included in a slice-wise fitting procedure. The green lines on the field map indicate the input volume for fitting an exemplary target slice (dashed green line). Exemplary transversal slices are also shown, with the red line outlining the spinal cord. Slice-wise fitting, including three linear field coefficients (G x , Gy, and G z ) along the main axes of the imaging volume and a spatially homogenous field term (field offset), was repeated over slices and the z-shim (G z ) moments corresponding to the center of the EPI slices were selected corresponding to one z-shim moment) with 24 slices eachwas then averaged over the 21 volumes (i.e., over all z-shim moments) and the resulting mean image was automatically segmented using the PropSeg approach implemented in SCT (De Leener et al., 2014). Based on experience from pilot experiments, we built in several fail-safes (i.e., systematically changing the arguments of SCT's PropSeg function that affect the propagation in the z-direction) in order to ensure that the segmentation would propagate across the entire slice stack; this possibility to automatically adjust parameters in case of failure was also the reason thatout of SCT's segmentation algorithmswe chose PropSeg instead of DeepSeg. We used the mean image for segmentation because we wanted to ensure that image quality was sufficient for automatic segmentation of the spinal cord and because the averaging of volumes acquired during different breathing cycles avoids a bias towards one respiratory state as could occur with single volumes. In post hoc investigations regarding the suitability of using the mean EPI image for segmentation, we (i) used a maximum image instead of a mean image as the input for segmentation and (ii) used a segmentation obtained from the T2-weighted image (registered to the EPI segmentation), but both of these alternative approaches resulted in highly similar results compared to our original approach (data not shown). Using the automatically generated spinal cord mask, the mean signal intensities for each slice and z-shim moment were extracted, resulting in a 24 Â 21 matrix, from which the z-shim moment yielding the maximum intensity across the cord mask was determined for each slice. The average run-time for the execution of the selection code was 15.6 s (range across the entire sample: 7.7-62.3 s), with the variation mostly being due to the number of PropSeg runs needed to achieve complete propagation. The interested reader can assess the quality of the EPI-based spinal cord segmentation via a quality-control HTML-report shared together with our data-set (see section 2.8).

FM-based selection
In another subsample of 24 participants, sagittal FMs (acquired with the same angulation as EPI data) were used to determine the optimum z-shim moments; note that FMs had anisotropic voxels, as (i) a high in-plane resolution of the sagittal FM is necessary in order to obtain sufficient information about the gradient in the through-slice direction of the EPI (i.e., foot-head) and (ii) the leftright direction (where voxels were largest) is expected to have the least field variation and is thus least sensitive to resolution. First, a spinal cord mask was generated via a PropSeg-based automatic segmentation of each participant's T2-weighted image because a high-quality segmentation of the FM magnitude image was not possible due to the sagittal slice thickness of 2.2 mm as well as the poor image contrast between spinal cord and cerebrospinal fluid (note that since the T2-weighted image and FM were well aligned and acquired right after each other we did not carry out a separate registration step). FM based (from now on referred to as FM-based) z-shim moments were then calculated for each EPI slice using a linear least-squares fit of a set of spatial basis functions to the measured FM (which was smoothed with an isotropic 1 mm Gaussian kernel prior to the calculation). The spatial basis functions consisted of three linear field terms along the main imaging axes and a spatially homogenous field term, representing a field offset (although obtaining x-and y-gradients is not necessary for calculating the through-slice field component, their inclusion can be seen as a step towards full slice-wise shimming [see also Islam et al., 2019] and obtaining y-gradients is necessary for determining the effective TE [see below]). Only voxels within the spinal cord mask contributed to the fitting procedure, which included voxels within a 9 mm thick slab (i.e., nine transversal FM slices) centered on the center of the corresponding EPI slice. The slab was chosen to be thicker than the EPI slice (i.e., an additional 2 mm either side) in order to give more robust estimates of the through-slice field gradient. The fitted through-slice linear field term (G z ) was taken to represent the local field gradient causing through-slice signal dephasing within the corresponding EPI slice. The resulting dephasing gradient moment of G z Á TE was rounded to the nearest of the 21 z-shim compensations available in the EPI protocol and then used for subsequent EPI acquisitions. The average time for the execution of the selection code was 36.1 s (range across the entire sample: 31.5-53.3 s).

| Preprocessing
All images were visually inspected before the analysis for potential artefacts. Preprocessing steps were performed using MATLAB (version 2021a), FSL (version 6.0.3), and SCT (version 4.2.2; please note that a more recent version of SCT was used for preprocessing (4.2.2) compared to the automated analysis during data acquisition (3.2.7), due to the availability of releases at the respective times). The reason we carried out preprocessing steps and did not work only on the raw data is two-fold: (i) we were interested in z-shim effects on timeseries metrics (tSNR) and thus needed to motion-correct the EPI timeseries data and (ii) we were performing most analyses in template space and thus need to bring structural and functional data to this space (requiring segmentation and registration-to-template steps). We only carried out the necessary amount of preprocessing (motion correction and registration to template space) needed to assess the effects of slice-specific z-shimming and thus refrained from performing additional steps such as spatial smoothing and physiological noise correction. Please note thatdepending on contextwe are using the terms "fMRI data" and "EPI time-series data" interchangeably.

| Motion-correction of EPI time-series data
A two-step motion correction procedure (with spline interpolation) was applied to the EPI time-series data. Initially, the mean of 750 volumes (250 volumes under each of the three different conditions, that is, no z-shimjmanual z-shimjautomated z-shim) was calculated in order to serve as the target image for the first step of motion correction; averaging across all three conditions eliminates a bias towards any one condition with respect to the target image. Based on this mean image, the spinal cord was automatically segmented in order to provide a spinal cord centerline that then served as input for creating a cylindrical mask (with a diameter of 30 mm). This mask was employed during the motion-correction procedure in order to ensure that regions moving independently from the cord would not adversely affect motion estimation. Slice-wise motion correction with a second degree polynomial regularization in the z-direction was then performed (De Leener et al., 2017). In the second step, a new target image was obtained by calculating the mean of motion-corrected images from the first step and the raw images were realigned to this new target image, using the identical procedure as described above.
Please note that the data obtained under different TEs (25 images per TE and condition) were also registered to this target image using the same procedure.
Under the "no z-shim" condition, especially the inferior slices suffered from severe signal drop-outs that hampered the quality of the slice-wise motion correction algorithm by inducing "artificial" movements that were indeed not present in the raw data. This could impact the tSNR calculation negatively by artificially increasing the standard deviation across time and thus give an inflated estimate of the beneficial effects of z-shimming. Therefore, in a control analysis, we also performed a "censoring" of outlier volumes before the tSNR calculation. The outlier volumes were defined using dVARS (the root mean square difference between successive volumes) and refRMS (root mean square intensity difference of each volume to the reference volume) as metrics using FSL's "fsl_motion_outliers" tool. Volumes presenting with dVARS or refRMS values two standard deviations above the mean values of each run were selected as outliers. These outlier volumes were then individually modelled as regressors of no interest.

| Segmentation
T2-weighted images were initially segmented using the DeepSeg approach implemented in SCT (Gros et al., 2019). This initial segmentation was used for smoothing the cord along its centerline using an anisotropic kernel with 8 mm sigma. The smoothed image was again segmented in order to improve the robustness of segmentation. The quality of the segmentations was assessed visually and further manual corrections were not deemed to be necessary in any participant.
For functional images, a manual segmentation was used instead of an automated procedure, as the registration to template space relied on segmentations and we therefore aimed to make this preprocessing step as accurate as possible. For the single-volume EPIs, the single volumes under the three different z-shimming conditions were averaged and this across-condition mean image was used to manually draw a spinal cord mask. For the EPI time-series, all motion-corrected volumes were averaged and a spinal cord mask was manually drawn based on this mean image (please note that this mask was also used for the normalization of the volumes with different TEs). These manually drawn masks were also used to calculate results in native space.

| Registration to template space
SCT was utilized for registering the EPI images to the PAM50 template space (De Leener et al., 2018); PAM50 is an MRI template of the spinal cord and brainstem available in SCT for multiple MRI contrasts.
The T2-weighted image of each participant was brought into template space using three consecutive registration steps: (i) using the spinal cord segmentation, the spinal cord was straightened, (ii) the automatically determined labels of vertebrae between C2 and C7 (manually corrected where necessary) were used for vertebral alignment between the template and the individual T2-weighted image, and (iii) the T2-weighted image was registered to the template using nonrigid segmentation-based transformations.
In order to bring the functional images to template space, the template was registered to the functional images using nonrigid transformations (with the initial step using the inverse warping field obtained from the registration of the T2-weighted image to the template image). The resulting inverse warping fields obtained from this registration (from native EPI space to template space) were then applied to the respective functional images (e.g., single EPI volumes, mean EPI volume, tSNR maps) to bring them into template space where statistical analyses were carried out.
Finally, we also brought each participant's FM into template space in order to visualize the average B 0 field variation across participants.
Each participant's FM was first resampled to the resolution of the T2-weighted image before the warping field obtained from the registration to template space was applied to the FM.

| EPI signal extraction
In order to assess the effects of z-shimming, we obtained signal intensity data from each EPI slice. When analyses were carried out in native space and were based on the entire spinal cord cross-section, we used the above-mentioned hand-drawn masks of the spinal cord and obtained one value per slice (average across the entire slice). In contrast, when analyses were carried out in template space or were based on gray matter regions only, we made use of the available PAM50 template masks of the entire spinal cord or the gray matter (with the probabilistic gray matter masks thresholded at 90%); again, we obtained one average value per mask and slice. Please note that in addition to reporting p values from statistical tests, we also report (where appropriate) the percentage difference between conditions and the associated 95% confidence interval (CI) as estimated via bootstrapping.

Direct replication
In a first set of analyses (across all 48 participants), we aimed to replicate the findings of Finsterbusch et al. (2012). We, therefore, used template space single-volume EPI data acquired under no z-shim and manual z-shim conditions, calculated the individual EPI signal intensity per slice and reported the mean of signal intensity across all slices as well as the variation of signal intensity across all slices; for the latter, we initially used the variance (as done by Finsterbusch et al., 2012), but after the replication of their results we employed the coefficient of variation for the remainder of the manuscript (due to it being a standardized measure of variability). Both descriptive changes (percent increase / decrease), as well as statistical values (based on paired t tests), were reported for the condition comparison. To additionally investigate how robust these findings were, we complemented these single-volume analysesthat might be affected by various noise sourcesby the same analysis approach, but now carried out on an EPI volume that is the average of a time-series of 250 motioncorrected EPI volumes (acquired both for no z-shim and manual zshim; Supplementary Material). In order to demonstrate that neither of these results were impacted by registration to template space, we also reported native space results in the Supplementary Material.
Slice-by-slice characterization of z-shim effects Finsterbusch et al. (2012) already demonstrated that the improvement due to slice-specific z-shimming varies spatially along the rostrocaudal direction. We therefore reasoned that it might be informative to also quantify the benefit for slices with various degrees of signalloss (obviously, such an analysis could only be performed in native space). We first did this in a descriptive manner by reporting (i) the maximally found percentage increase in signal intensity due to zshimming and (ii) the proportion of slices that differed by 0, 1, 2, 3, and >3 z-shim steps from the "neutral" setting of no z-shim (i.e., the number of slices [divided by the total number of slices] that had (i) a value of 11 (no z-shim applied) and thus a step difference of 0 compared to the 'neutral' value of 11, (ii) a value of 10 or 12 and thus a step-difference of 1 compared to the 'neutral' value of 11, (iii) a value of 9 or 13 and thus a step-difference of 2 compared to the 'neutral' value of 11, etc.). In the Supplementary Material, we then followed this up more formally with an analysis where we categorized slices according to the manually chosen z-shim value and compared the signal intensity in these categories between no z-shim and manual z-shim both descriptively (using % signal intensity difference) and inferentially using a 2 Â 5 repeated-measures ANOVA (factor 1: condition with two levels: no z-shim, manual z-shim; factor 2: step-difference with five levels: 0, 1, 2, 3, >3). We tested for a main effect of condition, a main effect of step-difference and an interaction between these two factors; post hoc t tests were Bonferroni corrected. To estimate the robustness of the results from these analyses (which were based on single EPI volumes), we repeated them on the average across the 250 motion-corrected EPI volumes (Supplementary Material).

z-shim effects across different TEs
We also aimed to assess the effects of z-shimming at TEs clearly shorter (30 ms; fastest TE possible with the employed partial-Fourier factor of 7/8) and longer (50 ms; same distance to our standard TE of 40 ms) than the estimated T2* in the cervical spinal cord at 3 T ($40 ms; Barry et al., 2019), considering that such choices might often be necessary in fMRI studies. We, therefore repeated the analyses described in section "Direct replication" (assessing the mean of signal intensity across all slices as well as the variation of signal intensity across all slices for no z-shim and manual z-shim conditions) on the template-space EPI data obtained with TEs of 30 ms and 50 ms, both shimming are also present in the gray matter (which is the relevant tissue for fMRI) and might even vary spatially (i.e., between dorsal and ventral horns). In order to address these two questions, we ran a 2 Â 2 repeated-measures ANOVA (factor 1: condition with two levels: no z-shim, manual z-shim; factor 2: anatomical location: dorsal horn, ventral horn) where we tested for a main effect of condition, a main effect of location and an interaction between the two factors (Supplementary Material); this was followed up by post hoc Bonferroni-corrected t tests (where we also report % increase for the direct comparisons). As underlying metrics, we tested both the mean of signal intensity across all slices and the variation of signal intensity across slices. To assess robustness, the above-described analyses (based on single-volume EPIs) were repeated based on the average across the 250 motion-corrected volumes. As a negative control, we also performed the same analyses as above, but now splitting the spinal cord gray matter into left and right parts.

z-shim effects on time-series data
The analyses described above, as well as the results reported by Finsterbusch et al. (2012) were solely based on measures of signal intensity. In order to directly investigate the potential benefit of zshimming for spinal cord fMRI, we also investigated its effect on the temporal signal-to-noise ratio (tSNR, i.e., temporal mean divided by temporal standard deviation on a voxel-by-voxel basis) of motioncorrected data (250 volumes). We are aware that effects on tSNR do not allow for a perfect one-to-one extrapolation to effects on BOLD sensitivity, but we nevertheless believe this to be an adequate proxy measure due to the following reasoning (De Panfilis & Schwarzbauer, 2005;Deichmann et al., 2002;Poser et al., 2006): since the contrast-to-noise ratio (CNR) of BOLD responses is proportional to the product of the effective TE and tSNR and the effective TE does not depend on the magnetic field gradient in the z-direction, any tSNR gain obtained by z-shimming should reflect a corresponding relative gain in BOLD-CNR in arbitrary task-based fMRI studies.
Following up on section "z-shim effects in gray matter regions", we only assessed this in the region most relevant for fMRI, that is, the gray matter of the spinal cord. We compared mean tSNR across all slices, as well as variation of tSNR across slices, between no z-shim and manual z-shim conditions: we descriptively reported % increase and also tested for significant differences using paired t tests.
Since signal loss in the most caudal (inferior) slices in the no zshimming condition could negatively impact the motion correction (as this is regularized along z using a second-degree polynomial), we performed the above-mentioned analyses also after "censoring" of outlier volumes (Supplementary Material; see also section 2.5.1).
As we only acquired 25 volumes for the short and long TEs due to time constraints, we did not calculate TE-dependent z-shim effects on tSNR (as these would be based on unstable tSNR estimates).

EPI-based automation
Next, we investigated the performance of the EPI-based automated approach for selecting z-shim values, both in comparison to the conditions of no z-shim and manual z-shim; this was carried out in a subgroup of 24 participants. For the sake of brevity, we (i) only reported our effects of interestsignal intensity based on single EPI volumes (Supplementary Material) and tSNR based on EPI time-seriesin the spinal cord gray matter (i.e., ignoring whole-cord data) and (ii) employed direct comparisons of conditions without using an initial omnibus test. Thus, in this subgroup of 24 participants we investigated: (i) no z-shim versus manual z-shim, (ii) no z-shim versus auto zshim, and (iii) manual z-shim versus auto z-shim. We reported % differences, as well as Bonferroni-corrected p values from paired t tests, again using mean and variation metrics.

FM-based automation
We investigated the performance of the FM-based automated approach for selecting z-shim values (based on a different subgroup of 24 participants) using the identical procedure as outlined in the previous paragraph.
However, since we discovered that the performance of the FMbased approach was slightly inferior compared to the manual approach, we followed this up with several post hoc investigations

Comparing all three approaches
So far, the automated approaches were compared to the manual approach within each subgroup of 24 participants. We next turned to directly comparing the approaches, using all 48 participants.
First, we used two-sample t tests (with Bonferroni-corrected two-tailed p values) in order to assess the following, based on gray matter tSNR from EPI time-series (using both the mean as well as the variation of tSNR across all slices): (i) comparing the baselines of no zshim between the two groups, (ii) comparing the improvement of manual z-shim versus no z-shim between the two groups, (iii) comparing the improvement of auto z-shim versus no z-shim between the two groups, and (iv) comparing the difference of manual z-shim versus auto z-shim between the two groups. In complementary analyses, we also assessed the similarity between the automated approaches and the manual approach in terms of the actually chosen z-shim step using rank-based correlation and Euclidean distance (Supplementary Material).
Second, we assessed the stability of z-shim effects (based on either of the automated approaches as well as the manual approach) over time in all 48 participants. We were able to do this since we acquired an EPI reference-scan not only at the beginning of the experiment, but also at the end ($60 min later). Using these reference scans, we "artificially reconstructed" an EPI volume from each of the reference scans by selecting the corresponding volume for each slice based on the chosen z-shim values, no matter whether a participant was in the EPI-based or FM-based automation group. Importantly, we chose the "originally" determined z-shim values to reconstruct "artificial volumes" from both the first and the second reference scan.
These volumes were then realigned to the mean of the motioncorrected time series. The warping fields that were obtained during the normalization of motion-corrected mean image to the template space were used to bring these volumes to the template space. We then compared gray matter signal characteristics (mean and variation of signal intensity across slices, respectively) for both time points using the various conditions via paired t tests with Bonferroni correction.

| Validation of EPI-based automation approach
In order to validate the EPI-based automation method (which performed at least as well as the manual approach), we obtained an independent, externally acquired data set of spinal GE-EPI data. These data were acquired by VO, RHD, and JCWB as part of a larger project on pharmacological aspects of cortico-spinal pain modulation (Oliva et al., 2022). Here, we report results based on analyzing the z-shim reference data from 117 acquisitions (39 participants, each with three visits).
As the validation dataset did not include volumes that were acquired under different z-shimming conditions, for each participant we "artificially reconstructed" an EPI volume from their reference scan by selecting the corresponding volume for each slice based on the chosen z-shim values (see also section "Comparing all three approaches"). We created three different EPI volumes for each participant and visit: (i) a "no z-shim" volume (based on an index of 8 for each slice, which corresponds to a z-shim moment of 0 mT m À1 ms), (ii) a "manual z-shim" volume (based on the z-shim values manually selected by VO when the experiment was carried out), and (iii) an "automated z-shim" volume (based on the above-described EPI-based automation carried out post hoc).
To bring these volumes to template space for each participant and visit, we applied the following steps to the T1-weighted anatomical data: (i) segmenting the T1 image using SCT's DeepSeg approach (Gros et al., 2019), (ii) automatically labelling the vertebral levels C2-C7, and (iii) bringing the T1 image to template space using nonrigid transformations. Then, we applied the following steps to the reconstructed EPI volumes: (i) calculating the average of these three volumes (one volume for no z-shim, manual z-shim and automated zshim each), (ii) segmenting the average (using the PropSeg approach), (iii) registering this average EPI to the template space (with the initial step of using the inverse warping field obtained from the registration of the T1-weighted image to the template image), (iv) registering individual EPI volumes to the template space using the warps obtained from the previous step (in order to be unbiased), and (v) in template space obtaining the signal over slices using the PAM50 cord mask.
Four individual data sets were excluded due to artifacts in the images (three data sets) and a wrong placement of the slice stack (one data set). Our final sample thus consisted of 113 measurements from 38 participants. Please also note that for preprocessing of data from one individual data set, we used a more recent version of SCT (version 5.2.0) due to a bug present in version 4.2.2.
Finally, we compared whole cord signal characteristics (mean and variation of signal intensity across slices) for (i) no z-shim versus manual z-shim, (ii) no z-shim versus auto z-shim, and (iii) manual z-shim versus auto z-shim via paired t tests with Bonferroni-correction and also reported % differences. For sake of simplicity, we treated each visit as a separate data point, thus ignoring the within-subject dependency structure. We also reported the results of the same analyses for gray matter signal characteristics (Supplementary Material).

| Open science
The code that was run during the experiment for the automated selection of z-shim moments (both EPI-based and FM-based), as well as all the code necessary to reproduce the reported results, is publicly available on GitHub (https://github.com/eippertlab/zshim-spinalcord).
Please also see the file Methods.md in this repository for a version of the Methods section with links to specific parts of the processing and analysis code. The underlying data are available in BIDS-format via OpenNeuro (https://openneuro.org/datasets/ds004068), with the exception of the external validation dataset obtained by VO, RHD and JCWB. The intended data-sharing via OpenNeuro was mentioned in the Informed Consent Form signed by the participants and approved by the ethics committee of the University of Leipzig.

| Direct replication
Our first aim was to replicate earlier findings that demonstrated a significant increase of mean signal intensity and a decrease of signal intensity variation across slices via z-shimming. In our data set we were able to replicate these findings (Figure 2a), by also showing a significant increase of mean signal intensity (t (47) = 19.97, p < .001, difference of 14.8%, CI: 13.4%-16.2%) and a significant reduction of signal intensity variation across slices, either using the variance as a metric (as the to-be-replicated study did; t (47) = 18.03, p < .001, difference of 67.8%, CI: 64%-71.2%) or using the coefficient of variation (as we did in all further analyses; t (47) = 23.97, p < .001, difference of 51%, CI: 47.7%-53.8%).

| Slice-by-slice characterization of z-shim effects
As depicted in Figure 2a, the improvement afforded by slice-specific z-shimming periodically varies along the rostro-caudal direction in a consistent manner across participants (for a depiction of individual data, see Supplementary Figure S1). In a next step, we thus investigated not only what the average benefit of z-shimming is across the entire slice-stack, but also quantified the benefit for slices with various degrees of signal-loss. We first asked what the maximal signal intensity gain is per participant and observed that this varied between 72% and 209%, with an average across participants of 122% (note that this analysis is based on the most-affected slice per participant).
To descriptively characterize how many slices were affected by signal drop-out to what degree across participants, we quantified for each slice by how much the manually chosen z-shim value (between 1 and 21) differs from that of the no z-shim condition (a constant value of 11). We observed that on average 20% of slices had no difference, 32% of slices had a 1-step difference, 22% of slices had a 2-step difference, 11% of slices had a 3-step difference, and 16% of slices had more than a 3-step difference. In this last category, the most extreme possibly value (i.e., a 10-step difference) occurred only in 1% of the slices across the whole sample, demonstrating that the range chosen here for the z-shim reference scan is appropriate. As expected, signal intensity improvements became more pronounced with the increasing z-shim step size: 0% difference for a 0-step-difference, 5% different for a 1-step-difference, 18% difference for a 2-step-difference, 41% difference for a 3-step-difference and 122% difference for a > 3-stepdifference ( Figure 2b); a statistical characterization of this relation can be found in the Supplementary Material.

| z-shim effects across different TEs
In addition to the TE of 40 ms (which was the default across this study), we also investigated the effects of z-shimming at shorter  where the whole cord is affected). As a negative control, we also performed the same analyses as above, but now splitting the spinal cord gray matter into left and right parts: as expected, there were no significant differences between these two regions.

| z-shim effects on time-series data
Moving away from reporting single-volume signal intensity measures, we next investigated the effect of z-shimming on the gray matter temporal signal-to-noise ratio (tSNR) of motion-corrected time-series data (250 volumes, acquired under no z-shim and manual z-shim, respectively). We observed a significant increase in mean tSNR (t (47) = 10.64, p < .001, difference of 11.9%, CI: 9.7%-14.2%), as well as a significant reduction of tSNR variation across slices (t (47) = 11.01, p < .001, difference of 26%, CI: 21.9%-30%), directly highlighting the benefits for spinal fMRI (Figure 2d). In the most-affected slices, zshimming increased the tSNR by 28% on average, ranging from 1% to 155% across participants (this analysis also revealed that there was one outlier where tSNR decreased by 26% for manual z-shimming compared to no z-shimming).

| Automation of z-shimming
The previous results were all obtained using manually determined zshim values and we now turn to results obtained when automating the z-shim selection process, for which we propose two methods: one is based on obtaining these values from the EPI z-shim reference scan (EPI-based) and one relies on calculating the necessary z-shim values based on a FM (FM-based).
Unexpectedly though, despite this clear benefit, the performance of the FM-based approach was slightly worse than using manually In post hoc analyses carried out after the complete data-set was acquired, we investigated several possibilities that might account for this slightly poorer performanceall of these are explained in detail in the Supplemental Material. Briefly, we investigated the influence of (i) the choice of mask for data extraction, (ii) the choice of parameters for the fitting process, (iii) the influence of field-gradients in the APdirection, and (iv) inhomogeneity-induced mis-localizations between EPIs and FM. We also investigated whether the type of FM played a role and whether z-shims could be reliably derived at all from FM data. These investigations aimed to improve the estimation of the through-slice field inhomogeneities in the FM. However, it should be noted that compensating the mean through-slice field inhomogeneity of a slice may not result in the optimum signal intensity: a few extreme values of the field inhomogeneity may shift the mean value significantly, thereby decreasing the signal of the majority of these voxels significantly; on the other hand, this shift may also not recover significant signal in the voxels with the extreme values, yielding an overall lower signal amplitude. To address this issue, a different approach of determining the z-shim value was used that was based on a histogram of the field gradients and aimed to reduce the influence of extreme values. This approach led to a consistent improvement in performance, although even this method still did not achieve the performance of the manual selection.

| Comparing all three approaches
To extend the within-group analyses reported above (each with N = 24) we next (i) formally compared the three approaches based on the entire set of participants (N = 48) and (ii) investigated the general question of how stable z-shim effects obtained via the three methods are across an experiment.
First, and most relevant for fMRI, we used mean gray matter tSNR to test for differences between the EPI-based and FM-based groups. These analyses (using Bonferroni corrected two-sample t tests) revealed that there was neither a significant difference between the baselines of no z-shimming in the two groups (p = 1), nor a significant difference between the improvement compared to no z-shimming for either the manual (p = 0.38) or the automated approaches in the two groups (p = 1). However, we did observe a significant difference between manual z-shim versus auto z-shim in the two groups (p = 0.003), indicating the slightly worse performance of Second, we investigated how stable the beneficial effects of zshimming are across time. When comparing how well each of the three z-shim methods performed against the case of no z-shimming in terms of mean signal intensity, we observed that despite some differences the beneficial effect of z-shimming was rather stable across time. More specifically, we observed that (i) there was a significant difference between the two time-points in the baseline condition of no z-shim F I G U R E 3 Performance of both automated methods. Top panel. EPI-based automation. Bottom panel. FM-based automation. In both panels, the left-most plots show the group-averaged gray-matter tSNR for no (red), manual (blue), and automated (green) z-shim sequences along the rostro-caudal axis of the cord. The solid line depicts the mean value and the shaded area depicts the standard error of the mean. Conditionwise group-average tSNR maps of the transversal slices at the middle of each segment are shown in the second graphs from the left. The maps are overlaid onto the group-average mean image of the motion-corrected EPI data and depict a tSNR range from 11 to 20. The outlines of the thresholded gray matter mask are marked by green lines. The scatter plots to the right show gray matter tSNR for manual (x-axis) and automated z-shim sequences (y-axis) plotted against each other (N = 24 for each automation subgroup). Bland-Altman plots show the gray matter tSNR for manual z-shim plotted as the ground truth (x-axis) and the difference in gray matter tSNR between automated and manual sequences plotted on the y-axis. The horizontal solid gray line represents the mean difference in the gray matter tSNR between the two (automated and manual) sequences, and the dotted lines show the 95% limits of agreement (1.96 Â standard deviation of the differences) (t (47) = 5.59, p < .001, with the first time point having significantly higher mean signal compared to second one), (ii) that there was a slight degradation in performance when comparing z-shim benefits against no z-shimming between the second and the first reference scan (manual: t (47) = 8.44, p < .001; EPI-based: t (47) = 9.70, p < .001; FM-based: t (47) = 9.84, p < .001; thus similar across all three approaches) and (iii) that all z-shim methods led to significant benefits not only in the data acquired at the beginning (manual vs. no z-shimming:

| Validation of EPI-based automation approach
In order to validate the EPI-based automation approach, we obtained an externally acquired corticospinal GE-EPI dataset consisting of 113 EPI z-shim reference scans acquired on a different MR-system (Oliva et al., 2022), which also allowed us to investigate the generalizability of the EPI-based automated approach in a dataset in which the manual selection was conducted by a different researcher (VO). Using this independently acquired data set, we observed thatcompared to no z-shimmanual z-shimming resulted in a significant increase in mean signal intensity (t (112) = 19.24, p < .001, difference of 22.1%, CI: 19.7%-24.4%) and a significant decrease in signal intensity variation across slices (t (112) = 8.83, p < .001, difference of 37.1%, CI: 29.7%-43.9%). Most importantly, the automated EPI-based approach resulted in a significant increase in mean signal intensity (t (112) = 25.93, p < .001, difference of 28.3%, CI: 26.2%-30.6%) and a significant decrease in signal intensity variation across slices (t (112) = 10.98, p < .001, difference of 43.1%, CI: 36.4%-49.3%).
When we directly compared the automated and manual approaches, we observed that the automated method performed significantly better than the manual method both for mean signal intensity (t (112) = 11.85, p < .001), and signal intensity variation across slices (t (112) = 4.79, p < .001), demonstrating that the proposed EPI-based automated method can even outperform the manual selection ( Figure 4).

| DISCUSSION
One of the main challenges in fMRI of the human spinal cord is the occurrence of spatially varying signal loss due to local magnetic field inhomogeneities. Here, we addressed this issue by employing the technique of slice-specific z-shimming. First, we aimed to replicate the results from the initial study on z-shimming in the spinal cord by investigating whether slice-specific z-shims mitigate signal loss in spinal cord GE-EPI data. Next, we probed the direct relevance of zshimming to studies measuring spinal cord activity with fMRI, by investigating its benefits with respect to different TEs, gray-matter signals and EPI time-series metrics. Most importantly, we aimed to improve upon the typical implementation of slice-specific z-shimming (user-dependent shim selection) by developing two automated approaches: one based on data from an EPI reference-scan and one based on data from a FM acquisition.
F I G U R E 4 Validation of EPI-based automation on an independent data-set. The mid-sagittal EPI sections on the left consist of the groupaverage reconstructed z-shim reference scan EPI data in template space for the three different conditions (note that "EPI reconstruction" was carried out via creating a single volume for each participant from the corresponding 15-volume z-shim reference scan by selecting for each slice the volume in which the z-shim moment maximized the signal intensity; for no z-shim reconstruction, the 8th volume of the z-shim reference scan was selected, which corresponds to the central/neutral z-shim moment, as this acquisition had a range of 15 moments). The line plots in the middle depict the group-averaged spinal cord signal intensity (obtained from the reconstructed z-shim reference-scan EPIs) along the rostrocaudal axis of the cord for the different conditions. The solid lines depict the group-mean values and the shaded areas depict the standard error of the mean. The box plots on the right show the group-mean spinal cord signal intensity averaged over the entire slice-stack. The median values are denoted by the central marks and the bottom and top edges of the boxes represent the 25% and 75%, respectively. The whiskers encompass approximately 99% of the data and outliers are represented by red dots. The gray lines indicate the participant-specific data (N = 113) upon which the box-plots are based 4.1 | Replication and extension of z-shim effects The first demonstration of the benefits obtainable with slice-specific z-shimming in T2*-weighted imaging of the human spinal cord was provided by Finsterbusch et al. (2012), who developed a z-shim protocol tailored to the peculiarities of spinal cord imaging and assessed its effects on single volume GE-EPI data. Here, our first aim was to provide a direct replication of their results in a larger cohort of participants (N = 48) on a different MR-system. Similar to Finsterbusch and colleagues, we observed that z-shimming led to a significant and meaningful increase of average signal intensity (15%) and decrease of signal intensity variation over slices (68%) compared to the baseline of no z-shimming. In order to provide some detail on the expected benefits afforded by this method, we also performed a slice-by-slice characterization: while in $20% of the slices no z-shimming was needed, in the rest of the slices the application of a slice-specific z-shim resulted in a significant signal increase which could be as large as $200% in the most extreme cases. Comparing these effects to those obtained with slice-specific z-shimming in the brain (Deichmann et al., 2003;Volz et al., 2019;Weiskopf et al., 2006)-where zshimming is critically important for signal recovery in susceptibilityprone regions such as the orbitofrontal cortexit becomes clear that they are at least as prominent in the spinal cord and their compensation is thus critical in spinal cord fMRI.
The above-discussed results were obtained with a TE of 40 ms in order to be close to the estimated T2* in the gray matter of the cervical spinal cord at 3 T (41 ms; Barry et al., 2019) and the TE considered by Finsterbusch and colleagues (44 ms). Similar to Finsterbusch et al. (2012), we however also investigated the effect of z-shimming over different TEs (30 ms, 40 ms, 50 ms), though now quantitatively and at the group-level. We observed that the beneficial effect of z-shimming was present to a similar degree across TEs, which is of direct relevance for fMRI. Longer TEs may be hard to avoid when covering lower cord sections due to the larger field of view required to avoid aliasing, in particular as higher in-plane acceleration factors may not be reasonable for the standard receive coils available. Conversely, shorter TEs might be desirable with respect to increasing the temporal resolution or optimizing BOLD sensitivity (Gati et al., 1997;Menon et al., 1993). In this respect, the consistent effect across TEs bodes well for using this technique flexibly in various settings.
In There is indeed the possibility that z-shim effects might be rather negligible for the spinal cord gray matter, considering that field variations are most pronounced at the edge of the cord (Cooke 2004, Cohen-Adad, 2017, which largely consists of white matter. With the recent availability of probabilistic gray matter maps via SCT (De Leener et al., 2018), we were in a position to address this question in this study. We observed that the beneficial effects of zshimming were highly significant and of appreciable magnitude in the gray matter. While these effects were already prominent in the ventral horns (11% increase), they were much stronger in the dorsal horns (18% increase) where signal losses were more severe (see also Cooke et al., 2004). Together, these results demonstrate the relevance of zshimming for spinal fMRI and highlight its necessity specifically in studies of dorsal horn function, such as somatosensation and nociception. It should be mentioned though that it is currently unclear whether such a pattern will also hold outside of the cervical spinal cord, that is, in thoracic and lumbar segments (see e.g., Finsterbusch, 2014). It is also important to note that in the current study, we aimed to optimize the signal in the entire spinal cord cross-section, but one might also consider optimizing the z-shim moments based on a gray matter region of interest. However, this approach would be more time-consuming (and might require user intervention), as for obtaining the gray matter masks it is necessary to first register the participant's native-space data to template space and then warp the probabilistic gray-matter masks back to native space.
Such a two-step approach is necessary since with the current spatial resolution and signal quality of EPI data at 3 T it is not possible to automatically segment the gray matter robustly in every slice of every participant (in our experience, this also holds for T2*-weighted ME-GRE protocols in lower cervical segments).
The improvement in signal intensity we have discussed so far might in the worst case not directly translate into improved fMRI data quality (as indexed e.g., by tSNR): this might for example happen if physiological noise dominates the time-series or if participants move strongly in the z-direction and thus render the chosen z-shim moment for a slice incorrect. We therefore quantified the beneficial effect of z-shimming on gray matter tSNR, by acquiring time-series data under different z-shimming conditions, and observed a 12% increase in the mean tSNR and a 26% decrease of tSNR variability over slices. It is important to note that as none of the data analysed here were high-pass filtered or corrected for the presence of physiological noise (Brooks et al., 2008), it is likely that the absolute tSNR values observed (range across participants for manually z-shimmed data in template space: 11.4-18.1) represent a worst case. By acquiring z-shim reference scans at the beginning and at the end of our experiment (separated by $60 min), we were furthermore able to demonstrate that the effect of z-shimming was sufficiently stable across time, which is another important consideration for fMRI studies, as they usually require long scanning sessions. It should be mentioned however that participant-movement in the slice direction may reduce the performance of the z-shim compensation, although we deem this unlikely to happen frequently, considering the slice thick-

| Automation of z-shimming
While the above-mentioned results demonstrate the utility of zshimming for fMRI of the spinal cord, this approach requires detailed manual intervention in order to select the slice-specific z-shim moments. In order to overcome this drawback, in this study, we developed two different automated methods for the selection of the z-shim moments. Although such approaches have been developed for fMRI of the brain (Marshall et al., 2009;Tang & Huang, 2011;Volz et al., 2019; The first automated method is based on the acquisition of an EPI z-shim reference scanwhich is also employed for the manual selectionand relies on finding the z-shim moment that leads to the highest spinal cord EPI signal in each slice. This simple method achieved an at least identical performance in terms of all the investigated signal characteristics compared to the manual z-shim moment selection. In addition to that, the EPI-based approach was much faster compared to the manual selection: the calculations were completed in 15 s on average, whereas the manual selection took approximately 10 min for our set-up (24 slices and 21 z-shim values). The acquisition time of the z-shim reference scan was 55 s, but this could be shortened by limiting the range of the acquired z-shim moments. With the current set-up, we observed that the range of the acquired z-shim moments could indeed be restricted to achieve shorter acquisition times, as the most extreme z-shim moments were chosen quite rarely (lower-most moment of z-shim range chosen only in 1% of the slices, upper-most moment never). A drawback of the EPI-based approach is that it does not provide the flexibility to obtain slice-specific x-and yshim settings at the same time in order to account for field gradients in the read and phase direction simultaneously (Volz et al., 2019). To obtain those, additional reference scans would be necessary and thus prolong the scan-time significantly . This drawback could be overcome by basing the slice-specific shim selection on a FM, which would allow for estimating x-, y-, and z-shims for each slice simultaneouslyas already suggested by Finsterbusch et al.
(2012) and Islam et al. (2019)-and this was the second approach we employed.
The FM-based approach was motivated by the fact that the source of the signal drop-outs which slice-specific z-shimming aims to compensate are B 0 field inhomogeneities in the slice direction. The optimum z-shim value should thus be derivable from a FM, and we therefore fit a spatially linear gradient field in the slice direction to the measured FM data in order to estimate the gradient moment that will compensate the local through-slice field inhomogeneity. This FMbased approach provided highly significant benefits when compared to no z-shimming in terms of all the investigated signal characteristics. an improvement with this approach is observed, it still does not perform as well as the approach based on the reference scan which may have several reasons. On one hand, the relative intensities of the voxels as relevant in the EPI images are not appropriately considered. On the other hand, while the EPI-based approach is based on the same pulse sequence and has identical acquisition parameters as the target data (i.e., it exactly reflects the signal intensity achieved with the fMRI protocol), the FM-based approach is based on a different pulse sequence that is less prone to artifacts, but comes with a different voxel size as well as image orientation and position. These data could thus theoretically be expected to have a better quality and be more accurate, but may be less consistent with the EPI data (e.g., in terms of effects arising from in-plane field gradients) (Deichmann et al., 2002;Weiskopf, Hutton, Josephs, et al., 2007) or slice thickness/profile modifications due to field inhomogeneities (Epstein & Magland, 2006) and most importantly are not determined from the EPI signal intensity.
It is also important to note that there are several ways of calculating the optimal z-shim moments from FM data and other approaches have for example taken the route of directly optimizing BOLD sensitivity in the brain based on EPI BOLD contrast models (e.g., Balteau et al., 2010;Volz et al., 2019). In the spinal cord, Islam et al. (2019) recently proposed an FM-based automated z-shim selection method for simultaneous brain and spinal fMRI. However, their implementation was aimed at compensating spatially broader field variations, as they fit a quadratic field term using voxels from slices that were ± 4 cm distant from the target slice (which would cover 16 EPI slices in our case). In our study, we aimed to compensate for more localized field variations along the superiorinferior axis of the cord and therefore only included voxels from slices that were ± 4 mm distant from the target slice. While comparing the performance of our approach directly to these approaches is beyond the scope of the current work, with the open availability of our code and data, this should be possible for the interested reader.

| Validation of EPI-based z-shim automation
Finally, we demonstrated the validity of our EPI-based automation approach in an independently acquired large-scale cortico-spinal dataset (N > 100; Oliva et al., 2022). In this case, the automated approach exceeded the performance of manual selection (though we were not able to test this performance advantage in a further independent data-set). Such a pattern of results might be expected for studies where manual z-shim selection has to be performed rather fast due to time constraints (such as in the validation dataset, where a pharmacological challenge of the opioidergic and noradrenergic systems took place)in our methodologically oriented study, particular emphasis was placed on the manual z-shim selection being as precise as possible, thus making the advantage of the automated approach possibly less apparent. This also hints at the potential of this approach to make z-shim selection more reliable and homogenous in complex studies where personnel might vary (e.g., in longitudinal or multi-site projects) and thus have different levels of experience that could detrimentally influence manual z-shim selection. Finally, since the cortico-spinal dataset naturally suffered from more severe signal drop-outs and acquisition artefacts such as ghosting (e.g., due to the large acquisition volume), the performance of the EPI-based automation approach demonstrates the robustness of this method with regards to varying levels of data quality.

| Limitations
We would also like to point out several limitations of the presented work. First, the slice-wise z-shim approach is only applicable to axially acquired single-shot GE-EPI data. While this type of acquisition is used by numerous groups when studying somatomotor (e.g., Kinany et al., 2019;Maieron et al., 2007;Vahdat et al., 2015;Weber et al., 2016), somatosensory (Brooks et al. 2008;Tinnermann et al., 2017;Weber et al., 2020;Oliva et al., 2022) or resting-state spinal cord responses (Kong et al., 2014;San Emeterio Nateras, 2016;Kinany et al., 2020), there is also a strong tradition of using spin-echo approaches (for reviews, see e.g., Stroman, 2005 andPowers et al., 2018) and a more recent development in using multi-shot acquisitions (e.g., Barry et al., 2014;Barry et al., 2021; note that while the use of short TEs makes these acquisitions less affected by signal-dropout, in principle z-shimming might also be beneficial here  Weiskopf et al., 2005;Poser et al., 2006), we cannot make direct extrapolations from the here-observed beneficial effects of zshimming on tSNR to similar effects on task-based BOLD responses. In future methodological studies, it would thus be interesting to also acquire task-based spinal fMRI data under different z-shimming conditions to demonstrate the effect of z-shimming on the detection of BOLD responseswhile this has been demonstrated in brain fMRI studies (Du et al., 2007;Gu et al., 2002), such evidence is currently lacking for the spinal cord (for a first step in this direction, see Islam et al., 2019). Third, the FM-based approach could be optimized for example, by improving FM quality to a degree where an automated segmentation of the magnitude image is possible (thus precluding any possible mismatch between the FM and the T2-weighted image that is used for spinal cord identification) and increasing the spatial resolution of the FM (currently limited at $2 mm in x-direction) in order to allow for full xyz-shimming (see also Islam et al., 2019).

| CONCLUSIONS
Spinal cord fMRI suffers from magnetic field inhomogeneities that negatively affect data quality, particularly via signal loss. In the current study, we extensively characterized the performance of slicespecific z-shimming in mitigating the effects of these inhomogeneities and developed two automated slice-specific z-shim approaches.
We believe that our automated approaches will be beneficial for future spinal cord fMRI studies since they (i) are less time-consuming than the traditional approach, (ii) do not require extensive experience in judging data quality, and (iii) are expected to increase reproducibility by eliminating the subjective component in the z-shim selection processes. This latter point is particularly important for longitudinal fMRI studies as they could be envisioned in the clinical setting where disease progression and treatment effects could be monitored.