Skip to content

Commit ad664ae

Browse files
tsaloJoshua Teves
andauthored
[ENH] Generate BIDS Derivatives-compatible outputs (#691)
* BIDS Derivatives-compatible outputs. * Switches to class-based output generator for file naming Co-authored-by: Taylor Salo <[email protected]> Co-authored-by: Joshua Teves <[email protected]>
1 parent 8477aab commit ad664ae

27 files changed

+1667
-847
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,7 @@ ENV/
108108

109109
# jupyter notebooks
110110
.ipynb_checkpoints/
111-
*.ipynb
111+
*.ipynb
112+
113+
# vim swap files
114+
*.swp

docs/api.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,6 @@ API
161161
tedana.io.add_decomp_prefix
162162
tedana.io.split_ts
163163
tedana.io.write_split_ts
164-
tedana.io.writefeats
165164
tedana.io.writeresults
166165
tedana.io.writeresults_echoes
167166

docs/approach.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ estimate voxel-wise :math:`T_{2}^*` and :math:`S_0`.
6969
:math:`S_0` corresponds to the total signal in each voxel before decay and can reflect coil sensivity.
7070
:math:`T_{2}^*` corresponds to the rate at which a voxel decays over time, which
7171
is related to signal dropout and BOLD sensitivity.
72-
Estimates of the parameters are saved as **t2sv.nii.gz** and **s0v.nii.gz**.
72+
Estimates of the parameters are saved as **T2starmap.nii.gz** and **S0map.nii.gz**.
7373

7474
While :math:`T_{2}^*` and :math:`S_0` in fact fluctuate over time, estimating
7575
them on a volume-by-volume basis with only a small number of echoes is not
@@ -153,7 +153,7 @@ between the distributions for other echoes.
153153

154154
The time series for the optimally combined data also looks like a combination
155155
of the other echoes (which it is).
156-
This optimally combined data is written out as **ts_OC.nii.gz**
156+
This optimally combined data is written out as **desc-optcom_bold.nii.gz**
157157

158158
.. image:: /_static/a10_optimal_combination_timeseries.png
159159

@@ -194,7 +194,7 @@ component analysis (PCA).
194194
The goal of this step is to make it easier for the later ICA decomposition to converge.
195195
Dimensionality reduction is a common step prior to ICA.
196196
TEDPCA applies PCA to the optimally combined data in order to decompose it into component maps and
197-
time series (saved as **mepca_mix.1D**).
197+
time series (saved as **desc-PCA_mixing.tsv**).
198198
Here we can see time series for some example components (we don't really care about the maps):
199199

200200
.. image:: /_static/a11_pca_component_timeseries.png
@@ -277,9 +277,9 @@ Next, ``tedana`` applies TE-dependent independent component analysis (ICA) in
277277
order to identify and remove TE-independent (i.e., non-BOLD noise) components.
278278
The dimensionally reduced optimally combined data are first subjected to ICA in
279279
order to fit a mixing matrix to the whitened data.
280-
This generates a number of independent timeseries (saved as **meica_mix.1D**),
281-
as well as beta maps which show the spatial loading of these components on the
282-
brain (**betas_OC.nii.gz**).
280+
This generates a number of independent timeseries (saved as **desc-ICA_mixing.tsv**),
281+
as well as parameter estimate maps which show the spatial loading of these components on the
282+
brain (**desc-ICA_components.nii.gz**).
283283

284284
.. image:: /_static/a13_ica_component_timeseries.png
285285

@@ -312,13 +312,13 @@ The blue and red lines show the predicted values for the :math:`S_0` and
312312
A decision tree is applied to :math:`\kappa`, :math:`\rho`, and other metrics in order to
313313
classify ICA components as TE-dependent (BOLD signal), TE-independent
314314
(non-BOLD noise), or neither (to be ignored).
315-
These classifications are saved in `comp_table_ica.txt`.
315+
These classifications are saved in **desc-tedana_metrics.tsv**.
316316
The actual decision tree is dependent on the component selection algorithm employed.
317317
``tedana`` includes the option `kundu` (which uses hardcoded thresholds
318318
applied to each of the metrics).
319319

320320
Components that are classified as noise are projected out of the optimally combined data,
321-
yielding a denoised timeseries, which is saved as `dn_ts_OC.nii.gz`.
321+
yielding a denoised timeseries, which is saved as **desc-optcomDenoised_bold.nii.gz**.
322322

323323
.. image:: /_static/a15_denoised_data_timeseries.png
324324

docs/outputs.rst

Lines changed: 118 additions & 100 deletions
Large diffs are not rendered by default.

tedana/decomposition/pca.py

Lines changed: 69 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
PCA and related signal decomposition methods for tedana
33
"""
44
import logging
5-
import os.path as op
65
from numbers import Number
76

87
import numpy as np
@@ -48,8 +47,8 @@ def low_mem_pca(data):
4847

4948

5049
def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
51-
ref_img, tes, algorithm='mdl', kdaw=10., rdaw=1.,
52-
out_dir='.', verbose=False, low_mem=False):
50+
io_generator, tes, algorithm='mdl', kdaw=10., rdaw=1.,
51+
verbose=False, low_mem=False):
5352
"""
5453
Use principal components analysis (PCA) to identify and remove thermal
5554
noise from multi-echo data.
@@ -73,8 +72,8 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
7372
For more information on thresholding, see `make_adaptive_mask`.
7473
t2sG : (S,) array_like
7574
Map of voxel-wise T2* estimates.
76-
ref_img : :obj:`str` or img_like
77-
Reference image to dictate how outputs are saved to disk
75+
io_generator : :obj:`tedana.io.OutputGenerator`
76+
The output generation object for this workflow
7877
tes : :obj:`list`
7978
List of echo times associated with `data_cat`, in milliseconds
8079
algorithm : {'kundu', 'kundu-stabilize', 'mdl', 'aic', 'kic', float}, optional
@@ -91,8 +90,6 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
9190
rdaw : :obj:`float`, optional
9291
Dimensionality augmentation weight for Rho calculations. Must be a
9392
non-negative float, or -1 (a special value). Default is 1.
94-
out_dir : :obj:`str`, optional
95-
Output directory.
9693
verbose : :obj:`bool`, optional
9794
Whether to output files from fitmodels_direct or not. Default: False
9895
low_mem : :obj:`bool`, optional
@@ -147,19 +144,20 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
147144
Outputs:
148145
149146
This function writes out several files:
150-
151-
====================== =================================================
152-
Filename Content
153-
====================== =================================================
154-
pca_decomposition.json PCA component table.
155-
pca_mixing.tsv PCA mixing matrix.
156-
pca_components.nii.gz Component weight maps.
157-
====================== =================================================
147+
=========================== =============================================
148+
Default Filename Content
149+
=========================== =============================================
150+
desc-PCA_decomposition.json PCA component table
151+
desc-PCA_mixing.tsv PCA mixing matrix
152+
desc-PCA_components.nii.gz Component weight maps
153+
=========================== =============================================
158154
159155
See Also
160156
--------
161-
:func:`tedana.utils.make_adaptive_mask` : The function used to create the ``adaptive_mask``
162-
parameter.
157+
:func:`tedana.utils.make_adaptive_mask` : The function used to create
158+
the ``adaptive_mask` parameter.
159+
:module:`tedana.constants` : The module describing the filenames for
160+
various naming conventions
163161
"""
164162
if algorithm == 'kundu':
165163
alg_str = ("followed by the Kundu component selection decision "
@@ -204,8 +202,8 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
204202
data_z = (data_z - data_z.mean()) / data_z.std() # var normalize everything
205203

206204
if algorithm in ['mdl', 'aic', 'kic']:
207-
data_img = io.new_nii_like(ref_img, utils.unmask(data, mask))
208-
mask_img = io.new_nii_like(ref_img, mask.astype(int))
205+
data_img = io.new_nii_like(io_generator.reference_img, utils.unmask(data, mask))
206+
mask_img = io.new_nii_like(io_generator.reference_img, mask.astype(int))
209207
voxel_comp_weights, varex, varex_norm, comp_ts = ma_pca(
210208
data_img, mask_img, algorithm, normalize=True)
211209
elif isinstance(algorithm, Number):
@@ -231,10 +229,11 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
231229
# Compute Kappa and Rho for PCA comps
232230
# Normalize each component's time series
233231
vTmixN = stats.zscore(comp_ts, axis=0)
234-
comptable, _, _, _ = metrics.dependence_metrics(
235-
data_cat, data_oc, comp_ts, adaptive_mask, tes, ref_img,
236-
reindex=False, mmixN=vTmixN, algorithm=None,
237-
label='mepca_', out_dir=out_dir, verbose=verbose)
232+
comptable, _, metric_metadata, _, _ = metrics.dependence_metrics(
233+
data_cat, data_oc, comp_ts, adaptive_mask, tes, io_generator,
234+
reindex=False, mmixN=vTmixN, algorithm=None,
235+
label='PCA', verbose=verbose
236+
)
238237

239238
# varex_norm from PCA overrides varex_norm from dependence_metrics,
240239
# but we retain the original
@@ -243,38 +242,68 @@ def tedpca(data_cat, data_oc, combmode, mask, adaptive_mask, t2sG,
243242
comptable['normalized variance explained'] = varex_norm
244243

245244
# write component maps to 4D image
246-
comp_ts_z = stats.zscore(comp_ts, axis=0)
247-
comp_maps = utils.unmask(computefeats2(data_oc, comp_ts_z, mask), mask)
248-
io.filewrite(comp_maps, op.join(out_dir, 'pca_components.nii.gz'), ref_img)
245+
comp_maps = utils.unmask(computefeats2(data_oc, comp_ts, mask), mask)
246+
io_generator.save_file(comp_maps, 'z-scored PCA components img')
249247

250248
# Select components using decision tree
251249
if algorithm == 'kundu':
252-
comptable = kundu_tedpca(comptable, n_echos, kdaw, rdaw, stabilize=False)
250+
comptable, metric_metadata = kundu_tedpca(
251+
comptable,
252+
metric_metadata,
253+
n_echos,
254+
kdaw,
255+
rdaw,
256+
stabilize=False,
257+
)
253258
elif algorithm == 'kundu-stabilize':
254-
comptable = kundu_tedpca(comptable, n_echos, kdaw, rdaw, stabilize=True)
259+
comptable, metric_metadata = kundu_tedpca(
260+
comptable,
261+
metric_metadata,
262+
n_echos,
263+
kdaw,
264+
rdaw,
265+
stabilize=True,
266+
)
255267
else:
256268
alg_str = "variance explained-based" if isinstance(algorithm, Number) else algorithm
257269
LGR.info('Selected {0} components with {1} dimensionality '
258270
'detection'.format(comptable.shape[0], alg_str))
259271
comptable['classification'] = 'accepted'
260272
comptable['rationale'] = ''
261273

262-
# Save decomposition
274+
# Save decomposition files
263275
comp_names = [io.add_decomp_prefix(comp, prefix='pca', max_value=comptable.index.max())
264276
for comp in comptable.index.values]
265277

266278
mixing_df = pd.DataFrame(data=comp_ts, columns=comp_names)
267-
mixing_df.to_csv(op.join(out_dir, 'pca_mixing.tsv'), sep='\t', index=False)
268-
269-
comptable['Description'] = 'PCA fit to optimally combined data.'
270-
mmix_dict = {}
271-
mmix_dict['Method'] = ('Principal components analysis implemented by '
272-
'sklearn. Components are sorted by variance '
273-
'explained in descending order. '
274-
'Component signs are flipped to best match the '
275-
'data.')
276-
io.save_comptable(comptable, op.join(out_dir, 'pca_decomposition.json'),
277-
label='pca', metadata=mmix_dict)
279+
io_generator.save_file(mixing_df, "PCA mixing tsv")
280+
281+
# Save component table and associated json
282+
temp_comptable = comptable.set_index("Component", inplace=False)
283+
io_generator.save_file(temp_comptable, "PCA metrics tsv")
284+
285+
metric_metadata["Component"] = {
286+
"LongName": "Component identifier",
287+
"Description": (
288+
"The unique identifier of each component. "
289+
"This identifier matches column names in the mixing matrix TSV file."
290+
),
291+
}
292+
io_generator.save_file(metric_metadata, "PCA metrics json")
293+
294+
decomp_metadata = {
295+
"Method": (
296+
"Principal components analysis implemented by sklearn. "
297+
"Components are sorted by variance explained in descending order. "
298+
"Component signs are flipped to best match the data."
299+
),
300+
}
301+
for comp_name in comp_names:
302+
decomp_metadata[comp_name] = {
303+
"Description": "PCA fit to optimally combined data.",
304+
"Method": "tedana",
305+
}
306+
io_generator.save_file(decomp_metadata, "PCA decomposition json")
278307

279308
acc = comptable[comptable.classification == 'accepted'].index.values
280309
n_components = acc.size

tedana/gscontrol.py

Lines changed: 21 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,21 @@
22
Global signal control methods
33
"""
44
import logging
5-
import os.path as op
65

76
import numpy as np
7+
import pandas as pd
88
from scipy import stats
99
from scipy.special import lpmv
1010

11-
from tedana import io, utils
11+
from tedana import utils
1212
from tedana.due import due, Doi
1313

1414
LGR = logging.getLogger(__name__)
1515
RepLGR = logging.getLogger("REPORT")
1616
RefLGR = logging.getLogger("REFERENCES")
1717

1818

19-
def gscontrol_raw(catd, optcom, n_echos, ref_img, out_dir='.', dtrank=4):
19+
def gscontrol_raw(catd, optcom, n_echos, io_generator, dtrank=4):
2020
"""
2121
Removes global signal from individual echo `catd` and `optcom` time series
2222
@@ -34,10 +34,8 @@ def gscontrol_raw(catd, optcom, n_echos, ref_img, out_dir='.', dtrank=4):
3434
Optimally combined functional data (i.e., the output of `make_optcom`)
3535
n_echos : :obj:`int`
3636
Number of echos in data. Should be the same as `E` dimension of `catd`
37-
ref_img : :obj:`str` or img_like
38-
Reference image to dictate how outputs are saved to disk
39-
out_dir : :obj:`str`, optional
40-
Output directory.
37+
io_generator : :obj:`tedana.io.OutputGenerator`
38+
The output generator for this workflow
4139
dtrank : :obj:`int`, optional
4240
Specifies degree of Legendre polynomial basis function for estimating
4341
spatial global signal. Default: 4
@@ -78,23 +76,25 @@ def gscontrol_raw(catd, optcom, n_echos, ref_img, out_dir='.', dtrank=4):
7876
detr = dat - np.dot(sol.T, Lmix.T)[0]
7977
sphis = (detr).min(axis=1)
8078
sphis -= sphis.mean()
81-
io.filewrite(utils.unmask(sphis, Gmask), op.join(out_dir, 'T1gs'), ref_img)
79+
io_generator.save_file(utils.unmask(sphis, Gmask), "gs img")
8280

8381
# find time course ofc the spatial global signal
8482
# make basis with the Legendre basis
8583
glsig = np.linalg.lstsq(np.atleast_2d(sphis).T, dat, rcond=None)[0]
8684
glsig = stats.zscore(glsig, axis=None)
87-
np.savetxt(op.join(out_dir, 'glsig.1D'), glsig)
85+
86+
glsig_df = pd.DataFrame(data=glsig.T, columns=['global_signal'])
87+
io_generator.save_file(glsig_df, "global signal time series tsv")
8888
glbase = np.hstack([Lmix, glsig.T])
8989

9090
# Project global signal out of optimally combined data
9191
sol = np.linalg.lstsq(np.atleast_2d(glbase), dat.T, rcond=None)[0]
9292
tsoc_nogs = dat - np.dot(np.atleast_2d(sol[dtrank]).T,
9393
np.atleast_2d(glbase.T[dtrank])) + Gmu[Gmask][:, np.newaxis]
9494

95-
io.filewrite(optcom, op.join(out_dir, 'tsoc_orig'), ref_img)
95+
io_generator.save_file(optcom, "has gs combined img")
9696
dm_optcom = utils.unmask(tsoc_nogs, Gmask)
97-
io.filewrite(dm_optcom, op.join(out_dir, 'tsoc_nogs'), ref_img)
97+
io_generator.save_file(dm_optcom, "removed gs combined img")
9898

9999
# Project glbase out of each echo
100100
dm_catd = catd.copy() # don't overwrite catd
@@ -111,7 +111,7 @@ def gscontrol_raw(catd, optcom, n_echos, ref_img, out_dir='.', dtrank=4):
111111
@due.dcite(Doi("10.1073/pnas.1301725110"),
112112
description="Minimum image regression to remove T1-like effects "
113113
"from the denoised data.")
114-
def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir="."):
114+
def minimum_image_regression(optcom_ts, mmix, mask, comptable, io_generator):
115115
"""
116116
Perform minimum image regression (MIR) to remove T1-like effects from
117117
BOLD-like components.
@@ -131,10 +131,8 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
131131
comptable : (C x X) :obj:`pandas.DataFrame`
132132
Component metric table. One row for each component, with a column for
133133
each metric. The index should be the component number.
134-
ref_img : :obj:`str` or img_like
135-
Reference image to dictate how outputs are saved to disk
136-
out_dir : :obj:`str`, optional
137-
Output directory.
134+
io_generator : :obj:`tedana.io.OutputGenerator`
135+
The output generating object for this workflow
138136
139137
Notes
140138
-----
@@ -202,7 +200,7 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
202200
mehk_ts = np.dot(comp_pes[:, acc], mmix[:, acc].T)
203201
t1_map = mehk_ts.min(axis=-1) # map of T1-like effect
204202
t1_map -= t1_map.mean()
205-
io.filewrite(utils.unmask(t1_map, mask), op.join(out_dir, "sphis_hik"), ref_img)
203+
io_generator.save_file(utils.unmask(t1_map, mask), "t1 like img")
206204
t1_map = t1_map[:, np.newaxis]
207205

208206
# Find the global signal based on the T1-like effect
@@ -213,11 +211,11 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
213211
np.linalg.lstsq(glob_sig.T, mehk_ts.T, rcond=None)[0].T, glob_sig
214212
)
215213
hik_ts = mehk_noT1gs * optcom_std # rescale
216-
io.filewrite(utils.unmask(hik_ts, mask), op.join(out_dir, "hik_ts_OC_MIR"), ref_img)
214+
io_generator.save_file(utils.unmask(hik_ts, mask), "ICA accepted mir denoised img")
217215

218216
# Make denoised version of T1-corrected time series
219217
medn_ts = optcom_mean + ((mehk_noT1gs + resid) * optcom_std)
220-
io.filewrite(utils.unmask(medn_ts, mask), op.join(out_dir, "dn_ts_OC_MIR"), ref_img)
218+
io_generator.save_file(utils.unmask(medn_ts, mask), "mir denoised img")
221219

222220
# Orthogonalize mixing matrix w.r.t. T1-GS
223221
mmix_noT1gs = mmix.T - np.dot(
@@ -230,9 +228,9 @@ def minimum_image_regression(optcom_ts, mmix, mask, comptable, ref_img, out_dir=
230228

231229
# Write T1-corrected components and mixing matrix
232230
comp_pes_norm = np.linalg.lstsq(mmix_noT1gs_z.T, optcom_z.T, rcond=None)[0].T
233-
io.filewrite(
231+
io_generator.save_file(
234232
utils.unmask(comp_pes_norm[:, 2:], mask),
235-
op.join(out_dir, "betas_hik_OC_MIR"),
236-
ref_img,
233+
"ICA accepted mir component weights img",
237234
)
238-
np.savetxt(op.join(out_dir, "meica_mix_MIR.1D"), mmix_noT1gs)
235+
mixing_df = pd.DataFrame(data=mmix_noT1gs.T, columns=comptable["Component"].values)
236+
io_generator.save_file(mixing_df, "ICA MIR mixing tsv")

0 commit comments

Comments
 (0)