Skip to content

Review and update part 1 materials #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Review and update part 1 materials #10

wants to merge 9 commits into from

Conversation

mgeaghan
Copy link
Contributor

I have reviewed and updated the part 1 materials. For most of the .md files I have just updated the wording, as well as addressed Issue #3 to update section headers to use a more active voice.

I have updated both 03_hellonf.md and 04_output.md to address Issue #5 and clear up the distinction between processes and workflows. In 04_output.md I have clarified how outputs are declared but not guaranteed by the process output blocks.

I have also made significant changes to 05_inputs.md. This is mainly to address Issues #4 and #7. As part of this, I have merged the explanation of channels into this section; we can discuss whether it may be better to split off into a separate section. As part of the changes to this section, I have:

  • Expanded on the difference between value and queue channels.
  • Expanded on channel factories
  • Shown how to use an input channel to feed multiple input values into a process
  • Touched on the use of -ansi-log false for clearer terminal logs
  • Shown how to use an array of values to create our input channel (taken from the sequera training)
  • Shown how to use channel operators to manipulate the channel data
  • Addressed Issue Addressing publishdir as inputs #7 and mentioned that we should not use publishDir as an input to anything
  • Made a note about multiple input channels

I think some of this may be a bit overkill, particularly the explanation of the map() operator and the multiple input channels, so happy to discuss this and potentially remove.

@mgeaghan mgeaghan requested a review from fredjaya May 19, 2025 01:03
Comment on lines +338 to +386
### Transforming each value with the `map` operator

Sometimes, you will need to modify the values within a channel in a predictable way. For example, you might need to get part of a file name, or perhaps you need to take a numeric value and apply a mathematical operation to it. In these situations, you will likely want to use the `map()` operator. This takes a channel and applies a **closure** to every element in the channel.

An in-depth discussion of closures is outside of the scope of this workshop, but briefly, they are blocks of code, similar to functions, that can be passed to some Nextflow operators and control how they function. With the `map()` operator, the closure defines the exact operation that should be performed on each element of the channel. A closlure is defined within curly braces like so:

```groovy
{ x ->
// Do something with x
}
```

At the start of the closure definition, we declare the name of a variable that will represent each element of our channel. Here we have simply called it `x`, but it could be named anything. Next, we write an arrow operator `->`; this signifies that the value on the left (`x`) will be processed by the code on the right. Finally, we write some code to do something with our variable. For example, we can square integer values:

```groovy
{ x -> x ** 2 }
```

Or, if we have strings as inputs, we could reverse the strings:

```groovy
{ x -> x.reverse() }
```

To use the `map()` operator with a channel, we simply do:

```groovy
ch.map { x -> ... }
```

!!!question "Exercise"

Use the `map()` operator to reverse the greeting strings.

???Solution

```groovy title="hello-world.nf" hl_lines="7"
workflow {

// Create a channel for inputs
greetings_array = [ 'Hello world!', 'Bonjour le monde!', 'Holà mundo' ]
greeting_ch = Channel.of(greetings_array)
.flatten()
.map { x -> x.reverse() }

// Emit a greeting
SAYHELLO(greeting_ch)
}
```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a bit too into the weeds here. Probably just remove this.

Comment on lines +400 to +473
## A note about multiple inputs

The input block can be used to define multiple inputs to the process. Importantly, the number of inputs passed to the process call within the workflow must match the number of inputs defined in the process. For example:

```groovy title="example.nf"
process MYFUNCTION {
debug true

input:
val input_1
val input_2

output:
stdout

script:
"""
echo $input_1 $input_2
"""
}

workflow {
MYFUNCTION('Hello', 'World!')
}
```

Another important aspect of multiple inputs is that when working with **queue channels**, they can result in **non-deterministic** results. This is because a process will execute as soon as a new value is ready for all input channels. For that reason, multiple inputs will typically be used with either value channel inputs (since they are single values that will be reused over and over again) or by using the `each` qualifier, which allows you to run the process once for every value in a collection or queue. For example:

```
process cat_message {
input:
val greeting
each noun

output:
path "message.txt"

script:
"""
echo '$greeting' '$noun' > message.txt
"""
}

workflow {
ch1 = Channel.of(
'Hello',
'Bonjour'
)
ch2 = Channel.of(
'world',
'everyone'
)

cat_message(ch1, ch2)
}
```

This will output the following four lines (possibly in a different order):

```
Bonjour everyone
Hello world
Hello everyone
Bonjour world
```

In contrast, if we didn't use the `each` qualifier with the `noun` input, we would only get two output lines, such as:

```
Hello everyone
Bonjour world
```

Because the queues are non-deterministic, the exact combination we would get is uncertain.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that multiple inputs was a question from last year's workshop, so it might be useful to keep this, but perhaps wrapped in an 'advanced material' block?

Copy link
Member

@fredjaya fredjaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes, I'll update them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants