-
Notifications
You must be signed in to change notification settings - Fork 0
Review and update part 1 materials #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ows, declarative outputs
### Transforming each value with the `map` operator | ||
|
||
Sometimes, you will need to modify the values within a channel in a predictable way. For example, you might need to get part of a file name, or perhaps you need to take a numeric value and apply a mathematical operation to it. In these situations, you will likely want to use the `map()` operator. This takes a channel and applies a **closure** to every element in the channel. | ||
|
||
An in-depth discussion of closures is outside of the scope of this workshop, but briefly, they are blocks of code, similar to functions, that can be passed to some Nextflow operators and control how they function. With the `map()` operator, the closure defines the exact operation that should be performed on each element of the channel. A closlure is defined within curly braces like so: | ||
|
||
```groovy | ||
{ x -> | ||
// Do something with x | ||
} | ||
``` | ||
|
||
At the start of the closure definition, we declare the name of a variable that will represent each element of our channel. Here we have simply called it `x`, but it could be named anything. Next, we write an arrow operator `->`; this signifies that the value on the left (`x`) will be processed by the code on the right. Finally, we write some code to do something with our variable. For example, we can square integer values: | ||
|
||
```groovy | ||
{ x -> x ** 2 } | ||
``` | ||
|
||
Or, if we have strings as inputs, we could reverse the strings: | ||
|
||
```groovy | ||
{ x -> x.reverse() } | ||
``` | ||
|
||
To use the `map()` operator with a channel, we simply do: | ||
|
||
```groovy | ||
ch.map { x -> ... } | ||
``` | ||
|
||
!!!question "Exercise" | ||
|
||
Use the `map()` operator to reverse the greeting strings. | ||
|
||
???Solution | ||
|
||
```groovy title="hello-world.nf" hl_lines="7" | ||
workflow { | ||
|
||
// Create a channel for inputs | ||
greetings_array = [ 'Hello world!', 'Bonjour le monde!', 'Holà mundo' ] | ||
greeting_ch = Channel.of(greetings_array) | ||
.flatten() | ||
.map { x -> x.reverse() } | ||
|
||
// Emit a greeting | ||
SAYHELLO(greeting_ch) | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a bit too into the weeds here. Probably just remove this.
## A note about multiple inputs | ||
|
||
The input block can be used to define multiple inputs to the process. Importantly, the number of inputs passed to the process call within the workflow must match the number of inputs defined in the process. For example: | ||
|
||
```groovy title="example.nf" | ||
process MYFUNCTION { | ||
debug true | ||
|
||
input: | ||
val input_1 | ||
val input_2 | ||
|
||
output: | ||
stdout | ||
|
||
script: | ||
""" | ||
echo $input_1 $input_2 | ||
""" | ||
} | ||
|
||
workflow { | ||
MYFUNCTION('Hello', 'World!') | ||
} | ||
``` | ||
|
||
Another important aspect of multiple inputs is that when working with **queue channels**, they can result in **non-deterministic** results. This is because a process will execute as soon as a new value is ready for all input channels. For that reason, multiple inputs will typically be used with either value channel inputs (since they are single values that will be reused over and over again) or by using the `each` qualifier, which allows you to run the process once for every value in a collection or queue. For example: | ||
|
||
``` | ||
process cat_message { | ||
input: | ||
val greeting | ||
each noun | ||
|
||
output: | ||
path "message.txt" | ||
|
||
script: | ||
""" | ||
echo '$greeting' '$noun' > message.txt | ||
""" | ||
} | ||
|
||
workflow { | ||
ch1 = Channel.of( | ||
'Hello', | ||
'Bonjour' | ||
) | ||
ch2 = Channel.of( | ||
'world', | ||
'everyone' | ||
) | ||
|
||
cat_message(ch1, ch2) | ||
} | ||
``` | ||
|
||
This will output the following four lines (possibly in a different order): | ||
|
||
``` | ||
Bonjour everyone | ||
Hello world | ||
Hello everyone | ||
Bonjour world | ||
``` | ||
|
||
In contrast, if we didn't use the `each` qualifier with the `noun` input, we would only get two output lines, such as: | ||
|
||
``` | ||
Hello everyone | ||
Bonjour world | ||
``` | ||
|
||
Because the queues are non-deterministic, the exact combination we would get is uncertain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that multiple inputs was a question from last year's workshop, so it might be useful to keep this, but perhaps wrapped in an 'advanced material' block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes, I'll update them
I have reviewed and updated the part 1 materials. For most of the .md files I have just updated the wording, as well as addressed Issue #3 to update section headers to use a more active voice.
I have updated both 03_hellonf.md and 04_output.md to address Issue #5 and clear up the distinction between processes and workflows. In 04_output.md I have clarified how outputs are declared but not guaranteed by the process output blocks.
I have also made significant changes to 05_inputs.md. This is mainly to address Issues #4 and #7. As part of this, I have merged the explanation of channels into this section; we can discuss whether it may be better to split off into a separate section. As part of the changes to this section, I have:
-ansi-log false
for clearer terminal logsI think some of this may be a bit overkill, particularly the explanation of the
map()
operator and the multiple input channels, so happy to discuss this and potentially remove.