Difference between revisions of "Command Line Exercises"
Line 5: | Line 5: | ||
In each of these examples we show the finished one-liner first, and then proceed to build it up | In each of these examples we show the finished one-liner first, and then proceed to build it up | ||
− | = Extract columns from files held in | + | = Extract columns from files held in different subfolders = |
We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called '''report.tsv''' | We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called '''report.tsv''' | ||
− | |||
== Finished Command == | == Finished Command == | ||
− | for i in | + | for i in ls -d */; do S=$(echo ${i%/*}); export S; \ |
awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \ | awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \ | ||
${i}report.tsv;echo; \ | ${i}report.tsv;echo; \ | ||
done >a.xls | done >a.xls | ||
+ | |||
+ | == Get to know our target file == | ||
+ | Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option to see the number-label correspondance better: | ||
+ | |||
+ | head -1 target.tsv |tr "\t" "\n" | ||
+ | |||
= for-loop over the subfolders = | = for-loop over the subfolders = | ||
+ | |||
+ | The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that | ||
+ | for i in ls -d */; do echo $i;done | head -3 | ||
+ | Because there are many subfolders, we use head to limit the output to just three lines. | ||
+ | |||
+ | Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable: | ||
+ | for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3 | ||
+ | |||
+ | The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions: <code>${i%/*}</code> which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later. | ||
+ | |||
+ | Now for the awk part. We don't want the first line as this is a header with labels for each column. |
Revision as of 17:29, 12 April 2017
Contents
Command-line Exercises
Many of these are so called throwaway one-liners
In each of these examples we show the finished one-liner first, and then proceed to build it up
Extract columns from files held in different subfolders
We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called report.tsv
Finished Command
for i in ls -d */; do S=$(echo ${i%/*}); export S; \ awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \ ${i}report.tsv;echo; \ done >a.xls
Get to know our target file
Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option to see the number-label correspondance better:
head -1 target.tsv |tr "\t" "\n"
for-loop over the subfolders
The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that
for i in ls -d */; do echo $i;done | head -3
Because there are many subfolders, we use head to limit the output to just three lines.
Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable:
for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3
The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions: ${i%/*}
which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later.
Now for the awk part. We don't want the first line as this is a header with labels for each column.