Difference between revisions of "Command Line Exercises"
(2 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
In each of these examples we show the finished one-liner first, and then proceed to build it up | In each of these examples we show the finished one-liner first, and then proceed to build it up | ||
− | = Extract columns from files held in | + | = Extract columns from files held in different subfolders = |
We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called '''report.tsv''' | We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called '''report.tsv''' | ||
+ | == Finished Command == | ||
+ | for i in ls -d */; do S=$(echo ${i%/*}); export S; \ | ||
+ | awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \ | ||
+ | ${i}report.tsv;echo; \ | ||
+ | done >a.xls | ||
+ | |||
+ | == Get to know our target file == | ||
+ | Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option) to see the number-label correspondance better: | ||
+ | |||
+ | head -1 target.tsv |tr "\t" "\n" |cat -n | ||
− | + | In this way we find out that columns 2, 15, 17, 19, 20 are of interest, and we want to avoid rows for which the column number 17 is a dot. Note that for the awk tool, the column numbers will need a dollar sign for it to interpret them properly | |
− | |||
= for-loop over the subfolders = | = for-loop over the subfolders = | ||
+ | |||
+ | The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that | ||
+ | for i in ls -d */; do echo $i;done | head -3 | ||
+ | Because there are many subfolders, we use head to limit the output to just three lines. | ||
+ | |||
+ | Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable: | ||
+ | for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3 | ||
+ | |||
+ | The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions: <code>${i%/*}</code> which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later. | ||
+ | |||
+ | Now for the awk part. We don't want the first line as this is a header with labels for each column. |
Latest revision as of 17:44, 12 April 2017
Contents
Command-line Exercises
Many of these are so called throwaway one-liners
In each of these examples we show the finished one-liner first, and then proceed to build it up
Extract columns from files held in different subfolders
We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called report.tsv
Finished Command
for i in ls -d */; do S=$(echo ${i%/*}); export S; \ awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \ ${i}report.tsv;echo; \ done >a.xls
Get to know our target file
Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option) to see the number-label correspondance better:
head -1 target.tsv |tr "\t" "\n" |cat -n
In this way we find out that columns 2, 15, 17, 19, 20 are of interest, and we want to avoid rows for which the column number 17 is a dot. Note that for the awk tool, the column numbers will need a dollar sign for it to interpret them properly
for-loop over the subfolders
The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that
for i in ls -d */; do echo $i;done | head -3
Because there are many subfolders, we use head to limit the output to just three lines.
Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable:
for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3
The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions: ${i%/*}
which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later.
Now for the awk part. We don't want the first line as this is a header with labels for each column.