Difference between revisions of "Command Line Exercises"

From wiki
Jump to: navigation, search
(Created blank page)
 
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
= Command-line Exercises =
  
 +
Many of these are so called throwaway one-liners
 +
 +
In each of these examples we show the finished one-liner first, and then proceed to build it up
 +
 +
= Extract columns from files held in different subfolders =
 +
 +
We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called '''report.tsv'''
 +
 +
== Finished Command ==
 +
for i in ls -d */; do S=$(echo ${i%/*}); export S; \
 +
    awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \
 +
    ${i}report.tsv;echo; \
 +
done >a.xls
 +
 +
== Get to know our target file ==
 +
Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option) to see the number-label correspondance better:
 +
 +
head -1 target.tsv |tr "\t" "\n" |cat -n
 +
 +
In this way we find out that columns 2, 15, 17, 19, 20 are of interest, and we want to avoid rows for which the column number 17 is a dot. Note that for the awk tool, the column numbers will need a dollar sign for it to interpret them properly
 +
 +
= for-loop over the subfolders =
 +
 +
The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that
 +
for i in ls -d */; do echo $i;done | head -3
 +
Because there are many subfolders, we use head to limit the output to just three lines.
 +
 +
Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable:
 +
for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3
 +
 +
The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions:  <code>${i%/*}</code> which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later.
 +
 +
Now for the awk part. We don't want the first line as this is a header with labels for each column.

Latest revision as of 17:44, 12 April 2017

Command-line Exercises

Many of these are so called throwaway one-liners

In each of these examples we show the finished one-liner first, and then proceed to build it up

Extract columns from files held in different subfolders

We shall run a for-loop in the shell environment and execute an awk column-selecting command for a certain file called report.tsv

Finished Command

for i in ls -d */; do S=$(echo ${i%/*}); export S; \
   awk '{if(NR!=1){if($19 != ".") {system("./echon.sh $S"); printf "\t%s\t%s\t%s\t%s\t%s\n",$2,$15,$17,$19,$20}}}' \
   ${i}report.tsv;echo; \
done >a.xls

Get to know our target file

Our target file is a tab-separated list of data rows, with quite a few columns. We're only interested in some of these columns, and we need to find what the column numbers are. We go inside one of the subfoldes and use a combination of head and cat (with its numbering option) to see the number-label correspondance better:

head -1 target.tsv |tr "\t" "\n" |cat -n

In this way we find out that columns 2, 15, 17, 19, 20 are of interest, and we want to avoid rows for which the column number 17 is a dot. Note that for the awk tool, the column numbers will need a dollar sign for it to interpret them properly

for-loop over the subfolders

The ls command will list folders only, only if given the */ argument, as well as the -d option. Let's check that it's really doing that

for i in ls -d */; do echo $i;done | head -3

Because there are many subfolders, we use head to limit the output to just three lines.

Because we are using awk and bash together, we can expect complications when passing variables from one into the other. For a start we definitely need the subfolders in a bash variable:

for i in ls -d */; do S=$(echo ${i%/*}); echo $S; done |head -3

The output is almost the same, except we have cleaned the variable of its final slash via one of bash's brace expansions: ${i%/*} which looks better. This will only be known to the current bash session, as we are adding on other commands, new "child" bash session will be launched and they will not know about this variable, so we need to export it so that awk can seen it later.

Now for the awk part. We don't want the first line as this is a header with labels for each column.