3

I have a group of machines at google cloud. from my localhost:

gcloud compute instance-groups list-instances workers
OUTPUT:
NAME              ZONE           STATUS
workers-lya2    us-central1-a     RUNNING 
workers-23d4     us-central1-a     RUNNING 
...
workers-3asd3     us-central1-a     RUNNING    

I want a random worker name from that list (let's say workers-23d4) and it's zone us-central1-m from the first command and paste it into this command:

gcloud compute --project "my-project" ssh --zone "<zone_name_from_first_command> "<machine_name_from_first_command>"

little weak on bash. please help

3
  • which one do you need? A random one? Or a specific one which matches a string? Commented Jan 22, 2017 at 13:23
  • 1
    sorry for not being clear. will edit the question. i meant a random one Commented Jan 22, 2017 at 13:24
  • thanks, now it's clear. I added my attempt at the problem as an answer Commented Jan 22, 2017 at 13:40

3 Answers 3

3

The following command picks a random line (excluding the header) from the output of the gcloud command, then stores the first two "words" into machine and zone variables:

read -r machine zone unused <<< $(
  gcloud compute instance-groups list-instances workers | \
    perl -e '@_ = <>; shift @_; print $_[rand @_]'
)

After this command you are ready to use machine and zone variables, e.g.:

gcloud compute --project "my-project" ssh --zone "$zone" "$machine"

Explanations

The perl command reads all lines from the standard input into @_ array using the diamond operator <>. Then shift function removes the first item from @_. rand@_ returns a random decimal number between zero and the number of items in @_. The decimal number is implicitly converted to integer in the index context. Therefore, the result of $_[rand @_] is a random item of @_, i.e. a random line from the output of the gcloud command.

The output of gcloud and perl commands is captured using command substitution and passed to the read command via here string.

I have quoted words in the first paragraph, because the shell interprets character sequences as words depending on the IFS (Input Field Separators) variable. So the IFS-separated words from the here string are assigned to machine (the first word), zone (the second word), and unused (the rest of the line) variables.

-r option disables the special meaning of backslash. In other words, read will not try to interpret escape sequences in the input, when this option is given.

The Case of Big Number of Lines

Note, the solution implies that the output of gcloud command is relatively small, i.e. small enough to slurp the entire file into an array. This operation is fast, but requires more memory as opposed to reading line by line using a while <> loop. Here is another solution for the off-chance if the output is very large, or memory is very limited:

read -r machine zone unused <<< $(
  gcloud compute instance-groups list-instances workers | \
    perl -e '<>; $. = 0; rand($.) < 1 && ($line = $_) while <>; print $line'
)

where <> reads the header; $. is the built-in variable keeping the current line number; and the rest is taken from this cookbook.

Sign up to request clarification or add additional context in comments.

1 Comment

Nicely done, especially the memory-efficient 2nd perl command. While it doesn't make a difference in this case, I still suggest double-quoting the command substitution to promote good habits.
2
gcloud compute instance-groups list-instances workers | grep -v "^NAME"  | shuf -n 1 | awk '{print $1, $2}' | 
while read machine zone; do
    export SELECTED_MACHINE="$machine"
    export SELECTED_ZONE="$zone"
done
gcloud compute --project "my-project" ssh --zone "$SELECTED_ZONE" "$SELECTED_MACHINE"
  • grep -v "^NAME" strips away all lines starting with NAME (assuming it's just the first line you want to strip away)
  • shuf takes a random line from the remaining lines
  • awk '{print $1, $2}' splits the line at the spaces and prints the first and second column
  • while read reads the output of awk into variables $machine and $zone

Update: The above code works for zsh, but not for bash, as bash runs the pipes in subshells (zsh doesn't) and export does only pass variables to subprocesses, not to parent processes. The following script solves this problem through having read running in the parent process:

machine_zone=$(gcloud compute instance-groups list-instances workers | 
              grep -v "^NAME"  | shuf -n 1 | awk '{print $1, $2}')
read machine zone <<< $machine_zone
gcloud compute --project "my-project" ssh --zone "$zone" "$machine"

9 Comments

@hansplast thanks for that.Pseudo-terminal will not be allocated because stdin is not a terminal.
Solved that by exporting the variables into export SELECTED_MACHINE="$machine" like so. and then using them outside the while. change your answer and i'll accept. thanks
@WebQube ok, solved it like that for the moment, although it doesn't feel right, I'll look into that later
++ for the part that extracts the random line, but since your read command executes in a subshell (due to use of a pipelin), the 2nd gcloud command won't see the variables you're defining - whether you export them or not; also, you don't need a loop to read the single-line output from awk.
@mklement0 aha, that makes sense. I updated my explanation and also replaced the backticks against $(...), thanks for the link. I never understood why $(..) is preferred over backticks, now I do
|
2

If the input from which to extract a random line is large, and you want to avoid reading it into memory as a whole, as shuf does, consider the 2nd perl solution in Ruslan Osmanov's helpful answer.

Otherwise, hansaplast's helpful answer uses a multi-utility approach based on shuf that is easy to understand, but it can be streamlined (and, as of this writing, has flaws):

read -r machine zone _ < \ 
  <(gcloud compute instance-groups list-instances workers | tail +2 | shuf -n 1)

gcloud compute --project "my-project" ssh --zone "$zone" "$machine"
  • By making read read the output from a process substitution (<(...)), it is ensured that read is executed in the current shell, which means that the variables it creates are visible to the remaining commands, notably the 2nd gcloud call.

    • By contrast, if you use a pipeline (gcloud ... | read ...), read executes in a subshell, and the variables it creates won't be seen by subsequent commands.
  • tail +2 skips the first line in the input (the header row).

  • shuf -n 1 extracts 1 randomly chosen line from the input.

  • Note the _ as the name of the 3rd variable passed to read, which receives the (unused) rest of the input line after the first 2 whitespace-separated tokens have been read into $machine and $zone.

    • If we only specified machine and zone, the $zone would receive not just the 2nd token, but the entire rest of the input line.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.