Solving Sadservers


SadServers is a LeetCode style puzzle for Site Reliability Engineers/DevOps Engineers or whatever Ops people in IT are called nowadays. The following is a writeup of walking through the challenges given.

Index

  1. ❌ Saint John
  2. 🧮 Saskatoon - Counting IPs
  3. 🔡 Santiago - Find the secret combination
  4. 🗡️ The Command Line Murders
  5. 🙋‍♂️ Taipei - Come a-knocking
  6. ➕ Lhasa - Easy Math
  7. 🐘 Bucharest - Connecting to Postgres
  8. 🕷️ Bilbao - Basic Kubernetes Problems
  9. 💀 Gitega - Find the Bad Git Commit

Linux Commands Cheatsheet

grep: Search for patterns in files, printing matching lines.
wc: Counts lines, words, and characters in one or more files. (-l for counting lines)
df: Display filesystem disk space usage, showing available and used space on mounted filesystems.
du: Display disk space usage of files and directories, summarizing their sizes.
awk: A powerful text-processing tool for extracting and manipulating data from files.
less: View file contents page by page, allowing navigation and search within large files.
xargs: Build and execute command lines from standard input, often used with other commands for complex operations.
knock: used to send a sequence of connection attempts (knocks) to specified ports on a remote server to trigger actions such as opening a port.
bc: do basic mathematical calculations. (echo "12+5" | bc) => 17

❌ Saint John

A developer created a testing program that is continuously writing to a log file /var/log/bad.log and filling up disk. You can check for example with tail -f /var/log/bad.log. This program is no longer needed. Find it and terminate it.

So let’s see what is accessing this file with lsof:

$ lsof /var/log/bad.log
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
badlog.py 621 ubuntu 3w REG 259,1 10629 67701 /var/log/bad.log

Second column lists the ID of the process what writes to the log.

We’ll kill this process using kill:

kill 621

We can accomplish the same task in one command by grepping (filtering) the output of the lsof command and searching for “badlog.py”. Then, we extract the second result and use it as a parameter with xargs to kill that specific PID.

lsof /var/log/bad.log | grep -w 'badlog.py' | awk '{print $2}' | xargs kill

🧮 Saskatoon - Counting IPs

In this scenario we have to find what’s the IP address that has the most requests in the file /home/admin/access.log.

By reading the file we can see all messages have the following format:

less /home/admin/access.log
83.149.9.216 - - [17/May/2015:10:05:50 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-dashboard.png HTTP/1.1" 200 321631 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"

Looks like awk commmand is perfect for this scenario as it allows you to perform operations on text files, usually line by line. In this way we can get first field of each line in the file.

awk '{print $1}' /home/admin/access.log
awk '{print $1}' /home/admin/access.log | sort | uniq -c | sort -nr | head -n 1 | awk '{print $2}' > /home/admin/highestip.txt

🔡 Santiago - Find the secret combination

In this scenario, we are going to count the number of times “Alice” appears in some text files.

  • The first command is a simple word count (wc) of the lines that contain “Alice”.
  • The second command filters for “Alice” and includes the next line with the -A 1 flag. It then filters again to extract numbers.

Afterward, it’s simply a matter of writing the two results to the file in the specified format.

cat /home/admin/*.txt | grep Alice | wc -l  > /home/admin/solution
cat /home/admin/1342-0.txt | grep Alice -A 1 | grep -o '[0-9]\+' >> /home/admin/solution

🗡️ The Command Line Murders

This one was really fun, but I think it could easily take more than 20 minutes if you are not using the hints.

First, we are going to read the crimescene file and filter by the important information, the clues:

grep "CLUE" /home/admin/clmystery/mystery/crimescene
- Tall man 6'
- Cards for Rotary_Club, Delta SkyMiles, the local library, and the Museum of Bash History.
- A woman left right before they heard the shots. The name on her latte was Annabel, she had blond spiky hair and a New Zealand accent.

Let’s investigate the witness. Filtering for women named Annabel, we get two results:

cat /home/admin/clmystery/mystery/people | grep Annabel
Annabel Sun     F       26      Hart Place, line 40
Annabel Church  F       38      Buckingham Place, line 179

Following the addresses, we can see what information the interviews have. As we know that Annabel has a New Zealand accent, we can discard the first result.

head -n 40 /home/admin/clmystery/mystery/streets/Hart_Place | tail -n 1
SEE INTERVIEW #47246024

cat /home/admin/clmystery/mystery/interviews/interview-47246024 
Ms. Sun has brown hair and is not from New Zealand.  Not the witness from the cafe.


head -n 179 /home/admin/clmystery/mystery/streets/Buckingham_Place | tail -n 1
SEE INTERVIEW #699607

cat /home/admin/clmystery/mystery/interviews/interview-699607 
Interviewed Ms. Church at 2:04 pm.  Witness stated that she did not see anyone she could identify as the shooter, that she ran away as soon as the shots were fired.
However, she reports seeing the car that fled the scene.  Describes it as a blue Honda, with a license plate that starts with "L337" and ends with "9"

So now we know that the thief drives a blue Honda and part of its plate number. Also, the owner must be a man with a height greater than 6’. After filtering car information, we end up with two results.

grep "L337..9" /home/admin/clmystery/mystery/vehicles -A5 | grep "Honda" -A4 -B1 | grep "Blue" -A3 -B2

License Plate L337DV9
Make: Honda
Color: Blue
Owner: Joe Germuska
Height: 6'2"
Weight: 164 lbs
--
License Plate L3375A9
Make: Honda
Color: Blue
Owner: Jeremy Bowers
Height: 6'1"
Weight: 204 lbs

Let’s quickly check if there have been any interviews done with these guys:

grep -E "Joe Germuska|Jeremy Bowers" /home/admin/clmystery/mystery/people
Joe Germuska    M       65      Plainfield Street, line 275
Jeremy Bowers   M       34      Dunstable Road, line 284


head -n 275 /home/admin/clmystery/mystery/streets/Plainfield_Street | tail -n 1
SEE INTERVIEW #29741223

cat /home/admin/clmystery/mystery/interviews/interview-29741223 
Not available to interview


head -n 284 /home/admin/clmystery/mystery/streets/Dunstable_Road | tail -n 1
SEE INTERVIEW #9620713

cat /home/admin/clmystery/mystery/interviews/interview-9620713 
Home appears to be empty, no answer at the door.
After questioning neighbors, appears that the occupant may have left for a trip recently.
Considered a suspect until proven otherwise, but would have to eliminate other suspects to confirm.

Nothing really useful, but if you remember from the clues, we can check their memberships and see which one matches all the ones mentioned in the clue:

grep "Jeremy Bowers" /home/admin/clmystery/mystery/memberships/Museum_of_Bash_History  /home/admin/clmystery/mystery/memberships/Rotary_Club /home/admin/clmystery/mystery/memberships/Delta_SkyMiles  /home/admin/clmystery/mystery/memberships/Terminal_City_Library 
/home/admin/clmystery/mystery/memberships/Museum_of_Bash_History:Jeremy Bowers
/home/admin/clmystery/mystery/memberships/Delta_SkyMiles:Jeremy Bowers
/home/admin/clmystery/mystery/memberships/Terminal_City_Library:Jeremy Bowers

grep "Joe Germuska" /home/admin/clmystery/mystery/memberships/Museum_of_Bash_History  /home/admin/clmystery/mystery/memberships/Rotary_Club /home/admin/clmystery/mystery/memberships/Delta_SkyMiles  /home/admin/clmystery/mystery/memberships/Terminal_City_Library 
/home/admin/clmystery/mystery/memberships/Museum_of_Bash_History:Joe Germuska
/home/admin/clmystery/mystery/memberships/Rotary_Club:Joe Germuska
/home/admin/clmystery/mystery/memberships/Delta_SkyMiles:Joe Germuska
/home/admin/clmystery/mystery/memberships/Terminal_City_Library:Joe Germuska

We see that only Joe matches all four memberships, so we can conclude this is our guy.

echo "Joe Germuska" > ~/mysolution

🙋‍♂️ Taipei - Come a-knocking

In this scenario we are going to learn about Port Knocking.

Port knocking is a security technique used to control access to a network service by requiring clients to perform a specific sequence of connection attempts, or “knocks,” on closed ports. Here’s how it works:

  • Initial State: All ports on the server are closed.
  • Knock Sequence: The client sends a sequence of connection attempts to predefined ports.
  • Sequence Recognition: The server monitors and verifies the sequence. If correct, it temporarily opens the desired port.
  • Access Granted: The client can access the service, after which the port may be closed again.

We are given a web server listening on port 80. In this scenario is just a matter of knocking this port, but realisticly it would be a secounce of port what enables us to communicate.

# Example
knock <server-ip> 9012 5678 1234

# Solution
knock localhost 80
curl localhost

Now if we curl localhost we get a message instead of a connection refused.

➕ Lhasa - Easy Math

We are given a file with two columns. The first column is just a count of scores, and the second column contains the actual scores. We need to calculate the average score.

This is a very good case to use the awk command, as we can grab the second column by using its natural delimiter, the blank space.

awk '{print $2}' /home/admin/scores.txt 

We get the total number of scores by using NR to keep a current count of the number of input records. Remember that records are usually lines. The total score is calculated by looping through the values using the += operator.

count=$(awk 'END {print NR}' /home/admin/scores.txt)
sum=$(awk '{sum += $2} END {print sum}' /home/admin/scores.txt)

Finally, to calculate the average, we divide the total score by the number of scores and use bc (basic calculator) to perform the division and format the output to two decimal places with %.2f.

average=$(echo "scale=2; $sum / $count" | bc | awk '{printf "%.2f", $0}')

echo $average > /home/admin/solution

🐘 Bucharest - Connecting to Postgres

In this server, we have a PostgreSQL database, and the default connection is not working.

We can verify that the service is running and in a healthy state with:

systemctl --type=service --state=running

Inspecting the PostgreSQL service, we can see the path where the configuration files are located:

systemctl status postgresql@13-main
  641 /usr/lib/postgresql/13/bin/postgres -D /var/lib/postgresql/13/main -c config_file=/etc/postgresql/13/main/postgresql.conf

Running the command given in the problem description, we get the following error:

PGPASSWORD=app1user psql -h 127.0.0.1 -d app1 -U app1user -c '\q'
psql: error: FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user "app1user", database "app1", SSL on
FATAL:  pg_hba.conf rejects connection for host "127.0.0.1", user "app1user", database "app1", SSL off

Let’s check the file it is complaining about, pg_hba.conf. The pg_hba.conf file is a PostgreSQL configuration file that stands for “PostgreSQL Host-Based Authentication.” This file controls client authentication, i.e., it specifies which users can connect to which databases from which hosts and which authentication methods they must use.

We can see that all connections were rejected. We can change the configuration to “trust” and restart the service.

sudo su
vi /etc/postgresql/13/main/pg_hba.conf
  host    all             all             all                     reject -> trust
systemctl restart postgresql

Now we can connect using the given credentials.

🕷️ Bilbao - Basic Kubernetes Problems

There’s a Kubernetes Deployment where the pod is not coming up. We need to find the issue and fix it.

It’s convenient to have an alias for kubectl, so let’s create it. Taking a look at the pod details, we can see it has a node selector by tag and is failing to be scheduled.

alias k=kubectl

k get pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-67699598cc-zrj6f   0/1     Pending   0          173d


k describe pod nginx-deployment-67699598cc-zrj6f
Node-Selectors:              disk=ssd
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  173d   default-scheduler  0/2 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling..
  Warning  FailedScheduling  2m20s  default-scheduler  0/2 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling..

There is only one node ready, “node1”. Let’s see the labels it has and add our pod’s tags if they are not present.

k get nodes
NAME                  STATUS     ROLES                  AGE    VERSION
i-02f8e6680f7d5e616   NotReady   control-plane,master   173d   v1.28.5+k3s1
node1                 Ready      control-plane,master   173d   v1.28.5+k3s1

k get node node1 --show-labels
NAME    STATUS   ROLES                  AGE    VERSION        LABELS
node1   Ready    control-plane,master   176d   v1.28.5+k3s1   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=k3s,beta.kubernetes.io/os=linux,disk=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=true,node-role.kubernetes.io/master=true,node.kubernetes.io/instance-type=k3s

k label nodes node1 disk=ssd  

k get node node1 -o json | jq '.metadata.labels'
{
  "beta.kubernetes.io/arch": "amd64",
  "beta.kubernetes.io/instance-type": "k3s",
  "beta.kubernetes.io/os": "linux",
  "disk": "ssd",
  "kubernetes.io/arch": "amd64",
  "kubernetes.io/hostname": "node1",
  "kubernetes.io/os": "linux",
  "node-role.kubernetes.io/control-plane": "true",
  "node-role.kubernetes.io/master": "true",
  "node.kubernetes.io/instance-type": "k3s"
}

We can see that even though it is properly scheduled to node1, it is not coming up. The reason for that is that the resource demands 2GB of memory that the node is not able to provide. We can edit the manifest configuration to request lower resources and now, if we apply the manifest again, we will see that the pod is able to run.

vi manifest.yaml
        resources:
          limits:
            memory: 200Mi
            cpu: 100m
          requests:
            cpu: 100m
            memory: 200Mi

k apply -f manifest.yaml

💀 Gitega - Find the Bad Git Commit

Here we can simple get all the commits hashes and checkout to the commit we want to check and execute the tests.

git log --pretty=format:"%h - %an, %ar : %s"
f2e018e - fduran, 6 weeks ago : README.md
47995fc - fduran, 6 weeks ago : README
c21bcc4 - fduran, 6 weeks ago : module name
3657dad - fduran, 6 weeks ago : 6th
96086c8 - fduran, 6 weeks ago : 5th
2e44089 - fduran, 6 weeks ago : 4th
5179399 - fduran, 6 weeks ago : third
641eade - fduran, 6 weeks ago : second
9e80a7e - fduran, 6 weeks ago : first

git checkout 2e44089
go test

.

.

.