Solving Sadservers
SadServers is a LeetCode style puzzle for Site Reliability Engineers/DevOps Engineers or whatever Ops people in IT are called nowadays. The following is a writeup of walking through the challenges given.
Index
- ❌ Saint John
- 🧮 Saskatoon - Counting IPs
- 🔡 Santiago - Find the secret combination
- 🗡️ The Command Line Murders
- 🙋♂️ Taipei - Come a-knocking
- ➕ Lhasa - Easy Math
- 🐘 Bucharest - Connecting to Postgres
- 🕷️ Bilbao - Basic Kubernetes Problems
- 💀 Gitega - Find the Bad Git Commit
Linux Commands Cheatsheet
grep: Search for patterns in files, printing matching lines.
wc: Counts lines, words, and characters in one or more files. (-l for counting lines)
df: Display filesystem disk space usage, showing available and used space on mounted filesystems.
du: Display disk space usage of files and directories, summarizing their sizes.
awk: A powerful text-processing tool for extracting and manipulating data from files.
less: View file contents page by page, allowing navigation and search within large files.
xargs: Build and execute command lines from standard input, often used with other commands for complex operations.
knock: used to send a sequence of connection attempts (knocks) to specified ports on a remote server to trigger actions such as opening a port.
bc: do basic mathematical calculations. (echo "12+5" | bc) => 17
❌ Saint John
A developer created a testing program that is continuously writing to a log file /var/log/bad.log and filling up disk. You can check for example with tail -f /var/log/bad.log. This program is no longer needed. Find it and terminate it.
So let’s see what is accessing this file with lsof:
$ lsof /var/log/bad.log
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
badlog.py 621 ubuntu 3w REG 259,1 10629 67701 /var/log/bad.log
Second column lists the ID of the process what writes to the log.
We’ll kill this process using kill:
kill 621
We can accomplish the same task in one command by grepping (filtering) the output of the lsof command and searching for “badlog.py”. Then, we extract the second result and use it as a parameter with xargs to kill that specific PID.
lsof /var/log/bad.log | grep -w 'badlog.py' | awk '{print $2}' | xargs kill
🧮 Saskatoon - Counting IPs
In this scenario we have to find what’s the IP address that has the most requests in the file /home/admin/access.log.
By reading the file we can see all messages have the following format:
less /home/admin/access.log
83.149.9.216 - - [17/May/2015:10:05:50 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-dashboard.png HTTP/1.1" 200 321631 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
Looks like awk
commmand is perfect for this scenario as it allows you to perform operations on text files, usually line by line.
In this way we can get first field of each line in the file.
awk '{print $1}' /home/admin/access.log
awk '{print $1}' /home/admin/access.log | sort | uniq -c | sort -nr | head -n 1 | awk '{print $2}' > /home/admin/highestip.txt
🔡 Santiago - Find the secret combination
In this scenario, we are going to count the number of times “Alice” appears in some text files.
- The first command is a simple word count (wc) of the lines that contain “Alice”.
- The second command filters for “Alice” and includes the next line with the
-A 1
flag. It then filters again to extract numbers.
Afterward, it’s simply a matter of writing the two results to the file in the specified format.
cat /home/admin/*.txt | grep Alice | wc -l > /home/admin/solution
cat /home/admin/1342-0.txt | grep Alice -A 1 | grep -o '[0-9]\+' >> /home/admin/solution
🗡️ The Command Line Murders
This one was really fun, but I think it could easily take more than 20 minutes if you are not using the hints.
First, we are going to read the crimescene file and filter by the important information, the clues:
grep "CLUE" /home/admin/clmystery/mystery/crimescene
- Tall man 6'
- Cards for Rotary_Club, Delta SkyMiles, the local library, and the Museum of Bash History.
- A woman left right before they heard the shots. The name on her latte was Annabel, she had blond spiky hair and a New Zealand accent.
Let’s investigate the witness. Filtering for women named Annabel, we get two results:
cat /home/admin/clmystery/mystery/people | grep Annabel
Annabel Sun F 26 Hart Place, line 40
Annabel Church F 38 Buckingham Place, line 179
Following the addresses, we can see what information the interviews have. As we know that Annabel has a New Zealand accent, we can discard the first result.
head -n 40 /home/admin/clmystery/mystery/streets/Hart_Place | tail -n 1
SEE INTERVIEW #47246024
cat /home/admin/clmystery/mystery/interviews/interview-47246024
Ms. Sun has brown hair and is not from New Zealand. Not the witness from the cafe.
head -n 179 /home/admin/clmystery/mystery/streets/Buckingham_Place | tail -n 1
SEE INTERVIEW #699607
cat /home/admin/clmystery/mystery/interviews/interview-699607
Interviewed Ms. Church at 2:04 pm. Witness stated that she did not see anyone she could identify as the shooter, that she ran away as soon as the shots were fired.
However, she reports seeing the car that fled the scene. Describes it as a blue Honda, with a license plate that starts with "L337" and ends with "9"
So now we know that the thief drives a blue Honda and part of its plate number. Also, the owner must be a man with a height greater than 6’. After filtering car information, we end up with two results.
grep "L337..9" /home/admin/clmystery/mystery/vehicles -A5 | grep "Honda" -A4 -B1 | grep "Blue" -A3 -B2
License Plate L337DV9
Make: Honda
Color: Blue
Owner: Joe Germuska
Height: 6'2"
Weight: 164 lbs
--
License Plate L3375A9
Make: Honda
Color: Blue
Owner: Jeremy Bowers
Height: 6'1"
Weight: 204 lbs
Let’s quickly check if there have been any interviews done with these guys:
grep -E "Joe Germuska|Jeremy Bowers" /home/admin/clmystery/mystery/people
Joe Germuska M 65 Plainfield Street, line 275
Jeremy Bowers M 34 Dunstable Road, line 284
head -n 275 /home/admin/clmystery/mystery/streets/Plainfield_Street | tail -n 1
SEE INTERVIEW #29741223
cat /home/admin/clmystery/mystery/interviews/interview-29741223
Not available to interview
head -n 284 /home/admin/clmystery/mystery/streets/Dunstable_Road | tail -n 1
SEE INTERVIEW #9620713
cat /home/admin/clmystery/mystery/interviews/interview-9620713
Home appears to be empty, no answer at the door.
After questioning neighbors, appears that the occupant may have left for a trip recently.
Considered a suspect until proven otherwise, but would have to eliminate other suspects to confirm.
Nothing really useful, but if you remember from the clues, we can check their memberships and see which one matches all the ones mentioned in the clue:
grep "Jeremy Bowers" /home/admin/clmystery/mystery/memberships/Museum_of_Bash_History /home/admin/clmystery/mystery/memberships/Rotary_Club /home/admin/clmystery/mystery/memberships/Delta_SkyMiles /home/admin/clmystery/mystery/memberships/Terminal_City_Library
/home/admin/clmystery/mystery/memberships/Museum_of_Bash_History:Jeremy Bowers
/home/admin/clmystery/mystery/memberships/Delta_SkyMiles:Jeremy Bowers
/home/admin/clmystery/mystery/memberships/Terminal_City_Library:Jeremy Bowers
grep "Joe Germuska" /home/admin/clmystery/mystery/memberships/Museum_of_Bash_History /home/admin/clmystery/mystery/memberships/Rotary_Club /home/admin/clmystery/mystery/memberships/Delta_SkyMiles /home/admin/clmystery/mystery/memberships/Terminal_City_Library
/home/admin/clmystery/mystery/memberships/Museum_of_Bash_History:Joe Germuska
/home/admin/clmystery/mystery/memberships/Rotary_Club:Joe Germuska
/home/admin/clmystery/mystery/memberships/Delta_SkyMiles:Joe Germuska
/home/admin/clmystery/mystery/memberships/Terminal_City_Library:Joe Germuska
We see that only Joe matches all four memberships, so we can conclude this is our guy.
echo "Joe Germuska" > ~/mysolution
🙋♂️ Taipei - Come a-knocking
In this scenario we are going to learn about Port Knocking.
Port knocking is a security technique used to control access to a network service by requiring clients to perform a specific sequence of connection attempts, or “knocks,” on closed ports. Here’s how it works:
- Initial State: All ports on the server are closed.
- Knock Sequence: The client sends a sequence of connection attempts to predefined ports.
- Sequence Recognition: The server monitors and verifies the sequence. If correct, it temporarily opens the desired port.
- Access Granted: The client can access the service, after which the port may be closed again.
We are given a web server listening on port 80. In this scenario is just a matter of knocking this port, but realisticly it would be a secounce of port what enables us to communicate.
# Example
knock <server-ip> 9012 5678 1234
# Solution
knock localhost 80
curl localhost
Now if we curl localhost we get a message instead of a connection refused.
➕ Lhasa - Easy Math
We are given a file with two columns. The first column is just a count of scores, and the second column contains the actual scores. We need to calculate the average score.
This is a very good case to use the awk
command, as we can grab the second column by using its natural delimiter, the blank space.
awk '{print $2}' /home/admin/scores.txt
We get the total number of scores by using NR to keep a current count of the number of input records. Remember that records are usually lines.
The total score is calculated by looping through the values using the +=
operator.
count=$(awk 'END {print NR}' /home/admin/scores.txt)
sum=$(awk '{sum += $2} END {print sum}' /home/admin/scores.txt)
Finally, to calculate the average, we divide the total score by the number of scores and use bc
(basic calculator) to perform the division and format the output to two decimal places with %.2f
.
average=$(echo "scale=2; $sum / $count" | bc | awk '{printf "%.2f", $0}')
echo $average > /home/admin/solution
🐘 Bucharest - Connecting to Postgres
In this server, we have a PostgreSQL database, and the default connection is not working.
We can verify that the service is running and in a healthy state with:
systemctl --type=service --state=running
Inspecting the PostgreSQL service, we can see the path where the configuration files are located:
systemctl status postgresql@13-main
641 /usr/lib/postgresql/13/bin/postgres -D /var/lib/postgresql/13/main -c config_file=/etc/postgresql/13/main/postgresql.conf
Running the command given in the problem description, we get the following error:
PGPASSWORD=app1user psql -h 127.0.0.1 -d app1 -U app1user -c '\q'
psql: error: FATAL: pg_hba.conf rejects connection for host "127.0.0.1", user "app1user", database "app1", SSL on
FATAL: pg_hba.conf rejects connection for host "127.0.0.1", user "app1user", database "app1", SSL off
Let’s check the file it is complaining about, pg_hba.conf. The pg_hba.conf file is a PostgreSQL configuration file that stands for “PostgreSQL Host-Based Authentication.” This file controls client authentication, i.e., it specifies which users can connect to which databases from which hosts and which authentication methods they must use.
We can see that all connections were rejected. We can change the configuration to “trust” and restart the service.
sudo su
vi /etc/postgresql/13/main/pg_hba.conf
host all all all reject -> trust
systemctl restart postgresql
Now we can connect using the given credentials.
🕷️ Bilbao - Basic Kubernetes Problems
There’s a Kubernetes Deployment where the pod is not coming up. We need to find the issue and fix it.
It’s convenient to have an alias for kubectl, so let’s create it. Taking a look at the pod details, we can see it has a node selector by tag and is failing to be scheduled.
alias k=kubectl
k get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-67699598cc-zrj6f 0/1 Pending 0 173d
k describe pod nginx-deployment-67699598cc-zrj6f
Node-Selectors: disk=ssd
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 173d default-scheduler 0/2 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling..
Warning FailedScheduling 2m20s default-scheduler 0/2 nodes are available: 1 node(s) didn't match Pod's node affinity/selector, 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling..
There is only one node ready, “node1”. Let’s see the labels it has and add our pod’s tags if they are not present.
k get nodes
NAME STATUS ROLES AGE VERSION
i-02f8e6680f7d5e616 NotReady control-plane,master 173d v1.28.5+k3s1
node1 Ready control-plane,master 173d v1.28.5+k3s1
k get node node1 --show-labels
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready control-plane,master 176d v1.28.5+k3s1 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=k3s,beta.kubernetes.io/os=linux,disk=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=true,node-role.kubernetes.io/master=true,node.kubernetes.io/instance-type=k3s
k label nodes node1 disk=ssd
k get node node1 -o json | jq '.metadata.labels'
{
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/instance-type": "k3s",
"beta.kubernetes.io/os": "linux",
"disk": "ssd",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "node1",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/control-plane": "true",
"node-role.kubernetes.io/master": "true",
"node.kubernetes.io/instance-type": "k3s"
}
We can see that even though it is properly scheduled to node1, it is not coming up. The reason for that is that the resource demands 2GB of memory that the node is not able to provide. We can edit the manifest configuration to request lower resources and now, if we apply the manifest again, we will see that the pod is able to run.
vi manifest.yaml
resources:
limits:
memory: 200Mi
cpu: 100m
requests:
cpu: 100m
memory: 200Mi
k apply -f manifest.yaml
💀 Gitega - Find the Bad Git Commit
Here we can simple get all the commits hashes and checkout to the commit we want to check and execute the tests.
git log --pretty=format:"%h - %an, %ar : %s"
f2e018e - fduran, 6 weeks ago : README.md
47995fc - fduran, 6 weeks ago : README
c21bcc4 - fduran, 6 weeks ago : module name
3657dad - fduran, 6 weeks ago : 6th
96086c8 - fduran, 6 weeks ago : 5th
2e44089 - fduran, 6 weeks ago : 4th
5179399 - fduran, 6 weeks ago : third
641eade - fduran, 6 weeks ago : second
9e80a7e - fduran, 6 weeks ago : first
git checkout 2e44089
go test
.
.
.