
Ultimate checklist for a newly joining fullstack software engineer
I was working as a frontend engineer for a quite while, so when I was back to fullstack, it took some time to pick it up. One of the challenges was to gather all the information about the setup of the project I joined for, in order to be able to perform my duties at a professional level.
So here I've put together a checklist that every newly joining fullstack should go through, to feel confident developing, maintaining and observing a service they are responsible for.
Logs
The first and the most important thing is to know where to find Logs. You needed logs for every microservice, for both Staging and every Live environment in every region. Logs can be managed via Datadog, Loki or Loggly, so if that's the case, go there, filter out by the service name, log level set to "Error" and choose the backspan of one day.
So any time you could check the logs and quickly see if there are any recent errors.
Alerts
Essentially alerting helps SRE and the rest of the team to react on possible outages, so its essential to have all that properly rigged. The alerting can be set up via Datadog as well, with dumping all relevant information to special Slack channels, so you make sure you joined these.
Metrics
Metrics typically contain:
- Resource consumption (CPU, Memory, DB Size).
- Request per second.
- Request duration.
- Query duration.
- Error rate.
- Cost of running the cloud native application, per month, per quarter, per year.
- Any other custom metrics you might be interested in.
All these may be rigged in Grafana, however, Datadog can be also utilised for this cause. Again, the dashboards should be available for every environment, and the quick links must be saved for your disposal.
Tracing
If a project uses things like Open Telemetry, it may gather information about so-called spans. A span is typically a sub-routine in the code, and tracing allows to measure the duration of these. It's essentially like a profiler, but for the backend. To see the spans several tools may be used, such as Jaeger or Datadog, so make sure you have it.
Deployment
If the application is containerised, it is most likely running inside of a K8s cluster. Typically there is a Staging cluster, and several regional Live clusters. One way or another, in case of K8s a good option could be to use Spinnaker. Spinnaker allows deploying new versions of Docker images into the cluster swiftly and frictionless. Whenever you make a release that triggers an image build and, consequently, a Spinnaker pipeline, you might want to go to Spinnaker and see if there were any errors deploying that new image.
Again, the links must be saved for both Staging and Live environments.
GCP dashboard access
The very same thing is applicable to AWS. But if the infrastructure is spinning on GCP, at least two things should be done:
- Get access to the GCP panel to see all the resources.
- Set the gcloud CLI tool up in order to perform useful operation in the console.
Managing clusters
Sometimes it's necessary to interact with the clusters directly. Typically, two tasks are quite frequent:
- Restart a misbehaving container.
- Obtain the logs of a failing container.
Assuming that gcloud CLI tool was already installed and configured, you can get the credentials for a specific cluster and store them locally:
gcloud container clusters get-credentials <cluster_name> --zone <zone_name> --project <project_name>
To see the list of all clusters you already have credentials for, you type:
kubectl config get-contexts
Then, to start working with a specific cluster, you use the following command:
kubectl config use-context <context_name>
Most of the time there are multiple cloud-native application running on a single cluster, each under its own namespace. You list the namespaces and find the one that holds your project:
kubectl get namespace
Then, to find a pod within a specific namespace:
kubectl -n <namespace> get pod
To get information about the pod:
kubectl -n <namespace> describe pods <pod_name>
Restart a misbehaving container
To restart a container, you scale it down to zero and then back to the regular amount of replicas:
kubectl -n <namespace> scale deploy <container_name> --replicas=0# wait for some timekubectl -n <namespace> scale deploy <container_name> --replicas=2
Obtain the logs of a failing container
When you notice that Spinnaker failed to spin up a new version of an image, you go check the container logs:
kubectl -n <namespace> logs -f <pod_name> -c <container_name>
There is also an amazing CLI tool called k9s for managing K8s clusters. I mostly use it for two things:
- To read the logs of a staging container, to understand why it crashes.
- To SSH into a container, in order to do stuff like reading env variables with printenv.
Connecting to the databases
You must always have read/write access to the staging database. You'll also gonna need access to all production databases in the readonly mode. The best way to do it would be to tunnel the connection to the local port.
With GPC it can be done via the cloud_sql_proxy tool. So you pick a port that you want to be allocated locally, and then run:
cloud_sql_proxy -instances=<project_name>:<region>:<clouddb_instance_name>=tcp:<local_port>
The best way to automate this would be to create a script, like this one, that allows connecting to a database for an arbitrary country or region:
./dbconn country=<country_name>
So make sure you have one. It is company-specific, so you'll have to make that script by yourself.
You can use any client to access the database. It can be DataGrip, but projects like PgAdmin or PhpMyAdmin can also do fine if you use Postgres on Mysql respectively. One important note: when connecting to the live database, always connect to the read replica. As mentioned before, if you don't have a read replica (which you should), at least make the connection readonly. You don't want to mess up with the production data, do you?
Running the application locally
It's a good practice to have an option of running the whole cloud-native app on your local machine. I my opinion, it's wise to not rely solely on Unit testing and TDD, but also be able to actually test new features before pushing to Staging.
Docker Desktop or Colima to the rescue, if you are on Mac or Windows. There is also a variety of projects that offer mocking of the most popular cloud services: Localstack for AWS and a handful of projects like gcloud-pubsub-emulator or gcp-storage-emulator for GCP.
Ask your team members for the .env.local and .env.test files, so you don't have to spend time on figuring the right values on your own. It is a good shortcut.
Dumping the staging database locally
One of the most frequent thing that may happen to you is your QA engineer reporting an issue on Staging. Then without any further ado, you can just dump the staging database locally and do the research in a local environment, which is of course extremely transparent and safe. Better than digging down the logs and trying to figure the issue. You can even use a debugger if a situation calls for it.
There is always a dumping tool available for your database out there. Use pg_dump for Postgres, mongodump for MongoDB, etc. Make a tunnel to the read replica, dump and then restore locally. You can even make a script to automate this, exactly as I did.
So yeh, as you see, observability is the key.
This article is a work in progress, so as soon as I find new relevant information, I am gonna expand the post.

Sergei Gannochenko
React, Node, Go, Docker, AWS, Jamstack.
15+ years in dev.