With the right tools, live debugging is an effective way to troubleshoot production errors in microservices applications. In this codelab, you'll learn how to set up Ozcode's live debugger to troubleshoot microservices in production.

This video provides a brief overview of Ozcode and the tools you will use in this CodeLab:

https://youtu.be/bx7EWdKohzI

What you'll learn

What you need to get started

For this codelab, we have prepared a sample application with bugs you can use to troubleshoot with Ozcode's live debugger. The application is a night school for superheroes composed of several Docker microservices that support the following main actions:

Microservices structure

The sample application is composed of the following Docker microservices:

Service

Technology

URL

Description

ServiceA

.Net Core 3.1 Web.API

http://localhost:5001

Manages course schedules

ServiceB

.Net Core 3.1 Web.API

http://localhost:5002

Manages room allocations

ServiceAuth

.Net Core 3.1 Web.API

http://localhost:5003

Manages authentication and token generation

ServiceUI

.Net Core 3.1 Razor pages

http://localhost:5000

UI service

ServiceSPA

Node.JS Vue

http://localhost:5080

UI service SPA generates traffic and errors in the cluster

SEQ

SEQ

http://localhost:5050

Log aggregation

Get the Sample application

To get the sample application, clone the following Git repository and move to the corresponding directory:

git clone https://github.com/oz-code/Nightschool-DockerLab
cd Nightschool-DockerLab

Create an Ozcode Account

To use Ozcode's live debugger to debug your microservices application, you first need to set up an account.

Browse to https://app.oz-code.com/sign-up-check and provide your email address.

In the following screen, fill in the details of your account:

After submitting your details, you should receive an email message to the address you provided. After confirming your email address, you will be taken to the Sign In screen.

After signing in for the first time, you will be prompted to create your first application.

Create an Application and Configure the Ozcode Agent

An application in Ozcode's live debugger represents the application you want to troubleshoot.

Give your application a name (we will use "DockerLab" in this CodeLab) and click Create.

Ozcode's live debugger will present options for Platform, OS, and Deployment.

Select My Host, Linux, and Docker

In the following screen, scroll down to the installation instructions.

NOTE:

These installation instructions tell you to copy the code snippet into your dockerfile. In our sample application, we have already included the relevant snippet. All you have to do is provide the right agent token by editing the .env file as described below.

Look for the OzCode_Agent_Token variable and copy it to the clipboard.

Back in the Nighschool-DockerLab folder, edit the .env file and replace the placeholder for the OzCode_Agent_Token value with the one you copied.

To run the sample application, navigate to your Nightschool-DockerLab folder and run:

docker-compose up

This will download several images from Docker Hub and start your local cluster. When done, the Night school application should be up and running. Navigate to http://localhost:5080

This is what you should see:

If you now check back with Ozcode's live debugger, it will indicate that your agent is up and running alongside the Night School application.

The Lab

There are two main parts to this lab:

  1. Debugging an error that generates an exception
  2. Debugging a logical error with tracepoints and dynamic logging

Exercise 1: Debugging an Error that Generates an Exception

1.1 Generate the error

  1. Navigate to http://localhost:5080 to see the main screen of the application.
  2. Click on the Courses tab to see all available courses. None of them are scheduled yet.
  3. Click the Schedule button for course INTRO-001. The application will create a schedule for the course, and the record will turn green.
  4. Do the same for course LEGAL-001. It, too, will turn green. Amazing!
  5. Now try to create a schedule for HAND-001. This will generate an error, and an alert will pop up.

    Click OK to dismiss the alert.

1.2 Capturing the Exception

  1. Go back to Ozcode's live debugger and click the Ozcode logo in the top left corner to view the Exception Dashboard. You should see two exceptions in Capturing mode.

INFO

Ozcode's live debugger captures exceptions in stages. The first time an exception is thrown, it is in a "Capturing" state. In this state, you get some initial data but are not yet able to debug the exception. The second time the exception is thrown, it is in a "Captured" state. In this state, you can start debugging the exception. If you request a full time-travel capture, next time the exception is thrown, you will have full time-travel debug information and can step through the complete execution flow of the error with full details. For more information about capturing exceptions with Ozcode, see Capturing Exceptions.

Your screen should look like this:

NOTE:

If you don't see exceptions, click the Refresh button to update the dashboard

  1. Now go back to the Night School application and generate the bug again by clicking the Schedule button for HAND-001.
  2. Refresh the Ozcode dashboard, and you should see that the exceptions are now in a "Captured" state.
  3. Select one of the exceptions and in the Code Preview panel that appears, click Debug.
  4. You should now see the Debug screen showing primary debug information for this exception.


    In the next section, you will generate a full time-travel capture for this exception.

1.3 Capturing full time-travel information

  1. Click the Full Capture button at the top of the Debug screen.
  2. Now go back to the Night School application and once again click the Schedule button for HAND-001. As expected (errors don't just fix themselves), you will see the alert again.
  3. In Ozcode, go back to the Exception Dashboard. The Status column for the exception should now indicate "Fully Captured." Select the exception, and click Debug again.

INFO

The exception whose message is "No available rooms with required features" is the best one to see time-travel debug features.

Exercise 2: Debugging a Logical Error with Tracepoints and Dynamic Logging

NOTE

For this exercise to work, you must first complete Exercise 1.

2.1 Generate the logical error

  1. After completing Exercise 1, both INTRO-001 and LEGAL-001 should have courses scheduled. If this is not the case, please restart your cluster and redo Exercise 1.
  2. To generate a logical error, go back to the Night School application and try to schedule rooms for FLIGHT-001. The row for that course will be highlighted in yellow to indicate a problem with scheduling. You'll see that rooms were assigned, but no students were registered.

2.2 Create a tracepoint session for debugging

  1. In the Ozcode dashboard, select Tracepoint Sessions.
  2. In the Tracepoints screen, click Create in the top right corner and give your session a name.
  3. In the Code Explorer panel, drill down into ServiceA to find class ServiceA. Controllers.CoursesController.
  4. Select function ResolveSchedule.
  5. In the Code panel, click to the left of the line that says course.Schedule = newSchedule; to add a tracepoint.
  6. Add the following message to the tracepoint:
    Found new schedule for course {id}

    This is what your screen should now look like:

    Click the green tick to set the tracepoint.
  7. Click Start Collecting in the top right corner of the screen
  8. Go back to the Night School application and once again, click Schedule for FLIGHT-001
  9. Back in Ozcode, you should now see a tracepoint hit.

2.3 Determine the root cause of the error

Add more tracepoints in the code in Ozcode. Keep clicking Schedule for FLIGHT-001 in Night School to generate more tracepoint hits to try and understand why the application can't schedule classes for that file.

HINT

To schedule a class, the application must find an available room that has the capacity to handle the number of students registered for the course. Add some tracepoints in FindAvailableRooms (under ServiceA.Controllers.CoursesController). If you still don't get it, go to 2.5 Root Cause at the end of this lab where we spill the beans.

NOTE

You can add as many tracepoints as you like. Try to put one at ServiceB.Controllers.RoomsController to see how the transaction moves from one microservice to the other. To see all tracepoints related to a transaction, add a column to the tracepoint hits table and display CorrelationID-Ozcode.

2.4 The Brave New World of Dynamic Logging

Displaying dynamic logs is not limited to the Ozcode UI. Using Ozcode's Ozcode.ProductionDebugger.Client NuGet package, you can transform tracepoint hits into log entries in the framework of your choice without having to redeploy your code. Let's see this in action using SEQ.

  1. Go tohttp://localhost:5050 to open SEQ services.
  2. This service was configured to collect the logs generated from all other microservices.
  3. Search for "Tracepoint is not null"
  4. Expand the Tracepoint property and copy/paste the TracepointUrl to open the exact tracepoint hit that caused that log line to be created.

2.5 Root Cause

If you're still struggling with our logical bug, here's why you couldn't schedule rooms for the FLIGHT-001 course.

Set the following tracepoint in the last line of FindAvailableRooms:
desiredRoomCapacity = {desiredRoomCapacity}

If you now try to schedule again, you'll find the following tracepoint hit.

You'll notice that 20 students are registered for this course, and you need a room with a large enough capacity.

If you examine the Locals for that tracepoint hit, you'll see that 12 rooms were found.

But if you drill down into each of those rooms, none of them has the capacity for the 20 students registered for the course (maxCapacity < 20 for all rooms).