Jaeger Trace MDC
1. Solution Overviewโ
1.1 Background and Objectivesโ
In distributed systems, cross-service request link tracing is a key means of locating problems. This solution aims to implement deep integration of Jaeger distributed tracing and SLF4J MDC (Mapped Diagnostic Context), enabling:
- Logs automatically carry traceId/spanId, facilitating log aggregation queries
- Support cross-service trace context passing
- Maintain trace context accuracy in asynchronous scenarios like thread pools
1.2 Core Valueโ
- Problem Location Efficiency Improvement: Quickly correlate all relevant logs in distributed systems via traceId
- Zero Intrusiveness: Business code does not need to manually manage MDC, automatic injection and cleanup
- Thread Safety: Support nested Spans and thread pool reuse scenarios
2. Architecture Designโ
2.1 Overall Architecture Diagramโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Application Layer โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ Controller โโโโ>โ Service โโโโ>โ DAO โ โ
โ โ (HTTP Entry) โ โ (Business) โ โ (Database) โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
47: โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Tracing Layer โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ CustomMDCScopeManager (Core Component) โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ ThreadLocal<CustomMDCScope> โ โ โ
โ โ โ - Manage Scope Lifecycle โ โ โ
โ โ โ - Maintain Span Stack โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ CustomMDCScope (Scope Impl) โ โ โ
โ โ โ - MDC Snapshot and Restore โ โ โ
โ โ โ - Support Nested Span โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Logging Layer โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ SLF4J MDC โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ traceId: 04bf92f3577b34da... โ โ โ
โ โ โ spanId: 36bd32b7a5712a1a โ โ โ
โ โ โ parentId: 00f067aa0ba902b7 โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโโโ
โ Jaeger Collector โ
โ (Trace Storage) โ
โโโโโโโโโโโโโโโโโโโโโโ
2.2 Core Component Descriptionโ
2.2.1 CustomMDCScopeManagerโ
Responsibilities: Implement OpenTracing's ScopeManager interface, manage Span activation and propagation
Key Implementation:
private final ThreadLocal<CustomMDCScope> tlsScope = new ThreadLocal<>();
@Override
public Scope activate(Span span) {
return new CustomMDCScope(span);
}
@Override
public Span activeSpan() {
CustomMDCScope scope = tlsScope.get();
return scope == null ? null : scope.wrapped;
}
Design Points:
- Use
ThreadLocalto ensure thread isolation - Support Scope nesting (maintain previous through linked list structure)
- Implement lazy activation: Inject MDC only when calling
activate()
2.2.2 CustomMDCScopeโ
Responsibilities: Implement OpenTracing's Scope interface, manage single Span lifecycle
Core Mechanism: Snapshot-Inject-Restore
CustomMDCScope(Span span) {
// 1. Save current MDC snapshot
this.previousTraceId = MDC.get("traceId");
this.previousSpanId = MDC.get("spanId");
// 2. Establish linked list relationship (support nesting)
this.previous = CustomMDCScopeManager.this.tlsScope.get();
CustomMDCScopeManager.this.tlsScope.set(this);
// 3. Inject new trace context
MDC.put("traceId", span.context().toTraceId());
MDC.put("spanId", span.context().toSpanId());
}
@Override
public void close() {
// 4. Restore to previous Scope
CustomMDCScopeManager.this.tlsScope.set(previous);
// 5. Restore MDC to snapshot state
restoreMDC("traceId", previousTraceId);
restoreMDC("spanId", previousSpanId);
}
3. Key Process Designโ
3.1 Process of Receiving Remote Trace Contextโ
โโโโโโโโโโโโโโโ
โ HTTP Requestโ
โ Headers โ
โ uber-trace-id: 4bf92f3577b...โ
โโโโโโโโฌโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Parse HTTP Header โ
โ - Extract traceId (128-bit) โ
โ - Extract parentSpanId (64-bit)โ
โ - Extract flags (Sample Tag) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 2. Construct JaegerSpanContext โ
โ long[] parts = splitTraceId() โ
โ new JaegerSpanContext( โ
โ traceIdHigh, โ
โ traceIdLow, โ
โ parentSpanId, โ
โ parentOfParentId, โ
โ flags โ
โ ) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 3. Create Child Span โ
โ tracer.buildSpan("child-span")โ
โ .asChildOf(parentContext)โ
โ .withTag(...) โ
โ .start() โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 4. Activate Span to Current Threadโ
โ try (Scope scope = โ
โ tracer.scopeManager() โ
โ .activate(childSpan)) โ
โ { โ
โ // MDC Auto Injection โ
โ // Business Logic Execution โ
โ } // MDC Auto Cleanup โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
3.2 MDC Management Process for Nested Spansโ
Timeline โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ>
ThreadLocal Stack:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ null โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
activate(spanA)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ScopeA: {traceId: xxx, spanId: A, previous: null}โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
MDC: {traceId: xxx, spanId: A}
activate(spanB) // Nested Call
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ScopeB: {traceId: xxx, spanId: B, previous: ScopeA}โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
MDC: {traceId: xxx, spanId: B} // spanId Update
// Business Code Execution
log.info("Processing...") // Log carries spanId=B
close(ScopeB)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ScopeA: {traceId: xxx, spanId: A, previous: null}โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
MDC: {traceId: xxx, spanId: A} // Restore to spanId=A
close(ScopeA)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ null โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
MDC: {} // Completely Clear
3.3 Trace ID Split Algorithmโ
Jaeger supports 128-bit traceId, but Java long is only 64-bit, so splitting is required:
/**
* Split hexadecimal traceId string into high 64 bits and low 64 bits
*
* Example:
* Input: "04bf92f3577b34da63ce929d0e0e4736" (32 hex characters = 128 bit)
* Output: [0x04bf92f3577b34da, 0x63ce929d0e0e4736]
*/
public static long[] splitTraceId(String traceIdHex) {
if (traceIdHex.length() <= 16) {
// Only low 64 bits, high bits are 0
return new long[]{0L, parseLong(traceIdHex)};
} else {
// Split into high 64 bits and low 64 bits
String highHex = traceIdHex.substring(0, traceIdHex.length() - 16);
String lowHex = traceIdHex.substring(traceIdHex.length() - 16);
return new long[]{parseLong(highHex), parseLong(lowHex)};
}
}
4. Configuration and Integrationโ
4.1 Tracer Initialization Configurationโ
static Tracer tracer = new Configuration("order-service")
.withSampler(
new Configuration.SamplerConfiguration()
.withType("const")
.withParam(1) // Sampling Rate 100%
)
.withReporter(
new Configuration.ReporterConfiguration()
.withLogSpans(true) // Enable log output in development environment
.withSender(
new Configuration.SenderConfiguration()
.withAgentHost("localhost")
.withAgentPort(6831)
)
)
.getTracerBuilder()
.withScopeManager(new CustomMDCScopeManager()) // Key: Inject Custom Manager
.build();
4.2 Logback Configurationโ
<configuration>
<appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} [traceId=%X{traceId} spanId=%X{spanId}] - %msg%n</pattern>
</encoder>
</appender>
<appender name="JSON" class="ch.qos.logback.core.FileAppender">
<file>logs/app.json</file>
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdcKeyName>traceId</includeMdcKeyName>
<includeMdcKeyName>spanId</includeMdcKeyName>
<includeMdcKeyName>parentId</includeMdcKeyName>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="CONSOLE" />
<appender-ref ref="JSON" />
</root>
</configuration>
4.3 Spring Boot Integration (Optional)โ
@Configuration
public class TracingConfig {
@Bean
public Tracer jaegerTracer() {
return new Configuration(
env.getProperty("spring.application.name", "unknown-service")
)
.withSampler(samplerConfig())
.withReporter(reporterConfig())
.getTracerBuilder()
.withScopeManager(new CustomMDCScopeManager())
.build();
}
@Bean
public TracingFilter tracingFilter(Tracer tracer) {
return new TracingFilter(tracer);
}
}
5. Usage Scenarios and Best Practicesโ
5.1 Receive Upstream Trace Contextโ
@RestController
public class OrderController {
@GetMapping("/order/{id}")
public Order getOrder(
@PathVariable String id,
@RequestHeader(value = "uber-trace-id", required = false) String uberTraceId
) {
JaegerSpanContext parentContext = parseUberTraceId(uberTraceId);
Span span = tracer.buildSpan("get-order")
.asChildOf(parentContext) // Key: Link Remote Parent Span
.withTag(Tags.SPAN_KIND, Tags.SPAN_KIND_SERVER)
.withTag("order.id", id)
.start();
try (Scope scope = tracer.scopeManager().activate(span)) {
log.info("Processing order request"); // Automatically carries traceId/spanId
return orderService.getById(id);
} finally {
span.finish();
}
}
}
5.2 Pass Trace Context Downstreamโ
public class PaymentClient {
public void processPayment(String orderId) {
Span span = tracer.buildSpan("call-payment-service")
.withTag(Tags.SPAN_KIND, Tags.SPAN_KIND_CLIENT)
.start();
try (Scope scope = tracer.scopeManager().activate(span)) {
HttpHeaders headers = new HttpHeaders();
// Inject trace context to HTTP Header
tracer.inject(
span.context(),
Format.Builtin.HTTP_HEADERS,
new HttpHeadersCarrier(headers)
);
restTemplate.exchange(
"http://payment-service/pay",
HttpMethod.POST,
new HttpEntity<>(paymentRequest, headers),
PaymentResponse.class
);
} finally {
span.finish();
}
}
}
5.3 Async Scenario Handlingโ
@Service
public class AsyncOrderService {
@Autowired
private Tracer tracer;
@Autowired
private ExecutorService executorService;
public void processOrderAsync(String orderId) {
Span parentSpan = tracer.activeSpan(); // Get Current Span
executorService.submit(() -> {
// Reactivate Parent Span in New Thread
Span asyncSpan = tracer.buildSpan("async-process")
.asChildOf(parentSpan)
.start();
try (Scope scope = tracer.scopeManager().activate(asyncSpan)) {
log.info("Async processing order"); // MDC Correctly Injected
// Business Logic
} finally {
asyncSpan.finish();
}
});
}
}
6. Key Design Decisionsโ
6.1 Why not operate MDC directly in business code?โ
Problem: Manual management is easy to miss cleanup, leading to traceId pollution during thread pool reuse.
Solution: Through ScopeManager lifecycle management, inject at activate() and automatically clean up at close().
6.2 Why need to save MDC snapshot?โ
Scenario: When nested Span calls, need to restore parent Span's MDC after child Span ends.
Implementation:
// When entering child Span
this.previousTraceId = MDC.get("traceId"); // Save snapshot
MDC.put("traceId", childSpan.context().toTraceId()); // Overwrite
// When exiting child Span
restoreMDC("traceId", previousTraceId); // Restore snapshot
6.3 Why use ThreadLocal instead of InheritableThreadLocal?โ
Consideration:
ThreadLocal: Strict thread isolation, suitable for synchronous scenariosInheritableThreadLocal: Child thread inherits parent thread values, but context leakage is prone to occur during thread pool reuse
Recommendation: Explicitly pass Span in asynchronous scenarios, rather than relying on automatic inheritance
7. Monitoring and Debuggingโ
7.1 Log Output Exampleโ
21:45:32.123 [http-nio-8080-exec-1] INFO c.w.OrderService [traceId=04bf92f3577b34da63ce929d0e0e4736 spanId=36bd32b7a5712a1a] - Processing order ORD-001
21:45:32.234 [http-nio-8080-exec-1] INFO c.w.PaymentClient [traceId=04bf92f3577b34da63ce929d0e0e4736 spanId=7f3a28b9c4d5e6a1] - Calling payment service
21:45:32.456 [http-nio-8080-exec-1] INFO c.w.OrderService [traceId=04bf92f3577b34da63ce929d0e0e4736 spanId=36bd32b7a5712a1a] - Order processed successfully
7.2 Jaeger UI Queryโ
In Jaeger UI you can:
- View complete call chain via traceId
04bf92f3577b34da63ce929d0e0e4736 - View each Span's Tags (such as
db.statement,http.status_code) - View parent-child relationship and time consumption distribution between Spans
7.3 Common Troubleshootingโ
| Problem Phenomenon | Possible Cause | Troubleshooting Method |
|---|---|---|
| traceId is empty in log | Span not activated or ScopeManager not configured correctly | Check activate() call and Tracer construction |
| traceId crosstalk between different requests | MDC not cleaned up correctly (thread pool reuse) | Ensure MDC.remove() called in finally block |
| Child Span not linked to Parent Span | asChildOf() parameter error | Verify if parentSpanId is parsed correctly |
| Span not visible in Jaeger | Sampling rate set to 0 or Reporter configuration error | Check withParam(1) and network connectivity |
8. Performance Considerationsโ
8.1 Performance Impact Analysisโ
| Operation | Time Consumption | Impact |
|---|---|---|
| MDC.put() | < 1ฮผs | Negligible |
| ThreadLocal.get() | < 1ฮผs | Negligible |
| Span.start() | ~10ฮผs | Low |
| Span.finish() + Report | ~100ฮผs | Async reporting, small impact on main flow |
8.2 Optimization Suggestionsโ
- Reasonable control of sampling rate: Production environment can be set to 0.1 (10%) to reduce storage costs
- Batch reporting: Reporter configuration
withFlushInterval(1000)batch send Spans - Avoid excessive Spans: Do not create Span for every database query, control granularity
9. Extension Directionsโ
9.1 Support Reactive Programming (Reactor/WebFlux)โ
public class ReactorScopeManager implements ScopeManager {
@Override
public Scope activate(Span span) {
return new ReactorScope(span);
}
static class ReactorScope implements Scope {
ReactorScope(Span span) {
// Use Reactor Context instead of ThreadLocal
Context.of("span", span);
}
}
}
9.2 Integrate Spring Cloud Sleuthโ
Spring Cloud Sleuth provides out-of-the-box distributed tracing, can replace this solution:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
9.3 Support OpenTelemetryโ
OpenTelemetry is the new generation observability standard, migration is recommended in the future:
OpenTelemetry openTelemetry = AutoConfiguredOpenTelemetrySdk
.initialize()
.getOpenTelemetrySdk();
10. Summaryโ
This solution implements seamless integration of Jaeger Trace and SLF4J MDC through custom ScopeManager, with the following features:
โ Automation: Scope lifecycle management, no need to manually operate MDC โ Thread Safety: ThreadLocal isolation + Snapshot recovery mechanism โ Nesting Support: Linked list structure maintains multi-layer Span relationship โ Production Ready: Considered boundary conditions such as thread pool reuse, asynchronous scenarios
This solution is suitable for microservice architectures requiring fine-grained control of trace context propagation, especially scenarios such as service gateways and BFF layers that need to receive upstream trace information.
- Attribution: Retain the original author's signature and code source information in the original and derivative code.
- Preserve License: Retain the Apache 2.0 license file in the original and derivative code.
- Attribution: Give appropriate credit, provide a link to the license, and indicate if changes were made.
- NonCommercial: You may not use the material for commercial purposes. For commercial use, please contact the author.
- ShareAlike: If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.