Logging best practices

1. Logging Best practices Geshan Manandhar Senior Software Engineer THE ICONIC @geshan

2. whoami Geshan Manandhar ● Senior Software Engineer ● Microservices are good, agile is better :) @geshan 2

3. I work for THE ICONIC (Tech)

4. “ @geshan If dog is a man’s best friend, logs are software engineer’s best friend.

5. We start from this, a pile of logs (if any) -- probably sorted @geshan

6. Hopefully, end up in this. Following best practices :) @geshan

7. “ @geshan This feature we deployed last week was working fine till yesterday now I have no idea why is it not working on production!

8. Logging from application level ● If errors should be reported, normal operation also need to be logged ● Applications should log actions to provide visibility and observability ● This allows the software engineers to debug and pinpoint the problems faster in case of any issue @geshan 8

9. How does logging help you? ● If you have logs are the right places, you will find out where is the program not behaving as expected ● It helps you find things on production you were not sure of ● Be careful to not log secrets like passwords though @geshan 9

10. Having logs is like having a torch light in a dark place @geshan

11. Log information optimally Too much logs = noise, too less = inadequate information @geshan

12. Logging in microservices ● Same request ID travels through multiple apps/services ○ ○ Like create shipment request travelled through 3 apps Request ID 112Ac120 -> App A -> MS B -> Service C ● This also helps in distributed tracing between apps/microservices ● Istio telemetry is a good read (distributed tracing, visualizing…) @geshan 12

13. Logs are not permanent, they are temporal @geshan

14. Logging severity standards How is alert different from notice for instance @geshan

15. Logs severity levels ● Standard RFC-5425 ○ ○ ○ ○ ○ ○ ○ ○ @geshan 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error 4 Warning: warning 5 Notice: normal but significant 6 Informational: informational 7 Debug: debug-level messages 15

16. Always follow severity standards Emergency means your on-call phone rings at 2 AM. Having agreed upon logging standards helps everyone. @geshan

17. Have structure in logs Structured logs go a long way as it is easier to parse @geshan

18. Structure your logs ● Define a log format like date is required, log title needs to be less than 255 characters ● Always add contextual information like request id, id of the subject in context like order id/order nr ● JSON can be used to structure and parse logs better ● Think of how to make searching ultra easy @geshan 18

19. Always provide context with structured logs Follow a structure and format for logs. Context always helps, JSON is your friend. @geshan

20. Write logs carefully Don’t add more milliseconds to your app performance because of logging @geshan

21. Write logs async as far as possible ● If you start calling a 3rd party https API to write your logs it will add milliseconds to your app ○ ○ Writing it locally then shipping it some other way (ELK) Queues for logs can also be a good option ● With non sequential executing languages like javascript you can make it async easily @geshan 21

22. Use a trusted logging library ● Depending on the language your can choose one that suits your needs ● Some languages also come with built in support like Go Lang @geshan 22

23. Some Logging libraries Language Library Github stars PHP Monolog ~13.5k TypeScript/JS Winston ~12.5k Native N/A Python Note: Don’t forget monolog handlers and formatters :) @geshan 23

24. Monolog to logentries @geshan

25. Write logs asynchronously Non blocking logs are the best. Be careful with console.log in JS/TS. Log shipping is intelligent and efficient. @geshan

26. Tools we are using Logentries.com aggregates most of our logs @geshan

27. Log aggregators and viewers ● Logs can be aggregated, shipped and viewed multiple ways ● Primary choice might be between self hosted/managed or SaaS ○ ○ @geshan Graylog, ELK stack are self hosted, self managed solution Logentries, loggly, Sematext Logsense, Scaylr are some good SaaS options 27

28. K8s container logs to LE K8s Cluster with nodes N2 N1 Nx @geshan Logspout Log Entries 28

29. LogEntries ● Currently we are using logentries to view and search all our logs ● You can also create alerts with logs @geshan 29

30. Alerts with logs ● Searching and viewing logs are the primary requirements of a log management system ● Alerts add that extra zing ○ @geshan If I get “these” logs more than 80 times in 5 minutes send me an email or slack message is kind of an alert based on logs 30

31. Use the tools on disposal efficiently Know how to search your logs, add dashboards if needed. You can even set up alerts if some logs are consistent over time. @geshan

32. Logging -> Instrumentation -> Observability ● Instrument every meaningful number available for capture - source ○ ○ tends to be things like incoming request counts, request durations, and error counts No. of order per minute, no. of stuck payments ● Both logging and instrumentation are ultimately just methods to achieve system observability. @geshan 32

33. Thanks! Any questions? @geshan 33

34. Credits/references https://blog.scalyr.com/2018/08/microservices-logging-best-practices/ https://www.loggly.com/blog/30-best-practices-logging-scale/ https://peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.ht ml ● https://news.ycombinator.com/item?id=11054973 ● https://surfingthe.cloud/dont-fear-node-js-console-log/ ● https://tools.ietf.org/html/rfc5424 ● https://en.wikipedia.org/wiki/Instrumentation_(computer_programming) ● https://www.loomsystems.com/blog/single-post/2017/01/26/9-logging-b est-practices-based-on-hands-on-experience ● https://geshan.com.np/blog/2015/08/importance-of-logging-in-your-appli cations/ ● https://blog.scalyr.com/2018/06/go-logging/ ● https://blog.codeship.com/how-to-understand-logs-with-logentries/ @geshan ● ● ● 34