List of videos

try! Swift NYC 2016 - Property-Based Testing with SwiftCheck

try! Swift New York Conference 2016 - try! Swift is an immersive community gathering about Apple Technologies, Swift Language Best Practices, Application Development in Swift, Server-Side Swift, Open Source Swift, and the Swift Community in New York! Topic - Property-Based Testing with SwiftCheck Speaker - TJ Usiyan Twitter - https://www.twitter.com/griotspeak Bio - TJ is a writer, musician, and developer interested in crafting interesting and artful work, and developer of the universal app Chordal Text and AU Additive Synthesizer. TJ is a graduate of Eugene Lang College and Berklee College of Music. Abstract - Unit tests are a challenge to write. “Did I think of every relevant case?” is an almost impossible question to answer. Fortunately, we have the tools to help find more relevant cases with less searching. In this talk TJ demonstrates how property-based testing using SwiftCheck helps us find edge cases and become more confident about the assumptions that our code is built upon. Presentation Link - https://speakerdeck.com/griotspeak/property-based-testing-with-swiftcheck try! Swift NYC Twitter - https://twitter.com/tryswiftnyc try! Swift NYC Twitter Hashtag - https://twitter.com/hashtag/tryswiftnyc try! Swift Website - https://www.tryswift.co/ try! Swift Conference Photos - https://www.flickr.com/photos/tryswift/albums try! Swift Conference Contact - info@tryswift.co try! Swift Conference © 2018 - Powered by NatashaTheRobot

Watch
Monitorama PDX 2024 - Tracing Service Dependencies at Salesforce

Sudeep Kumar's session from Monitorama PDX 2024. This talk will focus on our strategic choice to utilize a streaming pipeline for inferring service dependencies using Trace telemetry data. We'll also delve into a key use case that showcases how service dependencies are visualized and managed through the streaming pipeline on our distributed tracing platform. These service dependency views are crucial for monitoring the deployment status of services and the health of related services. Moreover, by providing a clear overview of service interactions, we also facilitate risk assessment for new feature rollouts, enhancing both product development and operational stability. Additionally, discuss the role of integrating service mesh onto Kubernetes led to comprehensive coverage and completeness of our service dependency data. Also In our talk, we'll will explore how a Flink based streaming platform and Druid backend is utilized to gather all trace telemetry data, enabling us to process 100% of trace data (from 300 millions spans collected per minute) and deduce complete trace contexts. By establishing unique trace contexts, we create a trace state that represents every request occurring within the system. This state (Dependency edges) contains vital information required to map out the path of transactions as they move through different services and components within Salesforce. Our proposed talk will delve into the transformative process of converting individual trace states into service dependency edge records through Flink and Druid, revealing the complex web of interactions between services. Attendees will be equipped with methods to uncover key interactions, such as identifying the services or operations that most frequently initiate contact with other services. Furthermore, we will explore strategies for utilizing service dependency topology to achieve a thorough grasp of the relationships and dependencies among services and components in a distributed system. Adding to the benefits, the audience will learn how having service mesh coverage on any Kubernetes infrastructure can be leveraged effectively for accurately deducing service dependencies. This aspect underscores the importance of infrastructure design in enhancing traceability and reliability within distributed systems. Armed with this understanding, participants will be better positioned to enhance system performance and reliability. The session aims to provide attendees with actionable insights and methodologies for effectively managing and navigating the intricate service dependencies that characterize modern distributed systems.

Watch
Monitorama PDX 2024 - Distributed Context Propagation: How you can use it to Improve Observability..

J. Kalyana Sundaram's session from Monitorama PDX 2024. One of the main challenges in a complex distributed system is: How do we get all the participants on the same page about the “shared context” of a logical request? How do we do that in a vendor neutral and interoperable way? Say hello to distributed context propagation, also known as Baggage. In this session, Kalyan will cover the "what", "why", and "how" of distributed context propagation. Baggage enables a variety of use cases in two categories: 1) improving the observability of a system and 2) enabling better control of a system. You will learn about use cases such as labelling synthetic traffic, chaos engineering, and attributing infrastructure spend. Kalyan represents the work done by the W3C Distributed Tracing Working Group that is driving the standardization of this mechanism (W3C Baggage). He will build together with you in a bottom-up manner the building blocks for solving this problem, and then extend that solution to be open and interoperable. You will also see a live demo of this built using the OpenTelemetry Baggage APIs.

Watch
Monitorama PDX 2024 - Use counters to count things

Fred Moyer's session from Monitorama PDX 2024. eBPF, OTel, TSM, data lakes, observability. All this cool new stuff. But you know what's really cool? Counting things. Lots of things. One, two, three, 18,446,744,073,709,551,615. Can your system count that high? And how fast can it do that? Being able to count certain things in your business quickly and efficiently is important to establishing business goal and understanding where you stand from both a business and technical health perspective. Counting lots of things is also genuinely interesting computer science problem. And in the monitoring realm, there are tools which use statistics to figure out how many distinct things are in a large pile (from a monitoring prospecting). One classic example is "How many times did users hit this API endpoint?" Inquiring product managers want to know. There are easy ways to do this, and there are hard ways, and there are ways in between, all with different tradeoffs. This talk will look at the different ways this problem can be approached and what business requirements drive those technical approaches. Should you count up logs? Should you bump a StatsD artifact? Should you consult a staff data scientist? You shouldn't be using the same approach from startup to mega corp, so you'll get a walk through the tradeoffs from practical experience across all those ranges.

Watch
Monitorama PDX 2024 - The shoemaker’s children have no shoes - why SRE teams must help themselves

Pete Fritchman's session from Monitorama PDX 2024. There’s an old proverb that says ‘the shoemaker’s children have no shoes.’ In other words, people with specific skills or who offer in-demand services are often so busy providing their expertise and service to others that they aren’t able or don’t have the time to provide it to themselves. The same could be said of SRE teams and internal services, and has never been more true for developer tooling and workflows. SREs are laser-focused on ensuring a great experience for customer facing applications but their own internal services, and their internal users, can sometimes go neglected. It’s time for the shoemaker to give their children shoes! In this 30-minute talk, Pete Fritchman, staff infrastructure engineer at Observe, will offer guiding principles and practical examples to get SRE team members thinking about how they handle internal services and customers. Fritchman will also explore how organizations are putting themselves at risk as well as outline best practices that must be taken to adequately support internal services.

Watch
Monitorama PDX 2024 - Incident Management: Lessons from Emergency Services

Julia Thoreson's session from Monitorama PDX 2024. What do missing people, production outages, and natural disasters all have in common? They are all different types of incidents! Although these may seem like completely unique situations, similar principles and processes can be applied in all kinds of challenging circumstances. Drawing on my experience as both a software engineer at Bloomberg and a Lieutenant on the Alameda County Search and Rescue Team, we will discuss pre-planning, alerting, responding and debriefing for all sorts of incidents.

Watch
Monitorama PDX 2024 - No observability without theory

Dan Slimmon's session from Monitorama PDX 2024. When our system isn't observable enough, what do we do? We add telemetry. The more signals we can observe, the more knowledge we'll have. Or so the thinking goes. But observability requires more than just data: it requires a _theory_ about how our system works. Only within a theory can signals be interpreted and made useful. In this talk we'll see that, often, the limiting factor of a system's observability is not the thoroughness of our measurements, but rather the strength and coherence of our theory. This fact carries major implications for how we must build dashboards, write alerts, and respond to incidents.

Watch
Monitorama PDX 2024 - Disintegrated telemetry: The pains of monitoring asynchronous workflows

Johannes Tax's session from Monitorama PDX 2024. Many tools and best practices around instrumentation and observability are tailored to synchronous request/response workflows, HTTP and RESTful APIs being the most prominent examples. However, if you have to instrument and monitor a system that relies on asynchronous communication based on events or messages, you'll soon find out that established concepts and practices don't work so well. Observing loosely coupled processing steps often leads to disintegrated telemetry, which makes it hard to derive actionable insights. In this talk, I focus on the challenge of correlating the disintegrated telemetry pieces (metrics and traces) that are emitted during the lifetime of a message or an event. I describe the problem and present possible solution approaches. I show how each solution approach is broken in its own way, and provide insights that help you to choose the least broken solution for your scenario. Finally, to show some light at the end of the tunnel, I give an overview of standardization efforts in this space, including W3C context propagation drafts for messaging protocols, and the messaging semantic conventions created by the OpenTelemetry messaging workgroup, which I'm leading.

Watch
Monitorama PDX 2024 - The Hater's Guide to Dealing with Generative AI

Phillip Carter's session from Monitorama PDX 2024. Generative AI! LLMs! They're all the rage! ...and for good reason. Unlike crypto, this new phase of AI is a legitimate step change in capabilities, with companies in every economic sector finding places where it can add value. People are experimenting and C-levels are rewriting roadmaps to incorporate this tech. It's here, it'll probably stay, and we're gonna have to make it work. So, how do you make this stuff work in the long run? You guessed, it Observability! LLMs are black boxes you can't debug in the traditional sense. The most effective way to improve them over time is to put them into production, gather good telemetry from real-world usage, and analyze that data to figure out how to improve answers. In this talk, you'll learn the basics of what you need to track, how you can track it, how you should consider monitoring health of an LLM over time, and how you can add value to your organization by feeding valuable data back into the development process.

Watch