Table of Contents
- • No silver bullets: Building the analytics flywheel
- • Identity Crisis: Navigating the Modern Data Organization
- • Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization
- • Down with 'data science'
- • Refactor your hiring process: a framework
- • Beyond the Box: Stop relying on your Black co-worker to help you build a diverse team
- • To All The Data Managers We've Loved Before
- • From Diverse "Humans of Data" to Data Dream "Teams"
- • From 100 spreadsheets to 100 data analysts: the story of dbt at Slido
- • New Data Role on the Block: Revenue Analytics
- • Data Paradox of the Growth-Stage Startup
- • Share. Empower. Repeat. Come learn about how to become a Meetup Organizer!
- • Keynote: How big is this wave?
- • Analytics Engineering Everywhere: Why in the Next Five Years Every Organization Will Adopt Analytics Engineering
- • The Future of Analytics is Polyglot
- • The modern data experience
- • Don't hire a data engineer...yet
- • Keynote: The Metrics System
- • This is just the beginning
- • The Future of Data Analytics
- • Coalesce After Party with Catalog & Cocktails
- • The Operational Data Warehouse: Reverse ETL, CDPs, and the future of data activation
- • Built It Once & Build It Right: Prototyping for Data Teams
- • Inclusive Design and dbt
- • Analytics Engineering for storytellers
- • When to ask for help: Modern advice for working with consultants in data and analytics
- • Smaller Black Boxes: Towards Modular Data Products
- • Optimizing query run time with materialization schedules
- • How dbt Enables Systems Engineering in Analytics
- • Operationalizing Column-Name Contracts with dbtplyr
- • Building On Top of dbt: Managing External Dependencies
- • Data as Engineering
- • Automating Ambiguity: Managing dynamic source data using dbt macros
- • Building a metadata ecosystem with dbt
- • Modeling event data at scale
- • Introducing the activity schema: data modeling with a single table
- • dbt in a data mesh world
- • Sharing the knowledge - joining dbt and "the Business" using Tāngata
- • Eat the data you have: Tracking core events in a cookieless world
- • Getting Meta About Metadata: Building Trustworthy Data Products Backed by dbt
- • Batch to Streaming in One Easy Step
- • dbt 101: Stories from real-life data practitioners + a live look at dbt
- • The Modern Data Stack: How Fivetran Operationalizes Data Transformations
- • Implementing and scaling dbt Core without engineers
- • dbt Core v1.0 Reveal ✨
- • Data Analytics in a Snowflake world
- • Firebolt Deep Dive - Next generation performance with dbt
- • The Endpoints are the Beginning: Using the dbt Cloud API to build a culture of data awareness
- • dbt, Notebooks and the modern data experience
- • You don’t need another database: A conversation with Reynold Xin (Databricks) and Drew Banin (dbt Labs)
- • Git for the rest of us
- • How to build a mature dbt project from scratch
- • Tailoring dbt's incremental_strategy to Artsy's data needs
- • Observability within dbt
- • The Call is Coming from Inside the Warehouse: Surviving Schema Changes with Automation
- • So You Think You Can DAG: Supporting data scientists with dbt packages
- • How to Prepare Data for a Product Analytics Platform
- • dbt for Financial Services: How to boost returns on your SQL pipelines using dbt, Databricks, and Delta Lake
- • Stay Calm and Query on: Root Cause Analysis for Your Data Pipelines
- • Upskilling from an Insights Analyst to an Analytics Engineer
- • Building an Open Source Data Stack
- • Trials and Tribulations of Incremental Models
Inclusive Design and dbt
Have you ever felt that you were not the intended audience of the technology you consume? Are there software design paradigms that simply do not work for you?
In this presentation, I lean on the research of Dr. Margaret Burnett, who studies the intersection of software design and cognitive diversity. Her thesis is that narrow-minded software design alienates people who are not neurotypical with respect to the assumed average user.
I believe dbt was designed with cognitive diversity in mind, which is its primary value-add over similar software solutions.
Browse this talk’s Slack archives #
The day-of-talk conversation is archived here in dbt Community Slack.
Not a member of the dbt Community yet? You can join here to view the Coalesce chat archives.
Full transcript #
Elize Papineau: [00:00:00] Hello, thank you for joining us at Coalesce. My name is Elize Papineau and I am a senior analytics engineer at dbt Labs. I’ll be the host of this session. The title of this session is Inclusive Design and dbt. And we’ll be joined by Evelyn Stamey, who is a data scientist at Civis Analytics, but her talents don’t end there.
She has a bachelor’s degree in mathematics and a master’s degree in statistics, and she did college gymnastics. Everyone credits her interest in inclusive software design to a specific podcast episode, which Emmy is going to be dropping the link to in the chat shortly. And finally, I wanted to give Evelyn an extra special shout out because all the images we’re going to see in today’s presentation are her own freehand drawings.
And it makes me extra excited to see this before we get started some quick housekeeping. All chat conversation is going to be taking place in the coalesce-inclusive-design [00:01:00] channel of dbt Slack. If you’re not a part of the chat, you have time to join. Visit community.getdbt.com and search for coalesce-inclusive-design.
When you enter this space, we encourage you to ask other attendees questions, make comments, and react in the channel. This is an ongoing conversation we’re going to have throughout the talk. Please flag any questions you have for the speaker. And the speaker will be available in Slack after the session to answer those questions.
All right, let’s get started over to you.
Evelyn Stamey: Hey, thank you Elize for that introduction and thank you to all of you for joining me in this presentation on inclusive software design and dbt. If you’re watching this presentation, let’s face it. You have technical skills. You might be a data engineer or product manager.
You might be a business [00:02:00] analyst or a data scientist, whatever your job title is, you are technically proficient in some way. In spite of this fact, have you ever felt that you were not the intended audience of the technology you consume? Have you ever needed to sacrifice productivity in order to operate industry standard tooling?
Do certain technologies leave you feeling alienated instead of empowering? For me, the answer is yes, even though I’m a highly technical contributor on a data team. I often have low self efficacy when it comes with working with new technologies. I did not experience my usual pain points when onboarding to dbt, which got me thinking maybe I’m not bad with technology. Maybe technology is bad with me.
[00:02:57] dbt Supports Inclusive User Experiences #
Evelyn Stamey: In this presentation, I’m going to [00:03:00] talk about inclusive software design and how dbt supports inclusive user experiences.
At the root of many bad user experiences are bad assumptions to set the scene. I’ll tell you about a time when I lost my scissors and bought a new pair. The only problem was that my scissors came in a clam shell package, which instructed me to cut along the dotted lines to open.
Okay. I’m resourceful. So I figured out a different way to open the scissors, but it was not a pretty sight. Has this ever happened to you needing scissors to open a box of scissors?
How about the scenario, maybe you’re left-handed and all you have are right-handed scissors. With every use you get to choose your own adventure. For physically uncomfortable experience, [00:04:00] put in left-hand. For cognitively taxing experience, put in right hand. And all of these examples, the burden is placed on the user to work around bad assumptions made by the designer.
Evelyn Stamey: Biased design is everywhere from scissors to software. In general, if a tool is hard to assemble, hard to learn, uncomfortable to use or lacks technical support, you are probably not the intended end user and that’s okay. Sometimes while there’s nothing inherently wrong with exclusive technology, it can bear hidden financial and ethical.
If designed carelessly in my experience, dbt stands out as one of the few tools and the analytics engineering space that was designed with inclusivity as a first class objective.[00:05:00]
Here’s what I’ll be diving into today. I’ll start with what got me interested in the topic of inclusive software design, a paper coauthored by Dr. Margaret Burnett. Her thesis is that software products can be biased against certain information processing and learning styles. I’ll highlight the key findings, supporting her thesis, and then draw from examples in my professional life of software that have biased design in the way that Dr. Burnett describes, then I’ll shift focus to inclusive design paradigms. By highlighting specific features in dbt that I believe support inclusive user experiences. Lastly, I will draw on the fundamentals of universal design to encourage greater awareness of the diverse needs of end users and why universal design principles are relevant to all of us, [00:06:00] but first a disclaimer.
Inclusivity and software design is a rich topic that is as nuanced as individual users themselves. The irony of presenting on the topic of inclusive software design is that I am guaranteed to exclude certain points of view. My goal for this presentation is to, one, spark your interest in the topic and two, to encourage follow-up conversation.
One of the perks of presenting at Coalesce is that we can have this conversation live in my dbt Slack channel and asynchronously in dbt Discourse. I hope we can all use these resources to our advantage. I would love to hear all of your perspectives.
[00:06:46] Information Processing and Learning Styles #
Evelyn Stamey: All right. Onto Dr. Margaret Burnett, whose research single-handedly inspired this presentation.
Who’s Dr. Burnett? She studied the intersection of software [00:07:00] design, cognitive diversity, and gender diversity at Oregon State. Her research group founded the GenderMag Project, which is an initiative that helps software professionals and usability professionals find and fix gender bias in software design.
There’s a large volume of research backing the GenderMag Project. So I’m going to narrow the scope of this presentation to just the paper shown here on the right, by the way, all the references for this presentation are free to access and they’re cited at the end of this slideshow.
All right. In Dr. Burnett’s paper, she argues that modern software products favor specific information processing and learning styles that disproportionately alienate female identifying users. There’s a lot to unpack here. So I’m going to start with defining some vocabulary. [00:08:00] In this context, information processing style is a user’s preferred strategy for gathering information.
While problem solving within a software environment, some users prefer comprehensive information processing, where they gather a lot of information over a wide scope before acting on a solution. Others prefer selective information processes. Where they quickly iterate through solution paths, using smaller bits of information.
Dr. Burnett observed that comprehensive information processing is more common among female users than male users. What about learning style? It describes how a user approaches learning a new software application. Some users are process-oriented in that they prefer to follow guided instructions.
Others are more exploratory and they prefer to learn new tooling [00:09:00] by tinkering via trial and error. Dr. Burnett observed that female users are more likely to be process-oriented learners and male users are more likely to be tinkerers.
Evelyn Stamey: while the relationship between user behavior and gender identity is extremely complex, the correlation summarized here constitute the foundations of the GenderMag Project. Just as a recap, information processing and learning styles are two cognitive facets or features each of which have values that describe user behavior.
Not only has the GenderMag research group observed significant interactions between user behavior and gender identity, but they also found that selective information processing styles and tinkering learning styles are more likely to be supported in software, which statistically speaking [00:10:00] favors the preferences of male identifying end users.
Dr. Burnett argues that for software to be gender inclusive, it must have features that cater to all four of these facet values.
[00:10:19] Examples of Information Processing Style, Bias, and Software #
Evelyn Stamey: To ground us in the real world. I’ll give a few examples of software that I believe have biased design in the way that Dr. Burnett describes. Here’s a simple example, control, copy control paste. This famous command sequence favors selected information processing styles for comprehensive thinkers who prefer to think holistically about a task before taking action.
The clipboard is a prohibitively small space for their virtual working memory. A comprehensive thinker would much rather copy several items to the clipboard before pasting. [00:11:00] Of course, anyone can achieve this functionality on their computers in theory, but there is no out of the box solution for people who prefer to work in this way.
Here’s another example of information processing bias. For those of you familiar with the version control software Git, it is pretty clear that it was founded on selective information processing principles. Each commit is intended to represent a single unit of change and users are encouraged to commit early and commit often.
The thing that always surprises me though, is how selective information processing paradigms are manifested in the GitHub UI. For example, in a pull request, GitHub only displays a few lines of contextual code around code changes and hides everything else by default. It is up to the user to expand the [00:12:00] accordion of hidden content. Comprehensive thinkers, who might prefer more context before contributing to code review, have a much clicky or GitHub experience. Related limitation in the UI is that only certain lines of code in a file can be commented on as shown in this illustration.
You can directly comment on the changes. And a few lines of context, but unfortunately you can’t comment directly anywhere else in the file. This limitation adds friction to the review process for those users who prefer to offer more comprehensive commentary.
[00:12:45] Examples of Learning Style Bias #
Evelyn Stamey: I’ve touched on a few examples of information processing style, bias, and software, but how about learning style bias? Recall that process oriented learners prefer more structured learning, whereas tickers [00:13:00] prefer trial and error. Learning style bias is hard to spot once you’ve already learned how to use a tool.
But there are some there’s some common themes, especially in the open source software that favor tinkering learning styles. Ideally tech support services should scale with the complexity of a system. But when complexity, outpaces support explaining how the system works is sometimes left as an exercise to the user.
Take Pandas, for example. As a data scientist who works with medium-sized data, I use Pandas a lot for in-memory data transformations. I believe tinker is fair better than process oriented learners when working with Panda. Most notably, there are several ways to accomplish the same task and Pandas. And the official documentation is unopinionated about idiomatic usage.
The vastness of Panda’s [00:14:00] library in combination with its decentralized community resources can be overwhelming for users who prefer more structured learning.
Here’s another Python library that I love. Great expectations. It is a powerful, flexible and extensible tool for data quality control tinkerers tend to embrace tools with flexible architecture because they are highly customizable and conducive to exploration. Great expectations offers this kind of freedom, but for process-oriented learners, great expectations is less approachable.
Accomplishing even the most basic of data quality checks requires knowing about data, context, data sources, data connectors, and so on which imposes a larger cognitive tax on users who don’t directly benefit from this extra over.[00:15:00]
I offer these examples in support of Dr. Burnett’s thesis that rigid software design can marginalize anyone who is not neuro-typical with respect to the assumed average user. I also offer these examples in support of a broader thesis, that there is no one correct way to solve a problem with. Dr. Burnett gave me a new perspective on the tools that I use on a daily basis, which of course got me reflecting on my experience with dbt.
So where does dbt fit into all of this? The analytics, engineering community comprises individuals with diverse educational and professional background. As such it demands software solutions that are flexible to a variety of end user needs. In my opinion, dbt lowers the barrier to entry at every stage in the user journey.[00:16:00]
Recall in my scissors example, I argued that you are probably not the intended user if your tool is hard to un-box, has a steep learning curve, is uncomfortable to use, lacks technical support. This is not the case with dbt. Let’s start with onboarding. I found that setting up dbt was easy due to its up to date documentation and tutorials.
I inevitably ran into configuration errors early on, but these errors surface as intelligible messages that were easy to debug by referencing the doc. I was also able to set up a basic data pipeline with documentation in a short period of time. Eventually I got around to macrowising and testing this pipeline, but the basic utility of dbt was immediately apparent.
I also believe that dbt is easy to learn [00:17:00] according to stack Overflow’s 2020 developers. SQL ranked third place among the most popular languages. It’s no surprise then that dbt leverage SQL to reach wide audiences for better or for worse. SQL is the defacto universal language that data practitioners speak.
You might also notice that Python ranks fourth place in the survey while dbt doesn’t require end users to write Python per se. It leverages the Jinja templating engine, which can compose queries using Python like syntax.
I also believe that dbt meets users at their comfort level and comfort means different things to different user groups. The most notable examples are dbt’s IDE and CLI entry. More importantly, each of these entry [00:18:00] points are equally supported in documentation and have reasonable feature parody. This communicates to users that their preferred method of interacting with dbt is valued.
Lastly, I think dbt takes the cake in championing not only good documentation, but a welcoming community. Between dbt, Slack, and dbt Discourse, it’s clear that participation is encouraged, which I believe inspires a tighter feedback loop between developers and end users.
If your job involves designing, deploying, or maintaining data pipelines, you are probably the intended end user of dbt. No matter what industry you’re in, dbt achieves this by observing best practices in universal design. To wrap up this presentation, [00:19:00] I want to highlight a few lessons that we can all learn from universal design.
[00:19:07] Lessons from Universal Design #
Evelyn Stamey: Those of you who work in your tech support might know this acronym. PEB CAC, "problem exists between chair and keyboard." I have been the proverbial problem in chair more times than I can count, and it can feel very disempowering and alternative framing of this scenario is that there is no such thing as user error, just poor design. Usability professionals and experts in interaction design are more likely to support this framing of user error, because it is much easier to change software user interfaces than it is to change.
This is where universal design comes into play. While humans are a measurably diverse, universal design attempts to maximize utility across all user groups. [00:20:00] Perhaps counter-intuitively designing for people with very specific needs can improve accessibility for all users. This is commonly known as the curb cut effect in reference to sidewalk ramps, which were originally designed for people in wheelchairs, but turned out to be useful to many other pedestrians. With that said, universal design is much more than a one size fits all approach to usability. The Center for Universal Design at NC State presents these seven principles, which can be used to guide and evaluate product design, including software.
[00:20:35] The 7 Principles of Universal Design #
Evelyn Stamey: Unfortunately, I won’t have time to just discuss all of these principles, but I’ll highlight a few using dbt as an example. The principle of perceptible information states that the design should communicate information effectively to the user, which might include redundant presentation of essential information.
I noticed this principle at play in the dbt documentation site, which offers different [00:21:00] ways of understanding the relationship between model. For example, users have the option to inspect a template which references parent models by means of the ref function. Alternatively users can inspect the rendered template, which surfaces the names of specific tables.
Of course, if you’re not in the mood to read SQL, you can always visualize model relationships by looking at the DAG. According to official documentation, the most important function in dbt is ref and dbt makes ref perceptible to users in these three different ways. The principal error tolerance states that the design should minimize the adverse consequences of unintended actions.
I can confidently say that I am accident prone. So I always appreciate your tolerant design. One of my favorite things about dbt is that its operations are item pulling, which means that there is a low risk of irreparably [00:22:00] destroying production objects. Even when I accidentally deleted a production table, that one time I was one dbt command away from rebuilding it.
Speaking of rebuilding models, the principle of low effort states that the design should facilitate efficient and comfortable use for a lot of software applications. This means minimizing repetitive actions for dbt specifically, this means shouldering the burden of boilerplate SQL and reducing the overhead of rebuilding.
For my team, replacing a data legacy data pipeline with dbt was a game changer specifically because it sped up our testing cycle. Pipelines fail all the time during development. And the last thing you want to do when your test fails is track down and clean up all the artifacts from your field run. dbt is low effort because no matter what sad state your pipeline is in [00:23:00] all you need is two words to pick up where you left dbt run.
You might be wondering why anyone outside a team of designers should care about these principles. At the end of the day, investing in inclusive design is not only morally favorable, but good for business when done right. It can lead to greater market share and give you a competitive advantage against technologies with more rigid.
Inclusive design doesn’t happen by accident. By the way, if you don’t intentionally include, you will unintentionally exclude. Even if you don’t have the resources to reach all target demographics, it’s better to have awareness of your products shortcomings than to unwittingly build on top of them. I’ll say it again.
Softwares for humans, not computer. [00:24:00] The principles that guide inclusive product design are the same principles that make your code readable. Your table’s queryable, your reports, intelligible, and your pipeline’s maintainable. You don’t need to be a UX professional to implement inclusive design. All you need is your imagination, the right set of tools and a strong capacity for empathy.
Thank you. As mentioned, these are the references to my presentation, which I encourage everybody to check out. I also want to give a special shout out to my colleagues at Civis Analytics past and present who have always made me feel welcome at every stage in my career as a data science. All right. I hope to see you all soon in Slack. Thank you very much.
Last modified on: Apr 19, 2022