M365.FM - Modern work, security, and productivity with Microsoft 365

The Star Schema Trick All Pros Use (But Won’t Say)


Listen Later

Your tangled web of tables isn’t a data model—it’s digital spaghetti. No wonder DAX feels like you’re solving a sudoku puzzle after twelve beers. The good news: cleaning it up pays off fast. With the right design, your visuals respond to filters in seconds, your DAX stops fighting you, and your model finally looks like something you’d want to show your boss. The trick is a star schema. That means one or more fact tables in the center holding your measures and events, surrounded by dimension tables—the who, what, when, and where. Relationships define the roles, and the engine is built to optimize that structure. You don’t need a PhD in data warehousing; you just need to untangle the chaos into this simple pattern. For more deep dives, hit m365.show—you’ll want it for your next model. Now, why does your current report crawl like a floppy drive in 1995 the moment you add a filter? Let’s get into that.The Digital Spaghetti ProblemWelcome to the heart of the problem: the Digital Spaghetti model. You know the type—a giant flat table packed with every column anyone ever thought was useful. Customer names, job titles, phone numbers, sales amounts, discount codes, the works—all jammed together. It feels fine at first because you can throw visuals at it and see numbers appear. But once you stack slicers, cross filters, and extra pages, the whole thing bogs down. That’s not bad luck, and it’s not Fabric throwing a tantrum. It’s the wrong design. Think of it like a city built without streets. Every building stacked on top of each other in one giant pile. Sure, you can live there if you’re willing to climb over roofs and windows, but try driving across it efficiently—gridlock. A flattened model does the same thing: it clumps facts and context in the same space, so every query has to crawl through duplicate information before getting to the answer. Microsoft’s own documentation is clear on this point. The Vertipaq engine running Power BI is optimized for one specific design: dimensions store descriptive context such as customers, dates, or regions, and facts store numeric events like sales, clicks, or costs. When you collapse everything into one giant fact-like table, you force the engine to re-do work on every query. SQLBI calls out exactly why this fails: DAX’s auto-exist behavior can produce incorrect results, and missing combinations of data break relationships that should exist but don’t. In practice, this means your report isn’t just sluggish—it can also be misleading. A large share of real-world performance problems trace back to this exact modeling choice. Not formulas. Not your GPU. Just chaotic schema design. Flattened models force inefficient query patterns: text values get repeated thousands of times, DAX has to de-duplicate attributes over and over, and filter propagation breaks when dimension logic is buried inside fact rows. That’s why your calculations feel heavy—they’re retracing steps the star schema would handle automatically. Now, here’s a quick 30-second check to see if you’re stuck in Digital Spaghetti: First: open your fact table. If you see descriptive text like customer names or region values repeated tens of thousands of times, you’ve got spaghetti. Second: look at your slicers. If 90% of them are built directly from giant fact table columns instead of small lookup tables, that’s spaghetti too. Third: ask yourself if you’ve got fact columns that double as static attributes—like a “salon group” typed into transaction rows—even when no visits exist. That right there is spaghetti. One “yes” on these checks doesn’t doom your model, but if you hit all three, you’re running in the wrong direction. The fix doesn’t happen by blaming DAX. The formulas aren’t broken. What’s broken is the road they’re driving on. When attributes live in fact rows, the engine burns time scanning duplicated text for every query. Star schemas solve this by splitting out those attributes into clean, slim dimension tables. One join, one filter, clean result. No detective work required. This is why experts keep hammering the same advice: expose attributes through dimensions, hide columns in fact tables, and respect the separation between context and numbers. It isn’t academic nitpicking—it’s the design that prevents your report from collapsing in front of the VP. Get the model shape right, and suddenly the engine works with you instead of against you. Bottom line: what looks like a harmless shortcut—a single huge table—creates a brittle, sluggish model that makes everything harder. Recognizing that the problem is structural is the first real win. Once you see the spaghetti for what it is, draining it becomes the logical next move. And draining it starts with a sort: deciding what belongs in facts and what belongs in dimensions. That single choice—the first clean cut—is what shifts you from chaos to clarity.Facts vs Dimensions: The First Sorting HatSo here’s where the Sorting Hat comes in: deciding what goes into facts and what belongs in dimensions. It might feel like a simple split, but it’s the first real test of whether your model is going to work or implode. Facts are the measurements—how many, how much, how often. Dimensions are the descriptions—the who, what, when, and where. Keep those roles clean, and suddenly filters know exactly where to go instead of trying to squeeze through gridlock. The general rule is blunt: dimensions describe, facts measure. A fact table is just measurable stuff—transactions, sales amounts, counts of visits. Dimensions hold your lookups: Customers, Products, Dates, Regions. If you jam them all into one table, you get nothing but duplicated values, heavy filtering, and DAX errors that feel like gremlins. Take relationships: every one-to-many relationship in Power BI tells you who’s who. The “one” side is always the dimension. The “many” side is always the fact. That simple distinction saves you from guessing. Dimensions provide the clean list, facts reference them. If your so-called dimension sits on the “many” end, it’s not a dimension—it’s another fact with identity issues. And if your would-be dimension doesn’t have a unique column? Fine. Build one. Power Query has “Add Index Column.” That’s your surrogate key. No excuses, no drama—just give the engine something unique to latch onto and move on. What happens if you don’t respect that split? SQLBI has a classic example: a beauty salon dataset. At first, people dumped salon group information straight into the Visits fact table. Looked convenient—until slicing by gender or job title produced missing totals. Why? Because auto-exist logic in DAX couldn’t handle the missing combinations. Key groups didn’t exist in the fact table at all, so filters silently dropped numbers. The fix was obvious once you see it: build real dimension tables for Gender, Job, and Salon. Then adjust the measure to operate on those. Suddenly, the totals matched reality, filters worked, and ghost results disappeared. That’s the power of getting the fact/dimension boundary right. Another pitfall: stuffing descriptive text straight into your fact table because “it’s already there.” For example, Region repeated half a million times for every transaction. That’s not a lookup—it’s spam. Every time you slice on Region, the model wastes cycles mashing those rows down into a unique list. Instead, throw Region into a dimension table, store each region once, and let the join handle the rest. That’s cleaner, faster, and accurate. Best practice here is non-negotiable: hide descriptive columns in the fact table and expose attributes only through the dimension tables. You will thank yourself later when your report actually slices cleanly. Slicers should point to dimensions, not bloated fact text fields. Get lazy, and you’ll be back to watching spinners while DAX cries in the background. If you want a mental image: facts are the receipts, dimensions are the catalog. Receipts don’t carry full product names, addresses, or job titles—they just reference IDs and amounts. The catalog—your dimension—stores the product info once and for all. Mix them up and you’re basically stapling the entire IKEA manual onto every receipt, over and over. That’s what kills performance. Even Microsoft’s docs repeat this like gospel: dimensions are the single source of truth for lookups. When you follow that, a slicer on Customer or Region works the way you expect—once and cleanly across all related facts. It works not because DAX woke up smarter, but because the schema is finally aligned with how the engine is built to behave. So the Sorting Hat rule is simple. Facts: your sales, visits, or other measurable events. Dimensions: your customers, products, dates, and regions. Keep them in their lanes. If the “one” side of the relationship can’t stand uniquely, give it a surrogate key. Then hide every descriptive column in your facts and let dimensions carry them. It sounds strict, but the payoff is filters that work, models that load fast, and measures that stop tripping over themselves. Now that we’ve sorted the cast into facts and dimensions, there’s a twist waiting. Microsoft insists you treat each side differently: slim facts, chunky dimensions. Sounds like a paradox. But there’s a reason for it—and that’s where we’re heading next.Normalize the Fact, Flatten the DimensionNormalize the fact, flatten the dimension. That’s the rule Microsoft keeps drilling into us, and once you see it in practice, it makes sense. Facts are meant to stay lean and normalized. Dimensions? They’re meant to carry all the descriptive weight in one flattened place. Get that wrong, and your filters stall out while memory usage balloons. Start with fact tables. These are your transaction logs—each row an actual event: a purchase, a return, a shipment. What belongs inside? Keys that link to dimensions, plus numeric measures you can aggregate. That’s it. Every time you toss in descriptive fields—like customer names, produ

Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-fm-modern-work-security-and-productivity-with-microsoft-365--6704921/support.

If this clashes with how you’ve seen it play out, I’m always curious. I use LinkedIn for the back-and-forth.
...more
View all episodesView all episodes
Download on the App Store

M365.FM - Modern work, security, and productivity with Microsoft 365By Mirko Peters (Microsoft 365 consultant and trainer)