How to Reduce Costs Using Snowflake

Explore top LinkedIn content from expert professionals.

Summary

Learn how to reduce costs using Snowflake, a cloud-based data warehousing platform, by adopting smarter query practices, optimizing resource allocation, and streamlining data usage. These strategies can help you minimize unnecessary expenses while improving performance.

  • Use auto-scaling and suspend features: Configure Snowflake warehouses to automatically scale up or down and suspend when idle to avoid incurring costs from unused compute resources.
  • Improve query efficiency: Speed up processing and lower costs by using query result caching, partitioning, and clustering to minimize unnecessary data scans.
  • Consolidate workloads: Combine similar warehouses into multi-cluster ones to maximize resource utilization and reduce redundant spending on compute power.
Summarized by AI based on LinkedIn member posts
  • View profile for Jyoti Pathak

    Empowering SMB’s with Data & GenAI Solutions | Founder & CEO @Clipeum.ai | ❄️ Snowflake Data Superhero | Board of Directors | Certified Career and Confidence Coach | Aspiring Pilot ✈️

    5,556 followers

    Excited to host today's Snowflake #Phoenix User Group Chapter Meeting where we’ll cover my top tips for optimizing Snowflake! Whether you’re new or experienced, these insights will help you ensure your platform stays efficient and ROI is maximized. Here’s a preview of the Top Tips we’ll discuss: 1. Auto-Scaling & Auto-Suspend: Automatically scale up/down and suspend warehouses when idle to avoid overprovisioning. 2. Query Result Caching: Speed up performance by using cached results, reducing the need to rerun queries. 3. Monitor Query Profiles: Regularly check query profiles to optimize slow-running or resource-heavy queries. 4. Right-Size Virtual Warehouses: Start small and scale up based on demand instead of over-allocating. 5. Clustering Keys: Use clustering to improve data retrieval speed for large datasets. 6. Minimize Data Movement: Avoid excessive data transfers between stages to reduce costs. 7. Zero-Copy Cloning: Efficiently create environments without data duplication. 8. Adjust Time Travel & Fail-Safe: Fine-tune these settings based on your data retention needs to lower storage costs. 9. Clean Up Unused Data: Regularly delete unused tables and objects to free up storage. 10. Resource Monitors: Set up resource monitors to cap usage and control runaway costs. The key is periodic monitoring and adjusting to meet your specific needs. Drop your favorite Snowflake optimization tip below 👇 #Optimization #CloudData #CostManagement #SnowflakeSuperhero #ROI #Datasuperhero #Snowflake_advocate

  • View profile for Jeff Skoldberg

    Cut Data Stack Cost | dbt + Snowflake + Tableau Expert | DM me for data consultation!

    10,275 followers

    Here is a tip to help you save money and reduce complexity: Instead of using a third-party data loader to copy data into Snowflake, use Snowflake’s native functionality and orchestrate it with the Transformation tool you are already using: dbt We all know that the most common way to load data into Snowflake is the good ole copy command. But, the observability on Snowflake Tasks and Stored Procedures leaves something to be desired, driving customers to use tools like Fivetran or similar to simply copy data from cloud storage. The problem is, this is a super expensive way to do something that should cost next to nothing. 💸 A long time ago I discussed this challenge with Randy Pitcher 🌞, when he was working at dbt Labs. I shared with him a pre-hook approach I was using to copy data… then he showed me something MUCH better: a copy_into materialization. By using a custom materialization in dbt, your data in cloud storage becomes part of the DAG and your data is copied into Snowflake every time you run `dbt build`. Here’s how to do it: Step 1: You can grab the custom materialization here:  https://lnkd.in/evSbWfJV Step 2: Then add a source for the stage in sources.yml:  https://lnkd.in/eewYJAc4 (line 29) Step 3: Create variables in dbt_project.yml for file search patterns (if needed). https://lnkd.in/ezDacf4E (line 50) Step 4: Use it in a model, as shown below! https://lnkd.in/enBr8ixW

  • View profile for Ankit Goyal

    Founder @ Retape | IIT Bombay CS | YC W23

    9,828 followers

    Snowflake Optimization Tip of the Day #4 ➡ Consolidate “similar” warehouses into one multi-cluster warehouse. What do I mean by “similar”? All your workloads can be bucketed based on their performance/latency requirements (in other words, to what extent queueing of queries is acceptable). Customer-facing dashboards and BI workloads are more sensitive to latency, whereas transformation jobs and data ingestion aren’t as much. Any two warehouses of the same size and that run workloads from the same bucket, would be “similar” in this definition. An example use case - warehouses MARKETING_SMALL and PRODUCT_SMALL, can be consolidated into one multi-cluster warehouse BI_SMALL. The idea is to maximize the utilization of every warehouse by concentrating workloads. Using 2 different warehouses does not have any advantage over using a single warehouse with 2 clusters unless you have a strict performance SLA. A single multi-cluster warehouse can scale out as required, and run at a high utilization at the same time. While cost attribution does get a little simpler with separate warehouses, it's not worth the extra Snowflake spend. You can download the free app, Finops Center, from the Snowflake marketplace for complete attribution. A caveat - the maximum number of clusters Snowflake allows is 10, so you need to create a new warehouse when you get close to that limit.

  • View profile for Ameena Ansari

    Engineering @Walmart | LinkedIn [in]structor, distributed computing | Simplifying Distributed Systems | Writing about Spark, Data lakes and Data Pipelines best practices

    6,427 followers

    Ever had a Snowflake query that takes forever because it’s scanning the entire table? Here’s how I make sure queries only scanned relevant partitions using a simple date-based strategy: Sort Data on Ingestion – Loaded data in a way that naturally groups records by order_date. Avoided small inserts to preserve partition locality. Use Clustering – Enabled auto-clustering on order_date to keep related data together for faster scans. Write Partition-Friendly Queries – Always filter by order_date to let Snowflake prune unnecessary partitions. Materialized Views – Pre-aggregated frequently used date ranges to avoid full table scans. Query Acceleration – Turned it on for large queries to dynamically parallelize execution. End result? Faster queries, lower compute costs, and Power BI dashboards that load in seconds instead of minutes. If your Snowflake queries are slow, check if they’re scanning more data than needed. Partition pruning makes a huge difference.

  • View profile for Lakshmi Shiva Ganesh Sontenam

    Data Engineering - Vision & Strategy | Visual Illustrator | Medium✍️

    13,763 followers

    🧠 The Invisible Cost in Analytics Workloads: Data Skew and Poor Partitioning When people talk about optimizing data pipelines, most eyes go straight to ingestion or ELT runtime. But the real silent killer of cloud data costs? Poorly partitioned or skewed datasets in your fact tables — especially at query time... 🔍 Example: Let’s say you have a customer transactions fact table with 200 million rows. #Scenario_A – Properly partitioned: - Partitioned by event_date or customer_region - Queries filtered by date or region. - Execution time: 12 seconds - Compute billed: Minimal - Only relevant partitions are scanned #Scenario_B – Unpartitioned or Skewed: - No partitioning OR 90% of data falls into 1 region/date - Same query: SELECT * WHERE region = 'North' - Now you scan 150M unnecessary rows - Query time: 3+ minutes Compute billed: 3x or more than Scenario A 💡 The query result might look the same, but your credit card knows the difference. 💥 Now let’s tie it to real dollars: In Snowflake, computing is billed per second via credits. We track this using: SELECT    NAME AS WAREHOUSE,   START_TIME,   END_TIME,   CREDITS_USED,  CREDITS_USED * 2.00 AS DOLLAR_SPENT_LOW,  CREDITS_USED * 3.10 AS DOLLAR_SPENT_HIGH FROM SNOWFLAKE.ACCOUNT_USAGE.METERING_HISTORY WHERE START_TIME >= DATEADD(DAY, -3, CURRENT_TIMESTAMP())  AND NAME IN ('LARGE_WH') -- Or your actual warehouse name ORDER BY START_TIME DESC; ✅ For a 3-minute query on a LARGE warehouse: Credits used: ~0.4 Dollar impact: $0.80 to $1.24 (Based on $2.00–$3.10 per credit) Now multiply that by: 50 dashboards Refreshed 5 times a day Used across 10 teams 🧮 That’s: 50 dashboards × 5 runs/day × 10 teams × $0.80–$1.24 → $2,000 to $3,100/month spent just on inefficient queries. 🔄 TL;DR You don’t always need a bigger warehouse or more memory. You just need a better data structure and more thoughtful querying. 📌 If you're in analytics, finance, or platform ops, review your top queried tables and look at bytes scanned vs. rows returned. You'll be surprised where your budget is going. #CloudData #DataModeling #AnalyticsEngineering #DataOps #WarehousePerformance #BigData #CostOptimization #Snowflake #DataSkew #PartitioningMatters PC: Keebo

  • View profile for Benjamin Rogojan

    Fractional Head of Data | Tool-Agnostic. Outcome-Obsessed

    181,281 followers

    If you work on a data engineering or data science team, then cost reduction is likely a major point of discussion. Especially this time of year. As a data consultant, I have managed to save millions of dollars over the past few years. The surprising thing is much of those expenses come from the same usual suspects(perhaps it's not that surprising). 1. Make sure you set up partitions or clusters where needed 2. Don't build a view, on view, on view mess that takes 10 minutes to run and is used for a heavily used dashboard 3. Check to ensure you've set Snowflake idle time to 1 minute(when it makes sense) 4. Make sure you've optimized your data ingestion solution(if you're paying 100k a year for ingestion, we should talk!) 5. Have some level of governance on who can build in production 6. Create a process to review costs every month or so. New projects and workflows can suddenly increase costs and if you're not constantly ensuring your costs are managed, they will explode I'd love to hear your tips as well!

  • View profile for Ergest Xheblati

    Data Architect | Author: Minimum Viable SQL Patterns

    16,591 followers

    A friend recently asked me what can you do to reduce your Snowflake bill? There are a lot of small tips and tricks but pattern wise I see just 3: 1. Make queries run faster. The faster they run, the fewer compute credits they consume. To make them run faster read up on Snowflake's micro-partitions and get very familiar with the Query Profiler. 2. Make queries run less frequently. If a dashboard only shows daily data there’s no need for it to run multiple times during the day. At a previous company 99.9% of our data was daily. We refreshed it over night. At another they needed data much more frequently so we ran stuff hourly or more frequently. We put a threshold in place for every query to finish in less than 25% the allotted time else we could put it in a slower cadence schedule 3. Use a smaller warehouse size. In SF warehouse size (aka compute resources) go from XS (1 credit/hr) to 6XL (512 credits/hr) in powers of 2 (1, 2, 4, 8, 16) There are a few other smaller factors involved like spin-up time. So if your query can run in a reasonable amount of time in a smaller warehouse, definitely use that.

Explore categories