Understanding The Principles of Cost Optimization: Google Cloud Whitepaper
Understanding The Principles of Cost Optimization: Google Cloud Whitepaper
Understanding the
principles of cost
optimization
Table of Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Contributors
• Justin Lerma
• Pathik Sharma
• Amber Yadron
• Andrew Sallaway
• Akshay Kumbhar
1
Introduction
At the end of the day, getting more out of your cloud resources can
translate into more customers served, more issues resolved, and
more adaptability for the overall business. Using your cloud resources
more efficiently can help your team and business adjust to these new
realities, and be as effective as possible.
Your ability to shift
Our own teams have worked for years with IT and operations teams and change as the
across industries and around the world, listening to your challenges,
your successes, and your future plans. While a lot has changed, the
business changes
ability of technologists to adapt and thrive hasn’t. Your ability to shift matters more than
and change as the business changes matters more than ever, and our
ever.
cloud technology is designed to support that kind of agility and
resilience. We’ve gathered together the tips and best practices we
recommend so that you can get more out of your existing resources—
more VMs, more storage, more queries—and help your business meet
its goals.
Chapter 1
Principles and processes to optimize
your cloud costs
Cloud is more than just a cost center. Moving to the cloud allows you
to enable innovation at a global scale, expedite feature velocity for
faster time to market, and drive competitive advantage by quickly
responding to customer needs. So it's no surprise that many
businesses are looking to transform their organization's digital
strategy as soon as possible. But while it makes sense to adopt cloud
quickly, it’s also important to take time and review key concepts prior
to migrating or deploying your applications into cloud. Likewise, if you
already have existing applications in the cloud, you’ll want to audit
your environment to make sure you are following best practices. The
goal is to maximize business value while optimizing cost, keeping in
mind the most effective and efficient use of cloud resources.
When it comes to
We’ve been working side by side with some complex customers as
optimizing costs,
they usher in the next generation of applications and services on
Google Cloud. When it comes to optimizing costs, there are lots of there are lots of tools
tools and techniques that organizations can use. But tools can only and techniques.
take you so far. In our experiences, there are several high-level
principles that organizations, no matter the size, can follow to make
sure they’re getting the most out of the cloud.
We’ve seen this dynamic many times, and it's unfortunate that one of
the most desirable features of the cloud—elasticity—is sometimes
One of the most
perceived as an issue. When there is an unexpected spike in a bill,
some customers might see the increase in cost as worrisome. Unless desirable features of
you attribute the cost to business metrics such as transactions the cloud — elasticity
processed or number of users served, you really are missing context
to interpret your cloud bill. For many customers, it's easier to see that
— is sometimes
costs are rising and attribute that increase to a specific business perceived as an
owner or group, but they don’t have enough context to give a specific
issue.
recommendation to the project owner. The team could be spending
more money because they are serving more customers—a good
thing. Conversely, costs may be rising because someone forgot to
shut down an unneeded high-CPU VM running over the weekend—and
it’s pushing unnecessary traffic to Australia.
One way to fix this problem is to organize and structure your costs in
relation to your business needs. Then, you can drill down into the
services using Cloud Billing reports to get an at-a-glance view of your
costs. You can also get more granular cost views of your environment
by attributing costs back to departments or teams using labels, and
by building your own custom dashboards. This approach allows you
to label a resource based on a predefined business metric, then track
its spend over time. Longer term, the goal isn’t to understand that you
spent “$X on Compute Engine last month,” but that “it costs $X to
serve customers who bring in $Y revenue.” This is the type of analysis
you should strive to create.
5
Billing Reports in the Google Cloud console let you explore granular cost details.
One of the main features of the cloud is that it allows you to expedite feature velocity for faster time to market, and
this elasticity is what lets you deploy workloads in a matter of minutes as opposed to waiting months in the
traditional on-premises environment. You may not know how fast your business will actually grow, so establishing a
cost visibility model up front is essential. And once you go beyond simple cost-per-service metrics, you can start to
measure new business metrics like profitability as a performance metric per project.
Similarly, our most sophisticated customers aren’t fixated on a specific cost-cutting number, they’re asking a variety
of questions to get at their overall operational fitness:
6
In short, they have gone ahead and created their own unit economics
model. They ask these questions up front, and then work to build a
system that enables them to answer these key questions as well as
audit their behavior. This is not something we typically see in a crawl
state customer, but many of those that are in the walk state are
employing some of these concepts as they design their system for
the future.
through recommendations and steps to create your optimal environment. Within this resource hierarchy, you can use
projects, folders, and labels to help create logical groupings of resources that support your management and cost
attribution requirements.
In your resource hierarchy, labeling resources is a top priority for organizations interested in managing costs. This is
essentially your ability to attribute costs back to a specific business, service, unit, leader, etc. Without labeling
resources, it’s incredibly difficult to decipher how much it costs you to do any specific thing. Rather than saying you
spent $36,000 on Compute Engine, it’s preferable to be able to say you spent $36,000 to deliver memes to 400,000
users last month. The second statement is much more insightful than the first. We highly recommend creating
standardized labels together with the engineering and finance teams, and using labels for as many resources as you
can.
If you’re a stable customer, you can review your spending less frequently, as the opportunities to tweak your
strategies will be reliant on items like new Google Cloud features vs. a business change on your product roadmap.
But if you’re deploying many new applications and spending millions of dollars per month, a small investment in
conducting more frequent cost reviews can lead to big savings in a short amount of time. In some cases, our more
advanced customers meet and adjust forecasts as often as every day. When you’re spending millions of dollars a
month, even a small percentage shift in your overall bill can take money away from things like experimenting with
new technologies or hiring additional engineers.
To truly operate efficiently and maximize the value of the cloud takes multiple teams with various backgrounds
working together to design a system catered to your specific business needs. Some best practices are to establish a
review cadence based on how fast you are building and spending in the cloud. The Iron Triangle is a commonly used
framework that measures cost vs. speed vs. quality. You can work with your teams to set up an agreed-upon
framework that works for your business. From there, you can either tighten your belt, or invest more.
9
• Effort: Estimated level of work (in weeks) required by the customer to coordinate the resources and
implement a cost optimization recommendation.
• Savings: Amount of estimated potential savings (in percentage per service) that customers will realize by
implementing a cost optimization recommendation.
While it's not always possible to estimate with pinpoint accuracy how much a cost savings measure will save you
before testing, it's important to try and make an educated guess for each effort. For instance, knowing that a certain
change could potentially save you 60% on your Cloud Storage for project X should be enough to help with the
prioritization matrix and establishing engineering priorities with your team. Sometimes you can estimate actual
savings. Especially with purchasing options, a FinOps team can estimate the potential savings by taking advantage
of features like committed use discounts for a specific amount of their infrastructure. By performing this exercise,
you want the team to be able to make informed decisions on where engineering is going, so they can focus their
energy from a culture standpoint.
12
Chapter 2
Optimizing compute costs
The next important step to gain visibility into your Compute Engine
costs is to use Cloud Billing reports in the Google Cloud Console, and
customize your views based on filtering and grouping by projects,
labels, and more. Here, you’ll find information about the various
Compute Engine machine types, committed use discounts, and how
to view your usage, among other things. If you need an advanced
view, you can even export Compute Engine usage details to BigQuery
for more granular analysis. This allows you to query the datastore to
understand your project’s vCPU usage trends and how many vCPUs
can be reclaimed. If you have defined thresholds for the number of
cores per project, usage trends can help you spot anomalies and take
proactive actions. These actions could be rightsizing the VMs or
reclaiming idle VMs.
Now, with better cost visibility available, let’s go over the five ways you
can optimize your Compute Engine resources that we believe will
offer the most immediate benefit.
For more info, read the Recommender documentation. And stay tuned as we add more usage-based recommenders
to the portfolio.
Schedule VMs to auto start and stop: The benefit of a platform like Compute Engine is that you only pay for the
compute resources that you use. Production systems tend to run 24/7; however, VMs in development, test, or
personal environments tend to only be used during business hours, and turning them off can save you a lot of
money! For example, a VM that runs for 10 hours per day, Monday through Friday, costs 75% less to run per month
compared to leaving it running.
it's worth exploring how VMs are set up and sized across your cloud infrastructure to find cost savings.
Rightsize VMs: On Google Cloud, you can already get significant savings by creating custom machine types with the
right amount of CPU and RAM to meet your needs. But workload requirements can change over time. Instances that
were once optimized may now be serving fewer users and traffic. To help, our rightsizing recommendations can
show you how to effectively downsize your machine type based on changes in vCPU and RAM usage. These
rightsizing recommendations for your instance’s machine type (or managed instance group) are generated using
system metrics gathered by Cloud Monitoring over the previous eight days.
16
If your organization uses infrastructure as code to manage your environment, check out this guide, which will show
you how to deploy VM rightsizing recommendations at scale.
2. Purchase commitments
Effort ••• Savings •••
Pricing efficiency is another key concept to employ as part of a cloud optimization effort. The ability to purchase
commitments and receive the related discounts is a huge step forward to ensuring you are truly optimizing your
cloud costs.
17
Our customers have diverse workloads running on Google Cloud, with differing availability requirements. Many
customers follow a 70/30 rule when it comes to managing their VM fleet—they have constant year-round usage of
about 70%, and a seasonal burst of about 30% during holidays or special events.
If this sounds like you, you are probably provisioning resources for peak capacity. However, after migrating to Google
Cloud, you can baseline your usage and take advantage of deeper discounts for compute workloads. Committed use
discounts are ideal if you have a predictable steady-state workload, as you can purchase a one- or three-year
commitment in exchange for a substantial discount on your VM usage. We recently released a committed use
discount analysis report in the Cloud Console that helps you understand and analyze the effectiveness of the
commitments you’ve purchased and even estimate what your resource floor looks like based on historical data.
In addition to this, discuss your usage with your internal team and get a sense of whether or not a committed use
discount makes sense for your workload. You can work proactively with them to increase your committed use
discount coverage and maximize your savings.
18
Using Cloud Functions to automate the cleanup of other Compute Engine resources can also save you engineering
time and money. For example, customers often forget about unattached (orphaned) persistent disks, or unused IP
addresses. These accrue costs, even if they are not attached to a virtual machine instance. VMs with the deletion
rule option set to “keep disk” will retain persistent disks even after the VM is deleted. That’s great if you need to save
the data on that disk for a later time, but those orphaned persistent disks can add up quickly. This Google Cloud
solutions article describes the architecture and sample code for using Cloud Functions, Cloud Scheduler, and Cloud
Monitoring to automatically look for these orphaned disks, take a snapshot of them, and remove them. This solution
can be used as a blueprint for other cost automations such as cleaning up unused IP addresses or stopping idle
VMs.
19
Preemptible VMs are short-lived—they run a maximum of 24 hours and may be shut down before the 24-hour mark
as well. A 30-second preemption notice is sent to the instance when a VM needs to be reclaimed, and you can use a
shutdown script to clean up in that 30-second period. Be sure to fully review the entire list of stipulations when
considering preemptible VMs for your workload. All machine types are available as preemptible VMs, and you can
launch one simply by adding “-preemptible” to the gcloud command line or selecting the option from the Cloud
Console.
Using preemptible VMs in your architecture is a great way to scale compute at a discounted rate, but you need to be
sure that the workload can handle the potential interruptions if the VM needs to be reclaimed. One way to handle this
is to ensure your application is checkpointing as it processes data—i.e., that it’s writing to storage outside the VM
itself, like Google Cloud Storage or a database. As an example, try this sample code for using a shutdown script to
write a checkpoint file into a Cloud Storage bucket. For web applications behind a load balancer, consider using the
30-second preemption notice to drain connections to that VM so the traffic can be shifted to another VM. Some
customers also choose to automate the shutdown of preemptible VMs on a rolling basis before the 24-hour period is
over to avoid having multiple VMs shut down at the same time if they were launched together.
You don’t have to limit preemptible VMs to a Compute Engine environment. GPUs, GKE clusters and secondary
instances in Dataproc can also use preemptible VMs. You can also reduce your Cloud Dataflow batch analytics
costs by using Flexible Resource Scheduling to supplement regular instances with preemptible VMs.
5. Try autoscaling
Effort ••• Savings •••
Another great way to save on costs is to run only as much capacity as you need when you need it. This is a key tenet
of resource usage optimization. As we mentioned earlier, typically around 70% of capacity is needed for steady-state
usage, but when you need extra capacity, it’s critical to have it available.
20
Chapter 3
Optimizing storage costs
There are multiple factors to consider when looking into storage cost
optimization. The trick here is to ensure that you don’t negatively
impact performance and that you don’t throw anything out that may
need to be retained for future purposes, whether that be for
compliance, legal, or simply business value reasons. With data
emerging as a top business commodity, you’ll want to use appropriate
22
1. Retention
2. Access patterns
3. Performance
There can be many additional use cases with cost implications, but
we’ll focus on recommendations around these themes. Here are more
details on each.
We see customers use lifecycle policies in a multitude of ways with We see customers
great success. A great application is for compliance in legal
discovery. Depending on your industry and data type, there are laws use lifecycle policies
that regulate the data types that need to be retained and for how long. in a multitude of
Using a Cloud Storage lifecycle policy, you can instantly tag an object
for deletion once it has met the minimum threshold for legal
ways with great
compliance needs, ensuring you aren’t charged for retaining it longer success.
than is needed and you don’t have to remember which data expires
when.
Within Cloud Storage, you can also set policies to transform a storage
type to a different class. This is particularly useful for data that will be
accessed relatively frequently for a short period of time, but then
won’t be needed for frequent access in the long term. You might want
to retain these particular objects for a longer period of time for legal
or security purposes, or even general long-term business value. A
good place to put this in practice is in a lab environment. Once you
complete an experiment, you likely want to analyze the results quite a
bit in the near term, but in the long term won’t access that data very
frequently. Having a policy set up to convert this storage to nearline or
coldline storage classes after a month is a great way to save on its
long-term data costs.
24
One thing to keep in mind when considering this option is that storage in multi-regional locations allows for better
performance and higher availability, but comes at a premium and could increase network egress charges, depending
on your application’s design. During the application design phase, this is an important factor to consider. Another
option when you’re thinking about performance is buckets in regional locations—a good choice if your region is
relatively close to your end users. You can select a specific region that your data will reside in, and get guaranteed
redundancy within that region. This location type is typically a safe bet when you have a team working in a particular
area and accessing a dataset with relatively high frequency. This is the most commonly used storage location type
that we see, as it handles most workloads’ needs quite well. It's fast to access, redundant within the region, and
affordable overall as an object store.
For something as simple-sounding as a bucket, cloud-based object storage actually offers vast amounts of
possibility, all with varying cost and performance implications. As you can see, there are many ways to fine-tune your
own company’s storage needs to help save some space and some cash in a well thought-out, automated way.
Google Cloud provides many features to help ensure you are getting the most out of your Google Cloud investment.
28
Chapter 4
Optimizing network costs
Some of the use cases for VPC Flow Logs include network
monitoring, forensics, real-time security analysis, and for today’s
purposes, cost optimization. And when it comes to optimizing
networking spend, the most relevant information in VPC Flow Logs is:
For general internet egress charges, i.e., a group of web servers that
serve content to the internet, prices can vary depending on the region
where those servers are located. For instance, the price per GB in
us-central1 is cheaper than the price per GB in asia-southeast1.
Another example is traffic flowing between Google Cloud regions,
which can vary significantly depending on the location of those
regions—even if it isn’t egressing out to the Internet. For example, the
cost to synchronize data between asia-south1 (India) and asia-east1
(Taiwan) is five times as much as synchronizing traffic between
us-east1 (South Carolina) and us-west1 (Oregon).
Example topology using Network
As well as regional considerations, it’s important to consider which Topology, Network Intelligence Center
zones your workloads are in. Depending on their availability
requirements, you may be able to architect them to use intrazone
network traffic at no cost. You read that right, at no cost! Consider
your VMs communicating via public, external IP addresses, but that
are in the same region or zone. By configuring them to communicate
via their internal IP addresses, you can save on the cost of what you
would have paid for that traffic communicating via external IP
addresses.
30
Keep in mind, you’ll need to weigh any potential network cost savings with the availability implications of a single-
zone architecture. Deploying to only a single zone is not recommended for workloads that require high availability,
but it can make sense to have certain services use a virtual private cloud (VPC) network within the same zone. One
example could be to use a single-zone approach in regions that have higher costs (Asia), but a multi-zone or multi-
regional architecture in North America, where the costs are lower.
Once you have established your network costs for an average month, you may want to consider a few different
approaches to better allocate spending. Some customers re-architect solutions to bring applications closer to their
user base, and some employ Cloud CDN to reduce traffic volume and latency, as well as potentially take advantage of
Cloud CDN’s lower costs to serve content to users. Both of these are viable options that can reduce costs and/or
enhance performance.
We have seen many customers who push large amounts of data on a daily basis from their on-premises environment
to Google Cloud, either using a VPN or perhaps directly over the Internet (encrypted with SSL, hopefully!). Some
customers, for example, have databases on dedicated, on-prem hardware, whereas their front-end applications are
serving requests in Google Cloud. If this describes your situation, consider whether you should use a Dedicated
Interconnect or Partner Interconnect. If you push large amounts of data on a consistent basis, it can be cheaper to
establish a dedicated connection vs. accruing costs associated with your traffic traversing the public internet or
using a VPN.
Check out the details of the architectural considerations to review when selecting an interconnect.
31
Choosing the Premium networking tier brings performance and low latency.
By choosing either Standard or Premium Tier, you can allocate the appropriate connectivity between your services,
fine-tuning the network to the needs of your application and potentially reducing costs on services that might
tolerate more latency and don’t require an SLA.
There are some limitations when leveraging the Standard tier for its pricing benefits. At a high level, these include
compliance needs around traffic traversing the public internet, as well as HTTP(S), SSL proxy, TCP proxy load
balancing, or usage of Cloud CDN. After reviewing some of the recommendations, you’ll be empowered to review
your services with your team and determine whether you can benefit from lower Standard Tier pricing without
impacting the performance of your external-facing services.
• Cloud Logging—You may not know it, but you do have control
over network traffic visibility by filtering out logs that you no
longer need. Check out some common examples of logs that
you can safely exclude. The same applies to data access
audit logs, which can be quite large and incur additional
costs. For example, you probably don’t need to log them for
development projects. For VPC Flow Logs and Cloud Load
Balancing, you can also enable sampling, which can
dramatically reduce the volume of log traffic being written
into the database. You can set this from 1.0 (100% of log
entries are kept) to 0.0 (0%, no logs are kept). For
troubleshooting or custom use cases, you can always choose
to collect telemetry for a particular VPC network or subnet or
drill down further to monitor a specific VM instance or virtual
interface.
Chapter 5
Optimizing data analytics costs with BigQuery
Running and managing legacy data warehouses can be frustrating and time-consuming, especially now, where data
is everywhere and in everything we do. Scaling systems to meet this increase in data has made it even more
challenging to maintain daily operations. There’s also the additional hassle of upgrading your data warehouse with
minimal downtime and supporting ML and AI initiatives to meet business needs. We hear from our customers that
they choose BigQuery, Google Cloud’s serverless enterprise data warehouse, so they can focus on analytics and be
more productive instead of managing infrastructure.
With BigQuery, you can run blazing fast queries, get real-time insights with streaming data, and start using advanced
and predictive analytics with built-in machine learning capabilities. An Enterprise Strategy Group (ESG) analysis
revealed that BigQuery can provide up to a 26% to 34% lower total cost of ownership (TCO) over a three-year period
compared to other cloud data warehouse alternatives. But that doesn't mean there’s no room for further
optimizations for your data housed in BigQuery. Since cost is one of the prominent drivers behind technology
decisions in this cloud computing era, the natural follow-up questions we hear from our customers are about billing
details and how to continually optimize costs.
We’ve put together this list of actions you can take to help you optimize your costs—and in turn, business outcomes—
based on our experiences and product knowledge. One particular benefit of optimizing costs in BigQuery is that
because of its serverless architecture, those optimizations also yield better performance, so you won’t have to make
stressful tradeoffs of choosing performance over cost or vice versa.
35
Let’s look at the pricing for BigQuery, then explore each billing
subcategory to offer tips to reduce your BigQuery spending. For any
location, the BigQuery pricing is broken down like this (and you’ll find
more details below):
• Query processing
• On-demand: This is based on the amount of data
processed by each query you run.
• Flat-rate: This is best for customers who desire cost
predictability. Customers purchase dedicated resources
for query processing and are not charged for individual
queries.
• Storage
• Active storage: A monthly charge for data stored in tables
or in partitions that have been modified in the last 90
days.
• Long-term storage: A lower monthly charge for data
stored in tables or in partitions that have not been
modified in the last 90 days.
• Streaming inserts
Before diving in to those, here are the BigQuery operations that are
free of charge in any location:
Buying too few slots can impact performance, while buying too many slots will introduce idle processing capacity,
resulting in cost implications. In order to find your sweet spot, you can start with a monthly flat-rate plan, which
allows more flexibility to downgrade or cancel after 30 days. Once you have a good enough ballpark estimate on the
number of slots you need, switch to an annual flat-rate plan for further savings. In addition, BigQuery Reservations
helps you use flat-rate pricing even more efficiently and plan your spending.
Given rapidly changing business requirements, we recently introduced Flex Slots, a new way to purchase BigQuery
slots for durations as short as 60 seconds, on top of monthly and annual flat-rate commitments.
With this combination of on-demand and flat-rate pricing, you can respond quickly and cost-effectively to changing
demand for analytics. Now that you understand the fundamentals of BigQuery pricing, let’s take a look at how you
get started with cost optimization.
When you’re getting started understanding costs within your organization, take a few minutes to run a quick report of
your BigQuery usage for the past month to get a sense of your costs. You can use Billing Reports in the Cloud
Console, or simply export your billing data into BigQuery. A detailed Data Studio dashboard is also available that
allows you to identify costly queries so that you can optimize for cost and query performance. It will also provide
insight into the usage patterns and resource utilization associated with your workload. Follow these step-by-step
instructions to create a dashboard, as shown below.
38
You’ll likely query your BigQuery data for analytics and to satisfy business use cases like predictive analysis or real-
time inventory management.
On-demand pricing is what most users and businesses choose when starting with BigQuery. You are charged for the
number of bytes processed, regardless of the data housed in BigQuery or external data sources involved. There are
some ways you can reduce the number of bytes processed. Let's go through the best practices to reduce the cost of
running your queries, such as SQL commands, jobs, user-defined functions, and more.
Let’s look at an example of how much data a query will process. Here we’re querying one of the public weather
datasets available in BigQuery:
As you can see, by selecting the necessary columns, we can reduce the bytes processed by about eight-fold, which
is a quick way to optimize for cost. Also note that applying the LIMIT clause to your query doesn’t have an effect on
cost.
If you do need to explore the data and understand its semantics, you can always use the no-charge data preview
option.
Also remember you are charged for bytes processed in the first stage of query execution. Avoid creating a complex
multistage query just to optimize for bytes processed in the intermediate stages, since there are no cost implications
anyway (though you may achieve performance gains).
Filter your query as early and as often as you can to reduce cost and improve performance in BigQuery.
In this case, use the maximum bytes billed setting to limit query cost. Going above the limit will cause the query to
fail without incurring the cost of the query, as shown below.
41
A customer once asked why custom control is so important. To put things into perspective, let’s say you have 10 TB
of data in a U.S. (multi-regional) location, for which you are charged about $200 per month for storage. If 10 users
sweep all the data using [SELECT * .. ] 10 times a month, your BigQuery bill is now about $5,000, because you are
sweeping 1 PB of data per month. Applying thoughtful limits can help you prevent these types of accidental queries.
Note that cancelling a running query may incur up to the full cost of the query as if it was allowed to complete.
Along with enabling cost control on a query level, you can apply similar logic to the user level and project level as
well.
Let’s take a real-world example, where you have a Data Studio dashboard backed by BigQuery and accessed by
hundreds or even thousands of users. This will show right away that there is a need for intelligently caching your
queries across multiple users.
To significantly increase the cache hit across multiple users, use a single service account to query BigQuery, or use
community connectors, as shown in this Next ‘19 demo.
Now, when you query to analyze sales data for the month of August, you only pay for data processed in those 31
partitions, not the entire table.
One more benefit is that each partition is separately considered for long-term storage, as discussed earlier.
Considering our above example, sales data is often loaded and modified for the last few months. So all the
partitions that were not modified in the last 90 days are already saving you some storage costs. To really get the
benefits of querying a partitioned table, you should filter the table using a partition column.
While creating or updating partitioned tables, you can enable Require partition filter, which will force users to include
a WHERE clause that specifies the partition column, or else the query will result in error.
For example, below, sales leadership needs a dashboard that displays relevant metrics for specific sales
representatives. Enabling clustering on the sales_rep column is a good strategy, as it is going to be used often as a
filter. As shown below, you can see that BigQuery only scans one partition (2019/09/01) and the two blocks where
sales representatives Bob and Tom can be found. The rest of the blocks in that partition are pruned. This reduces the
number of bytes processed and thus the associated querying cost.
44
Clustering is allowed only on partitioned data. You can always use partitioning based on ingestion data, or introduce
a fake date or timestamp column to enable clustering on your table.
You can find much more here on clustering. And explore new Materialized Views for improved performance and
efficiency, along with cost savings.
Once data is loaded into BigQuery, charges are based on the amount of data stored in your tables per second. Here
are a few tips to optimize your BigQuery storage costs.
For instance, in this example, we only need to query the staging weather dataset until the downstream job cleans the
data and pushes it to a production dataset. Here, we can set seven days for the default table expiration.
Note that if you’re updating the default table expiration for a dataset, it will only apply to the new tables created. Use
a DDL statement to alter your existing tables.
BigQuery also offers the flexibility to provide different table expiration dates within the same dataset. So this table
called new_york in the same dataset needs data retained for longer.
As shown in the images next page, new_york will retain its data for six months, and because we haven’t specified
table expiration for california, its expiration will default to seven days.
Similar to dataset-level and table-level expiration, you can also set up expiration at the partition level. Check out our
public documentation for default behaviors.
46
Querying the table data, along with few other actions, does not reset
the 90-day timer and the pricing continues to be considered as long-
term storage.
Choose this technique for the use cases where it makes the most
sense. Typically, queries that run on external sources don’t perform as
well compared to queries executed on the same data stored on
BigQuery, since data stored on BigQuery is in a columnar format that
yields much better performance.
To find the number of rows from a snapshot of a table one hour ago, use the following query:
For business-critical data, follow the Disaster Recovery Scenarios for Data guide for a data backup, especially if you
are using BigQuery in a regional location.
Done right, BigQuery can satisfy all your modern data warehousing needs at a very reasonable price. Once the cost
optimization actions are implemented, you should see a visible drop in your BigQuery bill (unless you’ve followed
best practices since day one). In either case, celebrate your success! You deserve it.
Following these principles may save on cost and open up capacity in the short term—but the long-term ROI pays
plenty of dividends. The customers we talk to have seen drastically reduced data costs and infrastructure savings,
letting them add brand-new capabilities and explore innovative solutions like machine learning. They’ve also gained
better performance and simplicity across their IT environments. It becomes much easier to scale as needed, too.
One study, for example, found that Google Cloud’s Spanner database brings a total cost of ownership (TCO) that’s
78% lower than on-premises databases. Optiva migrated its legacy Oracle databases to Google Cloud for cost
efficiency and achieved better scale.
And customers across industries have been able to do much more with cloud, and help teams be more productive,
compared to other cloud providers or legacy technology. That can lead to better customer experiences, new product
innovation, and adding new business initiatives. Machine learning platform provider MD Insider often experienced
49
slow network performance, and failure rates and costs grew accordingly. Using Google Cloud for data services has
led to a 5x performance improvement and accelerated time to market of up to 30%. The physician performance site’s
developers are now much more productive, and time previously spent managing infrastructure is now spent on
product development.
Raycatch reduced infrastructure costs for its AI-led renewable energy tech by 80% using Compute Engine, Cloud
Bigtable, and other technologies. The company has added flexibility and stability to its systems, and opened up
capacity from its previous cloud, which means developers can work faster. Raycatch has been able to reduce work
hours needed to oversee one analysis task by 60 times, so the IT team can do more valuable optimization work.
There are many cost management strategies and best practices that can help you get the most from the cloud. Use
what works best for your team and organization. With a little bit of effort, the cloud offers efficiencies today and the
potential for significant ROI into the future.
Discover ways to reduce and optimize your IT spend for immediate and long-term growth. Talk with a Google Cloud
expert.
cloud.google.com