Inside the Force.com Query Optimizer
如果无法正常显示,请先停止浏览器的去广告插件。
1. Inside the Force.com Query Optimizer
From salesforce.com ’ s Customer Centric Engineering – Technical Enablement team
2. Safe harbor
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of
intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we
operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new
releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization
and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com,
inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This documents and others
containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based
upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-
looking statements.
Join the conversation: #forcewebinar
3. Speakers
Join the conversation: #forcewebinar
John Tan Jaikumar Bathija
Architect Evangelist
@johntansfdc Architect – DB Performance
@
4. Follow Developer Force for the latest news
@forcedotcom / #forcewebinar
Developer Force – Force.com Community
+Developer Force – Force.com Community
Developer Force
Developer Force group
Join the conversation: #forcewebinar
5. Architect Core Resource page
• Featured content for architects
• Articles, papers, blog posts, events
• Follow us on Twitter
Updated weekly!
http://developer.force.com/architect
Join the conversation: #forcewebinar
6. Have questions?
§ We have an expert support team at the ready to answer your questions
during the webinar.
§ Ask your questions via the GoToWebinar Questions Pane.
§ The speaker(s) will choose top questions to answer live at the end of the
webinar.
§ Please post your questions as we go along!
§ Only post your question once; we’ll get to it as we go down the list.
Join the conversation: #forcewebinar
7. Today s Learning Goal
AWARENESS
Join the conversation: #forcewebinar
8. Why are we here?
SELECT Id
FROM Account
WHERE Status__c != ‘Closed’ AND
Rating = Null AND
CreatedDate > 2013-04-01
Empower developers to write selective queries.
Join the conversation: #forcewebinar
Don’t worry we have lots of examples.
9. Selective Filters
Leverages
in dexes.
Re duces the
nu mber of
records in yo ur
result set.
Avo ids full table
scans.
Join the conversation: #forcewebinar
10. Query Performance Impact
Go vernor Limits
(Timeo uts,
Concur rent
Request Limit,
Concur rent API
limit, etc).
User Experience
(Visualforce
pages, API,
Reports,
Listviews, etc).
Large Data
Volu mes (LDV ).
Join the conversation: #forcewebinar
11. Join the conversation: #forcewebinar
12. Agenda
• Design - http://developer.force.com/architect
• Query Optimizer
• SOQL Examples
• Skinny Tables
• Other Performance Factors
Join the conversation: #forcewebinar
13. Query Optimizer
Join the conversation: #forcewebinar
14. Query execution
Join the conversation: #forcewebinar
15. Multitenancy
Join the conversation: #forcewebinar
16. Basic Algorithm
• Pre-Query engine.
• Chooses the most selective filter from the WHERE clause.
• Determine the best leading table/index to drive the query.
Join the conversation: #forcewebinar
17. Indexing
• Standard index - is available out of the box and we have a whole bunch of
fields that are indexed on Standard and custom entities.
• Custom index – is created on-demand, based on performance analysis done
pro-actively by salesforce team.
• What other fields are indexed – External Id fields, fields marked unique,
foreign keys by way of lookup or master detail relationship.
Join the conversation: #forcewebinar
18. Statistics
• Pre-computed Statistics
§ Row count
§ User visibility
§ Custom index
§ Owner row count
Join the conversation: #forcewebinar
19. Options considered by the Optimizer
Join the conversation: #forcewebinar
20. The numbers game
• Standard index will be considered only if the filter fetches < 30% of the
records for the first million records and less than 15% of the records after the
first million records, up to 1M records. * The selectivity threshold is subject
to change.
• Custom index will be considered only if the filter fetches < 10% of the records
for the first million records and less than 5% of the records after the first
million records, up to 333,333 records. * The selectivity threshold is
subject to change.
Join the conversation: #forcewebinar
21. The numbers game – Standard Index
# of records First Threshold Second Threshold Final Threshold
Up to 1 million 30% of total N/A 30% of total
Up to 2 million 300,000 150,000 450,000
Up to 3 million 300,000 300,000 600,000
Up to 4 million 300,000 450,000 750,000
Up to 5 million 300,000 600,000 900,000
Above 5.6 million 300,000 700,000 1,000,000
Join the conversation: #forcewebinar
22. The numbers game – Custom Index
# of records First Threshold Second Threshold Final Threshold
Up to 1 million 10% of total N/A 10% of total
Up to 2 million 100,000 50,000 150,000
Up to 3 million 100,000 100,000 200,000
Up to 4 million 100,000 150,000 250,000
Up to 5 million 100,000 200,000 300,000
Above 5.6 million 100,000 233,333 333,333
Join the conversation: #forcewebinar
23. Other Optimizations
AND optimizations
§ Composite Index Join - INTERSECTION of indexes should still meet
selectivity threshold.
OR optimizations
§ Union - SUM of the filters should still meet selectivity threshold.
sort optimizations
§ an index aligns with our order by clause and the query has a row limit,
we can use the index to find the first rows quickly and exit.
Join the conversation: #forcewebinar
24. Examples
Join the conversation: #forcewebinar
25. Schema
MyCase – 100,000 Records
MyUser – 100,000 Records
Join the conversation: #forcewebinar
26. Reminder: Goal of Optimizer
• Generate efficient SQL
• Leverage an index to drive query
• Avoid full table scans
Query Optimizer cannot make up for non-selective filters. It will
make the best choice from the filters in your query.
Join the conversation: #forcewebinar
27. Selectivity
Total # Records = 100,000
Selectivity Threshold = 10,000
Indexed Field
Status
Status
Value
Closed
New
# of Records
96,500
3,500
SELECT Id FROM MyCase__c
WHERE Status__c = ‘Closed’ will be do a full table scan
SELECT Id FROM MyCase__c
WHERE Status__c = ‘New’ will use the index
Join the conversation: #forcewebinar
Selective?
No
Yes
28. Not Equals / Not In
Total # Records = 100,000
Selectivity Threshold = 10,000
Indexed Value # of Records Selective?
Priority
Priority
Priority 1
2
3 6,000
3,500
90,500 Yes
Yes
No
Can’t use index because of not equals
SELECT Id FROM MyCase__c
WHERE Priority__c != 3 will do a full table scan
SELECT Id FROM MyCase__c
WHERE Priority__c IN (1,2) will use the index
Join the conversation: #forcewebinar
29. Formula Fields
Field
CaseType__c
Type
Formula
Formula CASE(MyUser__r.UserType__c,1,”Gold”,”Silver”)
(Text)
Can’t create an index on CaseType__c since this formula spans objects
IF MyUser__r.UserType__c has an index
• SELECT Id FROM MyCase__c WHERE MyUser__r.UserType__c = 1
Join the conversation: #forcewebinar
30. Formula Fields
Field
CaseTypeClone__c
Type
Text(255)
Or avoid a join and create CaseTypeClone__c field and index it
• SELECT Id FROM MyCase__c WHERE CaseTypeClone__c = ‘Gold’
Join the conversation: #forcewebinar
31. Nulls
Total # Records = 100,000
Selectivity Threshold = 10,000
Indexed Field Value # of Records Selective?
ClosedDate
ClosedDate Non-Null
Null 96,500
3,500 Yes for specific dates
Yes
Customer Support will need to create a custom index that includes null records. Standard indexes by default
include nulls.
SELECT Id FROM MyCase__c WHERE ClosedDate__c = null will use the index
http://blogs.developerforce.com/engineering/2013/02/force-com-soql-best-practices-nulls-and-formula-fields.html
Join the conversation: #forcewebinar
32. Date & Number Range
SELECT Id
FROM MyCase
WHERE ClosedDate__c > 2013-01-01 AND ClosedDate__c < 2013-02-01
Query Optimizer can detect only date and number ranges.
Join the conversation: #forcewebinar
33. AND conditions
Composite Index Join
SELECT Id FROM MyUser
WHERE FirstName__c = ‘Jane’ AND LastName__c = ‘Doe’ AND City__c = ‘San Francisco’
Step 1 – Allow each index to still be considered if they return < 2X selectivity threshold
Step 2 – INTERSECTION of all indexes must meet *selectivity threshold
Step 3 – Use composite index join to drive query
*If all indexes are standard indexes, use standard index selectivity threshold. Otherwise, use the custom index standard selectivity
threshold
Join the conversation: #forcewebinar
34. AND conditions
Composite Index Join – MyUser object 100,000 records
Join the conversation: #forcewebinar
35. AND conditions
Composite Index Join – MyUser object 100,000 records
Join the conversation: #forcewebinar
36. 2-column Index
For this simple example, it makes more sense to have Customer Support create a 2-column
index.
Join the conversation: #forcewebinar
37. OR conditions
Union
SELECT Id FROM MyUser
WHERE FirstName__c = ‘Jane’ OR LastName__c = ‘Doe’ OR City__c = ‘San Francisco’
Step 1 – Each field must be indexed and meet selectivity threshold
Step 2 – ADDITION of all the indexes must meet *selectivity threshold
Step 3 – Use union to drive query
*If all indexes are standard indexes, use standard index selectivity threshold. Otherwise, use the custom index standard selectivity
threshold
Join the conversation: #forcewebinar
38. OR conditions
Union – MyUser object 100,000 records
Join the conversation: #forcewebinar
39. OR conditions
Using SOSL may be a better option
• SELECT Id FROM Account WHERE PersonMobilePhone LIKE ‘%123’ – leading %
wildcard as bad as full scan
• SELECT Id FROM Account WHERE PersonMobilePhone = ‘1234567890’ OR
PersonHomePhone = ‘1234567890’ OR Phone = ‘1234567890’
Join the conversation: #forcewebinar
40. Relationship
Relationship
SELECT Id FROM MyCase__c
WHERE MyUser__r.JobType = 1 AND Priority__c = ‘Priority 1’
Each index’s selectivity threshold is analyzed separately and the index with the lower
threshold % is chosen.
Join the conversation: #forcewebinar
41. Soft Deletes
• Records in the Recycle Bin with isDeleted = true
• DO NOT USE isDeleted = false as a filter
• Counted in pre-computed statistics
• Use hard delete option in Bulk API or Contact Customer Support
Join the conversation: #forcewebinar
42. Sort Optimization
• Number and Date fields only
• Limit Clause required
• Can make up for a non-selective filter
SELECT Id FROM MyCase__c
ORDER BY CreatedDate LIMIT 10
SELECT Id FROM MyCase__c
WHERE CreatedDate > 2001-01-01
ORDER BY CreatedDate LIMIT 10
Join the conversation: #forcewebinar
43. Review
ü Selectivity thresholds determine if an index is considered
ü Not Equals filters will not leverage indexes
ü Be careful filtering on Null
ü And conditions involve an INTERSECTION of indexes
ü OR conditions involve an ADDITION of indexes
ü ORDER BY with a LIMIT on an index can make up for non-selective filters
Join the conversation: #forcewebinar
44. Sharing
Join the conversation: #forcewebinar
45. Record Visibility
• Applies only to non-Admin users.
• Depending on your user profile, you may have visibility to few or large
number of records.
• Sharing tables may drive query instead of index
Join the conversation: #forcewebinar
46. Skinny Table
Join the conversation: #forcewebinar
47. Skinny Table
Join the conversation: #forcewebinar
48. Skinny Tables
• Single Object
• Maximum of 100 fields
• Not Aggregate/Summary. 1:1 record count between source object and skinny
• It is not a cross-object join
• Updates to source object automatically reflected in skinny
• Improved performance – minimal joins since fields are in one table
Join the conversation: #forcewebinar
49. Skinny Table
Join the conversation: #forcewebinar
50. When are Skinny Tables used?
ü After attempting to tune with custom indexes
ü All fields selected and filtered must be in skinny
ü Salesforce.com will analyze and create
Join the conversation: #forcewebinar
51. Other Performance Factors
Join the conversation: #forcewebinar
52. Performance Factors
Sharing
§ Test as a non-System Admin User
Data Skews
§ Avoid parent-child and ownership data skews
Archiving
Database Caching
§ Avoid relying on cache performance or attempting to warm the cache
Join the conversation: #forcewebinar
53. Key Takeaways
ü Query Performance improves with indexes
ü Use selective filters to reduce result set
ü Query Optimizer chooses the best table/index to drive a query
ü Skinny Tables may help when indexing is exhausted
Join the conversation: #forcewebinar
54. Cheat Sheet:
Indexed Fields
Database
Query & Search Optimization
Cheat Sheet
Query Optimization Overview Selectivity Overview
When building queries, list views, and reports, it's best to create filter conditions
that are selective so that Force.com scans the most appropriate rows in the
objects that your queries target. This best practice is especially important when
your queries target "large objects," objects containing more than one million
records. Several things can affect the selectivity of a query filter's conditions.
When writing SOQL, consider using the following fields, which can make your
query filter conditions more selective, and improve your query response times
and your database's overall performance.
Whether the field in the condition has an index
Whether the value in the condition is selective relative to the total number of records in the
object. These numbers determine the selectivity threshold, which the Force.com query optimizer
uses to ensure that the most appropriate index, if any, drives each of your queries.
Whether the operator in the condition permits the use of available indexes
When writing your queries, remember the following selectivity conditions and tips.
SOQL
Fields with Database Indexes
Primary
Keys
Foreign
Keys
Audit
Dates
Index Selectivity Conditions and Thresholds
Custom
Fields
Id CreatedById CreatedDate Unique fields
Name LastModifiedById LastActivityDate OwnerId Lookup fields SystemModstamp External ID
fields
Master-detail
relationship fields
Unary Condition:
Custom Index AND
Condition OR
Condition LIKE
Condition
Force.com uses a
standard index if
the filter targets less
than: Force.com uses a
custom index if the
filter targets less
than: Force.com uses
a composite
index join if the
filter targets less
than: Force.com
uses a union
if the filter
targets less
than: For
conditions
that don't
start with
a leading
wildcard,
Force.com
tests the
first 100,000
rows for
selectivity.
30% of the first
million records 10% of the first
million records
15% of all records
after the first million
records 5% of all records
after the first
million records
1 million total
records 333,333 total
records
Twice the index
selectivity
thresholds for
each field The index
selectivity
thresholds for
each field
The index
selectivity
thresholds for
the intersection
of those fields The index
selectivity
thresholds for
the sum of
those fields
Query Optimization Resources Index Selectivity Exceptions
In addition to this cheat sheet's previous sections, we recommend reading the
following related resources, which can help you retrieve the records you want
from a large volume of data—and do so quickly and efficiently. When you build a filter condition with the following operators, Force.com doesn't use an available
index. Instead, it scans all records in the object to find the records that satisfy the condition. Feel
free to use these operators, but be sure to add selective filter conditions.
Best Practices for Deployments with Large Data Volumes (white paper)
The following filter operators
Force.com Apex Code Developer's Guide (guide) » not equal to
Force.com Blogs: Engineering (blog posts) » contains
How to Improve Listview Performance (Salesforce Knowledge article)
In the online help:
http://developer.force.com/architect
Unary Condition:
Standard Index
» does not contain
When used with text and text fields, the following comparison operators
» "Build Effective Filters" »
» "Getting the Most Out of Filter Logic" »
» "Improve Report Performance" »
( < )
(>)
(<=)
»
(>=)
Additionally, Force.com doesn't use available indexes when you use:
Leading wildcards
Non-deterministic or cross-object formula fields
SOSL
Fields with Search Indexes Search Selectivity Tips
General General
Name fields
Phone fields
Text fields
Picklist fields
These fields vary by object. See "Search Fields" in the online help.
Be as selective as possible. For
example, use Michael*, not Mich*.
Remember that Chatter feed
searches aren't affected by the scope
of your search; Chatter feed search
results include matches across all
objects.
Sidebar Search and Advanced Search
Search for the exact phrase with an advanced search.
Limit scope by targeting:
» Specific objects
» Rows owned by the searcher
» Rows within a division, when applicable
See "Search Overview" in the online help.
http://developer.force.com
Join the conversation: #forcewebinar
55. Upcoming Events
April 21-27, 2013
Salesforce Mobile Developer Week
May 8, 2013
Summer ‘13 Release Developer Preview Webinar
May 9, 2013
SOQL Best Practices CodeTalk
Join the conversation: #forcewebinar
56. Survey
Your feedback is crucial to the success of our webinar programs.
Thank you!
http://bit.ly/querysurvey
*Look in the GoToWebinar chat
window now for a hyperlink.
Join the conversation: #forcewebinar
57. Q&A
Join the conversation: #forcewebinar
John Tan Jaikumar Bathija
Architect Evangelist
@johntansfdc Architect – DB Perfomance